Google Search Console page indexing issues

Topic summary

Google Search Console shows many pages not indexed across several new sites, prompting guidance on common causes and fixes.

Key issues cited:

  • Robots.txt blocks and a “crawl delay 10” setting; questions on whether filtered product URLs should be allowed.
  • ‘Noindex’ tags excluding many pages.
  • Canonical tags leading to “Alternative page with proper canonical” statuses.
  • 404 and soft 404 errors; ongoing validation failures.

Recommended actions:

  • Review robots.txt (use the tester), remove unintended blocks; consider whether filtered results should remain blocked.
  • Locate and remove unintended ‘noindex’ meta tags.
  • Fix 404s (remove in GSC or add 301 redirects) and address soft 404s (thin/empty pages).
  • Check Coverage and Manual Actions reports; submit updated sitemaps; request indexing and validate fixes.

Notes on terms:

  • Robots.txt: file controlling crawler access; crawl delay limits crawl rate.
  • Noindex: instructs search engines not to index a page.
  • Canonical tag: marks the preferred URL among duplicates; alternative pages are normal.
  • 404/soft 404: not found/low-value pages treated as errors.

Status: No final resolution; screenshots are central to understanding current errors; discussion remains open with actionable steps provided.

Summarized with AI on December 19. AI used: gpt-5.

Hi

I am having the same issues. I have a robots.txt file, i have a sitemap, i have good redirects that are still working but still getting lots of errors

the domain is yellowshield.co.uk