Technical SEO

Indexing

Indexing is how search engines store and organize your content so it can appear in search results. For content to be found in search, it must be indexed. Understanding how indexing works helps you ensure your important content gets indexed and control what shouldn't appear in search results.

How indexing works

After search engines crawl your site and discover content, they analyze it and decide whether to add it to their index—a massive database of web pages that can be searched. Only indexed content can appear in search results.

Not all crawled content gets indexed. Search engines may choose not to index duplicate content, low-quality pages, pages that don't meet quality guidelines, or pages that are blocked. The indexing process involves evaluating content quality and relevance.

You can check if your content is indexed using Google Search Console's Coverage report. This shows which pages are indexed, which aren't, and why. Understanding indexing status helps you identify and fix issues that prevent important content from appearing in search.

Ensuring content gets indexed

Help search engines index your content by:

Submitting sitemaps: Provide XML sitemaps to Google Search Console to help search engines discover your content
Internal linking: Link to important pages from other pages on your site so search engines can discover them
Avoiding duplicate content: Ensure each page has unique, valuable content
Proper site structure: Create a clear structure that makes it easy for search engines to navigate
Quality content: Create content that meets quality guidelines and provides value

Common reasons content doesn't get indexed include: pages blocked by robots.txt, pages marked as noindex, duplicate content, low-quality content, or pages with no internal links. Fix these issues to help important content get indexed.

Controlling what gets indexed

Sometimes you want to prevent content from being indexed. Use these methods:

Noindex tags: Add a noindex meta tag to pages you don't want in search results (e.g., thank-you pages, internal tools)
Robots.txt: Block crawlers from accessing certain pages (though this doesn't guarantee they won't be indexed if linked from elsewhere)
Password protection: Protect pages behind login to prevent indexing

Use noindex for pages you don't want in search results but still want accessible via direct links. Use robots.txt to prevent crawling of pages that shouldn't be accessed at all. Be careful with robots.txt—blocking important pages can prevent them from being indexed.

Examples

Well-indexed content

Example: A coffee shop website where all important pages are indexed: homepage, menu pages, location pages, and blog posts. The site has a sitemap submitted to Search Console, important pages are linked from the homepage and navigation, and each page has unique, valuable content.

This site makes it easy for search engines to discover and index important content, which helps the site appear in search results.

Indexing problems

Example: A website where important pages aren't indexed because they have no internal links, are blocked by robots.txt, or are marked as noindex accidentally. The site has no sitemap, and search engines can't discover important content.

This prevents important content from appearing in search results, limiting the site's visibility and ability to serve customers through search.

Previous topic

Mobile Optimisation

Ensuring your site works well on mobile

Next topic

Schema Markup

How structured data helps search engines

Indexing

How indexing works

Ensuring content gets indexed

Controlling what gets indexed

Examples

Well-indexed content

Indexing problems

Mobile Optimisation

Schema Markup

Recommended next steps

Schema Markup

Google Search Console

How Search Engines Work

Try with AI

Explore with AI

Explore with AI