“No index” directives instruct search engines to exclude a page from the index, rendering it ineligible to appear in search results.
The most common way to instruct search engines not to index a page is to include a Meta Robots tag within the <head> tag of an HTML page with a “noindex” directive as below:
<meta name=”robots” content=”noindex” />
Around 2007, major search engines began implementing support for “noindex” directives in Meta Robots tags. Meta Robots tags can also include other directives, such as the “follow” or ”nofollow” directive, which directs search engines to crawl or not crawl links found on the current page.
Typically webmasters will use the “noindex” directive to prevent content from being indexed that is not intended for search engines.
Some common use cases for “noindex” directives:
Additionally, search engines support a “noindex” directive delivered via the HTTP response headers for a given page. While this approach is less common and can be harder to identify using common SEO tools, it is sometimes easier for engineers or webmasters to include based on their server configuration.
The name and value for a “noindex” response header are as follows:
The accidental inclusion of a “noindex” tag or directive on a valuable page can cause that page to be removed from search engine indices and stop receiving all organic traffic.
For example, if a new website version is launched, but “noindex” tags, which were included to prevent search engines from indexing the new page versions before they were ready, are left in place, the new website version can immediately stop receiving traffic from search
Webmasters have often used Meta Robots tags or response headers to signal to search engines that the current page should not be indexed but the links on the page should still be crawled as with the following Meta Robots tag:
<meta name=”robots” content=”noindex,follow” />
This is commonly used for paginated listing pages. For example, “noindex,follow” may be applied to a blog’s archive listings, to prevent the archive pages themselves from appearing in search results but allow search engines to crawl, index and rank the blog posts themselves.
However, this approach may not work as intended, as Google have explained that their systems eventually treat a “noindex,follow” directive as a “noindex,nofollow” – in other words, they will eventually stop crawling the links on any page with a “noindex” directive. This can prevent the link destination pages from being indexed at all or reduce their PageRank or authority, lowering their rank for relevant keywords.
While never officially supported, search engines respected “noindex” directives in robots.txt rules for some time. As robots.txt rules with wildcards can apply to many pages at once without any changes required in the pages themselves, this method was preferred by many webmasters. Google does not recommend this use of robots.txt files to set “noindex” directives and retired code that supported these rules in September of 2019.