Canonical tags allow webmasters to indicate the original or preferred URL for pages that may be duplicated across multiple URLs.
These tags help search engines select the correct URL for inclusion in their results, consolidate link signals to a single preferred page and avoid wasting crawl budget by requesting and rendering multiple URLs for the same content.
The most common way to add canonical tags to pages is by adding a <link> tag within the <head> of a HTML document.
Similar to adding HREFLANG tags, canonical <link> tags must include the “canonical” value for the “rel”(relationship) attribute, as below:
<link rel=”canonical” href=”https://www.example.com/article1/” />
While less commonly used, search engine crawlers also support the inclusion of canonical directives in HTTP response headers.
This solution is often best when the URL in question is not a HTML document – a PDF file, for example.
Canonical HTTP response headers should be added as in this example from Google’s canonical guidelines:
Link: <http://www.example.com/downloads/white-paper.pdf>; rel=”canonical”
According to their guidelines, Google assumes that all URLs included in XML sitemaps are “suggested as canonicals” or “pages you consider most important on your site.”
While there is no guarantee that search engines will select all URLs in sitemaps as canonical, for large websites this is often the easiest way to indicate preferred content for indexing.
Google specifically suggests excluding non-canonical URLs from sitemaps as best practice, so it’s important to ensure your sitemaps are “clean” and include no extraneous or duplicate URLs.
Some common use cases for including canonical tags on your pages:
While officially supported by search engines, it’s important to note that canonical tags are effectively “hints” or suggestions. From Google’s canonical tag guidelines:
“Note that even if you explicitly designate a canonical page, Google might choose a different canonical for various reasons, such as performance or content.”
To check what URL Google has selected as canonical, use the URL Inspector tool in Google Search Console.