It’s a mobile world and you’re staying on the leading edge: Your mobile page speed is fast and you’ve started publishing AMP HTML versions of your pages. Google’s mobile-first index? You’re ready.
If you’ve tried to read an article on your smartphone over the past few years you’ve probably been frustrated by slow loading times and ads that get in the way of the content you’re trying to read. The Accelerated Mobile Pages Project is an open-source initiative championed by Google to deliver a lightning-fast mobile content experience.
Google cares a lot about the mobile experience because the majority of their users are on mobile devices (Search Engine Land, 2015). Perhaps more importantly, they’ve identified key behavioral stats like this one: 53% of mobile users abandon sites that take 3 seconds or more to load (DoubleClick study, 2016). That directly leads to lower engagement and monetization for publishers.
We’ve been seeing incredible crawl trends that reflect Google’s interest. Take for example this graph showing rapid growth and volume of Google Smartphone crawl of AMP URLs soon after a site launched them.
Whether it’s because Google has been championing AMP or because it’s preparing for the mobile-first index, there’s no doubt Google is giving a lot of attention to AMP URLs.
Here is a list of questions to answer to make sure everything is going as expected with your AMP URLs.
Let’s start with discovery. The two primary methods to enable discovery of AMP pages are:
tag in the
of your non-AMP web pages
How can you be sure the pages that should have AMP URLs actually do?
If you’re not publishing AMP HTML URLs for every page on your website, odds are you have some logic to determine which pages get AMP versions (articles on a news website, for example). In an ideal world you’d have a method for keeping track of which URLs should have an AMP version (maybe a table or flag in a database).
Identify your AMP URLs
The point here is to be able to generate a list of URLs that you can crawl to identify whether the amphtml link is present in the head, as in this example:
If not, then you will need to crawl your website with a tool that can capture the amphtml link from the head of your pages.
Once you’ve crawled your list or your site and identified which pages have the AMP link and which do not, you need to determine whether the results are as expected. If not, then you will need to determine why AMP links are missing. There could be many reasons your AMP links aren’t showing when you expect them to and those reasons are likely to be particular to each website. We recommend revisiting your logic for when AMP URLs should published and made discoverable and reviewing the implementation with your engineers to find the gap.
The second important step for AMP HTML discovery is ensuring that the AMP URLs have a rel canonical link in the to their base URL. The exception here is if you’re only publishing AMP pages, in which case they should have a canonical link that references themselves.
Crawl, Capture and Check
The process here is much the same as the first crawl, but with a small difference. Here is our three-step process:
You can either use the list of AMP URLs discovered from the first crawl or use another source, such as your log files or web analytics (filtering for URLs that contain a pattern such as /amp/ or /amp.html or amp.domain.com).
In general, you want the canonical link on the AMP URL to:
This way all the signals a search engine might use for indexing and ranking can be aggregated correctly. Search engines haven’t yet said non-canonical URLs shouldn’t link to AMP URLs. But AMP URLs should have a canonical link and it would make the most sense for that link to point to an indexable URL.
Use a spreadsheet to check canonicalization
For this step, we used a process similar to checking whether your canonical tags contradict your internal linking.
We expect all AMP URLs to deliver a 200 status code. Websites change, of course, so that may not always be the case. The process above will help you identify cases where the status code changed unintentionally.
If you have eliminated your base URLs, then we’d expect the AMP URLs to give the same response code as their base. By eliminating base URLs, we mean:
Most methods of crawling URLs should be delivering the HTTP status code, so you should have this information as a result of your earlier crawls meant to QA the AMP HTML discovery.
While it’s great that you’ve published your AMP URLs and verified they are discoverable, there is more to do. Are they being crawled? Are they getting traffic?
Please share your comments below about your experience with publishing AMP HTML pages. What troubles have you encountered and how did you resolve them? How do you monitor their performance? What use cases did we miss here?