In the previous post in our ongoing Crawl Budget series, we explored Crawl Ratio and the reasons you should optimize it. Once you understand your Crawl Ratio, it’s useful to understand Crawl Frequency so that you can make the most of Google’s behavior.
In Google’s article defining Crawl Budget and factors you can optimize to influence it, Gary Illyes described crawl rate limit as the part of Crawl Budget that is the maximum fetching rate for a website. The limit can change based on site performance and limits set in Google Search Console.
Assuming the limit hasn’t been reached, however, the crawl rate can be influenced by crawl demand. The article defines crawl demand as Google’s attempts to keep URLs fresh because they are popular, or to prevent URLs from going stale in the index.
In Botify, Crawl Frequency is the number of days per month that Googlebot requests a URL. We will see below that there is a clear relationship between traffic and frequency of crawl.
When Google is ignoring chunks of your website it’s a good indicator that you have something to improve. Remember: crawling is the first step on the road to organic traffic. If pages aren’t getting crawled, they’re not going to get into the index or, if they were indexed, they are at risk of dropping out of it.
Things that can reduce crawl frequency include:
On the other hand, understanding what Google is crawling frequently is a good indicator of what Google thinks is worthwhile, what it needs to keep fresh in the index. Understanding the characteristics of those pages can inform what you might need to do to improve the remainder.
If Google is crawling some pages frequently but they aren’t producing many visits, then this may be wasting crawl budget and steps should be taken to improve more important pages so the crawl shifts to them.
As with determining Crawl Ratio, you need to have a crawl of your website structure joined with Googlebot requests for your URLs from your server logs.
For most websites, Google is crawling at least one URL per day so you wouldn’t learn much by simply asking the question ‚Äòdid Google crawl my site today?’ You need to measure this at a URL level.
As with Crawl Ratio, Crawl Frequency varies by website. In these examples of Crawl Frequency visualizations, each section of the pie chart represents a range of days crawled by Google. The green section, for example, means 17.4% of URLs in the site structure were crawled 24 or more days out of the preceding 30 days.
Below, we can see that Google is crawling nearly all of this publisher’s core offering – its articles – and it’s doing so with some regularity. The publisher’s categories, collections, and aggregations aren’t being crawled as often, however, so there may be an opportunity to improve those templates and/or their place in the site structure.
Focusing just on the Crawl Frequency of those articles, we can see in the table that there is still a long tail of Crawl Frequency and visits. It’s possible that Google no longer needs to crawl the end of the tail because there’s no more demand for that content in the index. Or maybe it’s too deep. Maybe the publisher has deprioritized this content for the same reason Google is ignoring it – it’s old and there is no longer demand on the site for it.
Besides knowing the relationship between crawl frequency and traffic exists, there are practical applications for this data.
Whether you’re testing new page title or meta description templates or a new user experience, you’re going to need the variant pages to be crawled so the changes can be updated in the index so you can gauge the ultimate impact on searchers.
That process takes time. If you select pages in your test and control groups that get crawled every day, better yet multiple times a day, then you are optimizing the time to see results in the index and from searchers (click-through rate change in response to a new page title or meta description, for example).
If you select pages at random without accounting for Crawl Frequency, then it could take additional weeks or months to see get results. A faster result, even if negative, will help you learn and iterate more quickly. The classic growth concept of failing fast applies here.
If you can use XML sitemaps to update search engines about new pages, you should. But you can increase the speed to discovery and perhaps to index by also using highly crawled pages to
link to new content. This is not a new idea in SEO, but it is still a useful one.
For those less frequently crawled URLs, try to improve their Crawl Frequency by:
Crawl Frequency is a useful metric for identifying what Google deems important in your site and can be used for practical purposes to optimize your site for more organic traffic and, ultimately, increased revenue.
Still to come in future articles in our ongoing Crawl Budget series:
And, in case you missed them, don’t forget to start with Part 1: Google Confirms SEOs Should Control Their Crawl Budget and Part 2: What is Crawl Ratio, and Why Does it Matter?
We’d love to hear how you’ve used Crawl Frequency metrics to improve your organic search performance. Please leave comments and feedback below!