4 Ways to Optimize Your Crawl Budget & Increase Your Website ROI

Posted on

As a technical SEO, there’s almost nothing more satisfying than digging into a complex indexing problem and finding a solution that helps search engines find your content — especially when your stakeholders are counting on that content to drive revenue.  

In our most recent article on the Botify Blog, we covered the differences (and synergies) between the SEO funnel and the marketing funnel and the cascading effects technical SEO can have on revenue. When we stop to think about how these changes improve the bottom-line metrics our execs care about, namely SEO ROI, crawl budget has a huge impact. 

If you need a refresher, crawl budget is the maximum number of pages a search engine will crawl on any given website. Because search engines don’t have unlimited time and resources to crawl all the content on the web all the time, they prioritize what pages they’ll look at based on how healthy and popular a site is. 

This budget is why search engines miss more than 51% of an enterprise site’s content. The good news is — you can do something about it. (Psst! If you’re curious about how to calculate your site’s crawl budget, take a look at our recap of TechSEO Boost 2019, where G2’s Jori Ford breaks it all down!).  

crawl budget equation

Whether you have an e-commerce site with a huge faceted navigation or a publishing site that’s constantly adding new content, there’s almost always room for large websites to improve their crawl budget.

Here’s what to consider.    

1. Use your robots.txt file to tell search engines what not to crawl

By using your site’s robots.txt file, you can tell search engine bots what to crawl and what to ignore. If you’re unfamiliar, robots.txt files live at the root of websites and look like this:

Visit Google’s documentation for more information on creating robots.txt files

So how do these files help preserve your crawl budget?

Let’s say, for example, you have a large e-commerce site with a faceted navigation that lets you sort the content without changing it (e.g. sorting by price, lowest to highest). You’d want to disallow search engines from crawling those sort pages because they’re duplicates of the original page. You don’t want search engines wasting time on them since you don’t want them in the index anyway. 

This reminds us of a story Technical SEO Manager at REI Ryan Ricketts shared at our Crawl2Convert conference. His team cut their website down from 34 million URLs to 300,000 and saw drastic crawl budget improvements. Or when Hubspot’s Aja Frost worked to cut down thin pages to increase traffic.   

Your robots.txt file can be an important step to take in directing search engines away from your unimportant content and towards your critical content. If you’re a Botify customer, know that our crawler will follow the rules defined for Google in your website’s robots.txt file. However, you can also set up a virtual robots.txt file to override those rules.  

It’s important to note that disallowing search engines from certain sections or pages on your site does not guarantee that search engines won’t index those pages. If there are links to those pages elsewhere, such as in your content or sitemap, search engines may still find and index them. 

Which brings us to our second point. 

2. Clean up your internal linking 

To avoid wasting your crawl budget, make sure you’re linking to the live, preferred version of your URLs throughout your content. As a general rule, you should avoid linking to URLs if they’re not the final destination for your content. 

For example, you should avoid linking to:

  • Redirected URLs
  • The non-canonical version of a page
  • URLs returning a 404 status code 

Don’t waste your crawl budget by sending search engine bots through multiple middlemen (a.k.a. chains and loops) to find your content. Instead, link to the ultimate destination. You can learn more about finding and fixing 301 redirect errors in our recent blog post, which can be a big step toward improving your crawl budget. 

3. Make sure your XML sitemap is up to date 

Avoiding common XML sitemap mistakes is a great step to take if you want to improve your crawl budget. After all, it’s the map that search engines use to find your so-called treasure.

What mistakes are those?

  • Listing non-compliant pages like non-200s, non-canonicals, non-HTML, and no-indexed URLs.
  • Forgetting to update your sitemap after URLs change during a site migration
  • Omitting important pages, and more. 

Including only live, preferred URLs and making sure you’re not leaving out key pages that you want search engines to crawl and index is critical. Have old product pages? Make sure to expire them and remove them from your sitemap. 

Botify can help you audit your sitemap for errors to reduce your crawl waste.   

4. Optimize your JavaScript  

If your website makes heavy use of JavaScript, you may be wasting your crawl budget on JavaScript files and API calls. 

Consider this example.

A customer with a large enterprise website switched from client-side rendering to server-side rendering. Almost immediately, we could see from log file analysis that Google was spending more time on the website’s critical content. Because Google was receiving the fully-loaded page from the server, there was no need for it to spend time on JavaScript files and API calls.

In other words, switching to server-side rendering can free up search engine bots to spend more time on your important pages. 

JavaScript can also affect page load times, which, in addition to frustrating your visitors, can mean that search engine bots get to visit less of your pages. This makes being able to pinpoint your slowest-loading JavaScript resources critical to protecting your crawl budget. 

Making it easier for bots and searchers to find your best content 

Applying these optimizations on a site with millions of pages can open up a wealth of opportunity — not only for your crawl budget, but your site’s traffic and revenue, too!

When you cut out the excess (a.k.a. crawl waste), you’re not only opening the door for search engines to find your most critical content. You’re increasing the likelihood that more searchers will discover (and convert!) on that content. 
SEO funnel

Crawl budget isn’t just a technical thing. It’s a revenue thing. So bring the bots – and visitors – only to the good stuff!



Related posts

Get more articles like this in your inbox monthly!