Enterprise SEO Technical SEO

Getting Your Pages Crawled, Rendered, And Indexed

Getting Your Pages Crawled, Rendered, and Indexed

15th August 2019Botify NewsBotify News

This is part three in a multi-part series about enterprise SEO. Check out the first two installments in this series: The Future of Enterprise SEO and An Enterprise SEO Methodology: From Crawling to Conversions.

The modern web requires a new approach to search. One that’s transparent, predictable, and reliable. In this latest installment in our SEO methodology series, we explore why enterprise websites in particular need to close the gap on missed opportunities by focusing on crawling, rendering, and indexing.

Crawl Render Index

Building your enterprise website on a strong technical foundation

Before searchers can find your content, search engines need to be able to access and understand it. In the earlier days of the web, when the landscape presented fewer complexities and was mostly HTML based, this step of the search process used to be taken for granted.

Today, ignoring your technical foundation is simply not an option. We’re going to explore why focusing on crawling, rendering, and indexing is a crucial first step in any enterprise’s SEO methodology.

How are search engines crawling my website?

Some of the biggest missed opportunities on enterprise websites stem from crawl issues. In other words, Google and other search engines may be missing a lot of your content. Our data suggests that Google misses up to half of the content on enterprise websites.

Google Misses Half of Pages

This means you have:

  • Half the opportunities to rank in search engines
  • Half the organic traffic potential
  • Half the ability to profit from organic search

But why does this happen?

This underlying “why” is the reason we came up with our unique SEO methodology, and why we built tools to help make SEO more transparent.

Instead of your ranking and traffic issues being a mystery, we can look to search engines themselves to pinpoint the real issues that need to be fixed. We can do this through a process called log file analysis, which is a traditionally complex task that we’ve made much easier with the Botify Log Analyzer.

Every time a user requests a page from your website, your server will store information about that request in a log file. This means that, by analyzing our log files, we can see whether search engines are crawling our pages and how often.

Analyzing log files in their raw form is no easy feat. They’re sometimes tricky for SEOs to get access to, hard to read, and difficult to draw any meaningful conclusions from. But Botify pulls your log files into out platform, automatically parsing that data into meaningful reports that are easy to take action on.

Instead of wondering “how are search engines crawling my website?” you can know with certainty.

Crawls Venn Diagram

How are search engines rendering my JavaScript?

The web has changed since its earliest days, and it’s only getting more advanced. Websites that used to be static HTML have now been made more dynamic with the advent of JavaScript.

Google realized that they had to start rendering web pages like a modern browser if they wanted to keep up, so they did. In their own words:

*”In 1998 when our servers were running in Susan Wojcicki’s garage, we didn’t really have to worry about JavaScript or CSS. A lot has changed since then. The web is full of rich, dynamic, amazing websites that make heavy use of JavaScript. Today, [we have the] capability to render richer websites — meaning we see your content more like modern Web browsers: include the external resources, execute JavaScript, and apply CSS.”*

This was a necessary step in keeping pace with the advancements of the web, but it was also resource-intensive. Google has finite resources — they simply cannot spend an unlimited amount of time crawling every page of a website and rendering all its resources. To cope with this, Google introduced a second wave of indexing.

Second Wave Indexing

Essentially, Google performs an initial crawl of a web page and indexes what they find. Later, as resources become available, they go back and render the JavaScript on those pages. That means links and content in JavaScript may be missed, which can potentially harm your organic performance.

The answer is not “don’t use JavaScript.” Google itself has dedicated much effort toward helping SEOs understand how to work with websites that use JavaScript, even producing a video series on JavaScript SEO. Because JavaScript is not going away, SEOs need to be able to work with it and not fight against it.

Botify wanted to solve for this new reality by introducing JavaScript Crawl, the first cloud-based solution that offers the ability to analyze JavaScript at enterprise-level capacity. In other words, you can find JavaScript-executed links and text at scale on your website so you can better understand what Google is seeing or missing on your website.

We think that if Google has advanced its crawling to be able to understand your JavaScript, then your crawler should too.

How can I get search engines to index my important content?

Once you have a better understanding of how search engines are crawling and rendering your content, you can execute optimizations that help Google index your important content and keep non-critical content (ex: duplicate content, site search results pages, etc.) out of the index.

To do this, you need to be aware of how search engines are crawling and rendering your website.

  • Do they have access to your non-critical content? Some sections of your website should be blocked from Google simply because you don’t want or need them crawled or indexed. You can use your robots.txt file to provide instructions to Google on where that non-critical content is to prevent Googlebot from wasting time on that content.
  • Are they spending time on unimportant content? If you’re not blocking Google from your non-critical content, they might spend time crawling it, which could be a waste of your crawl budget.
  • Are they ignoring your important content? All of the important content on your website is competing for Google’s attention. If Google is wasting time crawling unimportant pages, it may mean that they could miss some of your truly important content. If they don’t have time to crawl it, it might not be indexed (at least as soon as you want it to be indexed).

Knowing which of your pages Google is visiting (or missing) is the first step to ensuring that you’re optimizing for Google’s index.

Making the most of your enterprise website’s organic search potential

Enterprise organizations have big goals. In order to reach those goals, you’ll need a website that’s operating at maximum capacity.

Botify helps you understand the barriers that might be preventing search engines like Google from finding all your important content. Transparency into these issues then enables you to take the necessary steps to correct course, maximizing your organic search potential.

But the job’s not done yet! Once your website has a strong technical foundation, you’ll need content that speaks to the real questions your audience is asking. Stick around for the next entry in our series where we’ll dive into real searcher questions and how to create content that sufficiently answers them.

Aug 8, 2019 - 7 mins

Where Does SEO Live? Tips For Structuring A Successful Enterprise SEO Team

Enterprise SEO Technical SEO
Jun 6, 2020 - 4 mins

Expired Content & SEO: 5 Examples of How to Handle Old Listings

Enterprise SEO Technical SEO
Jun 6, 2021 - 4 mins

Why Is First-Party Data Valuable For SEO?

Enterprise SEO Technical SEO