Industry Events Technical SEO

TechSEO Boost 2019: Everything You Need To Know

TechSEO Boost 2019: Everything You Need to Know

27th December 2019Leslie RameyLeslie Ramey

Last week, we had the privilege of sponsoring and attending TechSEO Boost. We’re grateful to have played a role in such an inspiring technical SEO conference, bringing together professionals from across the globe to talk through the industry’s biggest challenges.

At the event, we heard from thought leaders at Google, Hubspot, Merkle, and more on topics like bias in search and predictive search. Plus, our very own Frank Vitovitch, VP of Solutions, presented his research on how issues with faceted navigation impact SEO.

Take a look through the insights from some of our favorite talks!

The rise of predictive search with Dawn Anderson

A topic that’s been on a lot of people’s minds this year is searcher queries and intent. It’s a complicated topic, and suffice to say sometimes the searcher isn’t even sure of their own intent. With initiatives like Google’s BERT rolling out to 70 languages, we’re getting closer to a solution to the problem — but only marginally.

Dawn Anderson, Managing Director at Bertey, navigated us through the issues that create difficulties in matching searchers with the results they’re after.

Choice, choice, and more choice

Information overload? Maybe. In the e-commerce world, it’s been proven that too much choice leads to less buying. Just think of the power of a newsletter with a few featured products or a blog with an influencer’s favorites — versus clicking through hundreds of pages of hats. On the average e-commerce site, the conversion rate is 2% with 98% of visitors not buying anything. So, how on earth do you measure the searcher’s journey in order to innovate and anticipate?

The real problems at hand

In her talk, Dawn stressed that queries are complicated. Words themselves are problematic, ambiguous, polysemous (multiple meanings), and synonymous. How can search possibly handle it all accurately?

When it comes to spoken search (like in the case of Google Assistant), there are many complications at hand. To prove her point, Dawn referenced a famous sketch, “Four Candles” by Two Ronnies, where “fork handles” is famously misconstrued for “four candles” — a cute example of the ways in which words can be misunderstood (and a quirky metaphor for how technology can easily misunderstand humans).

Later, when Dawn discussed the implications of searcher intent tied to timing, she again mentioned the skit. When the actors from the skit passed away, there was a massive increase in searches for “Four Candles” — and out of context, the phenomenon could be quite confusing!

Instances like this can make it seem like search is quite unpredictable, but patterns do exist when it comes to queries, timing, and the user journey. For example, patterns often emerge when you consider the timing of someone searching for a loan to the timing of them purchasing a house (and the other in-betweens along the way — like inspections). As another example, a search for “Liverpool Manchester” could represent an intent of planning a vacation, while at other times of the year the same search could be tied to the Liverpool vs. Manchester North West Derby. The passage of time also adds new meaning. Take the word “apple” — a fruit and a multinational technology company.

How search is shifting

Innovations like Google Discover aim to bridge the gap between the searcher and their preferred results by gathering data to anticipate what they’d be interested in before they search. The idea is to provide a highly personalized experience, but it’s still far from perfect. Take a look at our recent breakdown of Google Discover to learn about the pros and cons.

Screen Shot 2019-12-26 at 2.36.30 PM

While we’re on the right track, there’s still a long way to go before the user journey becomes truly demystified (if it ever really does!). With that said, Dawn closed her talk by giving us a glimpse at what the future holds for the next 20 years of search.

Take a look at the 3 shifts coming up:

  • From text to visual information: More photos, videos, UX, and accessibility.
  • From the query to the queriless: Like Google Discover, we’re seeing more technologies with the mission of anticipating the content a person will like.
  • From answers to journeys: As a whole, we’re getting closer to understanding the “journeys” (or habits) behind different search experiences (like buying a car and everything that leads up to and follows the actual purchase). People are creatures of habit, and there’s a lot to uncover when you really dig into the data!

While there’s no uniform ranking tool that fills in the existing gaps, it’s important to think more comprehensively about the content you’re sharing and how to get it to the right people. While search is still complex, bits of predictability exist underneath it all, and with the right mindset and tools you can put your best foot forward towards the future.

For more on her session at TechSEO Boost, you can check out her slides or watch the recording.

Conquering crawl budget with Jori Ford

Jori Ford, Senior Director of SEO & Content at G2, gave us an in-depth look at the technical meaning of crawl budget.

Let’s get down to it. How do you define crawl budget?

To start us off, Gary Illyes said it best: “Taking crawl rate and crawl demand together we define crawl budget as the number of URLs Googlebot can and wants to crawl.”

Ford then explained the significance of crawl rate and crawl demand.

Crawl rate is variable, meaning that your crawl rate will go up if Google can access your site frequently and get a response (or it will go down if Google can’t access your site and get a response).

Here’s the crawl rate breakdown:

= {server response time x time} / error rate

Mathematically speaking, Result = {ttfb (time to first byte) x duration} / %server error

Crawl Demand equates to: popularity x freshness. In this case, popularity refers to the URLs that are popular on the internet and that tend to be crawled more often to keep them fresh in the rankings. Popularity can be defined by CTR (click-thru rate).

Using this information, crawl budget is the: {Avg. ttfb x duration/%server error} x {(CTR)x(Avg. time b/t Updates)}

In summary, the most important factors for managing your site’s crawl budget are:

  • time to first byte
  • server error rate
  • CTR
  • freshness

Communicating with your engineers, content SEOs, and any other team members involved in the success of these factors is key. Help them understand how they can contribute to the solution.

Quantifying your site’s crawl budget

To quantify the crawl budget of your site, you should determine the pages you want to be crawled (a.k.a. your revenue-generating and top converting pages), determine which pages Google is crawling, and determine the frequency at which they’re being crawled. Botify’s Log Analyzer will show you just that!

Screen Shot 2019-12-26 at 2.39.13 PM

Furthermore, segmenting your data in Botify by pagetype and template can help you zone in on understanding if your most critical pages are making the cut. Segmentation also helps pinpoint if any noncritical pages are eating up your crawl budget unintentionally.

It’s important to note that when you’re calculating your crawl budget you’re considering the average of every page, since crawl budget is determined by your entire site.

Learn more about crawl budget by watching Jori’s presentation or viewing her slides.

A glimpse into Google’s JavaScript renderer with Martin Splitt

Google Developer Martin Splitt brought life (and laughter) to one of the most jarring topics in the SEO industry: JavaScript. While his talk got very technical (and you might want to watch the recording for yourself!), we recapped our favorite parts for you.

How does Google render a webpage?

First, the browser makes a request to the server. The moment Google starts to receive some text, it starts to understand what it’s seeing. This process begins with data from the DOM tree (a.k.a. your h1s, txt, img, p, etc.). The DOM tree represents the structure of the content and each element’s relationship to one another.

Next, the page goes through the process of layouting (understanding where everything goes on the page). The last stage is called painting. This is when the pixels are plugged in. Google doesn’t care about the pixels, so it skips this part.

JavaScript is fundamentally made to change the DOM tree and layout tree. JavaScript interactively changes content by manipulating the DOM tree. In some cases, this causes Google to have to re-layout, but overall the DOM tree is smart and understands when there’s more code coming (rather than re-layouting each time).

Screen Shot 2019-12-26 at 2.40.04 PM

The DOM tree can grow a lot due to JavaScript. If you do client-side rendering, for example, your DOM tree may look very small and then explode as JavaScript runs.

Let’s take a step into Google’s WRS (Web Rendering Service)

To get a URL ready for indexing, Google first takes the HTML that it crawled and then pulls in more resources during rendering to produce the DOM tree and render tree. It opens a tab in a Headless Chromium browser (“headless” just means a browser with no visual representation/screen — all the underlying technology of a browser, but the output is code rather than an interactive screen!) programmatically, loads the page, and waits until there is network 0 activity or waits a maximum amount of time according to the average load time. Lastly, the browser runs some JavaScript inside the page to get the HTML.

The indexing pipeline components Google uses to render pages calls a service wrapper to connect the WRS with the crawling infrastructure and the cache. Google aggressively caches to make sure it’s not wasting your crawl budget and time (due to slow network, etc.).

Here are a few things to consider about Google’s renderer.

  • Google doesn’t use cookies. Instead, they have a fake functionality in place to expire cookies.
  • The date/time of crawling and indexing may differ. For example, if crawling fails, Google might index what they have anyway.
  • Service worker registrations and permission prompts are rejected.
  • If the code has a “random number” functionality, Google will pass a function that replaces the random number with another that stays the same throughout rendering so that the content stays the same.
  • For render timings, Google cares about:
    • Queue time: the time from dispatch to working (typically 5 seconds at median)
    • Render time: time spent rendering to execute the JavaScript content

Go more in depth into Google’s WRS in by watching Martin Splitt’s talk.

Tackling faceted navigation with Frank Vitovitch

Botify’s own Frank Vitovitch, VP of Solutions, presented his research about faceted navigation and its implications for SEO.

How can faceted navigation cause SEO problems?

If you’re doing it wrong (even in the slightest), faceted navigation can negatively impact your SEO. Your site could suffer from duplicate content, crawl budget issues, internal PageRank dilution, and a potential loss in external link equity. That’s a lot at stake!

For example, using robots.txt can pose a potential problem by blocking unintended areas of your site and influencing lost link equity.

Screen Shot 2019-12-27 at 10.45.05 AM

Here’s what you should consider when managing your faceted navigation

Are the pages Google’s crawling driving traffic to your site? They should be. Consider how many facets you’re letting Google crawl. If Google is crawling 20 million URLs but 19 million of those URLs are to faceted pages, that’s a lot of wasted crawl budget (a.k.a. crawl waste).

A good way to determine if a page in your faceted navigation is worth having crawled is to map out search demand by major categories according to their demand. Are they generating significant demand, limited demand, or no demand? Take running shoes, for example. Pages according to brand, gender, or type would generate significant demand, while pages for weight, size, or width would generate no demand. The latter should be blocked from being crawled.

Screen Shot 2019-12-26 at 2.41.17 PM

Once you’ve asked these questions, you’ve got three options:

  1. If there’s sufficient search demand for the page to exist, AND you have the inventory to support the page (products or content), then go ahead and index the page.
  2. If there’s sufficient search demand for the page to exist, BUT YOU DON’T have the inventory to support the page (products or content), then don’t index the page.
  3. If there’s NOT sufficient search demand for the page to exist, then block the crawl.

Sounds simple enough, but fixing issues with faceted navigation can be a massive job. At Botify, our clients take one of two approaches. If their facets only need a little bit of love, then they should start slow and gradually limit crawling and indexing of specific facets. If the problem is much bigger, they should massively reduce crawling and indexing before gradually adding facets back into the mix. Essentially, tear it down and build it back up.

Catch the rest of Frank’s talk here or skim through his slides.

Watch the talks, spread the knowledge!

TechSEO Boost 2019 was full of in-depth looks at some of the biggest topics in SEO. While we only recapped a few talks of the day, there was so much more to soak up! So, whether you’re taking a break from the relatives this holiday season or unwinding after chocolates and cheesecake, feel free to take a few moments to watch all of the live recordings — bit by bit — for yourself.

Who knows what you’ll learn!

Mar 3, 2021 - 7 mins

How To Perform A Technical SEO Audit

Industry Events Technical SEO
Nov 11, 2019 - 7 mins

Google Webmaster Conference: Key SEO Learnings

Industry Events Technical SEO
Aug 8, 2019 - 7 mins

Crawl Budget Optimization For Classified Websites

Industry Events Technical SEO