Log File Analysis Website Migration

Successful Website Migrations With Botify Log Analyzer

Anyone who has managed a website migration knows what a great source of stress it can be. If not properly anticipated, a migration can put existing traffic at great risk. Not only organic traffic, but traffic from all sources!

For a pain-free, successful migration, don’t leave anything to chance:

  • Anticipate: prepare for the migration well ahead
  • Plan: carefully implement redirections – the obvious ones as well as the not-so-obvious ones
  • Check: test your redirections on real data
  • Monitor: verify that all goes according to plan by monitoring Google’s reaction and organic visits trends

Botify’s Logs Analyzer provides actionable data and detailed monitoring along the process.

Think Migration Very Early On, Starting with New Content Perimeter

From an SEO perspective, migration management ideally starts with the new site’s design stage.

First, check that all your website’s important pages will exist on the new website, from a robot’s perspective. Organic traffic can only be preserved if valuable SEO target pages still exist. That may sound trivial, but in some cases it’s not. For instance, some new pages may not exist through their own URL in the new website because they are now generated through Ajax refreshes.

Second, carefully review pages dismissed as ‚Äònot important”. Website owners may be eager to get rid of old content they do not want to promote for understandable reasons, such as information about discontinued products. However, a part of this old content could generate a non-negligible, stable volume of visits (organic as well as referrals, through old links).
These pages could be worth keeping, as time-proven entry points to the website – with easy navigation to new, sexier pages. In many cases, business managers don’t suspect a business choice needs to be made. But SEO managers or traffic managers know better and can bring this to their attention.

The next steps will bring to light such unsuspected valuable pages, if any.

Redirections Preserve Organic Traffic Under Two Conditions

Existing organic traffic will be transferred to the new website, in a durable way, if and only if :

  • the new SEO target pages have very similar content (at least the same semantic universe, the same main keywords). In other words, if the new page answers the same questions as the pre-migration page.
  • there are page-to-page permanent redirections (HTTP 301) between old and new pages with similar content. This means bulk redirects won’t do the trick.

Example of a bulk redirection:
All pages from the old news section are redirected to the new news home.

www.mywebsite.com/news/*  ‚Üí www.mywebsite.com/companynews/

Example of one-to-one redirections (which does not mean there has to be as many rules, there can be a single rewriting rule covering many URLs):

www.mywebsite.com/products/productA_num1234.htm  ‚Üí www.mywebsite.com/catalog/productA-id1234.html
www.mywebsite.com/products/productB_num4567.htm  ‚Üí www.mywebsite.com/catalog/productB-id4567.html

A page-to-page permanent redirect will transfer most of the old page’s popularity to the new page, while bulk redirects will have at best a short-lived effect mainly based on redirected crawl volume.

It make sense: a 301 HTTP status code means that the content is not here any more, but it’s not gone, it can now be found there. If a very large number of URLs (URLs of pages that did not have the same content) are redirected to the same URL, it clearly means that the redirects don’t direct to similar content. That’s exactly why Google can consider a large number of 301 redirects to the same page as “soft 404s”: pages that are considered as 404s (page not found) although they do not return a 404 status code.

How to Build the One-to-One Redirections To-Do List

Ideally, the vast majority of URLs from the old website should be redirected one-to-one, provided that there is content to redirect to. But that probably won’t happen in the real world because of cost and resource constraints – implementation resources have limits, urgent matters always come up, and the result is always a compromise. In all likelihood, only high priority URLs will get one-to-one redirections, and we’ll have to settle for bulk redirections for the rest. This means we have to list valuable URLs (from an SEO perspective) that we will focus on for one-to-one redirections.

  • ** Inventory current SEO-active pages**

How do we make sure no valuable URL falls through the cracks?

Web server logs are a convenient data source: they trace every user visit, every crawl from every robot, every redirection… Any request the web server ever replies to is saved in these log files.

With the Botify Logs Analyzer, you can easily export SEO active URLs over a given period, for instance the last month or the last 60 days. The SEO active URLs are URLs that generated a least one organic visit over the period. They are listed with the number of visits they each generated, ordered by number of visits.

Botify Logs Analyzer export seo active urls
  • ** Other pages worth redirecting **

Any page that brings either traffic or popularity is worth redirecting. External links potentially bring both, and are sometimes overlooked.

Google Webmaster Tools (GWT) provides information on external linking: the ‚ÄòSearch Traffic / Links to Your Site section’ lists top referers (the top 1000 domains with links to your website) , and sample target pages on your website (top pages with counts).

That should be plenty enough to decide which pages need one-to-one redirections to similar content on the new website. Tools specialized in external linking can provide additional insight such as domain value for top referrers, etc.

Goggle Webmaster Tools external links

With top active pages from recent logs, and top external links targets, we cover the most important URLs that need one-to-one redirections from an SEO perspective.

Test Your Redirections on Real Data

Once all planned redirections are in place in a pre-production environment, it’s time to test them on real data. Urls recently crawled by Google are an ideal test set, as they represent Google’s current view of your website.

Botify’s Logs Analyzer provides the list of URLs crawled by Google.

Botify Logs Analyzer export crawled urls

Select the last 30 or 60 days for instance, and depending on the website’s size, we can use all crawled URLs, or top crawled URLs. Clearly, in the example above, we’ll take 30 days or less, and the 50K to 100K top crawled URLs for instance).

The next step is to crawl these URLs, just like Google would, using the crawler which comes with the Logs Analyzer.

To do so, we will need to :

  • Translate the URLs we selected from the Logs Analyzer into URLs that work in the pre-production environment (usually a simple search and replace to use the pre-production domain).
  • Feed this list of URLs to the crawler.
    This doesn’t take long as the crawl will stop after querying the list of URLs: we don’t need to get the new pages (target of redirections), nor follow links, we only need HTTP response headers with the type of response (HTTP 200, 404, 301 etc.) and redirection target locations.

Then, check the URLs status codes – this is one of the standard outputs provided by Botify.
There should be:

  • A vast majority of 301 redirections
  • No HTTP 200 status codes (that would be old pages responding with content)
  • No or very few 404s (or at least 404s that are expected)
  • No or very few 5xx status code (server errors), and if there are, that can be a sign of rewriting errors during redirections.

There might still be a bit of work to do after this redirection check, so ideally, this is done a little bit in advance, and not right before going live.

Post-Migration Monitoring

After the big day, it is important to monitor Google’s crawl and organic traffic.

  • Google crawl by HTTP status code
Botify Logs Analyzer crawl by status code

(days with no data correspond to missing logs)

Right after the migration, expected results are :

  • Very similar to results from a redirection check in pre-production, for errors and redirections (mainly permanent redirects, no or few 404s, no or few 5xx),
  • Plus some HTTP 200 (OK) from the new website. In this particular example, the post-migration volume of HTTP 200 crawl is particularly low because the new website includes a much lower number of page allowed to robots (the rest being counter-productive for SEO).

Over time, we will monitor how fast Google ‚Äòdigests’ the redirections and starts going mainly directly to new pages (which are also crawled directly from the new website). This will result in lower volume of redirected crawl. Redirected URLs with a significant traffic history will still be crawled for a significant length of time, that’s to be expected. Those that receive external links will continue to be crawled as longs as they are linked.

Note: in the Logs Analyzer, we can also zoom in to get the same view by type of page.

  • Volume of SEO active pages by type of page
Botify Logs Analyzer active pages by type

The post-migration volume of active pages greatly depends on the number of pages on the new website and the number of one-to-one redirections.
A downward trend is not necessarily a bad thing – provided that visits don’t follow a similar trend.
In this example’s case, the purple pages were voluntarily removed, and the green pages voluntarily closed to robots as they were generating an extremely small number of organic visits, while consuming a significant amount of Google crawl. Visits, as shown below, are going up.

  • Volume of organic visits by type of page
Botify Logs Analyzer organic visits by type

In this case, the migration went smoothly!

Jun 6, 2014 - 2 mins

Do You Speak Russian? The Botify Logs Analyzer Does!

Log File Analysis Website Migration
Sep 9, 2014 - 4 mins

How Incomplete Is Google’s View Of Your Website?

Log File Analysis Website Migration
May 5, 2019 - 4 mins

How To Catch A Website Hack Before It Hurts Your SEO Performance

Log File Analysis Website Migration