Fred Thoughts (t37.net) is a blog about startups, user experience and productivity. It was started back in 2006 in French, and became also available in English last August. Its author, Fred de Villamil, is, during the daytime (and sometimes late at night!) Systems Admin at Botify. Which means he has two reasons to be a heavy Botify Analytics user.
What did you discover, the first time you performed a Botify Analytics analysis on your blog?
Fred de Villamil: I performed my first crawl on January 11, 2014. I wanted to check that Botify Analytics had been deployed properly in a pre-production environment. I’m the first and probably the most intensive Botify Analytics user. My blog’s analysis results were far from excellent, but I did not have much time on my hands: we were very busy with a product launch at Botify, and I started working on my website much later.
I had removed my robots.txt by mistake during a CMS update. Suddenly, I had close to 4,000 additional indexable pages (for a total of 5,163), between pagination, dates, tags (with one or two duplicates, or pagination), old URLs that did not mean anything any more‚Ä¶ A large number of URLs were duplicated with and without the .html suffix, and many articles included links to such URLs.
My first crawl uncovered quite a few pieces of interesting information: more than 600 duplicate H1, more than 1,800 missing or duplicate descriptions, duplicate Title tags, loads of HTTP 404 errors, a few HTTP 500 errors, and over 100 HTTP 301 redirections that could easily be avoided.
What actions did you take?
I started correcting problems in April, and finished in September.
First, I put the robots.txt back online, and started a new crawl to see the difference it made. It was much better already, I was all set to start working on the rest.
I used the URL Explorer to identify all duplicate URLs. My blogging tool is quite old and has evolved significantly over time. My permalinks formats changed three times in eight years: /yyyy/mm/dd/post-slug, /post-slug then /post-slug.html. I used the exported data to update old links (using scripts, I did not do this manually!). I also added a few 301 redirections at the server level, so that external links could find their way to my website, and a few HTTP 410 (Gone) for pages that did not exist any more.
I also took the decision to deindex all pagination, category pages (merged with tag pages) and tags. I thought I could afford to do this as my internal linking is built using an in-house plugin based on tag usage and words in common in articles. This did not seem to have the expected result, as traffic started decreasing slightly in May.
And then, I started another crawl!
In September, I set myself to work on another important problem: internal linking and duplicate / overmultiplication of tags. There were a large number of tags duplicated between French and English, singular / plural, and similar semantics. I exported the URL list and was able to remove a significant amount of duplicates. This improved the blog’s internal linking as my “related articles” plugin is based on tags. Unfortunately, there is still much to do on this front. My last crawl shows that there are still isolated pages I need to work on.
What results did you get?
Two interesting results.
First, raw organic traffic increased: it is today 60% higher than it was a year ago. Here’s how organic visits from Google evolved since the beginning of 2014, compared to the same period the previous year:
The peak in January 2013 is due to a post of Java and security, published the day a vulnerability that generated a lot of talk was disclosed. Initially, this post was targeted at social networks, but it also yielded great results from Google searches, which I didn’t expect.
The steady upward trend is quite recent, and as it seems to maintain itself since a few weeks, I have good hope that it will continue. Considering the timing, I put this down to the internal linking optimizations I implemented recently. I’m quite confident that things can only improve further, as there is still work to do in this area. The fact that SEMrush noticed a change is also a good sign.
The second interesting result is an increase in the percentage of crawlable pages that bring in organic visits. The availability of Google Analytics imports in Botify Analytics made it easier to track this indicator. Between July and November of this year, the blog’s active pages ratio went from 28% to 41%. This still leaves some potential for improvement.
The graph below shows pages which generate visits from Google (active pages):
How long afterwards did you start another crawl?
I crawl my website several times a week, actually for each Botify Analytics deployment in pre-production or in production. This allows me to validate that there are no regressions in the product, and that I did not make any mistake on my blog the previous week.
I also start a new crawl to look into more details every time I change my blog’s theme, a plugin or when I change URLs redirections. This allows me to make sure that I didn’t break anything, in the website structure, in some pages, or at the server level. I notice immediately when a page returns HTTP 500 (Server Error) , without waiting for analysis results, as the crawl shows HTTP status codes in real time.
As soon as the analysis is ready, I check the “Interesting Facts”, in particular those related to performance. Metrics related to organic visits and visits from social networks are quite gratifying when there are rising. I can’t wait to compare two consecutive crawls to locate progress right away!
Here are my blog’s latest results:
Let’s talk about advanced features. Have you been using the Virtual Robots.txt? The Custom User-Agent ?
I use the Virtual Robots.txt on a regular basis. Actually, every time I want to change my robots.txt. This allows me to validate the list of pages which will be removed from Google’s index and verify the impact on internal linking, before going into production.
The Custom UA is extremely convenient to deal with limitations at the service provider’s level. Trafic limits from a given IP address can be removed by authorizing this specific user-agent without any restriction. The Custom UA can also be used to filter server log files to perform advanced analyses.
What’s your advice to someone who is about to use Botify Analytics for the first time? Someone familiar with SEO issues or not?
Someone who needs guidance can rely on colored indicators to locate problems. Colors indicate where actions are required: the more the indicator leans towards red, the more serious the issue is. Suggested Patterns are very convenient. Insights such as “50% of 404 errors come from URLs starting with /tag/ and containing ‘page’ ” save a lot of time when trying to understand the problem.
For seasoned SEO managers, the URL Explorer is a gold mine. It allows to query all data down to detailed specifics by piling up conditions. The bigger the website, the more the URL Explorer becomes essential.
What are your next steps?
I still have a lot of work to do to optimize tags, and internal linking in general. I think I will allow Google to reindex some tags and see how the search engine reacts on these expressions (in terms of positions and generated traffic).
I also started a couple of experiments to bring in traffic from social networks (in particular from Twitter). Botify Analytics already provided some visibility over this. Since I started these experiments, the number of pages visited through social networks doubled.
Also, when I find the time, I will select a couple of expression to position my blog on, and I will implement on-site and external strategies to see where this leads.
Although SEO is not my main hat, I find all things SEO very interesting because they allow for many experiments. My blog is my lab. Sometimes, attempts are successful, sometimes they are more questionable. In 2008, my blog disappeared completely from Google’s index, as did my name.
Thanks Fred! Something tells us that the upward trend will last this time… Fingers crossed!