We tend to assume that on large websites with a long tail audience and a lot of new content, fresh content generates a significant portion of overall organic traffic. Wrong.
Here are a few examples, with different types of large websites. The graph below answers the question “how old are pages that generate organic visits? “: it shows the websites’ volume of active pages (pages that generated organic traffic, at least one visit) over a month (October), and colors indicate how long ago Google first discovered these pages (first crawl):
Quite an eye opener, isn’t it? On the first three websites, most active pages are over one year old. These examples are not exceptions. Older – much older!! – pages being more active than new ones is actually the most common pattern. Notice that even on the classifieds website, close to half of active pages are at least two months old (“0 month” means first crawled during the current month). As far as the editorial / news website is concerned, keep in mind that this is Google’s universal search only (no Google News traffic – which is another SEO matter entirely, with high focus on content freshness and news sitemaps, which cover only the last 48 hours).
And still, on many websites, SEO attention tends to focus on recent pages (quick links to recent content, etc.), while older pages remain under-optimized (too deep, not enough internal linking, etc.).
The thing is, this often goes unnoticed. The age of active pages is not something that is tracked by Web Analytics software. Nor is it covered by website structural analysis, which works on a snapshot of the website. Only logs analysis covering a wide time-frame can provide this indicator.
The impact on traffic is real: let’s keep in mind that for very long tail traffic, active pages volume closely reflects organic visits volume.
On the forum website we took as an example, there are around 560K active discussion pages over the one-month period, which generate an average of a dozen organic visits per page; less than 4% make more than 100 visits; 90% make 30 visits or less. On the classifieds website, there are around 55K active ad pages, which generate on average 3 visits each; 5% make more than 10 visits, and 80% make 3 visits or less.
For this indicator to have a true meaning, we need to consider pages by type of audience: separate navigation pages (home and top categories with top tail audience, lower-level categories with middle tail audience) from content pages (discussions / articles / products / ads with long tail audience).
Navigation pages will of course be very old active pages – unless your website went through a migration recently. They don’t change and have been generating traffic as long as they existed. When looking at the “age of active pages” indicator, we want to focus on the type of page that is regularly published. This is why, in the examples above, we showed content pages and not the full websites. This is achieved through URL categorization.
The “age of active pages” indicator can be found in the Botify Log Analyzer’s “Advanced” tab. The maximum age that can appear on a graph is limited by the amount of logs available for analysis: if your log analyzer has been up and running for 12 months, then older pages will appear as 12 months old. The last time range in the graph indicates a minimum.
In the examples above, we showed data consolidated by month for graphs to be easier to read, but as with any indicator in Botify Log Analyzer, you can also consolidate information by week or day.
Here is for instance, daily data for ads on the classifieds website :
Understanding active pages’ age patterns for content with a long tail audience will help focus on high-yield optimizations, which often involve improving crawl ratio on content to make sure Google discovers as many pages as possible. Chances are that these older pages are deeper than newer ones. It will probably make sense to work on your website structure (depth, pagination, internal linking, reattaching orphan pages…). And also on performance (see why page performance matters for SEO, and how to optimize robots’ crawl performance). Optimizing a long tail audience is all about allowing search engine robots to crawl more – active pages volume is mechanically linked to crawled pages volume.