How well does Google keep up with your new content? If, like many websites, yours regularly publishes new pages – weekly, daily, constantly… -, you need to know. Are all new pages explored by Google immediately? Do new pages generate organic visits right away? Or do older pages keep generating the bulk of the traffic?
Figuring that out will help prioritize and target your SEO optimizations. That’s why Botify Log Analyzer includes the following indicators:
New pages resulting from the website’s normal “life”
New content of an existing type is regularly published.
Editorial websites: new articles are created on a daily basis. They are placed in sections that also already exist, for the most part.
E-commerce websites: new products are regularly added (periodical product catalog updates).
Classifieds: new ads are constantly added by users.
New pages resulting from one-time changes
A new section or new type of page is added to the website, or a type of page which was disallowed to robots is now allowed.
New, unwanted or unexpected pages
Two sorts of unexpected new pages can appear in the new crawl:
The example below illustrates all three cases in an e-commerce website: normal periodical changes (product catalog updates), a one-time change, and unwanted pages.
Mainly products, and products duplicates, to a lesser extent.
Now, let’s see new URLs crawled:
It’s interesting to compare the global typology of all crawled URLs with the typology of new crawled URLs. The former indicates what the website looks like from Google’s perspective, the latter how the search engines sees the website evolve.
The vast majority of new URLs crawled are product duplicates.
If we take a closer look at our example:
New URLs crawled in yellow and bright green over the first days are new navigation pages.
URLs in darker green are duplicates of product pages. They are flagged as “warning” and will appear in the “Alerting Crawl” graph. The counter above the graph shows that there are 129.4K warning URLs crawled – these are most of new URLs crawled.
New product pages appear in pink. So there are very, very few, actual new products, compared to all the product duplicates Google keeps finding.
Let’s zoom in on product pages, (select a page type at the top of the page, instead of showing “all website”):
New product pages are regularly published and crawled, they just become insignificant among large amount of new duplicates.
Considering that, on this website, product pages represent 74% of active pages (see below) and 56% of organic visits, dealing with product duplicates is an absolute necessity and a top priority.
As expected, products duplicates don’t generate any organic visits.
Page type distribution (distribution by page category or tag) can differ significantly between all active pages and new active pages, depending on the type of content, and the portion of organic traffic which is expected to be generated by fresh content.
In our example, content freshness is not key to generate organic traffic. New active pages represent 22% of all active pages, but visits on new active pages represent less than 5% of organic visits.
Here, list pages are new pages for the most part (as seen in the new crawl graph), so it is not surprising to see they generate new active pages. It would be interesting to see how active they remain over time.
Also, while Google seems to easily distinguish between product and product duplicates (the latter don’t generate organic visits), the search engine does not seem to be able to distinguish as easily between lists and duplicate lists: duplicates list are almost as active as “real” list pages.
If pages are mainly active when they are new, it usually means that they are active because they are new:
In this other example (a classifieds website), the distribution of new active pages is closer to the overall distribution of all active pages:
Differences still deserve to be examined.
Did you make any observation regarding your fresh content vs older content? Don’t hesitate, let us know ! Leave a comment!