Segmentation is key for meaningful, actionable SEO data analysis. However relevant the indicators, we won’t have any information to act on if we only see global, average values that flatten out any significant local phenomenon on the website.
We have two types of segmentation:
Suggested patterns and categories are based on URL content. This means segments are defined by combinations of rules such as ‚Äòcontains
Categories can also be defined by a regular expression for more complex URL patterns, but it always comes down to URL format.
So why the distinction between Suggested Patterns and Categories ? For two important reasons:
1) Suggested Patterns can overlap. Categories can’t.
2) Suggested Patterns are detected automatically using a Botify algorithm. Categories are the result of human thought and the rules are created manually.
That’s because they don’t have the same objective.
To identify URL Suggested Patterns, we take the website’s full set of analysis data, and process it to see what ‘shapes’ are emerging for each indicator.
Patterns are identified for sets of URL with a specific value for a given indicator – good or bad, for that matter. For instance for ‚Äòhttp status code = 404′ (page not found), or for ‚Äòunique title tag = true’. When the indicator does not have a predefined set of possible values (when it’s just a number, for instance page load time), we search for URL patterns for highest and lowest values.
So we provide one or several URL patterns per indicator value, depending on what came out of the data. When we provide several patterns, each pattern is an additional clue to help locate the problem. And the patterns are only relevant for that problem, that’s how they are selected.
In our Log Analyzer, the requirements are totally different. We want a segmentation that makes sense site-wide and remains valid over time. We also want to use the same segmentation for all indicators. There are a variety of metrics, such as pages that generate/don’t generate organic visits, pages crawled by search engine bots, crawl freshness on pages that generate visits, ‚Ä¶ to name a few.
We can’t have overlapping segments: for many indicators, we show distribution over all segments. For these graphs with cumulative data, we can’t use overlapping segments. It wouldn’t make any sense, the total from the graphs would be more than be actual total. So the segmentation we apply is a categorization, where each URL belongs to one category only.
This approach is very appropriate for a full, detailed structural analysis, or for trends monitoring over time. This is very different from performing an analysis on a website at one point in time, and trying to understand why on that instant picture, one indicator is not good. The time spent working on it is not the same either (zero for suggested patterns).
Of course categories can also be used for diagnosis purposes. But their main goal remains to provide a general-purpose, stable, site-wide segmentation, while suggested patterns provide a very focused, almost ‚Äòsingle use’ view to shed light on a given issue.
Botify customers can categorize their urls within the Logs Analyzer, and we provide a more flexible type of segmentation through Suggested Patterns in Botify Analytics reports. In another post, we explain the difference between Suggested Patterns and categorization.