Duplicate content is any content that appears in more than one place on the internet, such as an article published on your website and another website. Depending on how the canonical URL reference is handled, search engines can see this as duplicate content, affecting your search position. Content that appears on two pages on your own website can also be considered duplicate content. This is often seen on international websites where the content is published on both the US and the AU website instances or for retailers managing local pages with the same information. This can also happen in marketing campaigns where the message may differ slightly for personalization or other business reasons.
Content can also be considered “duplicate content” if it’s essentially the same but with minor variations.
Because both of these messages are semantically similar, search algorithms will parse them and think they are the same content. An old rule of thumb is that so long as 60% of the content was different, that would be enough for search spiders to see the content as unique. However, many in the SEO community have pulled back from this rule, instead working to prevent the need for duplicate content in the first place.
Is Duplicate Content Bad For My SEO?
Duplicate content doesn’t result in a penalty from Google, except for severe, repeat cases. However, just because Google will not penalize your website does not mean it doesn’t impact SEO. Google’s focus over the years has continued to shift towards providing users with the best on-site content and experience. This means that if your content is the same as another website, there isn’t much value for Google to rank to content highly as similar content already exists.
There are several ways that duplicate content can impact your SEO strategy if there is a lot of of it on your website:
If you have a large enterprise website, your crawl budget can quickly be wasted trying to crawl duplicate pages instead of having spiders focus on the most valuable content pages on your site.
Without instructions from you, the site owner, Google spiders can be confused about which page should be crawled and indexed. For example, if you and an affiliate publish the same article without a canonical URL pointing to the originating article, Google doesn’t have any signals to know which site should receive “credit” for the content.
If you have a large amount of duplicate content, Google may begin to doubt your E-A-T, affecting your overall appearance in search results beyond just those duplicate content pages.
What Are Common Ways Duplicate Content Occurs?
Duplicate content isn’t always the result of nefarious intentions. Instead, it frequently is the result of technical SEO issues on your website. Some of the most common ways duplicate content can occur are:
URL parameters: While exceptionally helpful for tracking where your traffic is coming from, URL parameters can make it appear that there is duplicate content on your website. Depending on your CMS, there are several ways to prevent this issue, such as limiting crawls on URLs containing parameters entirely.
Slashes at the end of URLs: To most of us, botify.com/ and botify.com are the same URL, right? While they likely resolve to the same location, search engine spiders will often see this as duplicate content. The easiest way to solve this issue is by maintaining strong URL hygiene or always redirecting slashed URLs to non-slashed slugs.
www.botify.com v. botify.com: Like the slashed URL issue above, some websites may be unaware that they are serving the same content on www.site.com as they are on site.com. If this is the case, a 301 redirect can solve search engines from seeing two duplicate versions of your whole website!
Localization: It’s common for websites with a global presence to have local language sites for core business regions. However, if you are also doing localization websites for the same language, you can run into issues. If your business has a US, UK, and AU website localized to the different variations of English spoken in those countries, the content can often appear as duplicate content. Making sure your websites are using the proper HREFLANG tags on each website will resolve this issue.
Need help trying to locate duplicate content in your ecosystem? Find duplicate or near-duplicate content on your website by viewing the “Similarities/Duplicates” report in Botify or through a site audit. Learn more now!