The way search engines view duplicate content is not the same way that the average search engine optimization (SEO) professional views duplicate content. The typical way search professionals view duplicate content is a percentage value, such as, “These two Web pages have 65% similar content and 35% unique content. If I change one page’s content so that there is only 64% similarity, search engines will not consider these pages to be duplicates.”
As tempting and easy as it is to make this simple calculation, it is not accurate one. Search engines do not calculate duplicate content with such a simple equation. Many beginner and expert SEO practitioners alike are not aware of the various duplicate content filters that search engines have at all three points (crawler, indexer, query processor) of the search engine process, some of which I will be discussing at the Chicago Search Engine Strategies Conference.
Types of duplicate content filters
For example, some duplicate content filters weed out content before Web pages are added to a search engine index, meaning that many Web pages will not even be available to rank. What is the point of optimizing a Web page if the page is not going to be available to rank? Optimization can take considerable amount of time and money, especially if you are hiring an SEO firm to do the optimization for you. Web site owners can save considerable time and expense if they are more knowledgeable about the ways that search engines allow and don’t allow pages to rank.
On the flip side, some duplicate content filters are applied after pages are added to the search engine index. Web pages might be available for ranking, but they might not display in search engine results pages (SERPs) as Web site owners might like them to appear. Being pro-active can make more pages available to rank, and the right pages available to rank.
By understanding how the commercial Web search engines filter out and display duplicate content, Web site owners can obtain greater search engine visibility and provide a better searcher experience.
Duplicate content and the searcher experience
The truth is that searchers do not wish to see the same content delivered over and over and over again in search results. For example, if a searcher is doing research on a product or service, and he/she keeps seeing the same content delivered in position #1, position #2, position #3, and so forth, the searcher becomes quickly frustrated. The searcher not only becomes frustrated with the search engine (for delivering “bad” results) but also with the seemingly “unique” companies that provide the search listings.
Web site owners, who have affiliates and distribute content through RSS feeds or content licensing, commonly believe that more search engine visibility means a greater chance for conversions. What they do not realize is that duplicate content delivery via these methods can actually lead to a poor search experience, a poor branding experience, and a poor user experience. As a Web site usability professional, I constantly see how easily searchers remember the “bad” sites and don’t click on those links again, even if those links appear in top search engine positions for other keyword searches.