May 14, 2008
For some reason, the words duplicate content seem to strike fear into the hearts of webmasters everywhere, particularly those who run affiliate programs.
Questions about dupe content are amongst the most common questions I get on my Ask Kalena blog. But most webmasters worry about this for the wrong reasons. Let me set the record straight: I don’t believe there is such a thing as a duplicate content penalty. Yes, search engines can detect duplicate pages and will do what they can to avoid including them. But that usually involves a filter-based algorithm to determine the “original” page and ignore the dupes. eMarketing firm Elliance have put together a terrific graphic that shows how this filter system works.
So search engines are good at filtering out dupe pages. However, duplicate content can cause other issues that you might not have thought about. As Eric Enge points out on the SEOmoz blog:
- dupe content can take up a search bot’s crawl budget, meaning that some of your important pages may not be indexed.
- dupe pages waste valuable PageRank and link juice
- a search engine’s final decision about which is the “original” page and which are the dupes may not be accurate
I see 3. happen a lot, particularly in the case of blog posts that have been scraped or articles that have been syndicated. Pages containing my own original articles have sometimes suffered this fate when they are syndicated on a popular hub or authority site. Google incorrectly assumes that the authority site is the originator of the content and ignores my pre-dated version.
So how to avoid these issues? Where possible, use NoFollow tags to the dupe pages. If you’re syndicating content, ask your publishers to NoIndex the pages containing your content and/or to include a link back to your original source.