February 14, 2012
In the first installment of this article series I talked about the issues of duplicate content and how it is becoming more and more important to provide a way for Google to easily acknowledge you as the originator of the web copy appearing on your website.
In this article I aim to show you how many web publishers like yourself can unwittingly loose upwards of 90% of their potential search traffic to ‘republishers’ or thieves, how your own web content can be used against you, and how you can take control over your content and take back the traffic which is rightfully yours.
What Type of Content Is at Issue?
Content is basically anything we choose to publish online – it all falls under the scope of this article. It can be a product or service description, a blog post, a review of some type, a page of text describing our company, even a ‘policy’ page is classed as content.
The Problem – Content Duplication and Distribution
Whether you actively seek to share your web content or not you are still affected by this issue.
Duplicated content is being handled differently by Google in 2012 and it’s important to understand how the changes affect you. In Part I of this series, I explained how Article Directories have been devalued by Google due, in most part, to the fact that they are a repository for duplicated content. But what if your content is replicated and published around the web, outside of article directories? How is that handled by Google and in what situations would an issue occur?
Proactively Marketing Your Content
If you are a proactive content or article marketer you’re familiar with the process. You write an article and submit it to your own site first and wait until it appears in Google’s index. Then you submit the article to various directories in pursuit of back-links, traffic and possibly syndication.
Passive Content Marketing
The web is all about sharing. Whether you actively seek to have your content shared or not is largely irrelevant. If you have good content, it’s going to get ‘shared’ whether you like it or not, and not always in ways you might think.
If you feel flattered when someone republishes your content, keep in mind that a large percentage of this republished content is collected and distributed by software – content scrapers. They automate the task of trawling the web and stealing text which matches a target theme. Many use integrated text ‘spinners’ which can churn out multiple versions of your original work. WordPress blogs are rife with heavily spun content collected by automated scrapers.
1. If your content is found from Google search outside of your own website you are loosing direct traffic.
2. If Google is unable to establish you as the originator of the content, you may eventually have the content published on your own website devalued or even de-indexed.
The Consequences of This Content ‘Distribution’ And How It Affects You
I’ve written extensively for a wide variety of niches and I frequently find whole pages of content which I’ve written for a website reproduced elsewhere without my permission. I’m not talking here about content I choose to share, I’m referring to content which I’ve explicitly said ‘DO NOT SHARE’ via my Footer Copyright Policy.
Many people assume this cannot harm you, but at best it’s leaching your traffic away from you and at worst it could embroil you in the middle of Google’s campaign to reassess the usefulness of duplicated content on the web.
Here’s the simple scenario – you write an information piece describing your product and publish it to your website. Someone or something takes that web copy and re-publishes it, or some ‘spun’ version of it, on a different website. That page of content receives search-related traffic from Google and other search engines which rightfully belongs to you. Your own page, which is strikingly similar to the spun version may be devalued as a duplicate, so you loose the traffic from that too. Up to 90% of traffic which rightfully belongs to you can be lost in this way, I know this for certain. I’ve been testing the theory for over 6 months.
But I own the content, I wrote it and published it first!
Here’s where the naysayers will step in and claim that as long as you publish the content first on your own website, you can’t become a victim of what I’m describing above. But you can, you are already.
Have you ever noticed how sometimes your content appears on other websites in SERP’s ahead of your own? Article marketers will relate to this phenomenon since they are more active in content distribution. They write an article for their own website and wait until it is indexed by Google. Then they distribute the same article to various Article Directories in the hope of gaining some benefit from the back-link or from direct traffic. Then the article in the Article Directory is indexed by Google and in some instances appears higher in the SERP’s than it does on the author’s own website. And it can work the same way with illicitly obtained content regardless of whether or not it is ‘spun.’ Here is an actual forum post I found today which relates to this subject (name removed) –
“Title: Somebody stealing my content and outranking me with it? Hopefully somebody can help me with the following. I’ve been starting with my first website a few months ago, and have been steadily improving. Today when checking the ranking in google for a keyword, I found out that somebody had copied a whole blogpost of mine. Not that big a deal if I still outrank him in Google. Only problem is, he scored higher in Google than me with that blog post (!).
Does anyone have a possible explanation for it, something that I can do about it?”
The point is, Google does not always know where the content originated. That fact that you may have published it first, does not give Google the information that you are the originator.
Google can only determine the originator if it has real-time indexing of the entire web, which it does not. Think about it for a moment. If Google only has between 18% and 30% of the entire web in its index at any one time, how can it assume that a piece of content it encounters for the first time does not already exist elsewhere? In fact, mathematically it must assume the opposite of what you’d expect – it must assume that all new content it finds is previously published.
It’s Not An Issue For Me, So I Don’t Care
If you’re not convinced about the threat potential you might be using flawed reasoning. Because it hasn’t appeared to be an issue for you up to now you may feel inclined to ignore this and to carry on doing what you do. There are very good reasons why you shouldn’t ignore this information.
Throughout 2011, duplicate publications of the same piece of content have been allowed to coexist. But many people, not just myself, will point to the fact that this is starting to change. We all know that the first hit came in 2011 with the Google algorithm update termed ‘Panda.’ Panda resulted in a massive slap for Article Directories, the notorious publishers of previously published content.
But in a more subtle way Google has started to filter pages from other types of websites which contain largely similar content – and this is where it may impact you. Through its video broadcasts, Webmaster Tools and Webmaster Forums, Google has already begun preparing us for the changes. In the first week of February 2012 Google began rolling out its revised PR gradings and once again sites making use of recycled content are affected. The process has begun, it’s ongoing, it’s happening now. But where will it end and how will it affect you?
It could end up with Google making a decision about the origination of all web content and filtering anything deemed ‘not original’ on a much wider scale than it is already doing. This isn’t hypothesis, as I’ve stated and shown by example, the process is already underway, we just don’t know how far it will go.
Google is in a constant battle to keep up with the people who try to exploit it. There’s an entire industry aimed at search engine manipulation, it started way back in the 90’s with the use of the ‘No Frames tag,’ Meta Tag Keyword stuffing, hidden keywords, cloaking etc, then more recently with bulk back-link building, content scraping, and so on.
If you’re a fool, you’ll be looking for ways to trick Google into providing you with more traffic. Each time a loophole is exploited, in this case the mass reproduction of simulated content for gaining SERP advantage through the manipulation of article directories, blogs and other content publishing platforms, Google has to step in and rewrite the rules.
If you’re sensible about search engine marketing, and you value the long-term health of your online business, you’ll be looking for ways to help Google help you.
One of the ways you can do this is to help Google determine that the content on your website belongs to you, assuming that it does. At the same time you can use a technique that will place a hurdle in the way of people who try to utilize your content without your consent.
Quite simply, use the following technique and you’ll avoid traffic loss through illicit use of your content and you’ll protect and even improve your own SERP’s.
The Google Credit Score
How can we create a situation where Google knows our content is the first copy to be published and at the same time lock it down against unauthorized duplication and republication?
The strategy is to simply ensure that it can have no duplicate.
Imagine that the parameter in the algorithm ‘original author’ is weighted, perhaps on a scale of 1-100, with 100 being an absolute certainty that Google knows you’re the originator of the content. So if you score 100, all subsequent processing of the algorithm is numerically weighted in your favor and your content will never be downgraded or removed for being duplicate.
The way to get a ‘100’ credit score for your work and authority for your website, and more traffic, is to encode all the key content pages on your website with an encrypted ‘uniqueness key.’ That is, to give Google something to look at which it knows cannot be recreated outside of your own website. To do this you need to focus on creating more of your content which is dynamic and not just ‘set and forget.’
Some of the simple ways we can implement this is to create content with –
1 – Integrated video
2 – Integrated images
3 – Links out to authority resources
4 – Integrated social media within the page
5 – Google+ and FB ‘like’ on the specific page and not just the homepage
6 – Alternate formats for accessibility, like PDF, audio and audio/video
7 – Some pattern of social bookmarking and interaction
8 – The option for textual interaction with the page, either a simple comments script or full-blown blog
The key is the way in which you integrate everything into your content. For example, take a section title or some bullet points from your content and place the text within an image file. That way if the text is copied without the image being present it won’t make sense. You’ve now placed a hurdle in the way of content scrapers.
In the same way if you integrate a video or audio clip into the page, describe textually what is present in the video, so if the text is scraped without the video it will be incomplete and won’t make any sense. (Hyperlinks and embedded media files are usually removed by content scrapers, so be sure to describe your links textually, so that if the link is removed, what’s left behind won’t make any sense on the thief’s website).
The sooner you can build a few back-links to your new page the better. These help create a profile for the content around the web and help to raise your credit score. Make a post on Facebook and invite people to visit your page, then provide them with ‘like’ buttons. At the same time, why not ask them to ‘like’ your page from within the body of text.
WordPress and CMS owners can utilize useful plugins to help make content more dynamic. There are plugins to help with syndicated content distribution (which creates a solid footprint on the web for your content) and there are plugins which you can use to dynamically update a static page with texts from blog posts and other sources.
In doing most or all of the above, you’re creating a unique template for your work that cannot be replicated. You’re creating an encryption code which cannot be cracked outside of your own website and you’re giving Google a reliable way to index your content and determine authorship/ownership. The more uniqueness you can apply to a page in the form of components which cannot be easily replicated outside of your website, the more ‘bits’ there are in your encryption code and the more ‘secure’ Google feels about your specific content.
The reward is a high Google Credit Score, or more “Authority,” higher SERP’s, and the security of knowing that your page isn’t going to be devalued as duplicated content. You’ve also rendered the auto-content scrapers impotent, you’re supplying them with little of use.
I’ve heard many people comment on the benefits of integrating video and social media into web content. They consistently state that doing so helps them rank higher in Google’s search results. What if the main thing that has happened with their integrated media approach is the establishment of a high Google Credit Score? I believe that it is.
There’s one thing for certain, if your approach to article marketing is via the syndication route, you’re less likely to be concerned with, or affected by the new rules surrounding duplicate content, for now.
In the next article I’ll be covering how you can create an effective article for syndication, how and where to find sources/ outlets for syndication, and how to create your all-important sales funnel.
I’ll leave you with one parting comment. The most valuable asset for any website owner is a contact list. The most effective way of building a contact list is through article syndication.
If you’d like to have an advanced copy of part three of this report sent to you via email, please visit this page on my website http://webdesigndoorcounty.com/spn2.html.
Author of the popular reference guide series “The Internet – No Place for Dummies,” Carl Hruza has operated his successful Web Design/SEO Company since 1998. You can learn more about the author from his website at webdesigndoorcounty.com or view the reference guide series at noplacefordummies.com.