Affordable Online Marketing Services!

  Webmaster Software     Webmaster Resources     Link to SPN     Top SEO Tools     Blog Search     Website Templates 

  Advertise Archives Contact Us Privacy Statement SPN Blog Home SEO News Forums
 
    QUICK LINKS
 

AUG. 24,  ISSUE #985

Web Search


Blog Search
   Add a Blog
   
Search 12,100+ Blogs
   
Grab a Blog RSS Feed
   
Blog Express for RSS Feeds
 
SiteProNews Blog
Read daily blog posts in the SiteProNews Blog by two of the Web's top writers, Jim Hedger and Jerry Bader.

Tools & Services

Webmaster Tools
   Web Page Analyzer
   Meta Tag Generator
   Keyword Popularity Tool
   Link Popularity Checker
   Search Engine Submitter
   Internet Tools Directory
   Site Resource Directory
 

SEO-News Forums
Join the SEO-News Forums to post comments, articles and tips or learn from SEO experts.
Forum Posts
Yahoo 236
Google 2013
SE Tips 272
Link Exchanges 365
General Discussion 206
Join the SEO-News Forums
ExactSeek Toolbar
Get the toolbar with spyware scanning, webpage keyword analysis, web search on multiple meta engines, popup-blocking, Alexa site ranking, word highlighting, auto-upgrade and erase browser cookies.
 
Traffïc Exchanges
Get Frëe Traffïc for Your Site with these Traffïc Exchanges:


TrafficZap


TrafficSwarm


Site of the Day
SimpleSpark.com is a catalog of cool web applications broken down by category. New useful apps added daily. Companies and independent developers can register to display their creations.

Does your web site qualify as a SPN Site of the Day? Webmaster resource sites can apply via email: sotd@sitepronews.com
 

App of the Day
Pidgin 2.1.1 (10.MB) is an instant messaging program for Windows, Linux, BSD, and other Unixes. Talk to friends using AIM, ICQ, Jabber/XMPP, MSN Messenger, Yahoo!, Bonjour, Gadu-Gadu, IRC, Novell GroupWise Messenger, QQ, Lotus Sametime, SILC, SIMPLE, and Zephyr. Log in to multiple accounts on multiple IM networks simultaneously. Freeware.

If you have a Webmaster App that you would like listed on the SPN site, send us an email with details to: wapps@sitepronews.com
 

Jayde Newsletters
Subscribe to SiteProNews, the Net's foremost Webmaster ezine or SEO-News, the weekly ezine for do-it-yourself website optimizers. Just enter your email address in the field below and use the Subscribe button.

HTML Newsletter
SiteProNews
SEO-News


Must Read Ebooks
SPN offers one of the best eBook libraries on the Web. Our current selection includes Commercial and over 183 Frëe eBooks.

Authors can submit eBooks to SiteProNews via email: ebooks@sitepronews.com
 

Link to SPN
Link your site to SiteProNews, the newsletter and resource site for Webmasters.

Or, Add SPN to your site with just 2 lines of Javascrípt code. Top content for your site without any of the work.

Visit our SPN Promotion Partners page. Some great sites have opted to support the SiteProNews newsletter.

SPN Partners
SubmitPlus - Promote your site to 110 search engines... Frëe!

Template Monster - The Web's number one website templates are available for immediate download.

Online Site Builder - Providing one step point and clíck web site creation!

Web-Source - Your Guide to Professional Web Site Design & Development.

TheCgiSite.com - A directory of programming resources.

TechNewsletters.com - A search engine where you can review and subscribe to thousands of IT newsletters.

Frëe Alexa Toolbar - An indispensable tool for web professionals, providing Traffïc Data, Site Stats, and Contact Info for all the sites you visit!.

NewWebDirectory - A new internet web directory of professionally reviewed web sites providing both frëe and paid site submission.

FreeWebMonitoring - Monitor your web site's availability 24 hours a day, 7 days a week with ínstant email alerts and weekly web site performänce statistics.

DropJack.com - A new social content website similar to Digg. Submit your own original content or articles and news items that you believe are newsworthy.

SmartWebGadgets.com - Useful gadgets and widgets for your blog or website. Enhance any web page with just 2 lines of code.

Top 10 Exposure - Forget PPC. Get Google-Type ads for $3 - $4 per month and top 10 exposure across 225+ search engines & web directories.

ExcellentGuide.com - Provides a directory of trusted, reliable and credible websites based on a unique credibility scoring system

 

Submit Plus
Blog Search
Add Me.com
DropJack.com
DesignerWiz
Web Position
Alexa Toolbar
SubmitExpress
Top SEO Tools
Website Builder
Top 10 Exposure
$100 Free Traffíc
SiteProNews Blog
WebMaster Radio
NewWebDirectory
Website Templates
FreeWebMonitoring
Search Engine Tool
FreeWebSubmission
SmartWebGadgets.com


Top Webmaster Headlines

Breaking Blog News



How to Defend your Website
from the Google Duplicate Proxy Exploit

By Sophie White (c) 2007

There is a current and active way to knock a website out of Google's search engine results. It's simple and effective. This information is already in the public domain and the more people that know about it, the more likelihood there is that Google will do something about it. This article will tell you how it works, how to get a website knocked out of the search engine rankings, but most importantly, how to defend your own website from having it happen to you.

To understand this exploit, you must first understand about Google's Duplicate Content filter. It's simply described thus: Google doesn't want you to search for "blue widget" and have the top 10 search terms returned copies of the same article on how great blue widgets are. They want to give you ONE copy of the Great Blue Widget article, and 9 other different results, just on the off chance that you've already read that article and the other results are actually what you wanted.


Rackspace Fanatical Support - It's Our Promise to You!


To handle this, every time Google spiders and indexes a page, it checks it to see if it's already got a page that is predominantly the same, a duplicate page if you will. Exactly how Google works this out, nobody knows exactly, but it is going to be a combination of some or all of: page text length, page title, headings, keyword densities, checking exactly copy sentence fragments etc. As a result of this duplicate content filter, a whole industry has grown up around trying to get round the filter. Just search for "spin article".

Getting back to the story here, Google indexes a page and lets say it fails it's duplicate content check, what does Google do? These days, it dumps that duplicate page in Google's Supplemental Index. What, you didn't know that Google has 2 indexes? Well they do: the main one, and a supplemental one. Two things are important here: Google will always return results from their Main index if they can; and they will only go to the Supplemental index if they don't get enough joy from their main index. What this means is that if your page is in the supplemental index, it's almost certain that you will never show up in the Search Engine Ranking Pages, unless there is next to no competition for the phrase that was searched for.

This all seems pretty reasonable to me, so what's the problem? Well there's another little step I haven't mentioned yet. What happens if someone copies your page, let's say your homepage of your business website, and when Google indexes that copy, it correctly determines that it's a duplicate. Now Google knows about 2 pages that it knows are duplicates, it has to decide which to dump in the supplemental index, and which to keep in the main one. That's pretty obvious right? But how does Google know which is the original and which is the copy? They don't. Sure they have some clever algorithms to work it out, but even if they are 99% accurate, that leaves a lot of problems for that 1% of times they can get it wrong!

And this is the heart of the exploit, if someone copies your website's homepage say, and manages to convince Google that *their* page is the original, your homepage will get tossed into the supplemental index, never to see the light of day in the Search Engine Ranking Pages again. In case I'm not being clear enough, that's bad! But wait, it gets worse:


Join the NeverblueAds network to Earn Revenue from Your Website!

It's fair to say that in the case of a person physically copying your page and hosting it, you can often get them to take it down through the use of copyright lawyers, and cease and desist letters to ISP's and the like, with a quick "Reinclusion Request" to Google. But recently there's a new threat that's a whole lot harder to stop: the use of publicly accessible Proxy websites. (If you don't know what a Proxy is, it's basically a way of making the web run faster by caching content more local to your internet destination. In principle, they are generally a good thing.)

There are many such web proxies out there, and I won't list any here, however I will describe the process: they send out spiders (much like Google's) and they spider your page, take your content, then they host a copy of your website on their proxy site, nominally so that when their users request your page, they can serve up their local copy quickly rather than having to retrieve if off your server. The big issue is that Google can sometimes decide that the proxy copy of your web page is the original, and yours is not.

Worse again, there's some evidence that people are deliberately and maliciously using proxy servers to cache copies of web pages, then using normal (white and black hat) Search Engine Optimization (SEO) techniques to make those proxy pages rank in the search engine, increasing the likelihood that your legitimate page will be the one dumped by the search engines' duplicate content filters. Danger Will Robinson!

Even worse still, some of the proxy spiders actively spoof their origins so that you don't realise that it's a spider from a proxy, as they pretend to be a Googlebot for example, or from Yahoo. This is why the major search engines actively publish guidelines on how to identify and validate their own spiders.

Now for the big question, how can you defend against this? There are several possible solutions, depending on your web hosting technology and technical competence:


SiteProNews is Recruiting for Staff Writer and Blogger Positions!


Option 1 - If you are running Apache and PHP on your server, you can set the webhost up to check for search engine spiders that purport to be from the main search engines, and using php and the .htaccess file, you can block proxies from other sources. However this only works for proxies that are playing by the rules and identifying themselves correctly.

Option 2 - If you are using MS Windows and IIS on your server, or if you are on a shared hosting solution that doesn't give you the ability to do anything clever, it's an awful lot harder and you should take the advice of a professional on how to defend yourself from this kind of attack.


Forget Expensive PPC Advertising - There is an Alternative!


Option 3 - This is currently the best solution available, and applies if you are running a PHP or ASP based website: you set ALL pages robot meta tags to noindex and nofollow, then you implement a PHP or ASP script on each page that checks for valid spiders from the major search engines, and if so, resets the robot meta tags to index and follow. The important distinction here is that it's easier to validate a real spider, and to discount a spider that's trying to spoof you, because the major search engines publish processes and procedures to do this, including IP lookups and the like.

So, stay aware, stay knowledgeable, and stay protected. And if you see that you've suddenly been dumped from the Search Engine Rankings Pages, now you might know why, how and what to do about it.


About The Author
Sophie White is an Internet Marketing and Website Promotion Consultant at Intrinsic Marketing an SEO and Pay-Per-Click firm dedicated to supplying Better Website ROI.



Printer Friendly Version of this Article


Recommended Webmaster Articles

Interview with Wikipedia Administrator Durova
SiteProNews.com readers will recall Durova’s name from Ross Dunn’s article, "Is Wikipedia Corrupt?" which ran on Thursday August 9. In the piece, Durova's name came up several times in relation to a dispute between herself and Gregory Kohs, an SEW Forums member who goes by the name of "thekohser".

Effective Ways to Optimize Security in IT
Chances are your computer network or PC has been attacked at some point or another. Perhaps a worm caused your system to slow down severely, a virus erased your entire hard drive, or, malware plagued your registry and browser, leaving you helpless and frustrated.

Why Your Small Business Needs a Web Solution Not a Website
Raise your hand if you have a website for your small business. Now keep it raised if it is generating a significant volume of prospects or sales. Not holding your hand up anymore? You are not alone.

3 Ways To Backup Your Website
Data loss is the worst nightmare of people who own precious computer files, whether he is a website administrator or just a simple e-mail user. Sometimes, just one wrong push of a button can make all your files instantly vanish without a trace.

Need Content for Your Website - GoArticles.com has 514,800+ Articles
Add a GoArticles RSS feed or Javascrïpt feed in seconds.

GoArticles has introduced an
Article Rating System - Read, Then Rate!


Webmaster Resource Sites & Services

Top SEO Tools - A suite of the best online submission, reporting and SEO Tools available. Sign up for a frëe tríal.


Add Me! - a pioneer in search engine submission, and the most popular. They provide frëe submission and paid submission.

Google Ranking Secrets Revealed! Boost Your Google Ranking, Get More Orders, And Make More Monëy!


Build Your Traffíc with ABCSearch
Get $100 of FR-E-E qualified Visitors. Sign-up today and we'll match any initial deposit up to $100. Geo-targeting, full reporting and one clíck results!


Recommended Webmaster Tools & Services

Net Research Server (NRS) - a complete search engine solution for sites wanting to host entire web search, industry specific search, site search, or directory search. NRS also enables users to submit listings, create alerts, and organize bookmarks.

Select from 1000's of Quality Templates
Need a new site look? Select from thousands of professional designs for a fraction of web design costs. Get a multi-page website up in just a few hours.

Build a Business Website in Under 5 Minutes.
Over 172,000 people just like you have used Exact Websites to build professional websites, complete with web pages, photo albums, email, links and 27 other features without ever having built a website before.

WebPosition
WebPosition helps you maximize your site's search engine visibility by providing a complete SEO solution including rank reporting, keyword research, page optimization and submission. Download a frëe demo today!

Have an Opinion on Today's Article?
Post Your Comments in the SEO-News Forums
Sign Up for FR-E-E and Participate

 

 

SiteProNews - The Net's most widely read Webmaster newsletter



(c) Copyright 2007 All rights reserved. Jayde Online, Inc.

Web design by
siteowner.biz .