Newsletter and webmaster resources site   Do You Know When Your Website Is Down?
  Advertise in SiteProNews SiteProNews Archives About SiteProNews SPN Privacy Statement FeedBack SiteProNews Homepage SiteProNews Image Map SEO-News Discussion Forums
  Stretch Your Budget - Advertise in SiteProNews
    QUICK LINKS
 
JULY 10,  ISSUE #812
Web Search

ExactSeek Links
      Add your Site
      Buy a Top 10 Listing
      Find Your Site Rank
      Schedule a Site Recrawl
      Enhance Your Site Listing
      ExactSeek Member Login

Buy Results, Not Promises - Your Intial Deposit Matched to $100
Top SEO Tools
A key factor in building website traffïc is to use the best tools available. ExactSeek's SEO solution gives you immediate access to 7 effective optimization tools. Get higher ranking on the top global search engines starting today.

Try it Fr-e-e for 90 Days
SEO Tools and Services


SEO-News Forums
Join the SEO-News Forums to post comments, articles and tips or learn from SEO experts.
Forum Posts
Yahoo 186
Google 1541
SE Articles 97
Link Exchanges 306
General Discussion 137
Join the SEO-News Forums
Blog Search
   Add a Blog
   Search 8,300+ Blogs
   Grab a Blog RSS Feed
   Blog Express for RSS Feeds

ExactSeek Toolbar
Get the toolbar with spyware scanning, webpage keyword analysis, web search on multiple meta engines, popup-blocking, Alexa site ranking, word highlighting, auto-upgrade and erase browser cookies.
Download Version 2.3

Webmaster Tools
   Site Ranking Tool
   Meta Tag Generator
   Link Popularity Checker
   Search Engine Submitter
   Internet Tools Directory
   Site Resource Directory
 
Traffïc Exchanges
Get Frëe Traffïc for Your Site with these Traffïc Exchanges:


TrafficZap


TrafficSwarm


Site of the Day
WebDesignFromScratch.com is a complete guide to designing web sites that work. Topics covered include usability, site architecture, HTML, CSS, DHTML, etc.

Does your web site qualify as a SPN Site of the Day? Webmaster resource sites can apply via email: sotd@sitepronews.com
 

App of the Day
Avira AntiVir Removal Tool (365 KB) is an anti-virus tool designed to remove and repair the damage done by 5 major worm viruses. It can also remove related non-viral files and registry entries. Freeware for Windows 98/ ME/ NT/ 2000/ XP/ 2003.

If you have a Webmaster App that you would like listed on the SPN site, send us an email with details to: wapps@sitepronews.com
 

Jayde Newsletters
Subscribe to SiteProNews, the Net's foremost Webmaster ezine or SEO-News, the weekly ezine for do-it-yourself website optimizers. Just enter your email address in the field below and use the Subscribe button.

HTML Newsletter
SiteProNews
SEO-News


Must Read Ebooks
SPN offers one of the best eBook libraries on the Web. Our current selection includes Commercial and over 178 Frëe eBooks.

Authors can submit eBooks to SiteProNews via email: ebooks@sitepronews.com
 

Link to SPN
Link your site to SiteProNews, the newsletter and resource site for Webmasters.

Or, Add SPN to your site with just 2 lines of Java-script code. Top content for your site without any of the work.

Visit our SPN Promotion Partners page. Some great sites have opted to support the SiteProNews newsletter.

SPN Partners
SubmitPlus - Promote your site to 110 search engines... Frëe!

Template Monster - The Web's number one website templates are available for immediate download.

PreWired.com - Providing ISPs & Publishers a Web based revenue stream!

FindMyHost.com... Review detailed Report Cards of web hosts who made the grade.

Web-Source... Your Guide to Professional Web Site Design & Development.

TheCgiSite.com - A directory of programming resources.

SiteUptime.com - A frëe website monitoring service, providing përformance reports and uptime stats.

TechNewsletters.com - A search engine where you can review and subscribe to thousands of IT newsletters.

Frëe Alexa Toolbar
An indispensable tool for web professionals, providing Traffïc Data, Site Stats, and Contact Info for all the sites you visit!.

NewWebDirectory- A new internet web directory of professionally reviewed web sites providing both frëe and paid site submission.

FreeWebMonitoring - Monitor your web site's availability 24 hours a day, 7 days a week with instant email alerts and weekly web site performänce statistics.

Top 10 Exposure - Forget PPC. Get Google-Type ads for $3 - $4 per month and top 10 exposure across 190+ search engines & web directories.

 

Submit Plus
Blog Search
FindMyHost
Add Me.com
DesignerWiz
Web Position
Alexa Toolbar
SubmitExpress
Website Builder
Top 10 Exposure
$100 Free-Traffic
Fr-e-e SEO Tools
NewWebDirectory
Website Templates
FreeWebMonitoring
FreeWebSubmission



Pushing Bad Data
Google's Latest Black Eye

By Eric Lester (c) 2006-06-26
Google stopped counting, or at least publicly displaying, the number of pages it indexed in September of 05, after a school-yard "measuring contest" with rival Yahoo. That count topped out around 8 billion pages before it was removed from the homepage. News broke recently through various SEO forums that Google had suddenly, over the past few weeks, added another few billion pages to the index. This might sound like a reason for celebration, but this "accomplishment" would not reflect well on the search engine that achieved it.

Complete Software Toolkit - Download the Frëe Edition!

What had people buzzing was the nature of the fresh, new few billion pages. They were blatant spam- containing Pay-Per-Click (PPC) ads, scraped content, and they were, in many cases, showing up well in the search results. They pushed out far older, more established sites in doing so. A Google representative responded via forums to the issue by calling it a "bad data push," something that met with various groans throughout the SEO community.

How did someone manage to dupe Google into indexing so many pages of spam in such a short period of time? I'll provide a high level overview of the process, but don't get too excited. Like a diagram of a nuclear explosive, it isn't going to teach you how to make the real thing, you're not going to be able to run off and do it yourself after reading this article. Yet it makes for an interesting tale, one that illustrates the ugly problems cropping up with ever increasing frequency in the world's most popular search engine.

A Dark and Stormy Night

Our story begins deep in the heart of Moldva, sandwiched scenically between Romania and the Ukraine. In between fending off local vampire attacks, an enterprising local had a brilliant idea and ran with it, presumably away from the vampires... His idea was to exploit how Google handled subdomains, and not just a little bit, but in a big way.

The heart of the issue is that currently, Google treats subdomains much the same way as it treats full domains- as unique entities. This means it will add the homepage of a subdomain to the index and return at some point later to do a "deep crawl." Deep crawls are simply the spider following links from the domain's homepage deeper into the site until it finds everything or gives up and comes back later for more.

Build, Send and Track Email Campaigns in Minutes!

Briefly, a subdomain is a "third-level domain." You've probably seen them before, they look something like this: subdomain.domain.com. Wikipedia, for instance, uses them for languages; the English version is "en.wikipedia.org", the Dutch version is "nl.wikipedia.org." Subdomains are one way to organize large sites, as opposed to multiple directories or even separate domain names altogether.

So, we have a kind of page Google will index virtually "no questions asked." It's a wonder no one exploited this situation sooner. Some commentators believe the reason for that may be this "quirk" was introduced after the recent "Big Daddy" update. Our Eastern European friend got together some servers, content scrapers, spambots, PPC accounts, and some all-important, very inspired scripts, and mixed them all together thusly...

Five Billion Served - And Counting...

First, our hero here crafted scripts for his servers that would, when GoogleBot dropped by, start generating an essentially endless number of subdomains, all with a single page containing keyword-rich scraped content, keyworded links, and PPC ads for those keywords. Spambots are sent out to put GoogleBot on the scent via referral and comment spam to tens of thousands of blogs around the world. The spambots provide the broad setup, and it doesn't take much to get the dominos to fall.

GoogleBot finds the spammed links and, as is its purpose in life, follows them into the network. Once GoogleBot is sent into the web, the scripts running the servers simply keep generating pages- page after page, all with a unique subdomain, all with keywords, scraped content, and PPC ads. These pages get indexed and suddenly you've got yourself a Google index 3-5 billion pages heavier in under 3 weeks.

Submit Your Website to the Top 50 Search Engines!

Reports indicate, at first, the PPC ads on these pages were from Adsense, Google's own PPC service. The ultimate irony then is Google benefits financially from all the impressions being charged to Adsense users as they appear across these billions of spam pages. The Adsense revenues from this endeavor were the point, after all. Cram in so many pages that, by sheer force of numbers, people would find and click on the ads in those pages, making the spammer a nice profit in a very short amount of time.

Billions or Millions? What is Broken?

Word of this achievement spread like wildfire from the DigitalPoint forums. It spread like wildfire in the SEO community, to be specific. The "general public" is, as of yet, out of the loop, and will probably remain so. A response by a Google engineer appeared on a Threadwatch thread about the topic, calling it a "bad data push". Basically, the company line was they have not, in fact, added 5 billion pages. Later claims include assurances the issue will be fixed algorithmically. Those following the situation (by tracking the known domains the spammer was using) see only that Google is removing them from the index manually.

Forget Expensive PPC Advertising

Get a Google-Type Ad with Top 10 Exposure across 200+ search engines and web directories delivering 150 Million+ Searches/Mo.

$3 - $4/Month - Quick Inclusion - World Wide Placement!
Your Keywords - No Bidding - No Click Fraud - Stats Tracking


Sign Up Today - Receive 3 Bonuses Valued at $90

The tracking is accomplished using the "site:" command. A command that, theoretically, displays the total number of indexed pages from the site you specify after the colon. Google has already admitted there are problems with this command, and "5 billion pages", they seem to be claiming, is merely another symptom of it. These problems extend beyond merely the site: command, but the display of the number of results for many queries, which some feel are highly inaccurate and in some cases fluctuate wildly. Google admits they have indexed some of these spammy subdomains, but so far haven't provided any alternate numbers to dispute the 3-5 billion shown initially via the site: command.

Over the past week the number of the spammy domains & subdomains indexed has steadily dwindled as Google personnel remove the listings manually. There's been no official statement that the "loophole" is closed. This poses the obvious problem that, since the way has been shown, there will be a number of copycats rushing to cash in before the algorithm is changed to deal with it.

Conclusions

There are, at minimum, two things broken here. The site: command and the obscure, tiny bit of the algorithm that allowed billions (or at least millions) of spam subdomains into the index. Google's current priority should probably be to close the loophole before they're buried in copycat spammers. The issues surrounding the use or misuse of Adsense are just as troubling for those who might be seeing little return on their advertising budget this month.

Do we "keep the faith" in Google in the face of these events? Most likely, yes. It is not so much whether they deserve that faith, but that most people will never know this happened. Days after the story broke there's still very little mention in the "mainstream" press. Some tech sites have mentioned it, but this isn't the kind of story that will end up on the evening news, mostly because the background knowledge required to understand it goes beyond what the average citizen is able to muster. The story will probably end up as an interesting footnote in that most esoteric and neoteric of worlds, "SEO History."


About The Author
Mr. Lester worked in the IT industry for 5 years, acquiring knowledge of hosting, website design, before serving for 5 years as the webmaster for Apollo Hosting. Apollo Hosting provides website hosting, ecommerce hosting, vps hosting, and web design services to a wide range of customers.



Printer Friendly Version of this Article


Recommended Articles and News for Webmasters

Designing Your Site for Web 2.0
Your Internet Business and the Law
Organic SEO – What Does it Really Mean?
Search Engine Promotion: No Strategy. No Success.
How to Use Simple Tag and Ping Marketing Techniques
Google Sitemaps Explained - How To Use Google Sitemaps

Need Content for Your Website - GoArticles.com has 200,500+ Articles
Add a RSS feed or Javascrïpt feed in seconds.


Webmaster Resource Sites & Services

Search Engine Tools - The only way to increase your Web site's position and ranking on search engines is to use the optimal tools. The ExactSeek SEO Solution is F-R-E-E for 90 days!

Add Me! - a pioneer in search engine submission, and the most popular. They offer frëe submission and paid submission.

Google Ranking Secrets Revealed! Boost Your Google Ranking,
Get More Orders, And Make More Monëy!


Build Your Traff'ic with ABCSearch
Get $100 of FR-E-E qualified Visitors. Sign-up today and we'll match any initial deposit up to $100. Geo-targeting, full reporting and one-click results!


Recommended Webmaster Tools & Services

Humans Learn Better by Watching.
Come watch my friend Jim Daniels build a web business from the ground up. Innovative, new "View it and Do it" Software lets you build your business right along with this 10-year web business expert. Simple, fun, amazingly effective.

Select from 1000's of Quality Templates
Need a new site look? Select from thousands of professional designs for a fraction of web design costs. Get a multi-page website up in just a few hours.

Build a Business Website in Under 5 Minutes.
Over 172,000 people just like you have used Exact Websites to build professional websites, complete with web pages, photo albums, email, links and 27 other features without ever having built a website before.

WebPosition
WebPosition helps you maximize your site's search engine visibility by providing a complete SEO solution including rank reporting, keyword research, page optimization and submission. Download a frëe demo today!

Have an Opinion on Today's Article?
Post Your Comments in the SEO-News Forums
Sign Up for FR-E-E and Participate

 

  SiteProNews - The Net's most widely read Webmaster newsletter


(c) Copyright 2006 All rights reserved. Jayde Online, Inc.
Web design by
ControlV.