Site   Web

August 23, 2009

Google Caffeine – A Taste Test

Earlier this month, Google invited the public to take their next generation of web search, code named Caffeine, for a test drive.

The new search infrastructure is the beginning of Google’s advance towards improving indexing speed and scale as the size of the web grows increasingly cumbersome. Google is seeking feedback on the changes from experienced users and web developers by making Caffeine available via a web developer preview.

From the official blog post:

It’s the first step in a process that will let us push the envelope on size, indexing speed, accuracy, comprehensiveness and other dimensions. The new infrastructure sits ‘under the hood’ of Google’s search engine, which means that most users won’t notice a difference in search results. But web developers and power searchers might notice a few differences.”

What Is It?

According to Matt Cutts of Google, Caffeine is essentially a rewrite of the search index and it roughly compares with the Big Daddy index of late 2005 / 2006. In other words, it’s a BIG change to Google search.

Here’s a couple of grabs from Mike McDonald’s video interview with Matt Cutts on the subject:

We’re shooting to get results identical to previous version. We’ll open up a few datacenters with it first and then roll it out.”

Caffeine will be more powerful, flexible and robust – allowing Google to index faster.”

(Caffeine) builds a powerful foundation for including any changes we want to do with indexing. Not so much for taking advantage of semantic, real-time indexing, but for getting good
infrastructure in place for growth and unlock more power

Webmasters shouldn’t be concerned. Caffeine does not affect your site architecture

Something that Matt Cutts hasn’t mentioned but has been discussed amongst my colleagues is whether the Caffeine rollout is at all related to Google’s BigTable technology.

Bigtable is a distributed storage system for managing structured data, designed to scale to accommodate huge amounts of data. Google uses BigTable to store data from their various load heavy apps such as Google Earth and Google Finance. It makes sense that this would eventually roll out to search. Perhaps Caffeine is the new algorithmic skin for the BigTable search infrastructure?

Search Technology Testing

In the SEO industry, we’re so used to Google rolling out algorithm changes without fanfare and reacting to them as we realize something has shifted that this announcement came as quite a surprise to me. Paul Carpenter made the same point on the blog:

… soliciting direct feedback from users before changes are made is something I can’t recall Google embarking on before.”

My first thought was that this was a knee-jerk reaction to the Yahoo / Bing announcement last week. But in his Caffeine blog post, Matt Cutts insists that the announcement had nothing to do with Binghoo and that they’ve had engineers working on it for months. He says that Summer is simply a good time to roll it out for testing.

So I decided to conduct my own test to see if I could notice any changes.

The Experiment

I decided to compare de-caffeinated Google against caffeinated Google using five main benchmarks:

     A) Index size
     B) Speed
     C) Site rank
     D) Link type
     E) Keyword density

My tool of choice for the comparison is Facesaerch’s Caffeine Compare. The search queries I decided to test were:

     1) “iPhone cases”
     2) “Les Paul”
     3) “diamond earrings”
     4) “Kalena Jordan”

See the Detailed Search Comparison Results Chart at:


• Probably the biggest eyebrow raiser for me was the marked increase in keyword density between SERPS on the old Google and SERPS on Caffeine. In nearly every comparison, the Caffeine SERPS featured site titles and snippets with a much higher phrase and/or keyword density. Coincidence? I doubt it.

• It’s definitely faster. Every search query I tried on Caffeine was returned at a faster speed than with the current Google. Impressive.

• Caffeine seems slightly fresher. Some of the results I observed in Caffeine SERPS and not in regular Google SERPS were more current. For example, blog posts published within the last couple of days.

• Apart from the ego search, old Google out performed Caffeine in the index size category. But this is likely because only a handful of data-centers have Caffeine on board so far.

• Caffeine definitely has a heavier emphasis on social media, with results from sites like Blogger, LinkedIn, Facebook and Google Profiles featuring more prominently, particularly for name searches. Wiki pages still seem to rank highly in both Caffeinated and Decaffeinated Google.

Other Observations

Interestingly, a couple of other bloggers have observed different trends in Caffeine SERPS. In his blog post on the subject, Paul Carpenter says:

… maybe blended results are getting a little less prominence. Certainly some news and image results are appearing further down the page in Caffeine than in the regular, *decaffeinated* results.”

Personally, I didn’t notice this. In fact with product related searches, I saw more blended results with Google shopping links often ranking higher in Caffeine.

But Everflux is influencing Caffeine results too, as Matthew Rogers of EndofWeb

The results for any search shift and change on a daily basis, because live-search results are added to the mix, causing a more fluid day-to-day search experience along with providing more
relevant data upon request

Comparison Tools

Want to conduct your own Caffeine comparison testing? Here’s a couple of tools to use:

Facesaerch’s Caffeine Compare

Black Dog’s Compare Google Caffeine

Doubleshot’s Get Caffeinated

Article by Kalena Jordan, one of the first search engine optimization experts in Australia, who is well known and respected in the industry, particularly in the U.S. As well as running a daily Search Engine Advice Column, Kalena manages Search Engine College – an online training institution offering instructor-led short courses and downloadable self-study courses in Search Engine Optimization and other Search Engine Marketing subjects.