March 14, 2008
How Google Applies the Lessons of Scale
So how does Google apply these lessons of scale? For starters, Google does not buy expensive hardware. PCs are unreliable, especially if you have thousands. However, they are cheap and fast. So what’s Google’s strategy? Craig says they exploit the processing power of off-the-shelf PC hardware and simply make the software more reliable.
Craig revealed that Google buys cheap hardware on a mass scale. The problem is that these cheap processors are notoriously unreliable because they are packed into datacenters by the thousands and they are running 24 hours a day so they get very hot. Commodity hardware therefore fails at an accelerated rate. Once you cope with that realization, you need to design recovery situations to deal with the problem. So Google’s software understands that their data can fail at any moment and works harder to cope with that.
For every server at Google, there is another with exactly the same data on it, the same configuration, the same everything: a clone. Replication is needed for scalability so that if requested data isn’t fetched instantly, the backup or clone computer is searched instead. The result is that failures don’t hurt Google, they only reduce capacity. When hardware crashes or software hangs, there is a time out and a re-issue request. Google has a central control system in place to manage all this.
Cooling failures at Google can be exciting!. Craig recalls the time when the air conditioning failed entirely at one of the datacenters and the monitoring system recognized that the centre was heating up, so they were able to shut down remaining PCs at the datacenter within minutes. The fire brigade turned up and it was quite a big event internally. But the best thing was that nobody using Google even noticed! Because of Google’s scalable solution, searchers were unaffected by the major hardware outage.
Craig says that once a week, a person at each data center has a list of all the failed hard disks and walks around the datacenter with a pile of hard drives, replacing them one at a time. Velcro is Google’s secret weapon! All Google’s hard disks are velcroed in. This allows super quick service and replacement time. So curiously, there is no downside to hardware failures at Google, because they are expected and managed via scale.
Google: The Startup
Craig showed the audience a photo of Google’s original PC configuration put together by Larry and Sergey at google.stanford.edu. It consisted of three hard drives and a couple of monitors. Larry and Sergey used Lego to enclose the hard disks and when Lego became too expensive, they used cheap Lego knock-offs! He then showed a picture of Google’s first office inside a residential garage and the hard drive racks that they built in a rented datacenter to save money. Larry and Sergey packed the racks together and used layers of cork between the motherboards so they wouldn’t explode. Eventually they hired people who knew about safe wiring, but they still used floor fans in the datacenter to try to keep the PCs cool.
Google and the Brady Bunch
Google’s Zeitgeist pulls together interesting search trends and patterns generated from the billions of searches conducted on Google. Craig is consistently fascinated by search trends and recalls a particular event that sticks in his mind. On the game show Who Wants to Be a Millionaire, the competitor got down to the final question for $1 million and it was: “On the TV show The Brady Bunch, what is Carol Brady’s maiden name?” The competitor used his phone-a-friend lifeline and his friend was able to look it up live on Google and provide the competitor with the correct answer, earning him a million dollars.
The next day, out of interest, Google staff looked at the logs for “carol brady maiden name” and saw a huge spike in traffic when the show aired on the West Coast, then another spike when it aired on the East Coast and then a tiny spike when it aired a few hours later in Hawaii.
So Google Trends is a useful tool to study data patterns, but Google keep a bunch of statisticians on staff who check that random effects aren’t making the data significant. Craig says that in the same vein, you should look at your site logs and react, but be careful about jumping to conclusions about what the trends say.
At the end of his presentation, I asked Craig whether he is concerned that Google’s PageRank algorithm will gradually become less accurate due to the demands of scale. Craig acknowledged that as Google’s indexed data grows, user input and search patterns will become increasingly important. He says PageRank will need to learn to become better at providing search results and scale up accordingly. But scale makes things interesting!
Article by Kalena Jordan, one of the first search engine optimization experts in Australia, who is well known and respected in the industry, particularly in the U.S. As well as running a daily Search Engine Advice Column, Kalena manages Search Engine College – an online training institution offering instructor-led short courses and downloadable self-study courses in Search Engine Optimization and other Search Engine Marketing subjects.