Google’s Caffeine Finally Live

After months (has it already been a year?) of hand-wringing and speculation in the search industry, Google finally announced Caffeine is live. It’s an appropriate name for the update, since it seems like Google’s been injected with a jolt of the good stuff to produce faster, fresher results.

Here’s a nice, concise breakdown of what Caffeine is and how it impacts search results:

What it is: Google’s new web indexing system that provides fresher (newer) and more results than the previous system. Searchers should be able to find content much sooner after it is posted than previously.

Background Information: When you search for something in Google, you are not actually searching the “live” web, you are instead searching Google’s index of the web. Think of it as a library: Google’s search engine is the librarian that returns all of the relevant results it can find.

Why did Google create Caffeine: Web content is growing exponentially in size and in variety (use of images, videos, real time, etc). Publishers of web content expect the content to be quickly accessible, and searchers expect the same. In short, Caffeine was built to keep up with the evolution of the web and the rising expectations of those who use it.

Differences between the old index and Caffeine: The old index had several layers, some of which were refreshed more rapidly than others. To refresh any given layer of the old index, Google engineers would have to analyze the entire web (making a large delay in information release).

Caffeine is not set up in layers, but in spheres. This way the web can be analyzed in small portions, and updated continuously and globally. As Google finds new pages or new information on existing pages it adds it straight to the index, allowing for instantaneous indexing and searchability.

Conclusion: Caffeine allows for the indexing of web pages on an enormous scale by processing hundreds of thousands of pages in parallel each second. It is a big step in allowing the exponential growth of online data that is published by websites to be added to Google’s index and made more accessible to searchers.

-Alie Sockol

