Freshbot and Deepbot

Google uses two robots to crawl web content, these have been dubbed the Freshbot and Deepbot after their general purpose. Deeptbot is the once a month deep crawl of web content that results in the main Google index. The Freshbot crawls the web on a continuous basis and is responsible for the Everflux effect. It finds content that is updated frequently such as news sites, forums, blogs and other websites. It appears that when Google finds a new page it checks it frequently at first to see if there are regular updates. If there are the site is added to the list of pages to be visited by the Freshbot.

The Freshbot results appear to be compiled into a separate database. This is overwritten every time the Freshbot starts a new cycle. The Freshbot and main index are merged to produce search results. The means that fresh content may appear in search results very quickly but then disappear only to resurface one or two months later in the main Google index. If the page is already in the main index the Freshbot results may appear for a few days before reverting to the older version until the site is crawled by the Deepbot.

At one time the Freshbot used internet addresses beginning with the number 64. and Deepbot addresses beginning with 216.; but since around the middle of 2003 the Google robots all come from machines in the 64.* or 66.* address range.

Figure 1: Google Robots

Figure 1 is taken from the logfiles of a website, it clearly illustrates a cycle of deep crawls with a lesser number of daily visits. The Googledance follows sometime after the deep crawl as the results are processed.

