Google SiteMaps

Google has introduced a "SiteMap Protocol". This is basically an XML document that tells Google about the pages on your website, in particular how often they should be indexed. It has been suggested that they can replace html sitemaps as a way for Google to index unlinked content however PageRank theory would suggest that unlinked content will not be included in the Google index - unless Google is going to assign PageRank to sitemap links.

Google's robots can consume a lot of bandwidth indexing old content which rarely changes. The Sitemap Protocol seems like a good way of telling the Google robot what content should be scanned, especially for making new pages which may be deep within a site available quickly. Google provides a sitemap tool (written in Python) that can scan a website (on the local machine) in order to build an initial sitemap file.

[10 June 2005]

Google Indexes Stop Words

Stop Words are words that are so common that they have little relevance to a search. Google originally didn't index Stop Words. This saved resources. However this policy has now changed.

See: Stop Words [10 May 2005]

Google Crosses 100K Barrier

Google didn't used to index much more than 100K of a document. Recent tests on large files shows that Google has now crossed this barrier and is indexing very large documents, certainly up to 500k on test we ran. The preference still seems to be for smaller documents though.

See: Search Engine Robots and Spiders [9 May 2005]

Google's Patent on Historical Data for Sorting Results

Google has recently filed a US patent that discusses using historical data for sorting search results. The idea is to filter spam results and give more prominence to new results excluded by the original PageRank algorithm.

Information Retrieval Based on Historical Data [8th April, 2005]

Google's Sneaky Redirect

If you are using Internet Explorer and Google did you know that any clicks you make on search engines results pages are automagically redirected via Google's search engine?

Google Sneaky Redirect [27 March 2005]

Case Study: The Power of Anchor Text

A poster on alt.internet.search-engines was searching Google using the keywords, surf shop:

http://www.google.com/search?en&q=surf+shop

and found the site: www.surf-shop.com. At first glance this site doesn't seem to be well optimized but a combination of good anchor text and inbound-links lets it beat 4.8 million other results.

Power of Anchor Text [26 March 2005]

Google Toolbar 3

Google has launched a beta version 3 of its famous toolbar. The autolink feature that identifies ISBN numbers and postal codes and automatically turns them into links has caused some controversy.

Google Toolbar 3 [21st March 2005]

Google Cloaking

After Microsoft Cloaking comes Google Cloaking. Sharp eyed surfers spotted that Google was keyword stuffing the TITLE element and cloaking results in its Adwords help section. Although the cloaking was only visible to the googlebot. Google being evil or a storm in a tea-cup?

Google Cloaking [7 March 2005]

Search Engine Optimization Book   search engine marketing book   improving your search engine rankings book   markting with google book   improve search engine rankings book

See Also

Home ] Table of Contents ] Start ]