Next: Site Rankings
Previous: SERPS

Site Maps

A site map provides users with a means to navigate directly to content within your website. It is particularly useful where a user encounters a broken link. This may be due to a search engine or external site that is slow to update inbound-links following a change to the website. The web server can be configured to redirect any "Not Found" or HTTP 404 errors to the site map.

Site maps have a potentially important role in search engine optimization. They overcome problems with dynamic content that some search engines have trouble navigating.

Some large websites have deep hierarchies of links. Search engines take time to traverse (spider) these hierarchies and index all the pages in a website. A site map can provide deep links to all the content on the website. For efficient indexing no content should be more than two links away from the home page. The site map should therefore link directly from this page. The site map should be written in plain HTML. Leave natty animated site maps written in Javascript, Shockwave or other technologies to deep-pocketed "corporates" who design their sites principally to impress the board. The site map is also another place to use relevant anchor text. Every page on the site should link to the home page, this means that every page is only a couple of clicks away from the sitemap. Search engines that find your site through a deep link now have a mechanism to explore the rest of the site.

Google advises against using more than 100 links on a page, many SEO advise no more than 50, and against having pages of more than 100 Kbytes in length. It seems that Google will index more than 100 links but it may take repeated visits to the page. Some search engines may not spider to the end of a long page. This poses a problem for large sites. First of all the site map should be more than just a series of links. Use headings (H1, H2, H3 etc.) to divide the links into themed areas and use your topic keywords in those headings. If your site is well structured this will probably follow the existing themes. If the map grows too large place topic areas into separate sub-site maps. This increases the depth of the site map so more important links should be on the main map but it is better to add depth to the sitemap than risk search engines giving up due to the quantity of links.

Outbound-Links and Page Rank Dilution

The site map is a good place to put reciprocal link pages. These are general external links that are of use to anyone visiting the site. You may still want to include some outbound-links within your content pages, but these should be reserved for excellent and highly relevant information that complements the content.

There are two reasons for doing this. It seems logical to group a map of external and internal links. External links will go into a separate hierarchy from the site-map. This also has positive implications for PageRank dilution without resorting to sneaky tricks such as none-spiderable link pages or Javascript cloaking. The section on PageRank showed that outbound links reduce the amount of PageRank available to the site. A site map generally has a large number of links within the site, it may have a high PageRank but this is well distributed to pages even deep within the site's hierarchy. Adding another link to a page with reciprocal links does little to upset that distribution.

Site maps and PageRank dilution

Figure 1: Site maps and PageRank dilution

Figure 1 shows this graphically. It is a small site with one inbound link and a links page with a single cross-link. If the links page had come straight from the home page the PageRank availabe for internal distribution would have been reduced considerably.

Google Sitemaps

In June, 2005 Google released a sitemap protocol. This is an XML format file that describes resources on a site. It can tell Google when the resource was last modified and give hints as to how important the resource is and how frequently it should be indexed by the robot.

Helpfully Google also released a sitemap tool, although less helpfully this is written in the Python programming language. The sitemap tool can take a list of URLs from a variety of sources including Apache format log files, a local directory (either on the web server itself or on a staging server), a list of URLs in a text file or specific URLs passed to the program via a configuration file. The program then produces a sitemap file in the correct format.

Google claims that the SEO implications of sitemaps are neutral. They suggest it can be used to tell the robot about files that cannot easily be reached by following links on the site. It would also seem useful to stop the robot from re-indexing pages that change very infrequently, thus saving bandwidth on the server. However pages with no inbound-links will not feature very highly in search results except for very obscure searches.

The protocol is currently only supported by Google, however if it enables Google to index more documents, particularly useful content that is hard for its robot to find, other search engines may have to follow suit.

Search Engine Optimization Book   search engine marketing book   improving your search engine rankings book   markting with google book   improve search engine rankings book

See Also

Home ] Table of Contents ] Start ]