Next: Limitations of Log Files for Traffic Analysis
Previous: Topic Sensitve PageRank

Traffic Analysis

Every time someone directly accesses a page on your website your server writes a line of information in a log file. If you are serious about search engine optimization these log files are mines of information that can show you which search engine robots are visiting your site, any problem pages and keywords used to find your pages.

Web server log files come in a few standard formats which means that there is a wide range of software available, both free and commercial, to read and analyze the information. Here is an example of a single entry in a log file:


64.203.3.83 - 64.203.3.83.128711091328144596 [01/Aug/2004:03:42:56 +0100] "GET /index.htm HTTP/1.1" 200 15101 "http://www.blogspot.com/widgetblog/index.htm" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

This shows the numeric IP address of the computer requesting the page, the time of the request using Greenwich Mean Time (GMT), the request type and file name. The server HTTP error code: 200 is good, 301/302 are redirects and 404 is an error caused by a broken link, the number of bytes (characters) downloaded, the URL of the page the user was viewing prior to this request and the user agent (web browser type) and operating system.

This entry comes from the logfile of a site that I manage. Each month the site has close to one million of log entries. It can sometimes be useful to look at raw log files but usually an individual entry doesn't tell us a great deal. What is interesting are trends over time and for that log file analysis software is essential.

Search Engine Optimization Book            

See Also

Home ] Table of Contents ] Start ]