Next: Keyword Spam
Previous: Blog and Comment Spam
Referrer Spam
Referrer spam shows just how ingenious black-hatters can be in finding ways to manipulate search engine rankings. When someone clicks on a hyperlink their browser opens up the new web page. As part of the communications process (called HTTP which stands for HyperText Transfer Protocol) their browser sends the web address (URL) of the page containing the hyperlink. This address is called the Referrer. The user's web server will log this address and it is useful for traffic analysis, for example to judge the effectiveness of inbound-links.
There are a number of programs for analysing raw log-files such as AWStats and Webalizer. These produce Web pages summarizing user access on a monthly basis. Unfortunately many reports are publicly accessible and indexed by search engines and with a little knowledge about their format it is possible to locate them. Searching Google for phrases that typically occur within reports such as:
"Generated by Webalizer" or "Created by awstats"
Will return literally thousands of Webalizer and AWStats reports. It is easy to write a script to make requests to websites with fake Referrer URLs. For example using the popular cUrl tool this would be:
c:\ curl.exe -e http://www.mySite.com/ http://www.targetSite.com/
Webalizer lists the top 25 Referrer URLs in its monthly statistics. The spammer merely has to bombard the site with enough requests to figure in this chart. This creates an inbound-link, containing keywords, boosting Page or Web Rank. Some of these log pages have surprisingly high Google PageRanks.
The technique is definitely black-hat. It manipulates search engine rankings by creating what are in effect fake inbound links. It subverts the HTTP Referrer mechanism. It clogs log files with bogus information and it consumes resources on the target web-server.
Black hatters may counter that it is up to server administrators to protect against this form of spam and there is some truth in that. There is usually no good reason to have log-files publicly viewable. The log files should be password protected and preferably not visible to the Internet. Webmasters can also use a robots.txt file to stop search engines from indexing their logs. Log reports have many outbound-links on a single page so the overall benefit of each link is limited.
See Also
[ Home ] [ Table of Contents ] [ Start ]
©1994-2005All text and images copyright: www.abcseo.com; last updated: Wed Apr 5 13:06:55 2006
