|
Next: Cloaking - IP
Address Delivery Agent DeliveryCloaking is performed on the Web server. The web page is actually a script. The following code excerpt is written in PHP, a popular Web scripting language. It demonstrates the basics of creating a cloaked page. When a browser or spider connects to a Web server across the Internet it uses a protocol called the Hyper Text Transfer Protocol or more commonly HTTP. A protocol is a set of rules governing a conversation. As part of the protocol the client sends some information including its address, the type of browser (called the user agent), and the URL of the previous page, if any, that it was visiting.
If the User Agent contains the string ''bot" (as in Googlebot, MSNbot etc.) then we can assume the client is a spider indexing our site and we can serve a search engine friendly page, otherwise we serve an end-user version of the page. Our search engine friendly version could contain a number of spam techniques such as keyword stuffing. This is quite a simplistic example and is fairly easy to detect. For example the version of the page held in the Yahoo! or Google cache would differ from that served to the end user. Someone could also connect to the site using the cUrl utility.
This would return the cloaked version of the page. A more sophisticated cloaking scheme would also check the client IP address. For example, recently the Googlebot has been using IP addresses in the range 64.68.80.01 to 64.68.87.254. After determining the User Agent the cloaked page could then see if the client address falls within this range. Robots normally have a blank Referrer field. This requires that the cloaker maintains an accurate and up-to-date list of the addresses used by spiders and also has a list of exceptions. We wouldn't want to serve a cloaked page to the Google web page translation service. We should also keep our cloaked page from being cached by including the
or
meta tags in the Head section of the cloaked version of the web page. This may be a clue in itself that the page is cloaked. Google Cloaking[Update 7 March 2005] Search Engine Marketeers were recently up in arms when the found that Google was cloaking its own pages related to the Adwords program. Sharp eyed surfers spotted that the titles of pages server to end users and those held in the cache were different. The cloaked pages had keyword stuffed titles. There was some debte as to whether this was spam as the keywords related to the content, although strictly spam is defined as overt repetition. Google had also put up a robots.txt file so the pages would only be indexed by the Google robot, not other search engines. Nevertheless after a few days Google removed the pages. See Also
|
|
©1994-2006 All text and images copyright: www.abcseo.com; last updated: |