Next: Content Update Ranking
Previous: Information Retrieval Based on Historical Data

Document Creation Date Ranking

Several techniques are discussed for determining the document's creation (or inception) date including the time the search engine first discovers the document, either by link discovery, crawling or by direct submission (section 36). Other techniques are the date the documents domain was registered or using dates on blogs or news or mailing list postings that reference the document (section 37). Gmail, Google News and Google Print could also documents for hyperlinks and obviously forms a part of this strategy. The creation date may also be set at a point where the document contains a certain threshold of information or may be retrieved from Web Server Meta data.

The staleness of a document may be based on factors, such as document creation date, anchor growth, traffic, content change, forward/back link growth, etc.

Link Growth Velocity Scoring

Google obviously have a lot of information about the velocity that documents naturally acquire inbound-links. The PageRank of a recent document that has an unnaturally high number of inbound-links or spikey link growth could be the subject of spam or Google bombing. In the normal situation a new document will have less inbound-links than an older document but may contain fresher information. The new document's PageRank has been penalized and could be given a temporary boost.

Link growth velocity rather than total links could also be used to indicate document popularity but would have to be used in conjunction with other factors to avoid spam.

It is likely that Google already uses this information in its algorithm and this results in the Sandbox and Reverse Sandbox effects and the factors in the patent have already been widely discussed by search engine optimizers.

The Google researchers actually suggest a formula for link velocity scoring:

H=PR ÷ log(F+2)

The PageRank figure is divided by the log of the inception date plus 2 (the 2 gives an initial inception date constant for the log function). For a given PageRank newer documents will have a higher score.

In some cases the score will be adjusted relative to the average age of all the documents in a result set (section 43).

See Also

Search Engine Optimization Book            

Home ] Table of Contents ] Start ]