|You are not logged in. Only a regsistered user can explore
the abstracts completely.
New User ? Register here! The registration process is
very simple and free.
Detecting Near-Replicas on the Web by Content and Hyperlink Analysis
The presence of near-replicas of documents is very common on the Web. Documents may be replicated completely or partially for different reasons (versions, mirrors, etc.), or the same resource can be associated to different URLs (dynamically generated pages, etc.). Whilst replication can improve information accessibility by the users, the presence of near-replicated documents can hinder the effect....