Measuring the Search Effectiveness of a Breadth-First Crawl
Autor: | Dennis Fetterly, Nick Craswell, Vishwa Vinay |
---|---|
Rok vydání: | 2009 |
Předmět: | |
Zdroj: | Lecture Notes in Computer Science ISBN: 9783642009570 ECIR |
DOI: | 10.1007/978-3-642-00958-7_35 |
Popis: | Previous scalability experiments found that early precision improves as collection size increases. However, that was under the assumption that a collection's documents are all sampled with uniform probability from the same population. We contrast this to a large breadth-first web crawl, an important scenario in real-world Web search, where the early documents have quite different characteristics from the later documents. Having observed that NDCG@100 (measured over a set of reference queries) begins to plateau in the initial stages of the crawl, we investigate a number of possible reasons for this behaviour. These include the web-pages themselves, the metric used to measure retrieval effectiveness as well as the set of relevance judgements used. |
Databáze: | OpenAIRE |
Externí odkaz: |