Robust query rewriting using anchor data

Autor: Nick Craswell, Dennis Fetterly, Bodo Billerbeck, Marc Najork
Rok vydání: 2013
Předmět:
Zdroj: WSDM
DOI: 10.1145/2433396.2433440
Popis: Query rewriting algorithms can be used as a form of query expansion, by combining the user's original query with automatically generated rewrites. Rewriting algorithms bring linguistic datasets to bear without the need for iterative relevance feedback, but most studies of rewriting have used proprietary datasets such as large-scale search logs. By contrast this paper uses readily available data, particularly ClueWeb09 link text with over 1.2 billion anchor phrases, to generate rewrites. To avoid overfitting, our initial analysis is performed using Million Query Track queries, leading us to identify three algorithms which perform well. We then test the algorithms on Web and newswire data. Results show good properties in terms of robustness and early precision.
Databáze: OpenAIRE