Indian Statistical Institute at INEX 2008 Adhoc Track

Autor: Aparajita Sen, Samaresh Maiti, Sukomal Pal, Sukanya Mitra, Debasis Ganguly, Mandar Mitra, Ayan Bandyopadhyay
Rok vydání: 2009
Předmět:
Zdroj: Lecture Notes in Computer Science ISBN: 9783642037603
INEX
DOI: 10.1007/978-3-642-03761-0_9
Popis: This paper describes the work that we did at Indian Statistical Institute towards XML retrieval for INEX 2008. Besides the Vector Space Model (VSM) that we have been using since INEX 2006, this year we implemented the Language Modeling (LM) approach in our text retrieval system (SMART) to retrieve XML elements against the INEX Adhoc queries. Like last year, we considered Content-Only (CO) queries and submitted three runs for the FOCUSED sub-task. Two runs are based on the Vector Space Model and one uses the Language Model. One of the VSM-based runs (VSMfbElts0.4) retrieves sub-document-level elements. Both the other runs (VSMfb and LM-nofb-0.20) retrieve elements only at the whole-document level. We applied blind feedback for both the VSM-based runs; no query expansion was used in the LM-based run. In general, the relative performance of our document-level runs is respectable (ranked 15/61 and 22/61 according to the official metric). Though our element retrieval run does reasonably (ranked 16/61 by iP[0.01]) according to the early-precision metrics, we think there is plenty of scope to improve our element retrieval strategy. Our immediate next task is therefore to focus on how to improve true element-level retrieval.
Databáze: OpenAIRE