NII Technical Report (NII-2006-012E):Building a Terabyte-scale Web Data Collection 'NW1000G-04' in the NTCIR-5 WEB Task

Autor: Takaku, Masao, Oyama, Keizo, Aizawa, Akiko, Ishikawa, Haruko, Minamide, Kengo, Kato, Shin, Yamana, Hayato, Hayashi, Junya
Jazyk: angličtina
Rok vydání: 2006
Předmět:
Zdroj: NIIテクニカル・レポート. :1-8
ISSN: 1346-5597
Popis: We built a terabyte-scale web data collection, NW1000G-04, which was used in the NTCIR-5 WEB task. This paper describes the process of building the collection and some statistics of it in detail.
Databáze: OpenAIRE