Measuring the similarity of short texts by word similarity and tree kernels

Autor: Haisheng Li, Yun Tian, Qiang Cai, Shouxiang Zhao
Rok vydání: 2010
Předmět:
Zdroj: 2010 IEEE Youth Conference on Information, Computing and Telecommunications.
DOI: 10.1109/ycict.2010.5713120
Popis: A novel modeling method is presented in this paper to measure the similarity between short texts. We thought that the complete expression of a sentence or a short text, not only depends on the words, but also relies on the syntactic structure, thus the method takes word similarity feature and syntactic feature into account. The proposed method can be used in a variety of applications involving automatic document summarization, text knowledge representation and discovery. Experiment on two different data sets shows that the proposed method performs better than the measure proposed by Li et al.
Databáze: OpenAIRE