Pattern Dictionary Development Based on Non-compositional Language Model for Japanese Compound and Complex Sentences.

Autor: Ikehara, Satoru, Tokuhisa, Masato, Murakami, Jin΄ichi, Saraki, Masashi, Miyazaki, Masahiro, Ikeda, Naoshi
Zdroj: Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead; 2006, p509-519, 11p
Abstrakt: A large-scale sentence pattern dictionary (SP-dictionary) for Japanese compound and complex sentences has been developed. The dictionary has been compiled based on the non-compositional language model. Sentences with 2 or 3 predicates are extracted from a Japanese-to-English parallel corpus of 1 million sentences, and the compositional constituents contained within them are generalized to produce a SP-dictionary containing a total of 215,000 pattern pairs. In evaluation tests, the SP-dictionary achieved a syntactic coverage of 92% and a semantic coverage of 70%. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index