Autor: |
Ikehara, Satoru, Tokuhisa, Masato, Murakami, Jin΄ichi, Saraki, Masashi, Miyazaki, Masahiro, Ikeda, Naoshi |
Zdroj: |
Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead; 2006, p509-519, 11p |
Abstrakt: |
A large-scale sentence pattern dictionary (SP-dictionary) for Japanese compound and complex sentences has been developed. The dictionary has been compiled based on the non-compositional language model. Sentences with 2 or 3 predicates are extracted from a Japanese-to-English parallel corpus of 1 million sentences, and the compositional constituents contained within them are generalized to produce a SP-dictionary containing a total of 215,000 pattern pairs. In evaluation tests, the SP-dictionary achieved a syntactic coverage of 92% and a semantic coverage of 70%. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|