Construction of the corpus of everyday Japanese conversation : An interim report
Autor: | Koiso, Hanae, Den, Yasuharu, Iseki, Yuriko, Kashino, Wakako, Kawabata, Yoshiko, Nishikawa, Ken’ya, Tanaka, Yayoi, Usuda, Yasuyuki |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: | |
Zdroj: | Proceedings of the LREC 2018 Special Speech Sessions. :29-29 |
Popis: | LREC 2018 Special Speech Sessions "Speech Resources Collection in Real-World Situations"; Phoenix Seagaia Conference Center, Miyazaki; 2018-05-09 application/pdf National Institute for Japanese Language and Linguistics Chiba University/National Institute for Japanese Language and Linguistics In 2016, we launched a new corpus project in which we are building a large-scale corpus of everyday Japanese conversation in a balanced manner, aiming at exploring characteristics of conversations in contemporary Japanese through multiple approaches. The corpus targets various kinds of naturally occurring conversations in daily situations, such as conversations during dinner with the family at home, meetings with colleagues at work, and conversations while driving. In this paper, we first introduce an overview of the corpus, including corpus size, conversation variations, recording methods, structure of the corpus, and annotations to be included in the corpus. Next, we report on the current stage of the development of the corpus and legal and ethical issues discussed so far. Then we present some results of the preliminary evaluation of the data being collected. We focus on whether or not the 94 hours of conversations collected so far vary in a balanced manner by reference to the survey results of everyday conversational behavior that we conducted previously to build an empirical foundation for the corpus design. We will publish the whole corpus in 2022, consisting of more than 200 hours of recordings. |
Databáze: | OpenAIRE |
Externí odkaz: |