Database of Figurative Expressions with Indicators from the 'Balanced Corpus of Contemporary Written Japanese'

Autor: Kato, Sachi, Kikuchi, Rei, Asahara, Masayuki
Jazyk: japonština
Rok vydání: 2020
Zdroj: 自然言語処理. 27(4):853-887
ISSN: 1340-7619
Popis: application/pdf
Mejiro University
Chuo University
National Institute for Japanese Language and Linguistics
A figurative expression database was constructed based on the Balanced Corpus of Contemporary Written Japanese (BCCWJ), with the goal of understanding actual usage of figurative expressions in Japanese. Using the three hundred fifty nine types of figurative expression indicators listed in 'A Stylistic Study of the Figurative' (Hiyuhyogen-no Riron-to Bunrui) as clues for metaphor indicator elements, candidates were selected based on synonym examples confirmed in the 'Word List by Semantic Principles', and a total of eight hundred twenty two expressions were manually extracted from one million two hundred ninety thousand sixty words found in six registers of core data (Yahoo! Answers, white papers; Yahoo! Blog, books, magazines, and newspapers). In addition to the vehicle, topic, and Word List by Semantic Principles label of each metaphor example, type categories such as personification, objectification, biomimicry, and substantiation were defined. Examples were also classified into categories such as synecdoche, metonymy, contextual metaphor, and idiomatic expression. Although the work above was carried out by linguists, ratings were also assigned to each example for five aspects (figurativeness, novelty, comprehensibility, personification, and substantiation) based on evaluations by twenty two to seventy seven non-experts (average: thirty three) to evaluate how these figurative expressions were perceived. The usage trends for each of these figurative expression indicators in contemporary Japanese were determined based on their relative frequency in each register and distribution of their rating values.
日本語の比喩表現の実態把握を目的として,『現代日本語書き言葉均衡コーパス』に基づく指標比喩データベースを構築した.『比喩表現の理論と分類』に掲載されている359種類の比喩指標要素を手掛かりとし,『分類語彙表』に基づいて類義用例を確認しながら指標比喩表現候補を展開し,コアデータ6レジスタ(Yahoo! 知恵袋・白書・Yahoo! ブログ・書籍・雑誌・新聞)1,290,060語から人手で822件抽出した.抽出した比喩用例には,喩辞・被喩辞の情報と,その分類語彙表番号を付与したほか,擬人化・擬物化・擬生化・具象化などの種別情報も付与した.さらに提喩・換喩・文脈比喩・慣用表現などの情報も付与した.上記作業は言語学者によったが,非専門家が比喩表現をどのように捉えるかを評価するために,比喩性・新奇性・わかりやすさ・擬人化・具体化(具象化)の5つの観点について,1事例あたり22–77人分(平均33人分)の評定値を付与した.レジスタ毎の相対度数や評定値の分布により,現代日本語の指標比喩表現の使用傾向を確認した.
Databáze: OpenAIRE