Method for Estimation of Harmfulness of ID-Exchange BBS Based on Lexical Jargonizations

Autor: Abiko, Satosh, Hasegawa, Dai, PTASZYNSKI, Michal, Nakamura, Kenji, Sakuta, Hiroshi
Jazyk: japonština
Rok vydání: 2018
Zdroj: 情報システム学会誌. 13(2):41-58
Popis: プライベートチャットアプリケーションのID 交換を目的とした掲示板 (ID 交換掲示板 において違法・有害な情報を含む書き込みが増加傾向にある. ID 交換掲示板では,多様な隠語表現を用いたやり取りが行われており意図的に崩された日本語が多く含まれるため,従来の手法では有害性評価を行うことが困難である.そこで本研究では,ID交換掲示板における隠語表現を分類し,特に表層的な表記揺れが生じる環境下でも有害性判定を行える手法を検討する.
[ENG] Recently generic forum boards, such as Bulletin Board Systems (BBS) have experienced an increase of illegal and harmful activities, especially on BBS, which purpose is to exchange user contact IDs for further private chat applications (so called “ ID exchange BBS”). On such BBS, lexical transcription is often jargonized, or modified intentionally, making it difficult to extract information using standard tools. In this study, we first study the typology of harmful jargonized expressions on ID exchan ge BBS. Based on this typology we propose a method dealing with the intentional transcription modifications on ID exchange BBS in a robust way. In the evaluation, we developed a system applying the proposed method and verified the performance in estimating harmfulness of BBS entries. We also further improved the system by applying an automatic sentence pattern extraction method to separate harmful from non harmful entries. The experiment confirmed that it was possible to eliminate most of the erroneous tran scription modifications with the proposed method.
Databáze: OpenAIRE