On Text Tiling for Documents: A Neural-Network Approach

Autor: Siang-Yun Yoong, 翁湘雲
Rok vydání: 2019
Druh dokumentu: 學位論文 ; thesis
Popis: 107
Segmenting documents or conversation threads into semantically coherent segments have been one of the challenging tasks in Natural Language Processing (NLP). BERT (Bidirectional Encoder Representation Transformer) is a language representation model that shows outstanding results on many natural language processing task. In this work, we introduce three new text segmentation models that employ BERT for post-training. Extensive experiments are conducted using benchmark datasets, and the experiment results demonstrate that our BERT-based models show significant improvements over the state-of-the-art text segmentation algorithms.
Databáze: Networked Digital Library of Theses & Dissertations