Long Document Understanding using Hierarchical Self Attention Networks

Autor:	Kekuda, Akshay
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	Artificial Intelligence Computer Science NLP Attention Networks BERT Transformer Long Document LSTM RNN Self Attention Hierarchical Self Attention Call Transcripts
Druh dokumentu:	Text
Popis:	Natural Language Processing techniques are being widely used in the industry these days to solve a variety of business problems. In this work, we experiment with the application of NLP techniques for the use case of understanding the call interactions between customers and customer service representatives and to extract interesting insights from these conversations. We focus our methodologies on understanding call transcripts of these interactions which fall under the category of long document understanding. Existing works in text encoding typically address short form text encoding. Deep Learning models like Vanilla Transformer, BERT and DistilBERT have achieved state of the art performance on a variety of tasks involving short form text but perform poorly on long documents. To address this issue, modifications to the Transformer model have been released in the form of Longformer and BigBird. However, all these models require heavy computational resources which are often unavailable in small scale companies that run on budget constraints. To address these concerns, we survey a variety of explainable and light weight text encoders that can be trained easily in a resource constrained environment. We also propose Hierarchical Self Attention based models that outperform DistilBERT, Doc2Vec and single layer self-attention networks for downstream tasks like text classification. The proposed architecture has been put into production at the local industry organization that sponsored the research (SafeAuto Inc.) and helps the company to monitor the performance of its customer service representatives.
Databáze:	Networked Digital Library of Theses & Dissertations
Externí odkaz:	http://rave.ohiolink.edu/etdc/view?acc_num=osu1669745538050042 Zobrazit plný text záznamu