Single-document and multi-document summarization techniques for email threads using sentence compression
Autor: | Bonnie J. Dorr, David Zajic, Jimmy Lin |
---|---|
Rok vydání: | 2008 |
Předmět: |
Sentence compression
Phrase Information retrieval Computer science Electronic messaging business.industry Technical language Thread (computing) Library and Information Sciences Management Science and Operations Research computer.software_genre Automatic summarization Computer Science Applications Multi-document summarization Media Technology Artificial intelligence business computer Natural language processing Sentence Information Systems |
Zdroj: | Information Processing & Management. 44:1600-1610 |
ISSN: | 0306-4573 |
DOI: | 10.1016/j.ipm.2007.09.007 |
Popis: | We present two approaches to email thread summarization: collective message summarization (CMS) applies a multi-document summarization approach, while individual message summarization (IMS) treats the problem as a sequence of single-document summarization tasks. Both approaches are implemented in our general framework driven by sentence compression. Instead of a purely extractive approach, we employ linguistic and statistical methods to generate multiple compressions, and then select from those candidates to produce a final summary. We demonstrate these ideas on the Enron email collection - a very challenging corpus because of the highly technical language. Experimental results point to two findings: that CMS represents a better approach to email thread summarization, and that current sentence compression techniques do not improve summarization performance in this genre. |
Databáze: | OpenAIRE |
Externí odkaz: |