A General Approach to Multimodal Document Quality Assessment
Autor: | Timothy Baldwin, Bahar Salehi, Jianzhong Qi, Aili Shen |
---|---|
Rok vydání: | 2020 |
Předmět: |
Computer science
business.industry media_common.quotation_subject Context (language use) 02 engineering and technology computer.software_genre Readability Rendering (computer graphics) Task (project management) Artificial Intelligence Font ComputingMethodologies_DOCUMENTANDTEXTPROCESSING 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Grammaticality Quality (business) Artificial intelligence business Feature learning computer Natural language processing media_common |
Zdroj: | Journal of Artificial Intelligence Research. 68:607-632 |
ISSN: | 1076-9757 |
DOI: | 10.1613/jair.1.11647 |
Popis: | The perceived quality of a document is affected by various factors, including grammat- icality, readability, stylistics, and expertise depth, making the task of document quality assessment a complex one. In this paper, we explore this task in the context of assessing the quality of Wikipedia articles and academic papers. Observing that the visual rendering of a document can capture implicit quality indicators that are not present in the document text — such as images, font choices, and visual layout — we propose a joint model that combines the text content with a visual rendering of the document for document qual- ity assessment. Our joint model achieves state-of-the-art results over five datasets in two domains (Wikipedia and academic papers), which demonstrates the complementarity of textual and visual features, and the general applicability of our model. To examine what kinds of features our model has learned, we further train our model in a multi-task learning setting, where document quality assessment is the primary task and feature learning is an auxiliary task. Experimental results show that visual embeddings are better at learning structural features while textual embeddings are better at learning readability scores, which further verifies the complementarity of visual and textual features. |
Databáze: | OpenAIRE |
Externí odkaz: |