Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector
Autor: | Aroa Orrequia-Barea, Antonio Moreno-Ortiz, Soluna Salles-Bernal |
---|---|
Rok vydání: | 2019 |
Předmět: |
General Computer Science
Computer science business.industry media_common.quotation_subject 05 social sciences Sentiment analysis User-generated content 02 engineering and technology Data science Scarcity Annotation Software Iterative refinement 020204 information systems Schema (psychology) 0502 economics and business 0202 electrical engineering electronic engineering information engineering 050211 marketing business Social Sciences (miscellaneous) Tourism media_common |
Zdroj: | BASE-Bielefeld Academic Search Engine |
ISSN: | 1943-4294 1098-3058 |
DOI: | 10.1007/s40558-019-00155-0 |
Popis: | The use of linguistic resources beyond the scope of language studies, e.g., commercial purposes, has become commonplace since the availability of massive amounts of data and the development of software tools to process them. An interesting perspective on these data is provided by Sentiment Analysis, which attempts to identify the polarity of a text, but can also pursue further, more challenging aims, such as the automatic identification of the specific entities and aspects being discussed in the evaluative speech act, along with the polarity associated with them. This approach, known as aspect-based sentiment analysis, seeks to offer fine-grained information from raw text, but its success depends largely on the existence of pre-annotated domain-specific corpora, which in turn calls for the design and validation of an annotation schema. This paper examines the methodological aspects involved in the creation of such annotation schema and is motivated by the scarcity of information found in the literature. We describe the insights we obtained from the annotation schema generation and validation process within our project, whose objectives include the development of advanced sentiment analysis software of user reviews in the tourism sector. We focus on the identification of the relevant entities and attributes in the domain, which we extract from a corpus of user reviews, and go on to describe the schema creation and validation process. We begin by describing the corpus annotation process and its further iterative refinement by means of several inter-annotator agreement measurements, which we believe is key to a successful annotation schema. |
Databáze: | OpenAIRE |
Externí odkaz: |