Identifying High Value Opportunities for Human in the Loop Lexicon Expansion

Autor:	Petar Ristoski, Deluca Chad, Alfredo Alba, Anna Lisa Gentile, Steve Welch, Daniel Gruhl, Linda Kato, Kau Chris
Rok vydání:	2019
Předmět:	Subject-matter expert Information retrieval Computer science Lexicon Set (psychology) Terminology Ranking (information retrieval)
Zdroj:	WWW (Companion Volume)
Popis:	Many real world analytics problems examine multiple entities or classes that may appear in a corpus. For example, in a customer satisfaction survey analysis there are over 60 categories of (somewhat overlapping) concerns. Each of these is backed by a lexicon of terminology associated with the concern (e.g., “Easy, user friendly process” or ”Process confusing, too many handoffs”). These categories need to be expanded by a subject matter expert as the terminology is not always straight forward (e.g., “handoffs” may also include “ping-pong” and “hot potato” as relevant terms). But given that Subject Matter Expert time is costly, which of the 60+ lexicons should we expand first? We propose a metric for evaluating an existing set of lexicons and providing guidance on which are likely to benefit most from human-in-the-loop expansion. Using our ranking results we achieved ≈ 4 × improvement in impact when expanding the first few lexicons off our suggested list as compared to a random selection.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::8ef4ab5f8c41941eba50155b838ff635 https://doi.org/10.1145/3308560.3317305 Zobrazit plný text záznamu