Identifying High Value Opportunities for Human in the Loop Lexicon Expansion

Autor: Petar Ristoski, Deluca Chad, Alfredo Alba, Anna Lisa Gentile, Steve Welch, Daniel Gruhl, Linda Kato, Kau Chris
Rok vydání: 2019
Předmět:
Zdroj: WWW (Companion Volume)
Popis: Many real world analytics problems examine multiple entities or classes that may appear in a corpus. For example, in a customer satisfaction survey analysis there are over 60 categories of (somewhat overlapping) concerns. Each of these is backed by a lexicon of terminology associated with the concern (e.g., “Easy, user friendly process” or ”Process confusing, too many handoffs”). These categories need to be expanded by a subject matter expert as the terminology is not always straight forward (e.g., “handoffs” may also include “ping-pong” and “hot potato” as relevant terms). But given that Subject Matter Expert time is costly, which of the 60+ lexicons should we expand first? We propose a metric for evaluating an existing set of lexicons and providing guidance on which are likely to benefit most from human-in-the-loop expansion. Using our ranking results we achieved ≈ 4 × improvement in impact when expanding the first few lexicons off our suggested list as compared to a random selection.
Databáze: OpenAIRE