Towards hierarchical affiliation resolution: framework, baselines, dataset
Autor: | Backes, Tobias, Hienert, Daniel, Dietze, Stefan |
---|---|
Rok vydání: | 2022 |
Předmět: |
News media
journalism publishing Publizistische Medien Journalismus Verlagswesen Entity resolution Affiliation resolution Formal concept analysis Association rule learning Taxonomy induction Szientometrie Bibliometrie Informetrie Scientometrics Bibliometrics Informetrics Bundesrepublik Deutschland Hierarchie Scientometrie Taxonomie Federal Republic of Germany taxonomy hierarchy scientometry 10800 |
Zdroj: | International Journal on Digital Libraries, 23, 3, 267-288 |
Druh dokumentu: | journal article<br />Zeitschriftenartikel |
ISSN: | 1432-1300 |
DOI: | 10.1007/s00799-022-00326-1 |
Popis: | Author affiliations provide key information when attributing academic performance like publication counts. So far, such measures have been aggregated either manually or only to top-level institutions, such as universities. Supervised affiliation resolution requires a large number of annotated alignments between affiliation strings and known institutions, which are not readily available. We introduce the task of unsupervised hierarchical affiliation resolution, which assigns affiliations to institutions on all hierarchy levels (e.g. departments), discovering the institutions as well as their hierarchical ordering on the fly. From the corresponding requirements, we derive a simple conceptual framework based on the subset partial order that can be extended to account for the discrepancies evident in realistic affiliations from the Web of Science. We implement initial baselines and provide datasets and evaluation metrics for experimentation. Results show that mapping affiliations to known institutions and discovering lower-level institutions works well with simple baselines, whereas unsupervised top-level- and hierarchical resolution is more challenging. Our work provides structured guidance for further in-depth studies and improved methodology by identifying and discussing a number of observed difficulties and important challenges that future work needs to address. |
Databáze: | SSOAR – Social Science Open Access Repository |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |