Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Usage, and Impact
Autor: | Tarek Saier, Michael Färber, Tornike Tsereteli |
---|---|
Rok vydání: | 2021 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Economics I.2.7 InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL Citations Computer Science - Digital Libraries Library and Information Sciences H.3.3 H.3.7 Computer Science - Information Retrieval Machine Learning (cs.LG) Citation analysis ddc:330 Scholarly data Digital Libraries (cs.DL) GeneralLiterature_REFERENCE(e.g. dictionaries encyclopedias glossaries) Information Retrieval (cs.IR) Cross-lingual |
Zdroj: | International Journal on Digital Libraries, 23, 179–195 |
ISSN: | 1432-5012 1432-1300 |
DOI: | 10.48550/arxiv.2111.05097 |
Popis: | Citation information in scholarly data is an important source of insight into the reception of publications and the scholarly discourse. Outcomes of citation analyses and the applicability of citation-based machine learning approaches heavily depend on the completeness of such data. One particular shortcoming of scholarly data nowadays is that non-English publications are often not included in data sets, or that language metadata is not available. Because of this, citations between publications of differing languages (cross-lingual citations) have only been studied to a very limited degree. In this paper, we present an analysis of cross-lingual citations based on over one million English papers, spanning three scientific disciplines and a time span of three decades. Our investigation covers differences between cited languages and disciplines, trends over time, and the usage characteristics as well as impact of cross-lingual citations. Among our findings are an increasing rate of citations to publications written in Chinese, citations being primarily to local non-English languages, and consistency in citation intent between cross- and monolingual citations. To facilitate further research, we make our collected data and source code publicly available. Projekt DEAL |
Databáze: | OpenAIRE |
Externí odkaz: |