The Quality of Big Data: Development, Problems, and Possibilities of Use of Process-Generated Data in the Digital Age

Autor: Baur, Nina, Graeff, Peter, Braunisch, Lilli, Schweia, Malte
Rok vydání: 2020
Předmět:
Sozialwissenschaften
Soziologie

Social sciences
sociology
anthropology

big data
mass data
process-generated data
process-produced data
digital data
digital methods
computational social sciences
historical sociology
survey methodology
corpus linguistics
social science methodology
data quality
social research
Erhebungstechniken und Analysetechniken der Sozialwissenschaften
Methods and Techniques of Data Collection and Data Analysis
Statistical Methods
Computer Methods

Datengewinnung
Datenqualität
Digitale Spaltung
Internet
Sozialstruktur
historische Sozialforschung
Methodologie
empirische Sozialforschung
Digitalisierung
historische Entwicklung
data capture
digital divide
social structure
historical social research
methodology
empirical social research
digitalization
historical development
30300
10200
Zdroj: Historical Social Research, 45, 3, 209-243
Druh dokumentu: Zeitschriftenartikel<br />journal article
ISSN: 0172-6404
DOI: 10.12759/hsr.45.2020.3.209-243
Popis: The paper introduces the HSR Forum on digital data by discussing what big data are. The authors show that big data are not a new type of social science data but actually one of the oldest forms of social science data. In addition, big data are not necessarily digital data. Regardless, current methodological debates often assume that “big data” are “digital data.” The authors thus also show that digital data have a big drawback concerning data quality because they do not cover the whole population – due to so-called digital divides, not everybody is on the internet, and who is on the internet, is socially structured. The result is a selection bias. Based on this analysis, the paper concludes that big data and digital data are data like any other type of data – they have both advantages and specific blind spots. So rather than glorifying or demonising them, it seems much more sensible to discuss which specific advantages and drawbacks they have as well as when and how they are better suited for answering specific research questions and when and how other types of data are better suited – these are the questions that are addressed in this HSR Forum.
Databáze: SSOAR – Social Science Open Access Repository