Abstrakt: |
Code is the language of contemporary knowledge work, but sociologists remain without methods for systematically analyzing it. This paper proposes and evaluates a measurement framework for using code as data in sociological research. I use basic techniques from software engineering and programming language design to derive features of interest from code corpora, and demonstrate how to analyze these data using standard methods from statistical text analysis. To assess the proposed approach, I analyze a large corpus of code published alongside research articles in the American Economic Review since 2008. I use these data to describe empirical variation in economists' data analyses, and relate this variation to the substantive content of economic research. Using code as data provides new empirical resources to scholars studying the technical dimensions of contemporary scientific research, algorithmic systems, and socioeconomic quantification generally. [ABSTRACT FROM AUTHOR] |