Using electronic transaction data to add geographic granularity to official estimates of retail sales

Autor: Brian Dumbacher, Darcy Steeg Morris, Carma Hogue
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Zdroj: Journal of Big Data, Vol 6, Iss 1, Pp 1-23 (2019)
Druh dokumentu: article
ISSN: 2196-1115
DOI: 10.1186/s40537-019-0242-z
Popis: Abstract Introduction Economists are interested in more granular, more frequent data to aid in their understanding of the U.S. economy. The most frequent economic data currently available from the U.S. Census Bureau come from monthly economic indicators such as the Monthly Retail Trade Survey, which produces national estimates of retail sales. On the other hand, the most granular data (in terms of geographic and industry detail) come from the Economic Census, which is conducted every five years. The Census Bureau is researching whether organic, third-party Big Data sources, in conjunction with survey data, allow for the production of retail sales estimates that are both monthly and subnational. Case description This case study explores the feasibility of using aggregated electronic transaction data from First Data (FD), a large payment processor, to calculate experimental regional and state-level monthly estimates of retail sales. Quality criteria are devised to understand this data source’s representativeness of the target population and consistency with existing survey data. Five retail industries in the FD transaction data are identified as having acceptable quality for estimation. Estimation methodology is developed based on linear mixed models in a Bayesian framework. These models try to take advantage of the timeliness of the FD transaction data and smooth over artifacts of FD’s business activity. Experimental estimates of retail sales are calculated for the period January 2015 through March 2018. Discussion and evaluation The experimental estimates are evaluated quantitatively via correlations between external estimates of the number of employees by industry and qualitatively with respect to additional information about the economy. Many features of the experimental estimates seem reasonable, but there are also caution flags such as anomalous trends related to identified FD quality issues. Conclusions The FD transaction data offer insight into economic activity at a more granular level. However, using this data source to enhance official estimates of retail sales is challenging; the FD aggregates have limitations in terms of suppression, coverage, and trends. Consequently, fewer industries than expected are identified as having acceptable quality for estimation. Future work involves calculating experimental estimates for more recent months and researching alternative methods for evaluating their accuracy.
Databáze: Directory of Open Access Journals