Piecewise linear approximation of empirical distributions under a Wasserstein distance constraint

Autor: Philipp Arbenz, William Guevara-Alarcón
Rok vydání: 2018
Předmět:
Zdroj: Journal of Statistical Computation and Simulation, vol. 88, no. 16, pp. 3193-3216
ISSN: 1563-5163
0094-9655
DOI: 10.1080/00949655.2018.1506454
Popis: Big data applications and Monte Carlo simulation results can nowadays easily contain data sets in the size of millions of entries. We consider the situation when the information on a large univariate data set or sample needs to be preserved, stored or transferred. We suggest an algorithm to approximate a univariate empirical distribution through a piecewise linear distribution which requires significantly less memory to store. The approximation is chosen in a computationally efficient manner, such that it preserves the mean, and its Wasserstein distance to the empirical distribution is sufficiently small.
Databáze: OpenAIRE