Asymptotically Exact Data Augmentation: Models, Properties, and Algorithms

Autor: Nicolas Dobigeon, Maxime Vono, Pierre Chainais
Přispěvatelé: Signal et Communications (IRIT-SC), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Institut Universitaire de France (IUF), Ministère de l'Education nationale, de l’Enseignement supérieur et de la Recherche (M.E.N.E.S.R.), Institut National Polytechnique (Toulouse) (Toulouse INP), Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), ANR-19-P3IA-0004,ANITI,Artificial and Natural Intelligence Toulouse Institute(2019), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI), Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)
Rok vydání: 2020
Předmět:
FOS: Computer and information sciences
Statistics and Probability
Divide and conquer algorithms
Computer Science - Machine Learning
Computer science
Bayesian inference
Machine Learning (stat.ML)
01 natural sciences
Machine Learning (cs.LG)
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Methodology (stat.ME)
Auxiliary variables
010104 statistics & probability
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
Statistics - Machine Learning
Robustness (computer science)
0502 economics and business
Discrete Mathematics and Combinatorics
0101 mathematics
Robustness
Approximation
Divide-and-conquer
Statistics - Methodology
050205 econometrics
05 social sciences
[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV]
Statistics
Probability and Uncertainty

Algorithm
[PHYS.PHYS.PHYS-DATA-AN]Physics [physics]/Physics [physics]/Data Analysis
Statistics and Probability [physics.data-an]
Zdroj: Journal of Computational and Graphical Statistics
Journal of Computational and Graphical Statistics, Taylor & Francis, 2021, 30 (2), pp.335-348. ⟨10.1080/10618600.2020.1826954⟩
Journal of Computational and Graphical Statistics, 2021, 30 (2), pp.335-348. ⟨10.1080/10618600.2020.1826954⟩
ISSN: 1537-2715
1061-8600
DOI: 10.1080/10618600.2020.1826954
Popis: Data augmentation, by the introduction of auxiliary variables, has become an ubiquitous technique to improve convergence properties, simplify the implementation or reduce the computational time of inference methods such as Markov chain Monte Carlo ones. Nonetheless, introducing appropriate auxiliary variables while preserving the initial target probability distribution and offering a computationally efficient inference cannot be conducted in a systematic way. To deal with such issues, this paper studies a unified framework, coined asymptotically exact data augmentation (AXDA), which encompasses both well-established and more recent approximate augmented models. In a broader perspective, this paper shows that AXDA models can benefit from interesting statistical properties and yield efficient inference algorithms. In non-asymptotic settings, the quality of the proposed approximation is assessed with several theoretical results. The latter are illustrated on standard statistical problems. Supplementary materials including computer code for this paper are available online.
63 pages
Databáze: OpenAIRE