Popis: |
Fractional Brownian motion (fBm) features both randomness and strong scale-free correlations, challenging generative models to reproduce the intrinsic memory characterizing the underlying stochastic process. Here we examine a zoo of diffusion-based inpainting methods on a specific dataset of corrupted images, which represent incomplete Euclidean distance matrices (EDMs) of fBm at various memory exponents $H$. Our dataset implies uniqueness of the data imputation in the regime of low missing ratio, where the remaining partial graph is rigid, providing the ground truth for the inpainting. We find that the conditional diffusion generation readily reproduces the built-in correlations of fBm paths in different memory regimes (i.e., for sub-, Brownian and super-diffusion trajectories), providing a robust tool for the statistical imputation at high missing ratio. Furthermore, while diffusion models have been recently shown to memorize samples from the training database, we demonstrate that diffusion behaves qualitatively different from the database search and thus generalize rather than memorize the training dataset. As a biological application, we apply our fBm-trained diffusion model for the imputation of microscopy-derived distance matrices of chromosomal segments (FISH data) - incomplete due to experimental imperfections - and demonstrate its superiority over the standard approaches used in bioinformatics. |