Manifold Fitting
Autor: | Yao, Zhigang, Su, Jiaji, Li, Bingjie, Yau, Shing-Tung |
---|---|
Rok vydání: | 2023 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | While classical data analysis has addressed observations that are real numbers or elements of a real vector space, at present many statistical problems of high interest in the sciences address the analysis of data that consist of more complex objects, taking values in spaces that are naturally not (Euclidean) vector spaces but which still feature some geometric structure. Manifold fitting is a long-standing problem, and has finally been addressed in recent years by Fefferman et al. (2020, 2021a). We develop a method with a theory guarantee that fits a $d$-dimensional underlying manifold from noisy observations sampled in the ambient space $\mathbb{R}^D$. The new approach uses geometric structures to obtain the manifold estimator in the form of image sets via a two-step mapping approach. We prove that, under certain mild assumptions and with a sample size $N=\mathcal{O}(\sigma^{(-d+3)})$, these estimators are true $d$-dimensional smooth manifolds whose estimation error, as measured by the Hausdorff distance, is bounded by $\mathcal{O}(\sigma^2\log(1/\sigma))$ with high probability. Compared with the existing approaches proposed in Fefferman et al. (2018, 2021b); Genovese et al. (2014); Yao and Xia (2019), our method exhibits superior efficiency while attaining very low error rates with a significantly reduced sample size, which scales polynomially in $\sigma^{-1}$ and exponentially in $d$. Extensive simulations are performed to validate our theoretical results. Our findings are relevant to various fields involving high-dimensional data in machine learning. Furthermore, our method opens up new avenues for existing non-Euclidean statistical methods in the sense that it has the potential to unify them to analyze data on manifolds in the ambience space domain. Comment: 60 pages |
Databáze: | arXiv |
Externí odkaz: |