Hypergraphs for multiscale cycles in structured data
Autor: | Barbensi, Agnese, Yoon, Hee Rhang, Madsen, Christian Degnbol, Ajayi, Deborah O., Stumpf, Michael P. H., Harrington, Heather A. |
---|---|
Rok vydání: | 2022 |
Předmět: |
Computational Geometry (cs.CG)
FOS: Computer and information sciences FOS: Biological sciences FOS: Mathematics Computer Science - Computational Geometry Algebraic Topology (math.AT) Mathematics - Algebraic Topology 55N31 62R40 55P10 60C05 92B05 92-10 Quantitative Biology - Quantitative Methods Quantitative Methods (q-bio.QM) |
DOI: | 10.48550/arxiv.2210.07545 |
Popis: | Scientific data has been growing in both size and complexity across the modern physical, engineering, life and social sciences. Spatial structure, for example, is a hallmark of many of the most important real-world complex systems, but its analysis is fraught with statistical challenges. Topological data analysis can provide a powerful computational window on complex systems. Here we present a framework to extend and interpret persistent homology summaries to analyse spatial data across multiple scales. We introduce hyperTDA, a topological pipeline that unifies local (e.g. geodesic) and global (e.g. Euclidean) metrics without losing spatial information, even in the presence of noise. Homology generators offer an elegant and flexible description of spatial structures and can capture the information computed by persistent homology in an interpretable way. Here the information computed by persistent homology is transformed into a weighted hypergraph, where hyperedges correspond to homology generators. We consider different choices of generators (e.g. matroid or minimal) and find that centrality and community detection are robust to either choice. We compare hyperTDA to existing geometric measures and validate its robustness to noise. We demonstrate the power of computing higher-order topological structures on spatial curves arising frequently in ecology, biophysics, and biology, but also in high-dimensional financial datasets. We find that hyperTDA can select between synthetic trajectories from the landmark 2020 AnDi challenge and quantifies movements of different animal species, even when data is limited. Comment: 6 Figures, 15 pages and Supplementary Information (including figures) as an Appendix. Associated GitHub repositories: github.com/degnbol/hyperTDA and github.com/irishryoon/minimal_generators_curves |
Databáze: | OpenAIRE |
Externí odkaz: |