An unsupervised learning approach to identify novel signatures of health and disease from multimodal data.

Autor: Shomorony I; Human Longevity, Inc., San Diego, CA, 92121, USA.; Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61820, USA., Cirulli ET; Human Longevity, Inc., San Diego, CA, 92121, USA., Huang L; Human Longevity, Inc., San Diego, CA, 92121, USA., Napier LA; Human Longevity, Inc., San Diego, CA, 92121, USA., Heister RR; Human Longevity, Inc., San Diego, CA, 92121, USA., Hicks M; Human Longevity, Inc., San Diego, CA, 92121, USA., Cohen IV; Human Longevity, Inc., San Diego, CA, 92121, USA., Yu HC; Human Longevity, Inc., San Diego, CA, 92121, USA., Swisher CL; Human Longevity, Inc., San Diego, CA, 92121, USA., Schenker-Ahmed NM; Human Longevity, Inc., San Diego, CA, 92121, USA., Li W; Human Longevity, Inc., San Diego, CA, 92121, USA.; J. Craig Venter Institute, La Jolla, CA, 92037, USA., Nelson KE; Human Longevity, Inc., San Diego, CA, 92121, USA.; J. Craig Venter Institute, La Jolla, CA, 92037, USA., Brar P; Human Longevity, Inc., San Diego, CA, 92121, USA.; J. Craig Venter Institute, La Jolla, CA, 92037, USA., Kahn AM; Human Longevity, Inc., San Diego, CA, 92121, USA.; Division of Cardiovascular Medicine, School of Medicine, University of California San Diego, La Jolla, CA, 92093, USA., Spector TD; Department of Twin Research and Genetic Epidemiology, King's College London, London, UK., Caskey CT; Human Longevity, Inc., San Diego, CA, 92121, USA.; Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA., Venter JC; Human Longevity, Inc., San Diego, CA, 92121, USA.; J. Craig Venter Institute, La Jolla, CA, 92037, USA., Karow DS; Human Longevity, Inc., San Diego, CA, 92121, USA., Kirkness EF; Human Longevity, Inc., San Diego, CA, 92121, USA.; J. Craig Venter Institute, La Jolla, CA, 92037, USA., Shah N; Human Longevity, Inc., San Diego, CA, 92121, USA. naishas@gmail.com.; J. Craig Venter Institute, La Jolla, CA, 92037, USA. naishas@gmail.com.
Jazyk: angličtina
Zdroj: Genome medicine [Genome Med] 2020 Jan 10; Vol. 12 (1), pp. 7. Date of Electronic Publication: 2020 Jan 10.
DOI: 10.1186/s13073-019-0705-z
Abstrakt: Background: Modern medicine is rapidly moving towards a data-driven paradigm based on comprehensive multimodal health assessments. Integrated analysis of data from different modalities has the potential of uncovering novel biomarkers and disease signatures.
Methods: We collected 1385 data features from diverse modalities, including metabolome, microbiome, genetics, and advanced imaging, from 1253 individuals and from a longitudinal validation cohort of 1083 individuals. We utilized a combination of unsupervised machine learning methods to identify multimodal biomarker signatures of health and disease risk.
Results: Our method identified a set of cardiometabolic biomarkers that goes beyond standard clinical biomarkers. Stratification of individuals based on the signatures of these biomarkers identified distinct subsets of individuals with similar health statuses. Subset membership was a better predictor for diabetes than established clinical biomarkers such as glucose, insulin resistance, and body mass index. The novel biomarkers in the diabetes signature included 1-stearoyl-2-dihomo-linolenoyl-GPC and 1-(1-enyl-palmitoyl)-2-oleoyl-GPC. Another metabolite, cinnamoylglycine, was identified as a potential biomarker for both gut microbiome health and lean mass percentage. We identified potential early signatures for hypertension and a poor metabolic health outcome. Additionally, we found novel associations between a uremic toxin, p-cresol sulfate, and the abundance of the microbiome genera Intestinimonas and an unclassified genus in the Erysipelotrichaceae family.
Conclusions: Our methodology and results demonstrate the potential of multimodal data integration, from the identification of novel biomarker signatures to a data-driven stratification of individuals into disease subtypes and stages-an essential step towards personalized, preventative health risk assessment.
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje