Explainable multi-task learning for multi-modality biological data analysis.

Autor: Tang X; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA.; Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA., Zhang J; School of Statistics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA., He Y; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA.; Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA., Zhang X; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA., Lin Z; Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, 02138, USA., Partarrieu S; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA., Hanna EB; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA., Ren Z; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA., Shen H; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA., Yang Y; School of Statistics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA., Wang X; Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.; Department of Chemistry, MIT, Cambridge, MA, 02139, USA., Li N; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA., Ding J; School of Statistics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA. dingj@umn.edu., Liu J; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA, 02134, USA. jia_liu@seas.harvard.edu.
Jazyk: angličtina
Zdroj: Nature communications [Nat Commun] 2023 May 03; Vol. 14 (1), pp. 2546. Date of Electronic Publication: 2023 May 03.
DOI: 10.1038/s41467-023-37477-x
Abstrakt: Current biotechnologies can simultaneously measure multiple high-dimensional modalities (e.g., RNA, DNA accessibility, and protein) from the same cells. A combination of different analytical tasks (e.g., multi-modal integration and cross-modal analysis) is required to comprehensively understand such data, inferring how gene regulation drives biological diversity and functions. However, current analytical methods are designed to perform a single task, only providing a partial picture of the multi-modal data. Here, we present UnitedNet, an explainable multi-task deep neural network capable of integrating different tasks to analyze single-cell multi-modality data. Applied to various multi-modality datasets (e.g., Patch-seq, multiome ATAC + gene expression, and spatial transcriptomics), UnitedNet demonstrates similar or better accuracy in multi-modal integration and cross-modal prediction compared with state-of-the-art methods. Moreover, by dissecting the trained UnitedNet with the explainable machine learning algorithm, we can directly quantify the relationship between gene expression and other modalities with cell-type specificity. UnitedNet is a comprehensive end-to-end framework that could be broadly applicable to single-cell multi-modality biology. This framework has the potential to facilitate the discovery of cell-type-specific regulation kinetics across transcriptomics and other modalities.
(© 2023. The Author(s).)
Databáze: MEDLINE