Automatic cell type harmonization and integration across Human Cell Atlas datasets

Autor: Chuan Xu, Martin Prete, Simone Webb, Laura Jardine, Benjamin Stewart, Regina Hoo, Peng He, Sarah A. Teichmann
Rok vydání: 2023
Popis: SummaryHarmonizing cell types with respect to transcriptomics identity and nomenclature across the single-cell community and assembling them into a common framework is an essential step towards building a standardized Human Cell Atlas. Here we present CellTypist v2.0 (a new version of our automated annotation tool,https://github.com/Teichlab/celltypist), where we develop a predictive clustering tree-based approach to resolve cell type differences across datasets that have different naming conventions, annotation resolution, and technical biases. CellTypist v2.0 accurately quantifies cell-cell transcriptomic similarities and enables robust and efficient cross-dataset meta-analyses. Cell types are placed into a relationship graph that hierarchically defines shared and novel cell subtypes. Application to multiple immune datasets confirms its sensitivity and specificity by recapitulating expert-curated cell annotations. We also apply CellTypist to datasets from eight diseases which all lead to pulmonary fibrosis, and reveal underexplored relationships between healthy cell types and a variety of diseased cell states. Furthermore, we present a workflow for fast cross-dataset integration guided by the harmonized cell types at different levels of annotation granularity. Finally, we apply CellTypist to 12 tissue atlases from 38 datasets, and provide a deeply curated cross-tissue database (www.celltypist.org/organs) with more than 3.6 million cells and 250 cell types. This is complemented by a large collection of machine learning models for automatic cell type annotation across human tissues.
Databáze: OpenAIRE