scPROTEIN: a versatile deep graph contrastive learning framework for single-cell proteomics embedding.

Autor: Li W; College of Artificial Intelligence, Nankai University, Tianjin, China.; AI Lab, Tencent, Shenzhen, China., Yang F; AI Lab, Tencent, Shenzhen, China., Wang F; AI Lab, Tencent, Shenzhen, China., Rong Y; AI Lab, Tencent, Shenzhen, China., Liu L; AI Lab, Tencent, Shenzhen, China., Wu B; AI Lab, Tencent, Shenzhen, China., Zhang H; College of Artificial Intelligence, Nankai University, Tianjin, China. zhanghan@nankai.edu.cn., Yao J; AI Lab, Tencent, Shenzhen, China. jianhuayao@tencent.com.
Jazyk: angličtina
Zdroj: Nature methods [Nat Methods] 2024 Apr; Vol. 21 (4), pp. 623-634. Date of Electronic Publication: 2024 Mar 19.
DOI: 10.1038/s41592-024-02214-9
Abstrakt: Single-cell proteomics sequencing technology sheds light on protein-protein interactions, posttranslational modifications and proteoform dynamics in the cell. However, the uncertainty estimation for peptide quantification, data missingness, batch effects and high noise hinder the analysis of single-cell proteomic data. It is important to solve this set of tangled problems together, but the existing methods tailored for single-cell transcriptomes cannot fully address this task. Here we propose a versatile framework designed for single-cell proteomics data analysis called scPROTEIN, which consists of peptide uncertainty estimation based on a multitask heteroscedastic regression model and cell embedding generation based on graph contrastive learning. scPROTEIN can estimate the uncertainty of peptide quantification, denoise protein data, remove batch effects and encode single-cell proteomic-specific embeddings in a unified framework. We demonstrate that scPROTEIN is efficient for cell clustering, batch correction, cell type annotation, clinical analysis and spatially resolved proteomic data exploration.
(© 2024. The Author(s), under exclusive licence to Springer Nature America, Inc.)
Databáze: MEDLINE