Proteogenomics: From next-generation sequencing (NGS) and mass spectrometry-based proteomics to precision medicine.

Autor: Ang MY; UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000 Kuala Lumpur, Malaysia., Low TY; UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000 Kuala Lumpur, Malaysia. Electronic address: lowteckyew@ppukm.ukm.edu.my., Lee PY; UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000 Kuala Lumpur, Malaysia., Wan Mohamad Nazarie WF; UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000 Kuala Lumpur, Malaysia., Guryev V; European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, Groningen 9713AD, The Netherlands., Jamal R; UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000 Kuala Lumpur, Malaysia.
Jazyk: angličtina
Zdroj: Clinica chimica acta; international journal of clinical chemistry [Clin Chim Acta] 2019 Nov; Vol. 498, pp. 38-46. Date of Electronic Publication: 2019 Aug 14.
DOI: 10.1016/j.cca.2019.08.010
Abstrakt: One of the best-established area within multi-omics is proteogenomics, whereby the underpinning technologies are next-generation sequencing (NGS) and mass spectrometry (MS). Proteogenomics has contributed significantly to genome (re)-annotation, whereby novel coding sequences (CDS) are identified and confirmed. By incorporating in-silico translated genome variants in protein database, single amino acid variants (SAAV) and splice proteoforms can be identified and quantified at peptide level. The application of proteogenomics in cancer research potentially enables the identification of patient-specific proteoforms, as well as the association of the efficacy or resistance of cancer therapy to different mutations. Here, we discuss how NGS/TGS data are analyzed and incorporated into the proteogenomic framework. These sequence data mainly originate from whole genome sequencing (WGS), whole exome sequencing (WES) and RNA-Seq. We explain two major strategies for sequence analysis i.e., de novo assembly and reads mapping, followed by construction of customized protein databases using such data. Besides, we also elaborate on the procedures of spectrum to peptide sequence matching in proteogenomics, and the relationship between database size on the false discovery rate (FDR). Finally, we discuss the latest development in proteogenomics-assisted precision oncology and also challenges and opportunities in proteogenomics research.
(Copyright © 2019 Elsevier B.V. All rights reserved.)
Databáze: MEDLINE