Interpretation of mass spectrometry data for high-throughput proteomics

Autor:	Helmut E. Meyer, Gerhard Koerting, Johan Gobom, Martin Blueggel, Joachim Klose, Herbert Thiele, Daniel Chamrad
Rok vydání:	2003
Předmět:	Proteomics Quality Control Databases Factual Polymers Calibration (statistics) Computer science computer.software_genre Peptide Mapping Biochemistry Bottleneck Analytical Chemistry Automation Search engine Software Electrophoresis Gel Two-Dimensional Instrumentation (computer programming) Measure (data warehouse) business.industry Proteins Identification (information) Spectrometry Mass Matrix-Assisted Laser Desorption-Ionization Calibration Data mining business Metasearch engine computer Algorithms Filtration
Zdroj:	Analytical and Bioanalytical Chemistry. 376:1014-1022
ISSN:	1618-2650 1618-2642
Popis:	Recent developments in proteomics have revealed a bottleneck in bioinformatics: high-quality interpretation of acquired MS data. The ability to generate thousands of MS spectra per day, and the demand for this, makes manual methods inadequate for analysis and underlines the need to transfer the advanced capabilities of an expert human user into sophisticated MS interpretation algorithms. The identification rate in current high-throughput proteomics studies is not only a matter of instrumentation. We present software for high-throughput PMF identification, which enables robust and confident protein identification at higher rates. This has been achieved by automated calibration, peak rejection, and use of a meta search approach which employs various PMF search engines. The automatic calibration consists of a dynamic, spectral information-dependent algorithm, which combines various known calibration methods and iteratively establishes an optimised calibration. The peak rejection algorithm filters signals that are unrelated to the analysed protein by use of automatically generated and dataset-dependent exclusion lists. In the "meta search" several known PMF search engines are triggered and their results are merged by use of a meta score. The significance of the meta score was assessed by simulation of PMF identification with 10,000 artificial spectra resembling a data situation close to the measured dataset. By means of this simulation the meta score is linked to expectation values as a statistical measure. The presented software is part of the proteome database ProteinScape which links the information derived from MS data to other relevant proteomics data. We demonstrate the performance of the presented system with MS data from 1891 PMF spectra. As a result of automatic calibration and peak rejection the identification rate increased from 6% to 44%.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c00472915f2aa95c95ed35dcf6067608 https://doi.org/10.1007/s00216-003-1995-x Zobrazit plný text záznamu Plný text ve formátu PDF