Maximizing the sensitivity and reliability of peptide identification in large-scale proteomic experiments by harnessing multiple search engines.

Autor: Yu W; Computational Biology, Amgen Inc., Seattle, WA 98119-3105, USA. wyu@amgen.com, Taylor JA, Davis MT, Bonilla LE, Lee KA, Auger PL, Farnsworth CC, Welcher AA, Patterson SD
Jazyk: angličtina
Zdroj: Proteomics [Proteomics] 2010 Mar; Vol. 10 (6), pp. 1172-89.
DOI: 10.1002/pmic.200900074
Abstrakt: Despite recent advances in qualitative proteomics, the automatic identification of peptides with optimal sensitivity and accuracy remains a difficult goal. To address this deficiency, a novel algorithm, Multiple Search Engines, Normalization and Consensus is described. The method employs six search engines and a re-scoring engine to search MS/MS spectra against protein and decoy sequences. After the peptide hits from each engine are normalized to error rates estimated from the decoy hits, peptide assignments are then deduced using a minimum consensus model. These assignments are produced in a series of progressively relaxed false-discovery rates, thus enabling a comprehensive interpretation of the data set. Additionally, the estimated false-discovery rate was found to have good concordance with the observed false-positive rate calculated from known identities. Benchmarking against standard proteins data sets (ISBv1, sPRG2006) and their published analysis, demonstrated that the Multiple Search Engines, Normalization and Consensus algorithm consistently achieved significantly higher sensitivity in peptide identifications, which led to increased or more robust protein identifications in all data sets compared with prior methods. The sensitivity and the false-positive rate of peptide identification exhibit an inverse-proportional and linear relationship with the number of participating search engines.
Databáze: MEDLINE