Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection
Autor: | Willy-Vincent Bienvenut, Jean-Charles Sanchez, Christine Hoogland, Robin Gras, Pierre-Alain Binz, Elisabeth Gasteiger, Ron D. Appel, Denis F. Hochstrasser, Amos Marc Bairoch, Marcus Müller |
---|---|
Rok vydání: | 1999 |
Předmět: |
Computer science
Clinical Biochemistry Analytical chemistry Parameterized complexity Peptides/chemistry Mass spectrometry Biochemistry Analytical Chemistry Ranking (information retrieval) Protein sequencing Peptide mass fingerprinting Scoring algorithm ddc:576 business.industry Proteins Pattern recognition Molecular Weight Identification (information) Spectrometry Mass Matrix-Assisted Laser Desorption-Ionization Test set Calibration Artificial intelligence Peptides business Algorithms Proteins/chemistry |
Zdroj: | Electrophoresis, Vol. 20, No 18 (1999) pp. 3535-3550 |
ISSN: | 1522-2683 0173-0835 |
DOI: | 10.1002/(sici)1522-2683(19991201)20:18<3535::aid-elps3535>3.0.co;2-j |
Popis: | We have developed a new algorithm to identify proteins by means of peptide mass fingerprinting. Starting from the matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) spectra and environmental data such as species, isoelectric point and molecular weight, as well as chemical modifications or number of missed cleavages of a protein, the program performs a fully automated identification of the protein. The first step is a peak detection algorithm, which allows precise and fast determination of peptide masses, even if the peaks are of low intensity or they overlap. In the second step the masses and environmental data are used by the identification algorithm to search in protein sequence databases (SWISS-PROT and/or TrEMBL) for protein entries that match the input data. Consequently, a list of candidate proteins is selected from the database, and a score calculation provides a ranking according to the quality of the match. To define the most discriminating scoring calculation we analyzed the respective role of each parameter in two directions. The first one is based on filtering and exploratory effects, while the second direction focuses on the levels where the parameters intervene in the identification process. Thus, according to our analysis, all input parameters contribute to the score, however with different weights. Since it is difficult to estimate the weights in advance, they have been computed with a generic algorithm, using a training set of 91 protein spectra with their environmental data. We tested the resulting scoring calculation on a test set of ten proteins and compared the identification results with those of other peptide mass fingerprinting programs. |
Databáze: | OpenAIRE |
Externí odkaz: |