The Functional Human C-Terminome

Autor: Sanguthevar Rajasekaran, Oniel Toledo, Steven B. Brooks, Michael W. Hedden, Vishal Thapar, Sean R. Williams, Justin Limtong, Roxanne P. David, Surbhi Sharma, Jacklyn M. Newsome, Martin R. Schiller, Nemanja Novakovic, Kenneth F. Lyon
Jazyk: angličtina
Rok vydání: 2016
Předmět:
0301 basic medicine
Proteomics
Proteomes
lcsh:Medicine
Biochemistry
Database and Informatics Methods
Binding Analysis
Protein sequencing
Human proteome project
lcsh:Science
Databases
Protein

Peptide sequence
Genetics
Mammals
Multidisciplinary
Proteomic Databases
Genomics
Genomic Databases
Proteome
Vertebrates
Sequence Analysis
Cell Binding Assay
Research Article
Protein domain
Molecular Sequence Data
Sequence Databases
Computational biology
Biology
Research and Analysis Methods
Rodents
03 medical and health sciences
Protein Domains
Sequence Motif Analysis
Consensus sequence
Animals
Humans
Amino Acid Sequence
Molecular Biology Techniques
Sequencing Techniques
Gene
Molecular Biology
Chemical Characterization
lcsh:R
Organisms
Biology and Life Sciences
Proteins
Computational Biology
Genome Analysis
030104 developmental biology
Biological Databases
Amniotes
Human genome
lcsh:Q
Zdroj: PLoS ONE
PLoS ONE, Vol 11, Iss 4, p e0152731 (2016)
ISSN: 1932-6203
Popis: All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new "C-terminome" database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3-10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com.
Databáze: OpenAIRE