ProDaMa: an open source Python library to generate protein structure datasets

Autor: Manconi Andrea, Armano Giuliano
Jazyk: angličtina
Rok vydání: 2009
Předmět:
Zdroj: BMC Research Notes, Vol 2, Iss 1, p 202 (2009)
Druh dokumentu: article
ISSN: 1756-0500
DOI: 10.1186/1756-0500-2-202
Popis: Abstract Background The huge difference between the number of known sequences and known tertiary structures has justified the use of automated methods for protein analysis. Although a general methodology to solve these problems has not been yet devised, researchers are engaged in developing more accurate techniques and algorithms whose training plays a relevant role in determining their performance. From this perspective, particular importance is given to the training data used in experiments, and researchers are often engaged in the generation of specialized datasets that meet their requirements. Findings To facilitate the task of generating specialized datasets we devised and implemented ProDaMa, an open source Python library than provides classes for retrieving, organizing, updating, analyzing, and filtering protein data. Conclusion ProDaMa has been used to generate specialized datasets useful for secondary structure prediction and to develop a collaborative web application aimed at generating and sharing protein structure datasets. The library, the related database, and the documentation are freely available at the URL http://iasc.diee.unica.it/prodama.
Databáze: Directory of Open Access Journals