Fast and accurate genome-scale identification of DNA-binding sites
Autor: | Vincent Maillol, Eric Rivals, David Martin |
---|---|
Přispěvatelé: | Méthodes et Algorithmes pour la Bioinformatique (MAB), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Computationnelle (IBC), Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), ATGC bioinformatics platform., ANR-11-BINF-0002,IBC,Institut de Biologie Computationnelle de Montpellier(2011), ANR-06-MDCA-0014,PlasmoExplore,Fouille des données génomiques de Plasmodium falciparum pour prédire la fonction des gènes orphelins et identifier de nouvelles cibles thérapeutiques(2006), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Université de Montpellier (UM)-Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS), ANR-11-BINF-0002,IBC,Institut de biologie Computationnelle(2011) |
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Computer science stringology binding sites Computational biology computer.software_genre Genome web ACM: H.: Information Systems/H.3: INFORMATION STORAGE AND RETRIEVAL/H.3.3: Information Search and Retrieval 03 medical and health sciences interactive Pattern matching Binding site Transcription factor genome transcription factor Whole genome sequencing search software motif Search engine indexing tool ACM: F.: Theory of Computation/F.2: ANALYSIS OF ALGORITHMS AND PROBLEM COMPLEXITY/F.2.2: Nonnumerical Algorithms and Problems/F.2.2.3: Pattern matching bioinformatics DNA binding site 030104 developmental biology ComputingMethodologies_PATTERNRECOGNITION pattern matching efficiency interface Web service [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] computer transcriptome |
Zdroj: | 12th International Conference on Bioinformatics and Biomedicine BIBM: Bioinformatics and Biomedicine BIBM: Bioinformatics and Biomedicine, Dec 2018, Madrid, Spain. pp.201-205, ⟨10.1109/BIBM.2018.8621093⟩ BIBM |
DOI: | 10.1109/BIBM.2018.8621093⟩ |
Popis: | This is the author version of the article published in the conference proceedings. It includes supplementary information. A software called MOTIF is available on the ATGC bioinformatics platform.; International audience; Motivation: Discovering DNA binding sites in genome sequences is crucial for understanding genomic regulation. Currently available computational tools for finding binding sites with Position Weight Matrices of known motifs are often used in restricted genomic regions because of their long run times. The ever-increasing number of complete genome sequences points to the need for new generations of algorithms capable of processing large amounts of data. Results: Here we present MOTIF, a new algorithm for seeking transcription factor binding sites in whole genome sequences in a few seconds. We propose a web service that enables the users to search for their own matrix or for multiple JASPAR matrices. Beyond its efficacy , the service properly handles undetermined positions within the genome sequence and provides an adequate output listing for each position the matching word and its score. Availability: MOTIF is freely available for use through an interface at http://www. atgc-montpellier.fr/motif. The source code of the stand-alone search method of MOTIF is freely available at https://gite.lirmm.fr/rivals/motif.git. It is written in C++ and tested on Linux platforms. |
Databáze: | OpenAIRE |
Externí odkaz: |