HOCOMOCO in 2024: a rebuild of the curated collection of binding models for human and mouse transcription factors.

Autor: Vorontsov IE; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia., Eliseeva IA; Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Russia., Zinkevich A; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.; Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119991 Moscow, Russia., Nikonov M; Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119991 Moscow, Russia., Abramov S; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.; Altius Institute for Biomedical Sciences, 98121 Seattle, WA, USA., Boytsov A; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.; Altius Institute for Biomedical Sciences, 98121 Seattle, WA, USA., Kamenets V; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.; Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia.; Institute of Biochemistry and Genetics of the Ufa Federal Research Centre of the Russian Academy of Sciences, 450054 Ufa, Russia., Kasianova A; Skolkovo Institute of Science and Technology, 121205 Moscow, Russia.; Institute for Information Transmission Problems of the Russian Academy of Sciences, 127051 Moscow, Russia., Kolmykov S; Department of Computational Biology, Sirius University of Science and Technology, 354340 Sirius, Krasnodar region, Russia., Yevshin IS; Biosoft.Ru LLC, 630090 Novosibirsk, Russia., Favorov A; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.; Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA., Medvedeva YA; Research Center of Biotechnology RAS, Russian Academy of Sciences, 119071 Moscow, Russia., Jolma A; Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada., Kolpakov F; Department of Computational Biology, Sirius University of Science and Technology, 354340 Sirius, Krasnodar region, Russia.; Bioinformatics Laboratory, Federal Research Center for Information and Computational Technologies, 630090 Novosibirsk, Russia., Makeev VJ; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.; Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia.; Institute of Biochemistry and Genetics of the Ufa Federal Research Centre of the Russian Academy of Sciences, 450054 Ufa, Russia., Kulakovskiy IV; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.; Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Russia.; Laboratory of Regulatory Genomics, Institute of Fundamental Medicine and Biology, Kazan Federal University, 420008 Kazan, Russia.
Jazyk: angličtina
Zdroj: Nucleic acids research [Nucleic Acids Res] 2024 Jan 05; Vol. 52 (D1), pp. D154-D163.
DOI: 10.1093/nar/gkad1077
Abstrakt: We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors. Next, the motifs underwent human expert curation to stratify distinct motif subtypes and remove non-informative patterns and common artifacts. Finally, the curated subset of 100 thousand motifs was supplied to the automated benchmarking to select the best-performing motifs for each transcription factor. The resulting HOCOMOCO v12 core collection contains 1443 verified position weight matrices, including distinct subtypes of DNA binding motifs for particular transcription factors. In addition to the core collection, HOCOMOCO v12 provides motif sets optimized for the recognition of binding sites in vivo and in vitro, and for annotation of regulatory sequence variants. HOCOMOCO is available at https://hocomoco12.autosome.org and https://hocomoco.autosome.org.
(© The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.)
Databáze: MEDLINE