Metadata retrieval from sequence databases with ffq.

Autor: Gálvez-Merchán Á; Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA., Min KHJ; Department of Computer Science and Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA 91125, USA., Pachter L; Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.; Department of Computing and Mathematical Sciences, Pasadena, CA 91125, USA., Booeshaghi AS; Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA 91125, USA.
Jazyk: angličtina
Zdroj: Bioinformatics (Oxford, England) [Bioinformatics] 2023 Jan 01; Vol. 39 (1).
DOI: 10.1093/bioinformatics/btac667
Abstrakt: Motivation: Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction.
Results: We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper's DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq's modularity and simplicity make it extensible to any genomic database exposing its data for programmatic access.
Availability and Implementation: ffq is free and open source, and the code can be found here: https://github.com/pachterlab/ffq.
(© The Author(s) 2023. Published by Oxford University Press.)
Databáze: MEDLINE