BMC Biology

Autor: Eric Tannier, Igor V. Sharakhov, Sèverine Bérard, Robert M. Waterhouse, Matthew W. Hahn, Maria F. Unger, Maria V. Sharakhova, Sergey Koren, Scott J. Emrich, Yoann Anselmetti, Adam M. Phillippy, Ashley Peery, Jiyoung Lee, Sergey Aganezov, Cedric Chauve, Livio Ruzzante, Max A. Alekseyev, Paul I. Howell, Daniel Lawson, Simo V. Zhang, Romain Feron, Nora J. Besansky, Maarten J.M.F. Reijnders, Phillip George, Maryam Kamali, Gareth Maslen
Přispěvatelé: Université de Lausanne (UNIL), Princeton University, Institut des Sciences de l'Evolution de Montpellier (UMR ISEM), École pratique des hautes études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Université de Montpellier (UM)-Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Centre National de la Recherche Scientifique (CNRS)-Institut de recherche pour le développement [IRD] : UR226, Virginia Polytechnic Institute and State University [Blacksburg], Virginia Tech [Blacksburg], Indiana University [Bloomington], Indiana University System, Centers for Disease Control and Prevention [Atlanta] (CDC), Centers for Disease Control and Prevention, National Biodefense Analysis and Countermeasures Center [Frederick], U.S. Social Security Administration, European Bioinformatics Institute [Hinxton] (EMBL-EBI), EMBL Heidelberg, The Wellcome Trust Sanger Institute [Cambridge], Department of Entomology [Blacksburg], National Human Genome Research Institute (NHGRI), Artificial Evolution and Computational Biology (BEAGLE), Laboratoire de Biométrie et Biologie Evolutive - UMR 5558 (LBBE), Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS)-Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2), University of Notre Dame [Indiana] (UND), Georgetown University [Washington] (GU), Department of Mathematics [Burnaby] (SFU), Simon Fraser University (SFU.ca), Department of Computer Science. University of Tennessee, Tennessee State University, Physical mapping and PacBio sequencing of A. funestus were supported by the US National Institutes of Health (NIH) National Institute of Allergy and Infectious Diseases (NIAID) grant R21 AI112734 to NJB, with SJE and IVS as co-investigators. IVS was supported by the US NIH NIAID grants R21AI099528 and R21AI135298 and by the US Department of Agriculture National Institute of Food and Agriculture Hatch project 223822. SA and MAA were supported by the US National Science Foundation (NSF) grant IIS-1462107. SA was supported by the US NSF grants CCF-1053753 and DBI-1350041 and by US NIH grants U24CA211000 and R01-HG006677. YA, SB, and ET were supported by the French Agence Nationale pour la Recherche Ancestrome project ANR10-BINF-01-01. SK and AMP were supported by the Intramural Research Program of the NIH National Human Genome Research Institute 1ZIAHG200398. CC was supported by a Mitacs Globalink grant, the Natural Sciences and Engineering Research Council of Canada Discovery Grant RGPIN-249834, and a resource allocation from Compute Canada. MWH and SVZ were supported by US NSF grant DEB-1249633. RMW, LR, MJMFR, and RF were supported by Novartis Foundation for medical-biological research grant #18B116 and SwissNational Science Foundation grant PP00P3_170664., Université de Lausanne = University of Lausanne (UNIL), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-École Pratique des Hautes Études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Université de Montpellier (UM)-Institut de recherche pour le développement [IRD] : UR226-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire de Biométrie et Biologie Evolutive - UMR 5558 (LBBE), Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS)-VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS), Entomology
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Bioinformatics
Computational biology
Biology
Genome
Synteny
Chromosomes
Mosquito genomes
03 medical and health sciences
0302 clinical medicine
Chromosome (genetic algorithm)
Gene synteny
Biotechnology
Plant Science
General Biochemistry
Genetics and Molecular Biology

Developmental Biology
Cell Biology
Physiology
Ecology
Evolution
Behavior and Systematics

Structural Biology
General Agricultural and Biological Sciences
Orthology
Anopheles
Animals
Comparative genomic analysis
lcsh:QH301-705.5
Gene
030304 developmental biology
Whole genome sequencing
Comparative genomics
0303 health sciences
Genome assembly
Chromosome Mapping
Genomics
06 Biological Sciences
biology.organism_classification
Computational evolutionary biology
Biological Evolution
lcsh:Biology (General)
Genetic Techniques
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]
030217 neurology & neurosurgery
Research Article
Physical mapping
Zdroj: BMC Biology
BMC Biology, BioMed Central, 2020, 18 (1), pp.1-20. ⟨10.1186/s12915-019-0728-3⟩
BMC Biology, 2020, 18 (1), pp.1-20. ⟨10.1186/s12915-019-0728-3⟩
BMC Biology, vol. 18, no. 1
BMC Biology, Vol 18, Iss 1, Pp 1-20 (2020)
ISSN: 1741-7007
DOI: 10.1186/s12915-019-0728-3⟩
Popis: BackgroundNew sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from ‘finished’. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing efforts. Avoiding the costs associated with such approaches, comparative genomic analysis of gene order conservation (synteny) to predict scaffold neighbours (adjacencies) offers a potentially useful complementary method for improving draft assemblies.ResultsWe employed three gene synteny-based methods applied to 21 Anopheles mosquito assemblies to produce consensus sets of scaffold adjacencies. For subsets of the assemblies we integrated these with additional supporting data to confirm and complement the synteny-based adjacencies: six with physical mapping data that anchor scaffolds to chromosome locations, 13 with paired-end RNA sequencing (RNAseq) data, and three with new assemblies based on re-scaffolding or Pacific Biosciences long-read data. Our combined analyses produced 20 new superscaffolded assemblies with improved contiguities: seven for which assignments of non-anchored scaffolds to chromosome arms span more than 75% of the assemblies, and a further seven with chromosome anchoring including an 88% anchored Anopheles arabiensis assembly and, respectively, 73% and 84% anchored assemblies with comprehensively updated cytogenetic photomaps for Anopheles funestus and Anopheles stephensi.ConclusionsExperimental data from probe mapping, RNAseq, or long-read technologies, where available, all contribute to successful upgrading of draft assemblies. Our comparisons show that gene synteny-based computational methods represent a valuable alternative or complementary approach. Our improved Anopheles reference assemblies highlight the utility of applying comparative genomics approaches to improve community genomic resources.
Databáze: OpenAIRE