Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

Autor: Ramkumar Lachumanan, Richard Hall, Vinod Scaria, Gregory Vurture, Xueyan Shen, László Orbán, Matthew Boitano, Lawrence S. Hon, Woei Chang Liew, Si Lok, James P Drake, Tamas Dalmay, Dean R. Jerry, Vinaya Kumar Katneni, Junhui Jiang, Alan Christoffels, Andrey A. Yurchenko, Peter van Heusden, Inna S. Kuznetsova, Fritz J. Sedlazeck, Kathiresan Purushothaman, Tyler Garvin, Aleksey Komissarov, Sai Rama Sridatta Prakki, Michael C. Schatz, Amy Hin Yan Tong, Natascha May Thevasagayam, Stephen Turner, Gopikrishna Gopalapillai, Marsel R. Kabilov, Tansyn Noble, Heiner Kuhl, Jonas Korlach, Chen-Shan Chin, Doreen Lau, Stanley Kimbung Mbandi, Vladimir A. Trifonov, Sridhar Sivasubbu, Shubha Vij, Simon Moxon, Siddharth Singh, Darrell Green, Si Yan Ngoh, Jolly M. Saju, Sarah Mwangi, Mario Jonas, Stephen J. O'Brien, Alexey E. Tupikin
Jazyk: angličtina
Rok vydání: 2016
Předmět:
Zdroj: PLoS Genetics, Vol 12, Iss 4, p e1005954 (2016)
PLoS Genetics
Popis: We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species’ native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics.
Author Summary We describe the genome assembly of Asian seabass (Lates calcarifer), a marine teleost with aquaculture relevance. Though >500 eukaryotic genome sequences are available in public repositories, the majority are highly fragmented with incomplete assemblies, which explains why considerable effort and resources are often spent to improve their quality after publication. In our study, we employed long read sequencing combined with genetic and optical mapping, and syntenic information to produce a chromosomal level assembly. The largely continuous genome assembly will be useful for comparative genomics and offers an opportunity to look into regions less explored such as tandem repeats (the core component of centromeres and telomeres). In addition, population structure of the species was analysed based on low-coverage genome sequence information from 61 individuals representing diverse geographic locations stretching from North-Western India across South-East Asia and Australia to Papua New Guinea.
Databáze: OpenAIRE