Next Generation Sequencing-Data Analysis for Cellulose- and Xylan-Degrading Enzymes from Pome Metagenome
Autor: | Hamzah Mohd. Salleh, Ibrahim Ali Noorbatcha, Mohd Noor Mat Isa, Adibah Parman, Muhammad Alfatih Muddathir Abdelrahim, Oualid Abdelkader Bellag, Afidalina Tumian, Farah Fadwa Benbelgacem |
---|---|
Rok vydání: | 2018 |
Předmět: |
Multidisciplinary
Contig Gene prediction Sequence assembly 02 engineering and technology Computational biology 010501 environmental sciences Biology 021001 nanoscience & nanotechnology 01 natural sciences DNA sequencing De Bruijn graph Fosmid symbols.namesake Metagenomics symbols 0210 nano-technology Gene 0105 earth and related environmental sciences |
Zdroj: | Sains Malaysiana. 47:2951-2960 |
ISSN: | 0126-6039 |
Popis: | Metagenomic DNA library from palm oil mill effluent (POME) was constructed and subjected to high-throughput screening to find genes encoding cellulose- and xylan-degrading enzymes. DNA of 30 positive fosmid clones were sequenced with next generation sequencing technology and the raw data (short insert-paired) was analyzed with bioinformatic tools. First, the quality of 64,821,599 reverse and forward sequences of 101 bp length raw data was tested using Fastqc and SOLEXA. Then, raw data filtering was carried out by trimming low quality values and short reads and the vector sequences were removed and again the output was checked and the trimming was repeated until a high quality read sets was obtained. The second step was the de novo assembly of sequences to reconstruct 2900 contigs following de Bruijn graph algorithm. Pre-assembled contigs were arranged in order, the distances between contigs were identified and oriented with SSPACE, where 2139 scaffolds have been reconstructed. 16,386 genes have been identified after gene prediction using Prodigal and putative ID assignment with Blastp vs NR protein. The acceptable strategy to handle metagenomic NGS-data in order to detect known and potentially unknown genes is presented and we showed the computational efficiency of de Bruijn graph algorithm of de novo assembly to 21 bioprospect genes encoding cellulose-degrading enzymes and 6 genes encoding xylan-degrading enzymes of 30.3% to 100% identity percentage. |
Databáze: | OpenAIRE |
Externí odkaz: |