An Integrated Approach for RNA-seq Data Normalization
Autor: | Zhide Fang, Donald E. Mercante, Kun Zhang, Shengping Yang |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
0301 basic medicine
Normalization (statistics) Cancer Research RNA-Seq Genomics Computational biology Bioinformatics 01 natural sciences lcsh:RC254-282 DNA sequencing Database normalization 010104 statistics & probability 03 medical and health sciences Gene expression Medicine DNA copy number alterations 0101 mathematics Gene Original Research business.industry lcsh:Neoplasms. Tumors. Oncology. Including cancer and carcinogens Housekeeping gene normalization 030104 developmental biology Oncology RNA-seq business |
Zdroj: | Cancer Informatics, Vol 2016, Iss 15, Pp 129-141 (2016) Cancer Informatics Cancer Informatics, Vol 15 (2016) |
ISSN: | 1176-9351 |
Popis: | Background DNA copy number alteration is common in many cancers. Studies have shown that insertion or deletion of DNA sequences can directly alter gene expression, and significant correlation exists between DNA copy number and gene expression. Data normalization is a critical step in the analysis of gene expression generated by RNA-seq technology. Successful normalization reduces/removes unwanted nonbiological variations in the data, while keeping meaningful information intact. However, as far as we know, no attempt has been made to adjust for the variation due to DNA copy number changes in RNA-seq data normalization. Results In this article, we propose an integrated approach for RNA-seq data normalization. Comparisons show that the proposed normalization can improve power for downstream differentially expressed gene detection and generate more biologically meaningful results in gene profiling. In addition, our findings show that due to the effects of copy number changes, some housekeeping genes are not always suitable internal controls for studying gene expression. Conclusions Using information from DNA copy number, integrated approach is successful in reducing noises due to both biological and nonbiological causes in RNA-seq data, thus increasing the accuracy of gene profiling. |
Databáze: | OpenAIRE |
Externí odkaz: |