The Heterogeneous Landscape and Early Evolution of Pathogen-Associated CpG Dinucleotides in SARS-CoV-2.

Autor: Di Gioacchino A; Laboratoire de Physique de l'Ecole Normale Supérieure, PSL & CNRS UMR8063, Sorbonne Université, Université de Paris, F-75005 Paris, France., Šulc P; School of Molecular Sciences and Center for Molecular Design and Biomimetics, The Biodesign Institute, Arizona State University, 1001 South McAllister Avenue, Tempe, Arizona 85281, USA., Komarova AV; Molecular Genetics of RNA viruses, Department of Virology, Institut Pasteur, CNRS UMR-3569, 75015 Paris, France., Greenbaum BD; Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue New York, NY 10065., Monasson R; Laboratoire de Physique de l'Ecole Normale Supérieure, PSL & CNRS UMR8063, Sorbonne Université, Université de Paris, F-75005 Paris, France., Cocco S; Laboratoire de Physique de l'Ecole Normale Supérieure, PSL & CNRS UMR8063, Sorbonne Université, Université de Paris, F-75005 Paris, France.
Jazyk: angličtina
Zdroj: SSRN [SSRN] 2020 May 27, pp. 3611280. Date of Electronic Publication: 2020 May 27.
DOI: 10.2139/ssrn.3611280
Abstrakt: SARS-CoV-2 infection can lead to acute respiratory syndrome in patients, which can be due in part to dysregulated immune signalling. We analyze here the occurrences of CpG dinucleotides, which are putative pathogen-associated molecular patterns, along the viral sequence. Carrying out a comparative analysis with other ssRNA viruses and within the Coronaviridae  family, we find the CpG content of SARS-CoV-2, while low compared to other betacoronaviruses, widely fluctuates along its primary sequence. While the CpG relative abundance and its associated CpG force parameter are low for the spike protein (S) and comparable to circulating seasonal coronaviruses such as HKU1, they are much greater and comparable to SARS and MERS for the 3'-end of the viral genome. In particular, the nucleocapsid protein (N), whose transcripts are relatively abundant in the cytoplasm of infected cells and present in the 3'UTRs of all subgenomic RNA, has high CpG content. We speculate this dual  nature of CpG content can confer to SARS-CoV-2 high ability to both enter the host and trigger pattern recognition receptors (PRRs) in different contexts. We then investigate the evolution of synonymous mutations since the outbreak of the COVID-19 pandemic. Using a new application of selective forces on dinucleotides to estimate context driven mutational processes, we find that synonymous mutations seem driven both by the viral codon bias and by the high value of the CpG force in the N protein, leading to a loss in CpG content. Sequence motifs preceding these CpG-loss-associated loci match recently identified binding patterns of the Zinc Finger anti-viral Protein (ZAP) protein. Funding: This work was partially supported by the ANR19 Decrypted CE30-0021-01 grants. B.G. was supported by National Institutes of Health grants 7R01AI081848-04, 1R01CA240924-01, a Stand Up to Cancer - Lustgarten Foundation Convergence Dream Team Grant, and The Pershing Square Sohn Prize - Mark Foundation Fellow supported by funding from The Mark Foundation for Cancer Research.
Databáze: MEDLINE