Popis: |
Summary: Background: Pairwise single nucleotide polymorphisms (SNPs) are a cornerstone of genomic approaches to the inference of transmission of multidrug-resistant (MDR) organisms in hospitals. However, the impact of many key analytical approaches on these inferences has not yet been systematically assessed. This study aims to make such a systematic assessment. Methods: We conducted a 15-month prospective study (2-month pilot phase, 13-month implementation phase), across four hospital networks including eight hospitals in Melbourne, VIC, Australia. Patient clinical and screening samples containing one or more isolates of meticillin-resistant Staphylococcus aureus, vancomycin-resistant Enterococcus faecium, and extended-spectrum β-lactamase-producing Escherichia coli and Klebsiella pneumoniae were collected and underwent whole genome sequencing. Using the genome data from the top four most numerous sequence types from each species, 16 in total, we systematically assessed the: (1) impact of sample and reference genome diversity through multiple core genome alignments using different data subsets and reference genomes, (2) effect of masking of prophage and regions of recombination in the core genome alignments by assessing SNP distances before and after masking, (3) differences between a cumulative versus a 3-month sliding-window approach to sample genome inclusion in the dataset over time, and (4) the comparative effects each of these approaches had when applying a previously defined SNP threshold for inferring likely transmission. Findings: 2275 samples were collected (397 during the pilot phase from April 4 to June 18, 2017; 1878 during the implementation phase from Oct 30, 2017, to Nov 30, 2018) from 1870 patients. Of these 2275 samples, 1537 were identified as arising from the four most numerous sequence types from each of the four target species of MDR organisms in this dataset (16 sequence types in total: S aureus ST5, ST22, ST45, and ST93; E faecium ST80, ST203, ST1421, and ST1424; K pneumoniae ST15, ST17, ST307, and ST323; and E coli ST38, ST131, ST648, and ST1193). Across the species, using a reference genome of the same sequence type provided a greater degree of pairwise SNP resolution, compared with species and outgroup-reference alignments that mostly resulted in inflated SNP distances and the possibility of missed transmission events. Omitting prophage regions had minimal effect; however, omitting recombination regions had a highly variable effect, often inflating the number of closely related pairs. Estimated SNP distances between isolate pairs over time were more consistent using a sliding-window than a cumulative approach. Interpretation: We propose that the use of a closely related reference genome, without masking of prophage or recombination regions, and of a sliding-window approach for isolate inclusion is best for accurate and consistent MDR organism transmission inference, when using core genome alignments and SNP thresholds. These approaches provide increased stability and resolution, so SNP thresholds can be more reliably applied for putative transmission inference among diverse MDR organisms, reducing the chance of incorrectly inferring the presence or absence of close genetic relatedness and, therefore, transmission. The establishment of a broadly applicable and standardised approach, as proposed here, is necessary to implement widespread prospective genomic surveillance for MDR organism transmission. Funding: Melbourne Genomics Health Alliance, and National Health and Medical Research Council of Australia. |