Popis: |
Molecular technologies are applied in epidemiology to distinguish strains of bacterial isolates. The aims of this thesis are to develop and apply methods to analyse molecular data from epidemiological studies of tuberculosis to enhance our understanding of the evolution and spread of this disease. Specific goals are (1) to develop models for the evolution of molecular markers, (2) to develop a tool to visualise relationships among isolates in a sample, (3) to estimate mutation rates of molecular markers, and (4) to study the impact of homoplasy in making inferences using these molecular markers. Our methods focused on two popular technologies for strain differentiation used for tuberculosis: spoligotyping and VNTR typing. First, we developed a spacer deletion model to aid in the study of relationships of spoligotypes in a sample of tuberculosis isolates. Second, we estimated a parameter of genetic diversity that enabled the estimation of the mutation rates of spoligotypes and VNTR loci. Finally, we developed models embedding the evolution of molecular markers in a stochastic process describing transmission in a population. We analysed computer simulations of this model to study homoplasy, where we compared two models: one model that counts new genotypes whenever distinct genotypes arise from distinct parent genotypes, and a second model that classifies genotypes according to similarity in their molecular patterns. Analysis of the spoligotype model of deletion (1) showed that a Zipf distribution applies to the deletion of spacers in spoligotypes and (2) led us to develop a new technique to visualise strain relationships in a sample of tuberculosis spoligotypes. Our estimate of the mutation rate of spoligotypes is comparable to those reported previ- ously, while our estimate for the mutation rate of a VNTR locus is at least tenfold higher than reported in recent publications. Finally, we found that homoplasy undermines inferences from data when too few loci are used (most mutation events are homoplasy due to too few possible states) or when the mutation rate is high (more mutation events provide more opportunities for homoplasy). Stochasticity results in a highly variable probability of homoplasy over time. The methods developed in this thesis can be adapted to other molecular markers in other organisms. |