Sequence alignment

Autor: Chia, Nicholas Lee-Ping
Jazyk: angličtina
Rok vydání: 2006
Předmět:
Druh dokumentu: Text
Popis: Sequence alignment is the most widely used tool in molecular biology today. Its uses include a variety of tasks from the identification of new genes to the construction of phylogenetic trees. An integral part of any sequence alignment is the interpretation of the resulting scores, or assessing the significance of alignment scores. This task requires that we first understand the distribution of scores between pairs of random sequences. This problem is statistical in nature, and thus can be approached using methods from statistical physics. In this thesis, I will explore three problems related to the significance assessment of sequence alignment scores. The first two of these problems will address issues in sequence alignment directly, while the last work focuses on the Asymmetric Exclusion Process, a physical system that can be mapped onto a sequence alignment problem. This work grew out of the close relationship between the statistics of the Asymmetric Exclusion Process, and my other work on the statistical properties of sequence alignments. More specifically, in the first topic, I will examine the commonly used disordered bonds, or Bernoulli randomness, approximation and its effect on sequence alignment scores. Because this approximation is used in much of the theoretical studies of sequence alignment, it is important to examine the effect this approximation has on the statistics of alignment scores. The second topic also focuses on sequence alignment. Specifically, the work focuses on the rapid and accurate significance assessment for various scoring schemes. This is an important task for tools, such as PSI-BLAST, that iteratively adapt their scoring schemes in each round. In this work, I utilize theoretical results from both sequence alignment and physics, plus powerful computational tools, in order to characterize the distribution of rare (high) scoring events in sequence alignment. In doing so, a tool for the rapid and accurate significance assessment is created. Some of the methods used for this study are also used in the third work on the Asymmetric Exclusion Process.
Databáze: Networked Digital Library of Theses & Dissertations