Popis: |
Oligonucleotide array hybridization allows one to read a DNA sequence by interpreting the pattern generated when the sequence hybridizes to an array of known oligonucleotides. Because of noise and cross-hybridization, there is uncertainty about the frequency of occurrence of each oligonucleotide in the sequence, and it is necessary to set up a model relating sequences to the data sets they are expected to generate. This model has parameters specifying the noise level and the pattern of cross-hybridization; we use the maximum entropy algorithm MemSys5 to estimate cross-hybridization parameters from data sets corresponding to known sequences, and to choose between different models. To determine an unknown sequence from a data set, a maximum entropy reconstruction provides estimates of the noise level and the frequency of occurrence of each oligonucleotide. This information allows candidate sequences to be evaluated against the data and probabilities to be assigned to them. |