Abstrakt: |
This paper discusses the use of graph-theoretic methods for the representation and searching of three-dimensional patterns of side-chains in protein structures. The position of a side-chain is represented by pseudo-atoms, and the relative positions of pairs of side-chains by the distances between them. This description of the geometry can be represented by a labelled graph in which the nodes and the edges of the graph represent the pseudo-atoms and the sets of inter-pseudo-atomic distances, respectively. Given such a representation, a protein can be searched for the presence of a user-defined query pattern of side-chains by means of a subgraph-isomorphism algorithm which is implemented in the program ASSAM. Experiments with one such algorithm, that due to Ullmann, show that it provides both an effective and a highly efficient way of searching for patterns of side-chains. The method is illustrated by searches for the serine protease catalytic triad, for residues involved in the catalytic activity of staphyloccocal nuclease, and for the zinc-binding side-chains of thermolysin. The catalytic triad pattern search revealed the existence of a second Asp-His-Ser triad-like arrangement of residues in trypsinogen and chymotrypsinogen, in addition to the catalytic residues. In addition the program can be used to search for hypothetical patterns, as is shown for a pattern of three tryptophan side-chains. These searches demonstrate that the search algorithm can successfully retrieve the great majority of the expected proteins, as well as other, previously unreported proteins that contain the pattern of interest. |