Abstract
The challenge of identifying DNA regulatory sequences based on sequence information only has been emphasized in view of the fast accumulation of new genes in the databases. While most predictive algorithms are based on multiple alignments of already known binding sites, here we examine the usefulness of a novel approach that is based on structural information of the protein-DNA complex. It has already been shown that specific recognition between a protein and its DNA target is achieved by stereo-chemical complementarity between the protein amino acids and the DNA bases. The proposed computational scheme uses crystallographic information to define the set of amino acid-base contacts between the proteins of a given DNA-binding protein family and their DNA targets. The compatibility of a given protein to bind to putative regulatory DNA sequences is then evaluated by knowledge-based parameters for amino acid-base interactions. By this procedure gene upstream regions may be screened for potential binding sites for regulatory proteins. Predictions are demonstrated for the E. coli cyclic AMP receptor protein (CRP) which recognizes the DNA via the helix-turn-helix motif, and for various Zif268-like proteins which belong to the Cys2His2 zinc finger family. The advantages and limitations of this approach are discussed.
Original language | English |
---|---|
Pages (from-to) | 139-150 |
Number of pages | 12 |
Journal | Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing |
State | Published - 2001 |