Current approaches for identification and detection of transcription factor binding sites rely on an extensive set of known target genes. Here we describe a novel structure-based approach applicable to transcription factors with no prior binding data. Our approach combines sequence data and structural information to infer context-specific amino acid-nucleotide recognition preferences. These are used to predict binding sites for novel transcription factors from the same structural family. We apply our approach to the Cys 2His2 Zinc Finger protein family, and show that the learned DNA-recognition preferences are compatible with various experimental results. To demonstrate the potential of our algorithm, we use the learned preferences to predict binding site models for novel proteins from the same family. These models are then used in genomic scans to find putative binding sites of the novel proteins.
|Original language||American English|
|Number of pages||16|
|Journal||Lecture Notes in Computer Science|
|State||Published - 2005|
|Event||9th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2005 - Cambridge, MA, United States|
Duration: 14 May 2005 → 18 May 2005