Scan2S: Increasing the precision of PROSITE pattern motifs using secondary structure constraints

Lucy Skrabanek*, Masha Y. Niv

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Sequence signature databases such as PROSITE, which include protein pattern motifs indicative of a protein's function, are widely used for function prediction studies, cellular localization annotation, and sequence classification. Correct annotation relies on high precision of the motifs. We present a new and general approach for increasing the precision of established protein pattern motifs by including secondary structure constraints (SSCs). We use Scan2S, the first sequence motif-scanning program to optionally include SSCs, to augment PROSITE pattern motifs. The constraints were derived from either the DSSP secondary structure assignment or the PSIPRED predictions for PROSITE-documented true positive hits. The secondary structure-augmented motifs were scanned against all SwissProt sequences, for which secondary structure predictions were precalculated. Against this dataset, motifs with PSIPRED-derived SSCs exhibited improved performance over motifs with DSSP-derived constraints. The precision of 763 of the 782 PSIPRED-augmented motifs remained unchanged or increased compared to the original motifs; 26 motifs showed an absolute precision increase of 10-30%. We provide the complete set of augmented motifs and the Scan2S program at http://physiology.med.cornell. edu/go/scan2s. Our results suggest a general protocol for increasing the precision of protein pattern detection via the inclusion of SSCs.

Original languageAmerican English
Pages (from-to)1138-1147
Number of pages10
JournalProteins: Structure, Function and Genetics
Volume72
Issue number4
DOIs
StatePublished - Sep 2008

Keywords

  • Pattern
  • Protein motif
  • Regular expression
  • Secondary structure constraint

Fingerprint

Dive into the research topics of 'Scan2S: Increasing the precision of PROSITE pattern motifs using secondary structure constraints'. Together they form a unique fingerprint.

Cite this