Rediscovering secondary structures as network motifs - An unsupervised learning approach

Barak Raveh*, Ronen Basri, Gideon Schreiber

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Motivation: Secondary structures are key descriptors of a protein fold and its topology. In recent years, they facilitated intensive computational tasks for finding structural homologues, fold prediction and protein design. Their popularity stems from an appealing regularity in patterns of geometry and chemistry. However, the definition of secondary structures is of subjective nature. An unsupervised de-novo discovery of these structures would shed light on their nature, and improve the way we use these structures in algorithms of structural bioinformatics. Methods: We developed a new method for unsupervised partitioning of undirected graphs, based on patterns of small recurring network motifs. Our input was the network of all H-bonds and covalent interactions of protein backbones. This method can be also used for other biological and non-biological networks. Results: In a fully unsupervised manner, and without assuming any explicit prior knowledge, we were able to rediscover the existence of conventional α-helices, parallel β-sheets, anti-parallel sheets and loops, as well as various non-conventional hybrid structures. The relation between connectivity and crystallographic temperature factors establishes the existence of novel secondary structures.

Original languageAmerican English
Pages (from-to)e163-e169
Issue number2
StatePublished - 2007
Externally publishedYes

Bibliographical note

Funding Information:
This work was partially funded by the Israel Ministry of Science and Technology. O.R. thanks Yossi Shaul for fruitful discussions about the problem of assigning secondary structures.


Dive into the research topics of 'Rediscovering secondary structures as network motifs - An unsupervised learning approach'. Together they form a unique fingerprint.

Cite this