Bacteria are the unseen majority on our planet, with millions of species and comprising most of the living protoplasm. We propose a novel approach for reconstruction of the composition of an unknown mixture of bacteria using a single Sanger-sequencing reaction of the mixture. Our method is based on compressive sensing theory, which deals with reconstruction of a sparse signal using a small number of measurements. Utilizing the fact that in many cases each bacterial community is comprised of a small subset of all known bacterial species, we show the feasibility of this approach for determining the composition of a bacterial mixture. Using simulations, we show that sequencing a few hundred base-pairs of the 16S rRNA gene sequence may provide enough information for reconstruction of mixtures containing tens of species, out of tens of thousands, even in the presence of realistic measurement noise. Finally, we show initial promising results when applying our method for the reconstruction of a toy experimental mixture with five species. Our approach may have a potential for a simple and efficient way for identifying bacterial species compositions in biological samples. Availability: supplementary information, data and MATLAB code are available at: http://www.broadinstitute.org/~orzuk/publications/BCS.
|Title of host publication
|Research in Computational Molecular Biology - 15th Annual International Conference, RECOMB 2011, Proceedings
|Vineet Bafna, S. Cenk Sahinalp
|Number of pages
|Published - 2011
|15th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2011 - Vancouver, Canada
Duration: 28 Mar 2011 → 31 Mar 2011
|Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
|15th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2011
|28/03/11 → 31/03/11
Bibliographical notePublisher Copyright:
© 2011, Springer-Verlag Berlin Heidelberg.