Friday, March 6, 2009
A proposal for a field guide for microbes just like a field guide for birds. "A Genomic Encyclopedia of Bacteria and Archaea (GEBA)"
"There is a glaring gap in microbial genome sequence availability – the currently available genome sequences show a highly biased phylogenetic distribution compared to the extent of microbial diversity known today. This bias has resulted in major limitations in our knowledge of microbial genome complexity and our understanding of the evolution, physiology and metabolic capacity of microbes. Although there have been small efforts in sequencing genomes from across the tree of life for microbes, there are no systematic efforts. There are many reasons why phylogenetic based sequencing in theory should be of great benefit including: (a) improved identification of protein families and orthology groups across species, which will improve annotation of other microbial genomes (b) improved phylogenetic anchoring of metagenomic data, (c) gene discovery (which tends to be maximized by selecting phylogenetically novel organisms, (d) a better understanding of the processes underlying the evolutionary diversification of microbes (e.g., lateral gene transfer and gene duplication) (e) a better understanding of the classification and evolutionary history of microbial species and (f) improved correlations of phenotype and genotype in microbes. Based on the potential benefits, we (JGI) have commenced a pilot project to create a Genomic Encyclopedia of Bacteria and Archaea (GEBA). In this pilot, we plan to sequence ~100 genomes selected based on their phylogenetic novelty. This is being done at two phylogenetic scales. About 60 of the genomes are from across the breadth of bacteria and archaea. The remaining 40 genomes are from within the Actinobacteria. By doing this two tiered selection we can test both the value of breadth from across the bacteria and archaea as well as the value of filling in the phylogenetic gaps within a single phyla. In my talk I will summarize the project and report on the sequencing and analysis of the first 56 genomes. I will discuss how we are using this pilot to test protocols that could be used for a scale up of the GEBA project or for any other large scale microbial sequencing project. In addition I will discuss how collaborations with culture collections can be valuable in such a project. Finally, I will report on the results of tests of the value of phylogenetic based sequencing."