SlideShare a Scribd company logo
GEBA A genomic encyclopedia of  bacteria and archaea Jonathan A. Eisen JGI User Meeting 2009
“ Nothing in biology makes sense except in the light of evolution.” T. Dobzhansky (1973)
 
rRNA Tree of Life
The Tree is not Happy
From http://genomesonline.org
At least 40 phyla of bacteria Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled Same trend in Archaea As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
Need for Tree Guidance Well Established Common approach within some eukaryotic groups Many small projects funded to fill in some bacterial or archaeal gaps Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature
Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled Solution I: sequence more phyla NSF-funded Tree of Life Project A genome from each of eight phyla Eisen, Ward, Badger, Wu, Wu, et al. Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
Bacterial aTOL Project AIMS Improve resolution of deep branches in the bacterial tree Launch biological studies of these phyla and discover functional novelty Leverage data for interpreting environmental surveys
T. roseum  genome
The Tree of Life is Still Angry
Within Phyla Diversity Immense Each phyla represents billions of years of evolution Some have hundreds of major lineages New lineages are being discovered all the time Most branches within most phyla have few or no genomes
Major Lineages of Actinobacteria
Additional Impetus for Tree Guided Projects Suggestion to sequence all bacteria and archaea in Bergey’s Manual (Stevens et al) Success in sequencing genomes from across the tree in animals Multiple government reports suggest a more systematic approach to sequencing is needed
Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi At least 100 phyla of bacteria Genome sequences are mostly from three phyla Most phyla with cultured species are sparsely sampled Lineages with no cultured taxa even more poorly sampled Solution - use tree to really fill gaps Well sampled phyla Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
http://www.jgi.doe.gov/programs/GEBA/pilot.html
GEBA Pilot Project Overview Select 200 organisms using tree Develop high throughput pipeline for strain growth and DNA preparation Sequence and finish 100  Annotate, analyze, release data Assess benefits of tree guided sequencing
GEBA Pilot I:  Selecting Targets
 
 
 
 
 
GEBA Pilot II:  The Importance of Project Management
GEBA Project Flowchart GEBA Proposal Scientific and Technical Review 1 Negotiate Scope of Work Receive Starting Material 1 OK? Project Initiation Sequencing Annotation Draft Sequencing and Assembly 1 Finish Sequencing and Assembly 2 IMG 1 Finish Annotation 3 Complete Genome GenBank Submission 1 Draft Annotation 3 Shotgun Genome GenBank Submission 1 IMG – ER 1 1  PGF 2  LANL 3  ORNL OK? OK? IMG – ER 1 Gene-QA 1 David Bruce, Lynne Goodwin et al
GEBA Pilot III:  Partnership with DSMZ
GEBA Biggest Challenge: Getting DNA Getting quality DNA is biggest bottleneck Solution: Beg Borrow and Steal DSMZ offered to do for free ATCC is doing a small number for a fee In discussions with other PCC and other collections
 
Microorganisms Quantification gel of the genomic DNA isolated from  Conexibacter woesei  (DSM 14684T) Conexibacter woesei  (DSM 14684T) was taken from the German Collection of Microorganisms and Cell Cultures (DSMZ). The genomic DNA was isolated using the Qiagen Genomic 500 DNA Kit (Qiagen 10262). The genomic DNA was 10-250 kb in size as determined by Pulsed Field Gel Electrophoresis (PFGE). The bulk of DNA had a size of 50-250 kb (see attached PFGE image). The DNA concentration is 500 ng/µl as estimated from the gel. Spectrophotometric measurements yielded a DNA concentration of 450 µg/ml; 300 µl of genomic DNA are shipped (150 µg).  1 2 3 4 5 6 7 8 Lane 1: c(  -Marker)= 15 ng Lane 2: c(  -Marker)= 30 ng Lane 3: c(  -Marker)= 50 ng Lane 4:  DNA Molecular Weight Marker II (Roche 236250) Lane 5: DSM 13279,  Collinsella stercoris Lane 6: DSM 43043,  Intrasporangium calvum Lane 7: DSM 18053,  Dyadobacter fermentans Lane 8: DSM 20476,  Slackia heliotrinireducens Lane  9: DSM 18081,  Patulibacter minatonensis Lane 10: DSM 14684,  Conexibacter woesei Lane 11: DSM 11002,  Dethiosulfovibrio peptidovorans Lane 12: DSM 11551,  Halogeometricum borinquense Lane 13:  DNA Molecular Weight Marker II (Roche 236250) Lane 14: c(  -Marker)= 125 ng Lane 15: c(  -Marker)= 250 ng   Lane 16: c(  -Marker)= 500 ng 9 10 11 12 13 14 15 16
GEBA Pilot IV:  Sequencing, Annotation, Data Release
Current Status >100 in progress GEBA 56 (focus of first paper) 34 finished genomes 55 submitted to Genbank Released to IMG-GEBA page and JGI-FTP site All data is completely Open for anyone to use
IMG/GEBA http://img.jgi.doe.gov/cgi-bin/geba/main.cgi
Adopt a Microbe
GEBA Pilot IV:  Assess Benefits of GEBA56 All genomes have some value But what, if any, is the benefit of tree-guided sequencing over other selection methods
Why Increase Taxonomic Coverage II? Gene discovery Annotation, functional prediction Metagenomic analysis Mechanisms of diversification Species phylogeny and classification
 
Value of diverse genomes I:  Gene discovery Premise: New genomes frequently contain genetic novelty Phylogenetic diversity of a genome should be correlated to novelty Caveat:  Does lateral gene transfer wipe out contribution of phylogenetic diversity to novelty?
Protein Family Rarefaction Curves Take data set of multiple complete genomes Identify all protein families using MCL Plot # of genomes vs. # of protein families
 
Genome Number Total Gene Number Number of proteins 0 50000 100000 150000 200000 250000 300000 350000 0 10 20 30 40 50 60 70 80 S. agalactiae Enterobacteriaceae Actinobacteria Bacteria from GEBA project
Novelty 2 - Structural Novelty Of the 17000 protein families in the GEBA56, 1800 are novel in sequence (Wu) Structural modeling suggests many are structurally novel too (D'haeseleer) 372 being crystallized by the PSI (Kerfeld)
Novelty 3 Diversity within known families
Transporter Profiles Sebaldella termitidis  ATCC 33386 has 2x number of sugar PTS transporters of any genome
Novelty 4 Unusual distribution patterns
Shotgun Sequencing Detects More Diversity than PCR-methods
First Bacterial Actin Related Protein First found by V. Kunin, Structure Analysis by Patrik D. et al
Most Closely Related to ARP8
Value of 100 diverse genomes II: Annotation Premise: Increased phylogenetic coverage should improve our ability to annotate genes in other (e.g., reference/model genomes)
Annotation Improves Conversion of hypothetical into conserved hypotheticals Linking distantly related members of protein families Non-homology functional prediction methods
Linking Protein Families Improved
Fusion Based Predictions Improved
Improving Rosetta Stone Predictions
Value of 100 diverse genomes III: Metagenomics Premise:  Increased sampling of diverse genomes should improve many aspects of metagenomic analysis To test: Annotation Binning
Metagenomic Annotation Improves (Slightly)
Compositional Binning Improves (Slightly)
Phylogenetic Binning Improves Slightly
Value of 100 diverse genomes V: Phylogeny
16s Says  Hyphomonas  is in Rhodobacteriales Badger et al. 2005
WGT Says Its Related to Caulobacterales  Badger et al. 2005
 
 
GEBA - After the Pilot
PD of sequenced organisms
PD with GEBA
 
Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Most phyla with cultured species are sparsely sampled Lineages with no cultured taxa even more poorly sampled Well sampled phyla Poorly sampled No cultured taxa Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled Same trend in Viruses As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled Same trend in Microbial Eukaryotes As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
0.1 Acidobacteria Bacteroides Fibrobacteres  Gemmimonas Verrucomicrobia  Planctomycetes Chloroflexi Tree based on  Hugenholtz (2002)  with some  modifications. Need experimental studies from across the tree too Proteobacteria Chlorobi  Firmicutes Fusobacteria  Actinobacteria  Cyanobacteria  Chlamydia  Spriochaetes  Deinococcus-Thermus  Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
 
MICROBES
A Happy Tree of Life

More Related Content

Eisen.Geba.Jgi2009b

  • 1. GEBA A genomic encyclopedia of bacteria and archaea Jonathan A. Eisen JGI User Meeting 2009
  • 2. “ Nothing in biology makes sense except in the light of evolution.” T. Dobzhansky (1973)
  • 3.  
  • 5. The Tree is not Happy
  • 7. At least 40 phyla of bacteria Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 8. Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 9. Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 10. Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled Same trend in Archaea As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 11. Need for Tree Guidance Well Established Common approach within some eukaryotic groups Many small projects funded to fill in some bacterial or archaeal gaps Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature
  • 12. Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled Solution I: sequence more phyla NSF-funded Tree of Life Project A genome from each of eight phyla Eisen, Ward, Badger, Wu, Wu, et al. Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 13. Bacterial aTOL Project AIMS Improve resolution of deep branches in the bacterial tree Launch biological studies of these phyla and discover functional novelty Leverage data for interpreting environmental surveys
  • 14. T. roseum genome
  • 15. The Tree of Life is Still Angry
  • 16. Within Phyla Diversity Immense Each phyla represents billions of years of evolution Some have hundreds of major lineages New lineages are being discovered all the time Most branches within most phyla have few or no genomes
  • 17. Major Lineages of Actinobacteria
  • 18. Additional Impetus for Tree Guided Projects Suggestion to sequence all bacteria and archaea in Bergey’s Manual (Stevens et al) Success in sequencing genomes from across the tree in animals Multiple government reports suggest a more systematic approach to sequencing is needed
  • 19. Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi At least 100 phyla of bacteria Genome sequences are mostly from three phyla Most phyla with cultured species are sparsely sampled Lineages with no cultured taxa even more poorly sampled Solution - use tree to really fill gaps Well sampled phyla Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 21. GEBA Pilot Project Overview Select 200 organisms using tree Develop high throughput pipeline for strain growth and DNA preparation Sequence and finish 100 Annotate, analyze, release data Assess benefits of tree guided sequencing
  • 22. GEBA Pilot I: Selecting Targets
  • 23.  
  • 24.  
  • 25.  
  • 26.  
  • 27.  
  • 28. GEBA Pilot II: The Importance of Project Management
  • 29. GEBA Project Flowchart GEBA Proposal Scientific and Technical Review 1 Negotiate Scope of Work Receive Starting Material 1 OK? Project Initiation Sequencing Annotation Draft Sequencing and Assembly 1 Finish Sequencing and Assembly 2 IMG 1 Finish Annotation 3 Complete Genome GenBank Submission 1 Draft Annotation 3 Shotgun Genome GenBank Submission 1 IMG – ER 1 1 PGF 2 LANL 3 ORNL OK? OK? IMG – ER 1 Gene-QA 1 David Bruce, Lynne Goodwin et al
  • 30. GEBA Pilot III: Partnership with DSMZ
  • 31. GEBA Biggest Challenge: Getting DNA Getting quality DNA is biggest bottleneck Solution: Beg Borrow and Steal DSMZ offered to do for free ATCC is doing a small number for a fee In discussions with other PCC and other collections
  • 32.  
  • 33. Microorganisms Quantification gel of the genomic DNA isolated from Conexibacter woesei (DSM 14684T) Conexibacter woesei (DSM 14684T) was taken from the German Collection of Microorganisms and Cell Cultures (DSMZ). The genomic DNA was isolated using the Qiagen Genomic 500 DNA Kit (Qiagen 10262). The genomic DNA was 10-250 kb in size as determined by Pulsed Field Gel Electrophoresis (PFGE). The bulk of DNA had a size of 50-250 kb (see attached PFGE image). The DNA concentration is 500 ng/µl as estimated from the gel. Spectrophotometric measurements yielded a DNA concentration of 450 µg/ml; 300 µl of genomic DNA are shipped (150 µg). 1 2 3 4 5 6 7 8 Lane 1: c(  -Marker)= 15 ng Lane 2: c(  -Marker)= 30 ng Lane 3: c(  -Marker)= 50 ng Lane 4: DNA Molecular Weight Marker II (Roche 236250) Lane 5: DSM 13279, Collinsella stercoris Lane 6: DSM 43043, Intrasporangium calvum Lane 7: DSM 18053, Dyadobacter fermentans Lane 8: DSM 20476, Slackia heliotrinireducens Lane 9: DSM 18081, Patulibacter minatonensis Lane 10: DSM 14684, Conexibacter woesei Lane 11: DSM 11002, Dethiosulfovibrio peptidovorans Lane 12: DSM 11551, Halogeometricum borinquense Lane 13: DNA Molecular Weight Marker II (Roche 236250) Lane 14: c(  -Marker)= 125 ng Lane 15: c(  -Marker)= 250 ng Lane 16: c(  -Marker)= 500 ng 9 10 11 12 13 14 15 16
  • 34. GEBA Pilot IV: Sequencing, Annotation, Data Release
  • 35. Current Status >100 in progress GEBA 56 (focus of first paper) 34 finished genomes 55 submitted to Genbank Released to IMG-GEBA page and JGI-FTP site All data is completely Open for anyone to use
  • 38. GEBA Pilot IV: Assess Benefits of GEBA56 All genomes have some value But what, if any, is the benefit of tree-guided sequencing over other selection methods
  • 39. Why Increase Taxonomic Coverage II? Gene discovery Annotation, functional prediction Metagenomic analysis Mechanisms of diversification Species phylogeny and classification
  • 40.  
  • 41. Value of diverse genomes I: Gene discovery Premise: New genomes frequently contain genetic novelty Phylogenetic diversity of a genome should be correlated to novelty Caveat: Does lateral gene transfer wipe out contribution of phylogenetic diversity to novelty?
  • 42. Protein Family Rarefaction Curves Take data set of multiple complete genomes Identify all protein families using MCL Plot # of genomes vs. # of protein families
  • 43.  
  • 44. Genome Number Total Gene Number Number of proteins 0 50000 100000 150000 200000 250000 300000 350000 0 10 20 30 40 50 60 70 80 S. agalactiae Enterobacteriaceae Actinobacteria Bacteria from GEBA project
  • 45. Novelty 2 - Structural Novelty Of the 17000 protein families in the GEBA56, 1800 are novel in sequence (Wu) Structural modeling suggests many are structurally novel too (D'haeseleer) 372 being crystallized by the PSI (Kerfeld)
  • 46. Novelty 3 Diversity within known families
  • 47. Transporter Profiles Sebaldella termitidis ATCC 33386 has 2x number of sugar PTS transporters of any genome
  • 48. Novelty 4 Unusual distribution patterns
  • 49. Shotgun Sequencing Detects More Diversity than PCR-methods
  • 50. First Bacterial Actin Related Protein First found by V. Kunin, Structure Analysis by Patrik D. et al
  • 52. Value of 100 diverse genomes II: Annotation Premise: Increased phylogenetic coverage should improve our ability to annotate genes in other (e.g., reference/model genomes)
  • 53. Annotation Improves Conversion of hypothetical into conserved hypotheticals Linking distantly related members of protein families Non-homology functional prediction methods
  • 56. Improving Rosetta Stone Predictions
  • 57. Value of 100 diverse genomes III: Metagenomics Premise: Increased sampling of diverse genomes should improve many aspects of metagenomic analysis To test: Annotation Binning
  • 61. Value of 100 diverse genomes V: Phylogeny
  • 62. 16s Says Hyphomonas is in Rhodobacteriales Badger et al. 2005
  • 63. WGT Says Its Related to Caulobacterales Badger et al. 2005
  • 64.  
  • 65.  
  • 66. GEBA - After the Pilot
  • 67. PD of sequenced organisms
  • 69.  
  • 70. Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Most phyla with cultured species are sparsely sampled Lineages with no cultured taxa even more poorly sampled Well sampled phyla Poorly sampled No cultured taxa Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 71. Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled Same trend in Viruses As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 72. Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi At least 40 phyla of bacteria Genome sequences are mostly from three phyla Some other phyla are only sparsely sampled Same trend in Microbial Eukaryotes As of 2002 Based on Hugenholtz, 2002 Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 73. 0.1 Acidobacteria Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi Tree based on Hugenholtz (2002) with some modifications. Need experimental studies from across the tree too Proteobacteria Chlorobi Firmicutes Fusobacteria Actinobacteria Cyanobacteria Chlamydia Spriochaetes Deinococcus-Thermus Aquificae Thermotogae TM6 OS-K Termite Group OP8 Marine GroupA WS3 OP9 NKB19 OP3 OP10 TM7 OP1 OP11 Nitrospira Synergistes Deferribacteres Thermudesulfobacteria Chrysiogenetes Thermomicrobia Dictyoglomus Coprothmermobacter
  • 74.  
  • 76. A Happy Tree of Life
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy