1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

Supratim Mukherjee, Rekha Seshadri, Neha J. Varghese, Emiley A. Eloe-Fadrosh, Jan P. Meier-Kolthoff, Markus Göker, R. Cameron Coates, Michalis Hadjithomas, Georgios A. Pavlopoulos, David Paez-Espino, Yasuo Yoshikuni, Axel Visel, William B. Whitman, George M. Garrity, Jonathan A. Eisen, Philip Hugenholtz, Amrita Pati, Natalia N. Ivanova, Tanja Woyke, Hans Peter Klenk & 1 others Nikos C. Kyrpides

Research output: Research - peer-reviewArticle

  • 7 Citations

Abstract

We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.

LanguageEnglish (US)
Pages676-683
Number of pages8
JournalNature Biotechnology
Volume35
Issue number7
DOIs
StatePublished - Jul 1 2017

Profile

Archaeal Genome
Bacterial Genomes
Genome
Genes
Encyclopedias
Archaea
Bacteria
Proteins
Metagenomics
phenazine
Direction compound

ASJC Scopus subject areas

  • Biotechnology
  • Bioengineering
  • Applied Microbiology and Biotechnology
  • Biomedical Engineering
  • Molecular Medicine

Cite this

Mukherjee, S., Seshadri, R., Varghese, N. J., Eloe-Fadrosh, E. A., Meier-Kolthoff, J. P., Göker, M., ... Kyrpides, N. C. (2017). 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life. Nature Biotechnology, 35(7), 676-683. DOI: 10.1038/nbt.3886

1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life. / Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.; Eloe-Fadrosh, Emiley A.; Meier-Kolthoff, Jan P.; Göker, Markus; Coates, R. Cameron; Hadjithomas, Michalis; Pavlopoulos, Georgios A.; Paez-Espino, David; Yoshikuni, Yasuo; Visel, Axel; Whitman, William B.; Garrity, George M.; Eisen, Jonathan A.; Hugenholtz, Philip; Pati, Amrita; Ivanova, Natalia N.; Woyke, Tanja; Klenk, Hans Peter; Kyrpides, Nikos C.

In: Nature Biotechnology, Vol. 35, No. 7, 01.07.2017, p. 676-683.

Research output: Research - peer-reviewArticle

Mukherjee, S, Seshadri, R, Varghese, NJ, Eloe-Fadrosh, EA, Meier-Kolthoff, JP, Göker, M, Coates, RC, Hadjithomas, M, Pavlopoulos, GA, Paez-Espino, D, Yoshikuni, Y, Visel, A, Whitman, WB, Garrity, GM, Eisen, JA, Hugenholtz, P, Pati, A, Ivanova, NN, Woyke, T, Klenk, HP & Kyrpides, NC 2017, '1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life' Nature Biotechnology, vol 35, no. 7, pp. 676-683. DOI: 10.1038/nbt.3886
Mukherjee S, Seshadri R, Varghese NJ, Eloe-Fadrosh EA, Meier-Kolthoff JP, Göker M et al. 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life. Nature Biotechnology. 2017 Jul 1;35(7):676-683. Available from, DOI: 10.1038/nbt.3886
Mukherjee, Supratim ; Seshadri, Rekha ; Varghese, Neha J. ; Eloe-Fadrosh, Emiley A. ; Meier-Kolthoff, Jan P. ; Göker, Markus ; Coates, R. Cameron ; Hadjithomas, Michalis ; Pavlopoulos, Georgios A. ; Paez-Espino, David ; Yoshikuni, Yasuo ; Visel, Axel ; Whitman, William B. ; Garrity, George M. ; Eisen, Jonathan A. ; Hugenholtz, Philip ; Pati, Amrita ; Ivanova, Natalia N. ; Woyke, Tanja ; Klenk, Hans Peter ; Kyrpides, Nikos C./ 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life. In: Nature Biotechnology. 2017 ; Vol. 35, No. 7. pp. 676-683
@article{2c90eff8a8894a4a88d18a27ee5b81b2,
title = "1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life",
abstract = "We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.",
author = "Supratim Mukherjee and Rekha Seshadri and Varghese, {Neha J.} and Eloe-Fadrosh, {Emiley A.} and Meier-Kolthoff, {Jan P.} and Markus Göker and Coates, {R. Cameron} and Michalis Hadjithomas and Pavlopoulos, {Georgios A.} and David Paez-Espino and Yasuo Yoshikuni and Axel Visel and Whitman, {William B.} and Garrity, {George M.} and Eisen, {Jonathan A.} and Philip Hugenholtz and Amrita Pati and Ivanova, {Natalia N.} and Tanja Woyke and Klenk, {Hans Peter} and Kyrpides, {Nikos C.}",
year = "2017",
month = "7",
doi = "10.1038/nbt.3886",
volume = "35",
pages = "676--683",
journal = "Nature Biotechnology",
issn = "1087-0156",
publisher = "Nature Publishing Group",
number = "7",

}

TY - JOUR

T1 - 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

AU - Mukherjee,Supratim

AU - Seshadri,Rekha

AU - Varghese,Neha J.

AU - Eloe-Fadrosh,Emiley A.

AU - Meier-Kolthoff,Jan P.

AU - Göker,Markus

AU - Coates,R. Cameron

AU - Hadjithomas,Michalis

AU - Pavlopoulos,Georgios A.

AU - Paez-Espino,David

AU - Yoshikuni,Yasuo

AU - Visel,Axel

AU - Whitman,William B.

AU - Garrity,George M.

AU - Eisen,Jonathan A.

AU - Hugenholtz,Philip

AU - Pati,Amrita

AU - Ivanova,Natalia N.

AU - Woyke,Tanja

AU - Klenk,Hans Peter

AU - Kyrpides,Nikos C.

PY - 2017/7/1

Y1 - 2017/7/1

N2 - We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.

AB - We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.

UR - http://www.scopus.com/inward/record.url?scp=85024405108&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85024405108&partnerID=8YFLogxK

U2 - 10.1038/nbt.3886

DO - 10.1038/nbt.3886

M3 - Article

VL - 35

SP - 676

EP - 683

JO - Nature Biotechnology

T2 - Nature Biotechnology

JF - Nature Biotechnology

SN - 1087-0156

IS - 7

ER -