Blaxter 2022-annotated
Blaxter 2022-annotated
PERSPECTIVE
Why sequence all eukaryotes?
Mark Blaxtera,1 , John M. Archibaldb , Anna K. Childersc , Jonathan A. Coddingtond ,
Keith A. Crandalle,f , Federica Di Palmag , Richard Durbina,h , Scott V. Edwardsi,j ,
Jennifer A. M. Gravesk,l , Kevin J. Hackettm , Neil Halln , Erich D. Jarviso,p, Rebecca N. Johnsonq ,
Elinor K. Karlssonr,s , W. John Kresst , Shigehiro Kurakuu,v , Mara K. N. Lawniczaka,
Kerstin Lindblad-Tohs,w , Jose V. Lopezx,y , Nancy A. Moranz , Gene E. Robinsonaa,bb ,
Oliver A. Rydercc,dd , Beth Shapiroee, Pamela S. Soltisff,gg , Tandy Warnowhh , Guojie Zhangii,jj , and
Harris A. Lewinkk,ll
Edited by Joan Strassmann, Washington University in St. Louis, St. Louis, MO; received September 10, 2021; accepted November 1, 2021
Life on Earth has evolved from initial simplicity to the astounding complexity we experience today. Bacteria
and archaea have largely excelled in metabolic diversification, but eukaryotes additionally display abundant
morphological innovation. How have these innovations come about and what constraints are there on the
origins of novelty and the continuing maintenance of biodiversity on Earth? The history of life and the code
for the working parts of cells and systems are written in the genome. The Earth BioGenome Project has pro-
posed that the genomes of all extant, named eukaryotes—about 2 million species—should be sequenced to
high quality to produce a digital library of life on Earth, beginning with strategic phylogenetic, ecological,
and high-impact priorities. Here we discuss why we should sequence all eukaryotic species, not just a repre-
sentative few scattered across the many branches of the tree of life. We suggest that many questions of evo-
lutionary and ecological significance will only be addressable when whole-genome data representing
divergences at all of the branchings in the tree of life or all species in natural ecosystems are available. We
envisage that a genomic tree of life will foster understanding of the ongoing processes of speciation, adapta-
tion, and organismal dependencies within entire ecosystems. These explorations will resolve long-standing
problems in phylogenetics, evolution, ecology, conservation, agriculture, bioindustry, and medicine.
genome j diversity j ecology j evolution j conservation
a
Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom; bDepartment of Biochemistry and Molecular Biology, Dalhousie
University, Halifax, NS B3H 4H7, Canada; cBee Research Laboratory, Agricultural Research Service, US Department of Agriculture (USDA),
Beltsville, MD 20705; dGlobal Genome Initiative, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560;
e
Computational Biology Institute, Department of Biostatistics and Bioinformatics, George Washington University, Washington, DC 20052;
f
Department of Invertebrate Zoology, Smithsonian Institution, Washington, DC 20013; gSchool of Biological Sciences, University of East Anglia,
Norwich NR4 7TJ, United Kingdom; hDepartment of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom; iDepartment of
Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138; jMuseum of Comparative Zoology, Harvard University,
Cambridge, MA 02138; kSchool of Life Sciences, La Trobe University, Bundoora, VIC 751 23, Australia; lUniversity of Canberra, Bruce, ACT 2617,
Australia; mCrop Production and Protection, Office of National Programs, Agricultural Research Service, USDA, Beltsville, MD 20705; nEarlham
Institute, Norwich, Norfolk NR4 7UZ, United Kingdom; oLaboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY
10065; pHoward Hughes Medical Institute, Chevy Chase, MD 20815; qNational Museum of Natural History, Smithsonian Institution, Washington,
DC 20560; rBioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605; sBroad Institute of MIT
and Harvard, Cambridge, MA 02142; tBotany, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013-7012;
u
Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan; vLaboratory for
Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo 650-0047, Japan; wScience for Life Laboratory, Department of
Medical Biochemistry and Microbiology, Uppsala University, Uppsala 751 23, Sweden; xDepartment of Biological Sciences, Halmos College of
Arts and Sciences, Nova Southeastern University, Dania Beach, FL 33004; yGuy Harvey Oceanographic Center, Dania Beach, FL 33004;
z
Integrative Biology, University of Texas at Austin, Austin, TX 78712; aaCarl R. Woese Institute for Genomic Biology, University of Illinois at
Urbana–Champaign, Urbana, IL 61801; bbDepartment of Entomology, University of Illinois at Urbana–Champaign, Urbana, IL 61801;
cc
Conservation Genetics, Division of Biology, San Diego Zoo Wildlife Alliance, Escondido, CA 92027; ddDepartment of Evolution, Behavior and
Ecology, University of California, San Diego, La Jolla, CA 92039; eeDepartment of Ecology and Evolutionary Biology, University of California,
Downloaded at UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE on January 26, 2022
Santa Cruz, CA 95064; ffFlorida Museum of Natural History, University of Florida, Gainesville, FL 32611; ggBiodiversity Institute, University of
Florida, Gainesville, FL 32611; hhDepartment of Computer Science, University of Illinois at Urbana–Champaign, Urbana, IL 61301; iiVillum Center
for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark;
jj
China National Genebank, Beijing Genomics Institute–Shenzhen, Shenzhen 518083, China; kkDepartment of Evolution and Ecology, College of
Biological Sciences, University of California, Davis, CA 95616; and llDepartment of Population Health and Reproduction, University of California,
Davis, CA 95616
Author contributions: M.B., J.M.A., A.K.C., J.A.C., K.A.C., F.D.P., R.D., S.V.E., J.A.M.G., K.J.H., N.H., E.D.J., R.N.J., E.K.K., W.J.K., S.K.,
M.K.N.L., K.L.-T., J.V.L., N.A.M., G.E.R., O.A.R., B.S., P.S.S., T.W., G.Z., and H.A.L. wrote the paper.
The authors declare no competing interest.
This article is a PNAS Direct Submission.
This open access article is distributed under Creative Commons Attribution License 4.0 (CC BY).
1
To whom correspondence may be addressed. Email: mb35@sanger.ac.uk.
Published January 18, 2022.
Conservation
Protect
Focus of genomics expands to
biodiversity &
embrace all species, changing how
we analyze their characteristics and mitigate effects
interactions, and their capacity to of climate
adapt to rapid ecological change. change
Ecosystems
Ecological and genomic data are
Support
integrated for all species, uncovering
coevolutionary relationships and ecosystem
interactions in natural and services
human-dominated ecosystems.
Symbiosis
By studying all participants in symbiotic Build pandemic
relationships, unexpected connections security and
between pests, parasites, and mutualist enhance human
symbionts, including physiological health
dependencies, are discovered.
Speciation
Improve
Completely sequenced lineages
agricultural
reveal diverse speciation processes,
from complete separation to ongoing yields & facilitate
hybridization, and underlying genome climate
variation. adaptation
Origins of novelty
Enable
Surveying little studied pockets of
synthetic biology
cellular diversity reveals how basic
processes of life evolved, and uncovers and spur
genes, proteins, and metabolic biotechnological
pathways. innovation
Fig. 1. Outcomes of the Earth BioGenome Project. Sequencing all eukaryotic life on Earth will transform our understanding of how life evolved,
help us build a sustainable future, provide biosecurity, and support an innovative bioeconomy. Here, we summarize the impact of these new
genomes on basic research, which is described in detail in the text. Some practical benefits of this project are captured in the gray circles.
“ … every scrap of biological diversity is priceless, to DNA sequence data, have revealed the outline of the tree of
be learned and cherished, and never to be surren- life, but many of the details are yet to be discovered (6). Each
dered without a struggle.” species is a unique evolutionary experiment, the daughter of an
E. O. Wilson (1) unbroken lineage of successful experiments. To date, much of
comparative genomics has centered on deep analysis of a short
Humans have classified the organisms of the natural world
roster of species, focused around Homo sapiens. While this
into groups by form and utility. In On the Parts of Animals, Aris-
work has revealed many of the details of our and other species’
totle (2) considered both form and function of animal organs
functioning and evolutionary history, each new genome has
and systems to ascertain deeper relationships between kinds.
brought new insights and it is clear that our knowledge is lim-
Downloaded at UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE on January 26, 2022
Sexual reproduction is a deep common thread in eukaryotes. It event. Even in the absence of hybridization, variation at some
likely evolved to permit efficient mixing of alleles at linked loci loci remains shared (incomplete lineage sorting) typically for
to colonize new niches and escape parasites and pathogens thousands to millions of generations even after total separation
(61). Asexual lineages are threatened by fitness degradation (20). The genetics and genomics of speciation mechanisms
through accumulation of deleterious mutations (Muller’s ratchet) range from single loci of large effect, through inversions that
and are short-lived in phylogenetic terms, and putatively ancient suppress recombination or generate Haldane’s rule effects on
asexuals appear to undergo rare sexual reproduction (62, 63). the heterogametic sex, to genomes that have fully diverged
Some protist groups, such as ciliates, have multiple equivalent in allopatry. Specific questions on speciation that can be
mating types, but most multicellular organisms have two sexes. addressed by complete sequencing of eukaryotic lineages
Production of differentiated haploid gametes (large eggs and include:
preservation of species and interventions that maintain balance including the complex reticulations that endosymbiosis, hori-
within ecosystems, and will drive effective, data-driven ecosys- zontal transfer, hybridization, and introgression have created.
tem conservation (82). Complete genome assemblies enable a broader and more com-
plete understanding of a species’ biology, contributing to a
Inventing New Tools and Resources lessened risk of extinction. Within the unifying model of this
Historically, genome sequencing and assembly have been skilled phylogenetic network, the genomes and the genes they pos-
labors, each polished genome the product of years of human sess will enable understanding of regulatory networks and trait
effort (83). This has to change, without compromising on quality. evolution, the dynamics of coevolution between genes and
While already routine for small bacterial and viral genomes, it is between species, the impact of changing environments on spe-
only now becoming possible to generate near-complete and cies and populations, the mechanistic link between genotypes
1 E. O. Wilson, Half-Earth: Our Planet’s Fight for Life (Liveright Publishing, 2016).
2 Aristotle, On the Parts of Animals, W. Ogle, Trans. (Kegan Paul, French, London, 1882).
3 C. Linnaeus, Systema Naturae (Holmiae Salvius, ed. 10, 1758).
4 C. Mora, D. P. Tittensor, S. Adl, A. G. B. Simpson, B. Worm, How many species are there on Earth and in the ocean? PLoS Biol. 9, e1001127 (2011).
5 C. Darwin, The Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life (Penguin Classics Reprint, 1985;
John Murray, ed. 1, 1859).
6 C. E. Hinchliff et al., Synthesis of phylogeny and taxonomy into a comprehensive tree of life. Proc. Natl. Acad. Sci. U.S.A. 112, 12764–12769 (2015).
7 S. Richards, It’s more than stamp collecting: How genome sequencing can unify biological research. Trends Genet. 31, 411–421 (2015).
8 H. A. Lewin et al., Earth BioGenome Project: Sequencing life for the future of life. Proc. Natl. Acad. Sci. U.S.A. 115, 4325–4333 (2018).
9 H. Lewin et al., The Earth BioGenome Project Working Group, The Earth BioGenome Project 2020: Starting the clock. Proc. Natl. Acad. Sci. U.S.A. 118,
e2115635118 (2021).
10 E. D. Jarvis, Perspectives from the Avian Phylogenomics Project: Questions that can be answered with sequencing all genomes of a vertebrate class. Annu.
Rev. Anim. Biosci. 4, 45–59 (2016).
11 S. Nurk et al., The complete sequence of a human genome. bioRxiv [Preprint] (2021). https://doi.org/10.1101/2021.05.26.445798 (Accessed 27 May 2021).
12 A. Rhie et al., Towards complete and error-free genome assemblies of all vertebrate species. Nature 592, 737–746 (2021).
13 The Darwin Tree of Life Project Consortium, Sequence locally, think globally: The Darwin Tree of Life Project. Proc. Natl. Acad. Sci. U.S.A. 118,
e2115642118 (2021).
14 A. Suh, L. Smeds, H. Ellegren, The dynamics of incomplete lineage sorting across the ancient adaptive radiation of neoavian birds. PLoS Biol. 13, e1002224 (2015).
15 C. Blais, J. M. Archibald, The past, present and future of the tree of life. Curr. Biol. 31, R314–R321 (2021).
16 J. Mallet, Hybrid speciation. Nature 446, 279–283 (2007).
17 L. H. Rieseberg, J. H. Willis, Plant speciation. Science 317, 910–914 (2007).
18 C. E. Lane, J. M. Archibald, The eukaryotic tree of life: Endosymbiosis takes its TOL. Trends Ecol. Evol. 23, 268–275 (2008).
19 L. A. Graham, P. L. Davies, Horizontal gene transfer in vertebrates: A fishy tale. Trends Genet. 37, 501–503 (2021).
20 S. V. Edwards, Is a new and general theory of molecular systematics emerging? Evolution 63, 1–19 (2009).
21 M. Arita, I. Karsch-Mizrachi, G. Cochrane, The International Nucleotide Sequence Database Collaboration. Nucleic Acids Res. 49, D121–D124 (2021).
22 Y. Li, X.-X. Shen, B. Evans, C. W. Dunn, A. Rokas, Rooting the animal tree of life. Mol. Biol. Evol. 38, 4322–4333 (2021).
23 T. Zhao et al., Whole-genome microsynteny-based phylogeny of angiosperms. Nat. Commun. 12, 3498 (2021).
24 F. Burki, A. J. Roger, M. W. Brown, A. G. B. Simpson, The new tree of eukaryotes. Trends Ecol. Evol. 35, 43–55 (2020).
25 S. Feng et al., Dense sampling of bird diversity increases power of comparative genomics. Nature 587, 252–257 (2020).
26 S. D. Smith, M. W. Pennell, C. W. Dunn, S. V. Edwards, Phylogenetics is the new genetics (for most of biodiversity). Trends Ecol. Evol. 35, 415–425 (2020).
27 K. More, C. M. Klinger, L. D. Barlow, J. B. Dacks, Evolution and natural history of membrane trafficking in eukaryotes. Curr. Biol. 30, R553–R564 (2020).
28 J. M. Archibald, Endosymbiosis and eukaryotic cell evolution. Curr. Biol. 25, R911–R921 (2015).
29 Y. Liu et al., Expanded diversity of Asgard archaea and their relationships with eukaryotes. Nature 593, 553–557 (2021).
30 L. Eme, A. Spang, J. Lombard, C. W. Stairs, T. J. G. Ettema, Archaea and the origin of eukaryotes. Nat. Rev. Microbiol. 15, 711–723 (2017).
31 opez-Garcıa, D. Moreira, Open questions on the origin of eukaryotes. Trends Ecol. Evol. 30, 697–708 (2015).
P. L
32 P. J. Keeling, Diversity and evolutionary history of plastids and their hosts. Am. J. Bot. 91, 1481–1493 (2004).
33 P.-M. Delaux, S. Schornack, Plant evolution driven by interactions with symbiotic and pathogenic microbes. Science 371, eaba6605 (2021).
34 H. Feldhaar, Bacterial symbionts as mediators of ecologically important traits of insect hosts. Ecol. Entomol. 36, 533–543 (2011).
35 A. K. Hansen, N. A. Moran, The impact of microbial symbionts on host plant utilization by herbivorous insects. Mol. Ecol. 23, 1473–1496 (2014).
36 M. McFall-Ngai et al., Animals in a bacterial world, a new imperative for the life sciences. Proc. Natl. Acad. Sci. U.S.A. 110, 3229–3236 (2013).
Downloaded at UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE on January 26, 2022
37 T. de Mee^ us, F. Renaud, Parasites within the new phylogeny of eukaryotes. Trends Parasitol. 18, 247–251 (2002).
38 F. Husnik, J. P. McCutcheon, Functional horizontal gene transfer from bacteria to eukaryotes. Nat. Rev. Microbiol. 16, 67–79 (2018).
39 C. Hoencamp et al., 3D genomics across the tree of life reveals condensin II as a determinant of architecture type. Science 372, 984–989 (2021).
40 E. Lieberman-Aiden et al., Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
41 J. Wang et al., Comprehensive chromosome end remodeling during programmed DNA elimination. Curr. Biol. 30, 3397–3413.e4 (2020).
42 J. J. Smith et al., The sea lamprey germline genome provides insights into programmed genome rearrangement and vertebrate evolution. Nat. Genet. 50,
270–277 (2018).
43 J. D. Podlevsky, C. J. Bley, R. V. Omana, X. Qi, J. J.-L. Chen, The Telomerase Database. Nucleic Acids Res. 36, D339–D343 (2008).
44 K. H. Miga, Centromere studies in the era of ‘telomere-to-telomere’ genomics. Exp. Cell Res. 394, 112127 (2020).
45 J. Damas, M. Corbo, H. A. Lewin, Vertebrate chromosome evolution. Annu. Rev. Anim. Biosci. 9, 1–27 (2021).
46 J. M. de Vos, H. Augustijnen, L. B€ atscher, K. Lucek, Speciation through chromosomal fusion and fission in Lepidoptera. Philos. Trans. R. Soc. Lond. B Biol. Sci.
375, 20190539 (2020).