Diversi Fication of Owering Plants in Space and Time

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Article https://doi.org/10.

1038/s41467-023-43396-8

Diversification of flowering plants in space


and time

Received: 31 August 2021 Dimitar Dimitrov 1,2,3,4,15, Xiaoting Xu 1,3,5,15, Xiangyan Su1,6,
Nawal Shrestha 1,7, Yunpeng Liu 1, Jonathan D. Kennedy 3,8,9, Lisha Lyu1,10,
Accepted: 8 November 2023
David Nogués-Bravo 3, James Rosindell 11, Yong Yang12, Jon Fjeldså3,4,
Jianquan Liu 5, Bernhard Schmid 13, Jingyun Fang1, Carsten Rahbek 3,8,14 &
Zhiheng Wang 1,3,15
Check for updates

The rapid diversification and high species richness of flowering plants is


1234567890():,;
1234567890():,;

regarded as ‘Darwin’s second abominable mystery’. Today the global spa-


tiotemporal pattern of plant diversification remains elusive. Using a newly
generated genus-level phylogeny and global distribution data for 14,244
flowering plant genera, we describe the diversification dynamics of angios-
perms through space and time. Our analyses show that diversification rates
increased throughout the early Cretaceous and then slightly decreased or
remained mostly stable until the end of the Cretaceous–Paleogene mass
extinction event 66 million years ago. After that, diversification rates
increased again towards the present. Younger genera with high diversifica-
tion rates dominate temperate and dryland regions, whereas old genera with
low diversification dominate the tropics. This leads to a negative correlation
between spatial patterns of diversification and genus diversity. Our findings
suggest that global changes since the Cenozoic shaped the patterns of
flowering plant diversity and support an emerging consensus that diversifi-
cation rates are higher outside the tropics.

Flowering plants are a major component of the biosphere providing of flowering plant diversity has intrigued ecologists and biogeo-
food and habitats for terrestrial animals1. They have adapted to and graphers since the time of von Humboldt4 and still represents an
diversified in a wide variety of environments2,3. Understanding the unresolved issue in biology. Previous studies have illustrated a perva-
evolutionary processes underlying the global spatiotemporal patterns sive latitudinal diversity gradient for flowering plants5,6. Among many

1
Institute of Ecology and Key Laboratory for Earth Surface Processes of the Ministry of Education, College of Urban and Environmental Sciences, Peking
University, Beijing 100871, China. 2Department of Natural History, University Museum of Bergen, University of Bergen, P.O. Box 7800, 5020
Bergen, Norway. 3Center for Macroecology, Evolution and Climate, GLOBE Institute, University of Copenhagen, Universitetsparken 15, 2100
Copenhagen, Denmark. 4Natural History Museum, University of Oslo, PO Box 1172 Blindern, NO-0318 Oslo, Norway. 5Key Laboratory of Bio-Resource and
Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065 Sichuan, China. 6Land Consolidation and
Rehabilitation Center, Ministry of Natural Resources, Beijing 100035, China. 7State Key Laboratory of Herbage Improvement and Grassland Agro-eco-
systems, College of Ecology, Lanzhou University, Lanzhou 730000 Gansu, China. 8Natural History Museum of Denmark, University of Copenhagen, DK-
2100 Copenhagen Ø, Denmark. 9Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK. 10School of Urban Planning and Design,
Shenzhen Graduate School, Peking University, Shenzhen 518055 Shenzhen, China. 11Department of Life Sciences, Imperial College London, Silwood Park
Campus, Ascot, Berkshire SL5 7PY, UK. 12Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, Nanjing Forestry
University, 159 Longpan Rd., Nanjing 210037, China. 13Department of Geography, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland.
14
Danish Institute for Advanced Study, University of Southern Denmark, Odense, Denmark. 15These authors contributed equally: Dimitar Dimitrov, Xiaoting
Xu, Zhiheng Wang. e-mail: zhiheng.wang@pku.edu.cn

Nature Communications | (2023)14:7609 1


Article https://doi.org/10.1038/s41467-023-43396-8

factors hypothesized to explain the latitudinal diversity gradient, and net diversification rates slowed down or remained mostly stable
macroevolutionary processes, including variations in net diversifica- (Figs. 2 and S1–S4). The second diversification burst of flowering plants
tion rates and the time available for speciation, have played a key started after the PETM with overall speciation rates and net diversifi-
role7,8. cation rates continuously increasing towards the present. However,
Although previous attempts to understand the diversification of speciation and net diversification rates significantly vary across linea-
flowering plants have explored the influences of variation in global ges (Fig. 1) and are relatively higher in temperate and dryland-adapted
climate, geography, and ecological opportunities on diversification genera (Figs. 2, 3, S5, and S6).
rates9–11, the temporal and spatial trends of species diversification at a Previous studies have interpreted the difference between the
global scale for flowering plants are yet to be established. Most pre- stem and crown ages of angiosperm families as evidence for a period
vious analyses have either been based on family-level of low diversification rates between the initial burst of angiosperm
phylogenies9,12–14,but see15 or have relied exclusively on the frag- diversity in the first half of Cretaceous and their fast diversification in
mented fossil records16,17. The lack of a comprehensive time-calibrated the Cenozoic14. This period of low diversification coincides with the
phylogeny with higher taxonomic resolution and better-resolved dis- time when the extent of tropical-like habitats was reduced due to
tributional data have limited our understanding of the diversity pat- global climate cooling24,25 followed by a major turnover in plant com-
terns of flowering plants and the macroevolutionary mechanisms munities across the globe and a decline in non-flowering plants16,26.
underlying them. These environmental changes might have required flowering plants to
Here, we elucidate the spatiotemporal diversification dynamics of acquire novel adaptations. Studies have indeed suggested that the low
the flowering plants and their relationships with the global patterns of and stable rates of angiosperm diversification between the mid-
flowering plant diversity by integrating two global datasets: (1) a time- Cretaceous and the Cenozoic might be the effect of the time necessary
calibrated phylogeny containing 14,244 currently recognized genera for angiosperms to develop morphological and ecological innovations
(87.5% with DNA sequences based on sequences from 22,277 species) after the early split between major angiosperm lineages14. However,
of flowering plants and (2) a dataset of the global distribution of 13,719 the evidence for quick diversification of angiosperms during periods
genera at a spatial resolution of ca. 329,670 km2 (mean area: characterized by abrupt environmental changes that we find (see
329,670 ± 198,191 km2) from > 1100 data sources, which are mostly below) suggests that other factors may have been involved. For
regional species lists and to a lesser extend species occurrence records example, competition with non-flowering seed plants and ferns which
due to the limited availability of the latter (Supplementary Data 1). The remained a major component of terrestrial plant communities until
phylogeny of angiosperm genera is constructed using maximum like- the late Cretaceous when they experienced dramatic increase in
lihood (ML) with the divergence between orders constrained following extinction rates and decrease in diversification rates16.
the APG IV framework (see Methods), and is dated using 100 The speciation of flowering plants rapidly increased during the
fossils10,13,14 (Supplementary Data 2) under three dating scenarios for Cenozoic, especially in temperate and dryland-adapted genera (Figs. 2,
the crown age of flowering plants: (1) 140–150 Ma16, (2) 140–210 Ma18, S5, and S6). This may be partly due to increased ecological opportu-
and (3) 149–256 Ma19 (see Methods). Although the crown age of nities after the K-Pg mass extinction27 and the expansion of temperate
angiosperms varies across the three dating scenarios, the estimated and dryland habitats due to generally continuous process of global
genus and family ages are, in general, consistent with recent estima- cooling and aridification from the PETM towards the present28, espe-
tions based on fossil and molecular13 evidence (Table 1), but see19. Our cially in high latitudes29. For example, geological evidence shows a
phylogeny provides a global overview of angiosperm genera rela- drastic decrease in temperature at the high latitudes of Eurasia from
tionships (Fig. 1), and significantly expands the coverage of angios- the late Eocene to the early Oligocene, leading to the expansion of
perm genera compared to available large scale angiosperm temperate habitats30. As a result, the ancient angiosperm lineages31
phylogenies20,21. that were likely adapted to cooler environments experienced quick
diversification10,31. In addition, the retreat of the Paratethys sea
Results and discussion between 20–30 Ma generated a vast new terrestrial area in northern
Temporal trends of flowering plant diversification Africa and Southwest and Central Asia32 and also intensified aridity
The analyses of speciation rates and net diversification rates through since the early Miocene, especially during the last 10 Ma, thereby
time demonstrate two diversification bursts in the evolution of flow- creating large drylands around the globe33. This aridification likely
ering plants. The first one occurred between the Late Jurassic (ca. accelerated the speciation rates of many dryland adapted genera34
150 Ma) and the mid-Cretaceous (ca. 100 Ma)3,16 and this time period (Figs. 2 and S6). Studies on birds show a similar pattern of extensive
roughly coincides with the time when flowering plants started to speciation and high net diversification rates in the Cenozoic, which has
increase their abundance in terrestrial floras before rising to dom- been linked to the expansion of new habitats28.
inance towards the end of the Cretaceous22. This burst in flowering Our genus-level angiosperm phylogenetic tree improves the
plant diversification is also corroborated by both fossil16 and pollen23 phylogenetic resolution compared to family-level trees, henceforth,
records. All our dating scenarios suggest that from the late Cretaceous provides finer temporal resolution on the diversification history of
to the Paleocene-Eocene Thermal Maximum (PETM), speciation rates angiosperms compared to previous studies conducted at the family-

Table 1 | Median values of divergence time (age), speciation rates, and net diversification rates for flowering plant genera.
Minimum and maximum values are shown in brackets
Tree Types Dating constrains (Ma) Genus Age (Ma) Speciation rates Net diversification rates
Molecular phylogeny 140–150 21.20 (0.005–150) 0.058 (0.022–0.524) 0.056 (−0.003–0.463)
140–210 22.19 (0.005–210) 0.056 (0.024–0.468) 0.055 (−0.089–0.413)
149–259 23.16 (0.005–256) 0.054 (0.019–0.424) 0.053 (−0.100–0.383)
Global phylogeny 140–150 19.50 (0.005–150) 0.057 (0.028–0.527) 0.055(0.001–0.4463)
140–210 19.50 (0.005–210) 0.056 (0.024–0.470) 0.053 (−0.009–0.413)
149–259 21.38 (0.005–256) 0.054 (0.020–0.424) 0.052 (−0.045–0.383)
Minimum and maximum values are shown in brackets.

Nature Communications | (2023)14:7609 2


Article https://doi.org/10.1038/s41467-023-43396-8

Number of lineages

Age (Ma)

Fig. 1 | Net diversification rates through time across all flowering plant genera. Warmer and colder colors denote faster and slower rates, respectively. The insert shows
the lineage through time plot for all flowering plant genera based on the global (red line) and molecular (black line) phylogenies.

level or on smaller species-level trees and confirms the generality of regions such as the Indochina peninsula (Cambodia, Laos, Vietnam),
these overall patterns9,12–15. However, the lack of resolution at the species distribution data are relatively insufficient, which may lead to
specie-level in our study may have led to an underestimation of the underestimation of genus richness.
most recent diversification trend, particularly for genera where dif- We find pronounced latitudinal gradients of mean genus age,
ferent intrageneric lineages have highly divergent diversification his- mean speciation rate, and mean net diversification rate per geo-
tories (Figs. 1 and 2). This might be the likely reason for the apparent graphical unit. Specifically, the mean genus age per geographical unit
decline of diversification rate in the last 15–20 Ma when most extant is the oldest in the tropics and decreases with latitude in both hemi-
genera originate. spheres (Figs. 3 and S8–S10) and increases with mean annual tem-
perature and precipitation (Fig. 4A, B). In contrast, the mean speciation
Spatial patterns of flowering plant diversity, age, and and net diversification rates per geographical unit increase with lati-
diversification tude (Figs. 3 and S8–S10) and decrease with mean annual temperature
The current genus diversity of flowering plants generally shows a sig- and precipitation (Fig. 4C, D), reaching the highest values in temperate
nificant latitudinal gradient, decreasing from the tropics towards the Eurasia and the Afro-Asian drylands (Figs. 3, S8, and S9). The latitudinal
poles, with a notable outlier in the Fynbos of South Africa (Fig. 3). The gradient in genus age, mean speciation and net diversification rates
Andes, Central America and Southeast Asia harbor the highest generic differed significantly from a null model assuming random spatial dis-
diversity (>2100 genera per ca 4o × 4o geographical unit), while tem- tributions of genera (Fig. S11), which suggests that the genera outside
perate regions and the continuous arid zone ranging from northern the tropics are not a random subset of angiosperms, but are sig-
Africa through southwestern Asia to central Asia and Mongolia (Afro- nificantly biased towards taxa with high speciation and net diversifi-
Asian drylands hereafter, see Methods) have the lowest generic cation rates and young ages. More interestingly, we find that the
diversity (<1000 genera per geographical unit in Afro-Asian drylands latitudinal gradients in mean speciation and net diversification rates
and temperate Eurasia). This global pattern is not biased by the size of have persisted since the Cenozoic (Figs. 2 and S5) to the present. The
geographical units and the number of distribution data sources that we relationships of mean genus age and mean net diversification rate with
used to compile the distributional data (Fig. S7). However, in some mean annual temperature and precipitation remain unchanged when

Nature Communications | (2023)14:7609 3


Article https://doi.org/10.1038/s41467-023-43396-8

A B North hemisphere South hemisphere


0.25 0.15 0.15

0.20

Speciation rate

Speciation rate
0.10 0.10
0.15

0.10
0.05 0.05

o
0.05 75
65o

0.00 0.00 0o 0.00 0o

0.20
Net Diversification rate

Net Diversification rate

0.10 0.10
0.15

0.10
0.05 0.05

0.05

0.00 0.00 0.00


150 100 50 0 150 100 50 0 150 100 50 0
Time before present (Ma) Time before present (Ma) Time before present (Ma)

Fig. 2 | Variation in speciation and net diversification rates through time. Rates 140–150 Ma. The shaded areas surrounding the solid lines represent the 95% con-
of speciation and net diversification through time of flowering plants estimated at a fidence intervals of the mean rate estimates. In B, the evolutionary rates are esti-
global scale (A) and for different latitude belts (B). Evolutionary rates are estimated mated for latitudinal belts of 10 degrees across the two hemispheres. Darker colors
using the global phylogeny with the crown of flowering plants constrained to indicate higher latitudes, and the red line indicates the Equator belt (−5 o to 5 o).

five quantiles (i.e., 5%, 25%, 50%, 75%, and 95%) instead of average (Fig. 3). In the 280 flowering plant families inhabiting these regions,
values of mean annual temperature and precipitation within geo- including large families such as Fabaceae and Poaceae (Supplementary
graphic units were used to present the spatial variations in climate Data 3), ~54% of these families have higher speciation and net diver-
(Fig. S12). Furthermore, geographic variations in mean genus age and sification rates within these regions than in the rest of their distribution
mean net diversification rate are not significantly correlated with cli- ranges. A further evaluation demonstrated that the endemic genera in
matic heterogeneity within geographic units (Fig. S13). These results these regions are significantly younger and diversify significantly faster
suggest that climatic heterogeneity within geographic units does not than the endemic genera in tropical/subtropical regions (Wilcoxon
bias our findings on the relationships between mean genus age/net rank-sum two-sided test, p = 6.048e–08, Fig. S6). In contrast to tem-
diversification rate and climate. perate and dryland regions, the diversity hotspots in tropical and
When we classified all genera into four quartiles according to their subtropical regions are characterized by clades with relatively low
ages, speciation, or net diversification rates respectively (see Meth- speciation and net diversification rates, and old ages (>25 Ma) (Fig. 3).
ods), we find that the oldest genera (stem age >30.83 Ma) and those This suggests that tropical and subtropical regions may have accu-
with the lowest speciation and net diversification rates have the mulated species over longer periods of time in comparison to other
highest dominance in the tropical and subtropical floras. Their pro- regions of the world and might have served as a museum for flowering
portion in floras decreases with latitude and is the lowest in temperate plants during long-term climate change, see also15,35.
Eurasia and the Afro-Asian drylands (Figs. 5 and S14–S19). In contrast, Both geological and evolutionary processes may have contributed
the youngest genera (stem age <11.24 Ma) and those with the highest to the young age and high diversification rate of the floras in temperate
speciation and net diversification rates have much higher contribution Eurasia and the Afro-Asian drylands. For example, both regions
to temperate and dryland floras compared to floras in other regions experienced dramatic environmental changes in the Cenozoic, which
(Figs. 5 and S20–S24). may have provided new habitats for the radiation of flowering plants. A
Our results show markedly higher speciation and net diversifica- sharp decrease of temperature in temperate Eurasia during the early
tion rates and younger ages of genera in drylands and temperate Oligocene might have also caused rapid expansion of temperate
regions, despite low generic diversity in these regions (Figs. 3, S6, S8, habitats in this region30. Furthermore, the retreat of the Paratethys Sea
and S9). Among all temperate and dryland regions, the floras of tem- (20–30 Ma) created new terrestrial and arid areas in northern Africa as
perate Eurasia and the Afro-Asian drylands have genera with the well as Southwest and Central Asia32 since the early Miocene33. These
youngest ages and the highest speciation and net diversification rates dramatic environmental changes have provided a large variety of new

Nature Communications | (2023)14:7609 4


Article https://doi.org/10.1038/s41467-023-43396-8

Fig. 3 | Global patterns and latitudinal gradients of angiosperm generic phylogeny with angiosperms crown age constrained to 140–150 Ma. Regions with
diversity, mean genus age, mean speciation rates and mean net no distributional data are shown in white. Solid red lines on the scatter plots
diversification rates. Generic diversity is a quadratic function of latitude represent lowess regression with span of 0.5. The same results based on only
(r2 = 0.42): diversity is the highest in tropical regions and decreases towards the monophyletic genera are shown in Fig. S9.
poles. Mean genus age and mean evolutionary rates are estimated using the global

habitats for the rapid radiation of cold- and arid-adapted flowering genus diversity and mean net diversification rate per geographic unit is
plants. Moreover, the initiation of the east Asian monsoons in the early in contrast with the long-standing diversification-rate hypothesis,
Miocene, and their later intensification in the mid-Miocene, may have which states that a decrease in net diversification rate from the tropics
led to more windy environments in temperate Eurasia36. This, in turn, to the poles drives the latitudinal gradient in species diversity of
may have enhanced the diversification of wind-pollinated families, flowering plants46. Instead, our results are in line with the predictions
such as Poaceae which are common in these regions37. of the time-for-speciation hypothesis and suggest that longer time for
Herbaceous and small shrub species usually have higher pro- speciation in the tropics compared with other areas47 may have played
portions in dryland and temperate floras than tree species. Previous a critical role in shaping the global patterns of flowering plants diver-
studies demonstrate that, compared to tree species, herbaceous sity. Similar conclusions have also been achieved in recent studies on
species have higher rates of molecular evolution likely due to other groups of organisms, e.g., birds, mammals, and turtles28,48,49.
their shorter generation times38. In addition, herbaceous species and Geological evidence indicates that the tropical climate has likely
small shrubs tend to have higher ploidy levels39–41. Both of this may been present in the equatorial area during the Cretaceous and the early
have also contributed to the younger age and higher speciation and Cenozoic and has persisted ever since50, while the temperate in high
net diversification rates in drylands and temperate regions (Fig. S25). latitudes (such as the temperate Eurasia) and arid climates in Afro-
In addition, a large component of the floras in temperate Eurasia and Asian drylands may have arisen only since the mid-Cenozoic51,52.
the Afro-Asian drylands are Crassulacean acid metabolism (CAM) Therefore, these temperate and dryland regions may have had much
plants42,43. These plants have experienced rapid diversification since less time for species accumulation than the tropics. Our analysis also
the mid-Miocene in temperate and arid regions42,44,45. Indeed, we find reveals that genera originating before the mid-Cenozoic are mostly
that genera dominated by CAM species have higher diversification restricted to tropical and subtropical climates, while those radiating in
rates than those dominated by C3 plants (Fig. S25). The rapid the cold and arid regions have mostly originated after the mid-
expansion of CAM plants to newly formed habitats in temperate Cenozoic and, in general, have higher diversification rates (Figs. 3, 5,
Eurasia and the Afro-Asian drylands since the mid-Miocene may have S8, S9, and S14–S24). Fossil records53 and spatial patterns in family
further contributed to high speciation and net diversification rates in crown ages of flowering plants14 show consistent patterns. Together
these regions. these findings indicate that the current latitudinal gradient in species
diversity may have been formed only in the last 30–40 Ma following
The role of evolutionary processes on the global patterns of the expansion of modern temperate climate and drylands, especially in
flowering plant diversity the Northern Hemisphere54. In addition to this effect of time for spe-
We find a negative correlation between genus diversity of flowering ciation, other factors such as differences in the area of tropical and
plants and the mean diversification rates per geographical unit temperate regions through time55 or ecological constraints56 may have
(modified t-test: correlation coefficient = −0.352, Fstat = 0.141, degrees also played an important role in the establishment of the current
of freedom = 54.229, p = 0.009, effective sample size = 52.229), but a latitudinal gradient in species diversity of flowering plants. Their
positive correlation between genus diversity and mean genus age per relative effects compared with evolutionary processes should be tes-
geographical unit (modified t-test: correlation coefficient = 0.386, ted in future studies.
Fstat = 0.175, degrees of freedom = 32.771, p = 0.021, effective sample
size = 33.771) (Figs. 4E, F and S26, and Tables S1 and S2). Our null model Summary
results (Fig. S11B, C) suggest that the observed relationships between Our results suggest that flowering plants have experienced two bursts
genus richness and mean net diversification rate/mean genus age are of diversification, which agrees with paleontological data3. Extant
not due to random processes. The negative relationship between flowering plant species are mainly derived from the second

Nature Communications | (2023)14:7609 5


Article https://doi.org/10.1038/s41467-023-43396-8

A B
30 r 2 = 0. 051 30 r 2 = 0.224
P = 5.13E-06 P = 6.82E-24

per geographical unit


28 28

Mean age
26 26

24 24

22 22

20 20
-16 -4 8 20 32 0 900 1800 2700 3600
Mean annual temperature Mean annual precipitation
C D
0.084 2
r = 0.027 0.084 r 2 = 0.077
rate per geographical unit
Mean net diversification

P = 9.39E-4 P = 1.43E-8
0.078 0.078

0.072 0.072

0.066 0.066

0.06 0.06
-16 -4 8 20 32 0 900 1800 2700 3600
Mean annual temperature Mean annual precipitation
E F
8 y=-3.454+0.398x 8 y=20.133-197.064x
2
r = 0.149 r 2 = 0.124
per geographical unit
Log(genus richness)

P = 9.49E-16 P = 3.66E-13
7 7

6 6

5 5

19 21 23 25 27 29 0.061 0.067 0.073 0.079 0.085


Mean age Mean net diversification rate

Fig. 4 | Variation in genus age and net diversification rate as a function of generic diversity as a function of mean genus age (E) and mean net diversification
annual temperature and precipitation, and in generic diversity as a finction of rate (F) per geographical unit. Mean age and mean net diversification rate are based
genus age and net diversification. Variation in mean genus age (A, B) and mean on the global phylogeny with angiosperms crown age constrained to 140–150 Ma.
net diversification rate (C, D) per geographic unit as a function of mean annual Solid red lines represent linear (A–D) and model II geometric mean regression line
temperature and mean annual precipitation. Variation in the log-transformed (E, F). All p-values (P) and r2 were estimated by linear regression.

diversification burst where intense global cooling and aridification for the following genes commonly used in plant phylogenetic studies:
induced a rapid diversification of species in newly emerged habitats. 18 S ribosomal DNA (18 S rDNA), internal transcribed spacer region (ITS,
Across different biomes, the temperate and dryland regions in Eurasia including ITS1, 5.8S ribosomal DNA and ITS2), and 26S ribosomal DNA
and northern Africa host angiosperm genera with the youngest ages (26S rDNA) from the nuclear genome; ATPase β-subunit gene (atpB),
and the highest speciation and net diversification rates. Moreover, the Maturase K (matK), NADH dehydrogenase F (ndhF) and ribulose
global diversity pattern of angiosperms is negatively correlated with bisphosphate carboxylase large chain (rbcL) from the chloroplast
mean speciation and net diversification rates, suggesting that pro- genome; and Maturase R (matR) from the mitochondrial genome. The
cesses other than speciation and net diversification rates may have ITS1, 5.8S, and ITS2 were downloaded and treated as a single fragment
driven the global diversity patterns of flowering plants. Our study referred to as ITS herein. This gene sample represents both quickly
demonstrates the necessity of integrating species distributions with (e.g., ITS) and slowly (e.g., 18S rDNA, rbcL, matR) evolving genes.
mega-phylogenies to understand the mechanisms underlying large- Sequence download and quality checks were managed using the
scale biodiversity patterns. NCBIminer v. 4.057. In total, the raw data included 669,619 records of
seed plant DNA sequences for 132,373 infrageneric taxa and 457
Methods families.
Phylogenetic reconstruction We first filtered the raw data to species level following a few
Sequence downloading and quality screening. We downloaded all simple rules. 1) Sequences belonging to hybrids or from taxa that were
sequence data of seed plants available in GenBank (as of 19 May 2018) not identified to the genus level according to the GenBank taxonomic

Nature Communications | (2023)14:7609 6


Article https://doi.org/10.1038/s41467-023-43396-8

1st quartile Speciation rate Net diversification rate


60° N

0.34 0.31
0.28 0.24
0.26 0.22

0.24 0.20
0.22 0.18
0.20 0.16
60° S

0.11 No data 0.09

2nd quartile
60° N

0.35 0.42
0.25 0.29
0.23 0.27

0.21 0.25
0.19 0.23
0.17 0.21
60° S

0.10 0.15

3rd quartile
60° N

0.42 0.44
0.34 0.35
0.31 0.31

0.28 0.27
0.25 0.23
0.22 0.19
60° S

0.15 0.11

4th quartile
60° N

0.42 0.42
0.34 0.32
0.32 0.31

0.30 0.30
0.28 0.29
0.26 0.28
0.20 0.23
150° W 90° W 30° W 30° E 90° E 150° E 150° W 90° W 30° W 30° E 90° E 150° E
Fig. 5 | Geographical variation in the proportions of genera with different each geographical unit is calculated. All analyses are based on the global phylogeny
evolutionary rates. All genera are divided into four quartiles according to their with angiosperms crown age constrained to 140–150 Ma. The same results based on
species-level speciation and net diversification rates respectively. From the 1st to other phylogenies with different dating constraints are shown in Figs. S20–S24.
the 4th quartiles, the evolutionary rates increase. The proportion of each quartile in

database were removed. 2) Only the longest sequence of each species To improve the coverage of genetic markers for each genus, we
was kept for each genetic marker. When more than one sequence was used congeneric sequences for some genera. Because genera may be
found as longest (due to equal lengths), the most recently published non-monophyletic, we assessed the monophyly of each genus with
sequence was kept. 3) Sequences used in published peer-reviewed sequence data using the large species-level phylogeny of ref. 20. The
papers were preferred to those not used in publications. The dataset tree of ref. 20 is based on few genetic markers and contains a huge
resulting from this procedure included 260,477 sequences from proportion of missing data thus it is not free of issues and is likely not
124,646 infrageneric taxa. To avoid errors due to mismatches between able to provide a conclusive test of genera monophyly (e.g., genera
GenBank taxonomy and the taxonomy that we applied to the dis- may be inferred to be non-monophyletic due to lack of data). However,
tributional data, we updated the taxonomy of these sequences using when taxa described in the same genus are found to form a mono-
the same procedure as the one applied to the distributional data (see phyletic group it provides evidence that these genera are likely
Species and genus names section). monophyletic. A total of 593 genera (4.7%) were identified as non-

Nature Communications | (2023)14:7609 7


Article https://doi.org/10.1038/s41467-023-43396-8

monophyletic. We screened carefully all non-monophyletic genera. 1) composite terminals after data screening in our study should not
For the non-monophyletic genera caused by very few stochastic influence higher-level relationships and might minimize the potential
intruders from other genera, we removed the intruders’ sequences uncertainties on the estimate of the diversification history for flower-
from our database. 2) For those non-monophyletic genera with several ing plants based on the available DNA data for a limited number of
clades, we identified all the monophyletic clades and estimated the species.
number of species included in the tree of ref. 20 for each clade. Then
the largest clade of each non-monophyletic genus, was used to Sequence alignment. To make sure that different accessions are
represent the genus. 3) For polyphyletic genera, we only selected oriented in the same direction we performed the following steps for
species from the core monophyletic clades. These steps ensure that we each gene. 1) We selected the two longest sequences for every order
only combine sequences from species that form monophyletic groups and wrote these sequences into a single fasta file. 2) We aligned these
in the cases where sequences in our final dataset were from multiple representative sequences using MAFFT61 and the L-INS-i algorithm with
species. the commands --localpair -- maxiterate 1000 --adjustdir-
To minimize the number of non-conspecific sequences repre- ectionaccurately. This step generated an alignment of the longest
senting monophyletic composite genera while maximizing the cover- sequences of all orders in the same direction. Let Alignment0 denote
age of genetic markers for each genus, we developed the following this alignment. 3) We separated all available sequences into different
complementary method for sequence filtering. (1) For a genus, we files, one file per order, and sorted them in order of decreasing
sorted the genetic markers in an ascending order of the number of sequence length. Therefore, the longest sequences of each order were
species per marker. (2) For the first marker (i.e. the marker with the always on top. 4) We replaced the longest two sequences in all files
least number of species), we selected the species covering the highest with the corresponding sequences from Alignment0 whose directions
number of markers and (when the number of markers was the same for had been adjusted. 5) Finally, we aligned the sequences of each order
more than one species) the highest total relative sequence length. In respectively using the same algorithm: --localpair -- maxiterate 1000
this way, selected species may also cover the remaining genetic mar- --adjustdirectionaccurately. These steps ensure that the sequences
kers. (3) We repeated the above procedure for all genetic markers until within each order and across orders are in the same direction.
the maximum number of genetic markers for each genus was achieved. Because the alignment of some gene regions (particularly the ITS1
4) As the above procedure might lead to the selection of multiple and ITS2) between very divergent groups is difficult and may lead to
species for each marker, we then selected the longest sequences for unwanted artefacts, we adopted an alignment strategy with the fol-
each marker to maximize the total number of base pairs for each genus lowing steps. 1) The sequences of each plant order were placed in a
in the final matrix. The relative sequence length was calculated as the separate matrix and were aligned using the L-INS-i strategy in MAFFT
number of base pairs of a genetic marker for a species divided by the with the following commands: --localpair -- maxiterate 1000 --adjust-
maximum number of base pairs of the marker for the specific genus. directionaccurately (step 5 above). 2) The order-level alignments of
Using the outlined procedure, we produced a final matrix for each gene were merged as a single fasta file. For each of these com-
12,539 seed plant genera representing ca. 87.5% of the known seed bined fasta files subMSAtable file with the information on which
plant genera (87.4% of the angiosperms and 100% of the gymnos- sequences correspond to individual order-level alignments was cre-
perms) based on sequences from 22,277 species. We downloaded ated using the makemergetable.rb script distributed with MAFFT. 3)
sequences for 9 species of ferns as outgroups. The list of accession The order-level alignments for each gene were then aligned to each
numbers and taxonomic information for all genera in the final dataset other in MAFFT using the --localpair --merge commands that allow
is given in Supplementary Data 4. alignment of multiple sequence alignments. 4) The resulting aligned
The outlined procedure has an important aspect that should be matrices of all genes were concatenated to each other to generate a
highlighted. The use of one representative sequence per genetic super-matrix of aligned sequences, which was then used in subsequent
marker per genus as placeholder of the genus could lead to composite phylogenetic analyses. All alignments were conducted at the High-
terminals for large genera (especially those with more than 3 species) performance Computing Platform of Peking University.
as different genetic markers may come from different species. Com- It is worth mentioning that the separate order-level alignments
posite terminals represented ca. 46% of the genera in the final mole- will, although indirectly, introduce to some extent a soft constraint on
cular dataset. About 52% of these composite terminals had sequence orders monophyly even if some orders were not explicitly constraint to
data from only 2 species, and 48% had sequence data from 3 or more be monophyletic (see next section). This approach is consistent with
species (Fig. S27, see Supplementary Data 4 for the species list used for the topological constraints that we used for the deeper nodes on the
each genus). This approach should not result in biases and artifacts in backbone topology in the following analysis and is better than other
the resulting phylogenies due to the following reasons. 1) We do not alternatives as it decreases alignment errors and ensures higher con-
intend to address any relationships within genera (i.e., at the species sistency with currently accepted taxonomy.
level). 2) We used only taxa that could be unambiguously assigned to
accepted genera (e.g., we did not use sequences that were ambigu- Phylogenetic analyses. Phylogenetic analyses were run using RAxML
ously assigned to genera following the GenBank sequence annota- v 8.0.2662 at the High-performance Computing Platform of Peking
tions). 3) Recent studies on mega-phylogenies indicate that most University and at the Abel cluster at the University of Oslo. Data was
accepted genera are recovered as monophyletic even when analyzing partitioned by gene and the GTRGAMMA model (with parameters
large and highly incomplete molecular super-matrices, suggesting that optimized independently for each partition) was used for evaluation of
the current taxonomy of flowering plant genera is overall well sup- the tree likelihood. To ensure better consistency between our analysis
ported by available molecular data10,20. In cases where genera were and the current knowledge on higher-level seed plant relationships,
found to be non-monophyletic we made sure that sequences of com- e.g., above family level, we constrained the phylogenetic analyses
posite terminals come only from one monophyletic lineage repre- using the currently accepted view on relationships among angiosperm
senting the bulk of species diversity in these taxa. Composite taxa orders and among eudicots, monocots and magnoliids63. This
approach has been proven advantageous as it not only helps to reduce approach is similar to the one undertaken in recent large-scale com-
the computational demands for data analyses but also significantly parative studies of angiosperms10,20 but differs in the level at which
increase phylogenetic accuracy by decreasing the amount of missing topological constraints were applied. We opted to apply the con-
data (i.e. gaps in the supermatrix of DNA data) in most cases58. This straints on the topology at the deeper nodes of the tree (i.e., at order
approach has been successfully used in recent studies59,60. Therefore, level), as these are more likely to present problems given the character

Nature Communications | (2023)14:7609 8


Article https://doi.org/10.1038/s41467-023-43396-8

sample for this study (for example, some of these relationships have Construction of the global genus-level phylogeny. To include all
been elucidated using morphological data that is not used in our currently described seed plant genera (see Supplementary Data 4 for
analyses). In contrast, in the analysis by ref. 10, a family-level topology the genera list) in our phylogeny, we added genera without sequence
was used as a backbone constraint for their phylogeny reconstruction. data into the dated phylogeny according to their family-level place-
The topology resulting from the RAxML analysis was then sub- ment or according to the order level placement if family placement
jected to molecular dating using penalized likelihood as imple- was uncertain. Specifically, all genera that were not represented in the
mented in the program treePL64. In our analyses, we used 100 fossil molecular dataset were added to the crown nodes of their corre-
calibration points described in recent refs. 10,13,14. Among these sponding families or orders as polytomies. To speed up the analyses
ref. 14. provides the most comprehensive list of curated fossil cali- the topology of each order containing taxa added following this pro-
brations (see the Data2b_CalibrationList.xls supplementary file of cedure was extracted and polytomies were resolved following the
ref. 14). A complete list of the publications describing these fossils is methods of69, with BEAST v1.8.070. The polytomy resolver method uses
provided in Supplementary Data 2. In previous studies using large the input topology and branch length information to generate an xml
plant taxon samples, the strict molecular clock has been rejected10. file that can be then directly analyzed in BEAST estimating both branch
Therefore, we did not repeat this test here. In the penalized like- lengths and phylogenetic placement of taxa for which character data is
lihood dating analysis, the selection of the smoothing value para- not available simultaneously. The polytomy resolver analysis was run
meter may influence age estimates due to its effect on substitution in BEAST until stationary was achieved in the MCMC chain. For the
rates estimates. With the increase of smoothing value, the rate het- BEAST analyses we used a birth-death model with uniform priors for
erogeneity decreases and more clock-like mode of rate evolution is the mean growth rate (λ − μ) and the relative death rate (μ/λ) para-
assumed65. To select appropriate smoothing value for our dating meters. We used this model set up as it is intended to be wildly
analyses, we performed a cross validation with smoothing parameter applicable when additional information on the model priors is not
set to 0.0001, 0.001, 0.1, and 100 (the default setting in treePL). Due available (as in our case)69. Each BEAST run was run for 11 million
to the large size of the dataset and the computational burden of full generations and posteriors were sampled every 1000 generations.
cross-validation, we did not test all potential values of the smoothing After discarding the first 10% of generations as a burn-in, we used
parameter. Based on the χ2 test, the result of these cross validation Tracer v1.6.071 to examine the effective sampling sizes (ESS) of all
runs suggested that the lowest smoothing value (0.0001) is pre- parameters, chain mixing, and convergence to a stationary distribu-
ferred. Using a smoothing value of 0.0001 in our dating runs under tion. The post burn-in sample showed convergence to stationary dis-
different constraints for the age of the crown angiosperms showed tribution, good mixing, and high ESS for all parameters (ESS > 200).
higher congruence in age estimates with the fossil record and pre- The maximum clade credibility tree for each order was extracted from
vious studies. Moreover, the smoothing value = 0.0001 did not gen- the corresponding post-burn-in sample. These complete order-level
erate bias towards much younger ages in some groups such as topologies were then used to replace the corresponding order level
Gymnosperms, while the use of higher smoothing values in alter- sub-trees in the dated molecular trees to produce the final ultrametric
native test runs did. Thus, based on the results from the random topologies that included all described angiosperm genera. This divide
cross valuation of the three different dating runs and to account for and conquer strategy follows the approach of28 and results in greatly
potentially very large differences in rates among lineages (particu- improved ESS and faster BEAST analyses when compared to analyses
larly between angiosperms and gymnosperms), we have chosen the of the tree as a whole.
setting with a very small smoothing value (0.0001) for the final To evaluate the consistency of the resulting diversification ana-
penalized likelihood analyses. The preference for such small lysis, we repeated all analyses using all global and all molecular trees.
smoothing values is consistent with previous studies of large clades These analyses produced highly similar patterns (See supplementary
with heterogeneous rates of molecular evolution where even lower figures for details). In addition, we compared the estimates of evolu-
settings for the smoothing parameter were found to be tionary rates using different age constraints for the crown of angios-
appropriate66. perms on the molecular and global phylogeny (Fig. S28). Overall,
Dating runs based on all combinations of the fossil constraints constraining the crown age of angiosperms to 140–210 Ma and
and the four levels of the smoothing values that we tested found that 149–256 Ma resulted in somewhat older age estimates for genera
the ages for the crown Angiosperms substantially exceeded the com- compared to that when the constraint was 140–150 Ma. These findings
monly referred age for this node, i.e. 140–150 Ma18. The time of origin apply both to the molecular tree and global tree.
of the crown Angiosperms is still debated and different studies have Our approach for building the phylogeny is similar to the one in a
provided estimates that varied from a maximum of about 280 Ma67 to a recent study of world floristic regionalization72. Compared with ref. 72,
minimum of 130 Ma68 with virtually all possible values in-between. As we used an updated and greatly expanded set of fossils in our dating
differences in age estimates may have significant impacts on the fol- analysis and our tree includes ca 2000 additional genera. The global
lowing diversification analyses, we used an additional constraint on the phylogeny with the angiosperm crown age constraint of 140–150 Ma
age of the crown Angiosperms to investigate the potential effects on along with associated metadata was then used to generate an inter-
diversification induced by the uncertainties associated with the dating active website that allows the user to explore the topology and
of this node. This additional constraint was designed based on the access information about taxa, divergence times, and diversification
overview of angiosperm dating studies and fossil evidence provided in rates. This website was generated using the OneZoom tool73 and is
refs. 18, 19. Three different settings were used: the first incorporated a available at https://en.geodata.pku.edu.cn/index.php?c=content&a=
wider temporal interval (min = 149 Ma, max = 256 Ma) based on the list&catid=200.
results of ref. 19 to accommodate the most common average ages for
this group; the second took a median estimate of the age of this group Global distributions of seed plant genera
(min = 140 Ma, max = 210 Ma); and the third was more restrictive Geographical standard units. The geographic standard units used in
encompassing the highest probability density for the average age of the database is an updated version of refs. 74,75, which uses the World
the crown Angiosperms at ca. 145 Ma (min = 140, max = 150 Ma) as Geographical Scheme for Recording Plant Distributions (WGSRPD)
suggested by ref. 16. A similar approach of using various dating stra- and administrative boundaries from the Global Administrative Areas
tegies to account for the uncertainties in the age estimates for crown (GADM) database version 1 (http://www.gadm.org/) as base maps.
angiosperms was recently adopted in a study showing global patterns WGSRPD was developed by the Biodiversity Information Standards,
of angiosperms families14. formerly Taxonomic Databases Working Group (http://www.kew.org/

Nature Communications | (2023)14:7609 9


Article https://doi.org/10.1038/s41467-023-43396-8

gis/tdwg/index.html). The aim of WGSRPD is to provide a standard Africa, the African Plant Database (http://www.ville-ge.ch/musinfo/bd/
database of geographic names so that the data could be exchanged cjb/africa/recherche.php) that includes information for over
efficiently across databases without any loss of information76. Cur- 70,000 species and distribution maps of over 55,000 species. We also
rently, WGSPRD were widely used to record species distribution, e.g. compiled distribution data from floras covering smaller scales, for
World Checklist of Selected Plant Families (WCSP, http://apps.kew. example, the local floras on Russian floras in different regions, and the
org/wcsp/home.do) and the database of Plants of the World Online floras published on eFloras (http://www.efloras.org/). These published
(http://www.plantsoftheworldonline.org/). GADM provides maps for databases and floras provided relatively reliable distribution data of
all countries and their subdivisions and offers the possibility to map species at different spatial scales.
species distribution according to the collection localities. However, We compiled the global distribution records of species (or gen-
the sizes of the geographical units in the WGSRPD and GADM vary era) from well recognized and authorized datasets at a global scale, e.g.
significantly across space. Therefore, we established our geographical World Checklist of Selected Plant Families (WCSP, http://apps.kew.
standard units (GSU) for the earth landmasses by (1) merging small org/wcsp/home.do), which collects global species distribution data
adjacent regions of WGSRPD and GADM into larger ones and (2) and at the time of our search included information for 173 plant
splitting the large units of WGSRPD to small ones based on GADM to families. We supplemented these records further by adding the dis-
reduce the effects of distribution data deficiency and area on the tributions of all legume species from the Internal Legume Database &
estimation of genus richness. The final GSUs classified the earth Information Service (ILDIS, http://ildis.org/). We also compiled species
landmasses (islands and Antarctica not included) into 403 geo- distribution data from Tropicos (http://www.tropicos.org/ProjectList.
graphical units. We then prepared a dictionary of geographical names aspx), and online databases and checklists published or maintained by
to link the names of administrative units at different levels (e.g. county, plant research institutes or governments, e.g. Royal Botanic Garden
province, country) within GADM and WGSRPD to the names of our Edinburgh, Smithsonian Tropical Research Institute, British Natural
GSUs. We standardized and georeferenced the recorded geographical History Museum, Kunming Institute of Botany Chinese Academy of
names from different data sources based on the global geographical Sciences and Institute of Botany Chinese Academy of Sciences. For
names database (GeoNames, http://www.geonames.org/). example, Bolivia Catalogue compiled by Tropicos includes over
The maps of our GSUs were prepared using Goode projection 10,000 species and over 50,000 state-level distribution records. These
(Land) in ArcGIS 10. The areas of all GSUs are roughly standardized databases have been regularly updated and maintained and contain
with area ranging from 37,923 km² to 2,151,791 km² (Fig. S7A). The the latest and relatively reliable data for spatial distributions of plant
mean area of all GSUs is 329,670 km² with a standard deviation of species.
198,191 km². Linear regression showed that the area of GSU has a non- In addition to these datasets, we also used occurrence data from
significant relationship with latitude and can capture the global lati- herbarium specimens, personal collections, and online checklists,
tudinal gradient of environmental conditions (Fig. S7B–D). Hence the some of which have not been scrutinized by taxonomic experts to
potential bias of GSU area on species richness is avoided. the same standards. Therefore, these records were used with cau-
tion. To improve the quality of species distribution data, we con-
Compilation of distributional data. We compiled distribution data for ducted a strict quality control process (see Quality control of the
global seed plant species from >1100 available data sources, including distributional data).
regional and local floras, online databases of specimens and species, Depending on the types of the raw data on species distributions,
and published checklists and papers in different regions following the we used different methods to reduce spatial conflicts and to improve
approach of ref. 72 as outlined below. See Supplementary Data 1 for a the accuracy of species distributions in the final dataset. We classified
detailed list of all data sources used for the compilation of species the raw distributional data into four types: coordinates, range maps,
distribution data. gridded distributions, and recorded localities. For species distribution
For different continents and large regions, we compiled data from data recorded as coordinates, we first removed the spurious records
published regional and continental databases and floras. For example, with latitudinal values outside the range of −90 to 90 and longitudinal
the distribution records from the former Soviet Union came from the values outside the range of −180 to 180. Then, we used the MATLAB
Flora of USSR, which includes distribution data for over 7000 native function ‘inpolygon’ to map the coordinates to GSUs and retained only
species. Distribution data for Chinese species were extracted from the those coordinates which were inside the GSUs. To improve the data
Flora of China (both Chinese and English versions) (http://www.efloras. accuracy, when the coordinates in the herbarium specimen conflicted
org/; http://frps.iplant.cn/), and Catalogue of Chinese Higher Plants, with the described localities, we used collection localities rather than
which include over 200,000 province-level distribution records for coordinates to map the taxa to GSUs. For species distributions recor-
over 34,000 species. For India, we used the data from the online ded as range maps, we manually extracted the range map of each
Angiosperm Flora of India (beta version, http://flora. taxon using ArcGIS 10 and used the boundaries of the original datasets
indianbiodiversity.org/), which includes the information of over wherever possible. For species distributions recorded as grid cells, we
20,000 species and distribution maps of over 12,000 species. For overlapped these grid cells with the GSUs. Only when the intersected
Europe, we used the data from Euro+Med PlantBase (http://www. area of a grid cell by GSUs was larger than half of its size, the record of
emplantbase.org/home.html) that contains the distribution records of this grid cell was kept. For species distribution data recorded as
over 95% of the European vascular plants. The distribution data for locality names, all locality names were first searched in the global
North America was compiled from the Plant Database of US Depart- geographical names service (http://www.geonames.org/) and then
ment of Agriculture (https://plants.usda.gov/java/) and the Database of were standardized by our geographical names dictionary to make it
Vascular Plants of Canada (VASCAN, http://data.canadensys.net/ consistent with the GSUs. When the boundary of a locality did not
vascan/search/), which contain over 300,000 state-level distribution completely overlap with the GSU boundaries, we intersected its
records for over 31,000 species in USA and Canada. Distribution data boundary with the GSUs and assigned the locality to the correspond-
for Australia was obtained from the Australian Plant Census (APC, ing GSU that covered at least 80% of its area.
https://www.anbg.gov.au/chah/apc/) and the Census of South Aus- The species distribution data in image format was digitized,
tralian Plants, Algae and Fungi (http://www.flora.sa.gov.au/census. georeferenced and converted into GIS shape files. All geographical
shtml). Species distribution data for Brazil was supplemented using operations were done in ArcGIS 10. We used MATLAB (2013b) to read
the Catalogue of Plants and Fungi of Brazil that includes ca. 35,000 and import data into the SQL server database (2008 R2) through the
higher plants and over 130,000 state-level distribution records. For SQL JDBC driver.

Nature Communications | (2023)14:7609 10


Article https://doi.org/10.1038/s41467-023-43396-8

Quality control of the distributional data. To improve the quality of We also collected CAM42,77–79 and C443,80 photosynthetic pathway
species distribution data, we conducted the following quality control data, and the proportion of woody species for each genus. Due to the
process. We set a threshold for the number of data sources to retain difficulty in determining CAM and C4 photosynthetic pathway for each
an occurrence record of a species in a GSU in different regions. For species, we classified genera as CAM or C4 when some of their species
European GSUs, occurrence data corroborated by at least 3 data were identified as CAM or C4 species. The proportion of woody species
sources were retained; for GSUs of Australia, China, Madagascar, and of each genus was estimated using a newly generated plant life from
North America, the threshold was set to 2 data sources. The entire database which includes life form information for each species based
data was retained for Afghanistan, Central America, South America, on online databases, published papers, and floras10,81. We defined a
Africa, Temperate Asia, and Tropical Asia because of the relative data genus as woody when the proportion of woody species within the
deficiency in these regions. We did not include distributions of genus was over 60%. Similarly, we defined a genus as herbaceous when
genera in the introduced parts of their range in our database. To the proportion of woody species was <40%. Evolutionary rates of dif-
identify the introduced genera in each geographical unit, we first ferent groups of taxa based on the traits outlined above were com-
collected the information for each species. When all species of the pared in the R82 package “multcompView” and “rcompanion” using
genus are introduced in the geographical unit, we deemed it as an one-tailed Wilcoxon test or pairwise Mann–Whitney U tests83,84.
introduced genus. We also removed the genus level introduced
record according to Plants of the World Online (POWO, http://www. Contemporary climate data
plantsoftheworldonline.org/ accessed: Aug, 2023). Mean annual temperature (MAT) and mean annual precipitation (MAP)
Finally, we manually checked the distribution maps of each genus. data were downloaded from the WorldClim database (v2.0) at a spatial
New records were added and dubious records were removed accord- resolution of 2.5 arc minutes85. These climate data are generated from
ing to the genus description from Flora of North America, Flora of terrestrial climate stations (34,542 stations for mean annual pre-
China, Wikispecies (https://en.wikipedia.org/wiki/Wikispecies), and cipitation and 20,268 stations for mean annual temperature) using
other sources. In total, our final species database includes thin‐plate spline method, in which mean MODIS cloud cover, daytime
360,000 species and 15,500 genera with over 2,180,000 distribution land surface temperatures, distance to oceanic coast, and elevation are
records. Based on this species distribution database, we generated a used as covariates85. These data have been widely used in previous
genus distribution database, which includes 397,403 records for studies at different scales. Mean values of these two climate variables
14,976 angiosperm genera. Of these 13,719 genera are included in our for each geographical unit were then calculated as the average of all
global phylogeny, and 11,798 are included in the phylogenetic tree for cells of 2.5 × 2.5 arc minutes using Zonal Statistics in ArcGIS 10.1 (ESRI
genera with molecular data. Differences in the number of genera in the Inc.). Mean annual temperature/precipitation ranges were calculated
distributional dataset and in the phylogenies are due to lack of detailed as the difference between maximum and minimum values within each
distributional data for some taxa. Spatial and latitudinal genera rich- geographical unit. Mean and range of MAT and coefficient of variation
ness patterns are shown in Fig. 3. Figure S7E shows the quality of the of MAP were used to explore the relationships between the mean net
available data in different geographical units. diversification rate per geographical unit (calculated as the average of
the current net diversification rates of all tips occurring in each geo-
Species and genus names. Distributional data sources were checked graphical unit) and climate (Figs. 4A–D, S12, and S13).
for nomenclatural issues independently of the molecular data and
before the final distributional dataset was compiled. We compiled the Diversification analyses
distributional data for each seed plant genus by aggregating distribu- Temporal patterns of seed plant diversification. The assumption
tion data of all its species. The taxonomic status and the accepted that phylogenetic trees can be used to study diversification dynamics
names of species from all data sources were standardized using the trough time using stochastic birth-dead models has recently been
recently updated databases Catalogue of Life (COL, http://www. subjected to criticism86. In their study ref. 86 shows that current
catalogueoflife.org/col/, accessed: May, 2018) and the plant list (TPL) methods to estimate historical fluctuations of evolutionary rates based
available at http://www.theplantlist.org/ (accessed: Jan 3, 2015). We on dated phylogenies cannot provide reliable estimates of past rates
first matched all the species names with the accepted names of COL as, for a given tree, there is an infinite number of diversification sce-
and TPL. The unmatched taxonomic names were basically the syno- narios that are equally likely. They suggest that the only metric that
nyms, unresolved names, and misspelt names. Therefore, we thor- current methods can estimate are the rates at present or “tip rates” and
oughly rechecked species names and synonyms in TPL and replaced suggest using two new metrics - pulled speciation rate and pulled
them with the corrected/accepted names. For taxonomic names that diversification rate. The identifiability issues raised by86 are important
returned multiple matches from TPL, we selected accepted names with and have far-reaching implications as all widely used methods to study
the highest confidence level. However, when they showed the same diversification (as those used here) are affected. A recent evaluation of
confidence level, we crosschecked the names manually in the World the relevance and potential impact of identifiability issues for studies
checklist of selected plant names (http://apps.kew.org/wcsp/) as well of diversification based on dated trees of extant species shows that
as Tropicos (http://www.tropicos.org/). The misspelt taxonomic non-identifiability does not imply that current methods to study
names were corrected using the Taxonomic Name Resolution Service diversification cannot be used87. Acknowledging the importance of
4.0 (TNRS, http://tnrs.iplantcollaborative.org/TNRSapp.html, acces- identifiability problems87, argue that using hypothesis-driven approa-
sed: 18 May 2016). The taxonomic names including ‘aff.’, ‘cf.’, and ‘x’ ches, implementing priors in Bayesian frameworks, and penalizing for
(representing hybrids or taxa of uncertain identification) were not models complexity, limits their impact and allows current diversifica-
included. The final data set was then compared to the recently updated tion methods to be used.
databases of Plants of The World online (PTW, http:// Here our main conclusions are mostly based on estimates of tip
plantsoftheworldonline.org/, accessed: Aug, 2023). When we found rates which are suggested to be less affected by identifiability issues.
conflicts among these databases, we first followed COL, and then To study rates through time, we use three different methods, two of
POWO. Taxonomic names that were identified as ‘unresolved’ in both which are Bayesian approaches (BAMM and diversification rate models
COL and POWO were removed. Finally, we compiled 397,403 unique implemented with RevBayes). These Bayesian approaches do not aim
distributional records for 13,719 genera that were also included in our at selecting a single best fit model that describes lineages evolutionary
full genus level flowering plant phylogeny (Supplementary Data 5). history but rather sample the posterior distribution of model space

Nature Communications | (2023)14:7609 11


Article https://doi.org/10.1038/s41467-023-43396-8

and evaluate models based on their frequency in that posterior for the crown angiosperms) up to about 5 Ma (for the tree dated with
distribution. the 149–256 Ma constraint), thus the tip rates that we estimate here are
Although estimates of diversification rates shall be interpreted depended on the recent evolutionary history of genera (in the last
with caution, given our focus on tip rates, the use of a mixture of 3 Ma to 5 Ma).
analytical approaches and the fact that despite some differences all While working on the present diversification analyses, the BAMM
results support our conclusions, we believe that our methodological analytical approach has been subjected to a critique by ref. 91. Using
approach can assess the patterns of diversification at the scale and with simulations, the authors of ref. 91 reach to the conclusion that there
the precision that is necessary to support our conclusions. are two major flaws with the BAMM approach. First, they conclude that
To study the speciation and net diversification through time, we the likelihood function used to estimate model parameters is incor-
used the program Bayesian analyses of macroevolutionary mixtures rect; and second, they found compound Poisson process prior model
v2.3.0 (BAMM88). BAMM models the evolutionary dynamics of lineages to be incoherent. We have carefully considered the arguments of
through time by defining distinct macroevolutionary cohorts that ref. 91 as the core of our discussion is based on results from analyses in
share common rates of speciation and extinction and that are sepa- BAMM, and here we outline our arguments in favor of using BAMM in
rated by other such cohorts because of diversification rate shifts. combination with other analytical approaches. Most of the points
BAMM can assess diversification rate heterogeneity on highly incom- raised by ref. 91, and specifically the one concerning the Poisson prior,
plete and phylogenetically non-random datasets and thus is appro- have been addressed by ref. 92 and in the BAMM online manual93
priate for the analysis of our data. Although our molecular dataset has which includes reanalysis of the data used by91. This rebuttal shows
high coverage of seed plant genera, as described above we used the that there are some major flaws in the way the author of91 used BAMM
polytomy resolver method to include all genera of seed plants in our and demonstrates that the likelihood function of ref. 91 is incorrect.
analyses. The resulting phylogeny covers all the flowering plant gen- Another comment on the ability of the BAMM approach to estimate
era. However, at the species level, the tree is highly incomplete, as it diversification rates correctly has also become available recently94; for
only includes a single representative from each known genus. In a reply see ref. 95. Although refs. 91, 94 discuss important concerns
addition, the level of sampling incompleteness is highly non-random that are relevant to the usage of BAMM, the rebuttal provided by
as one species represents much higher proportion of the known refs. 92, 95, show that BAMM performs as intended.
diversity of species-poor genera, while this is negligible for species-rich To further ensure that the use of BAMM here does not lead to
genera containing thousands of species (e.g., Astragalus). biased results we used two alternative approaches to estimate diver-
To account for the diversity of species within genera, BAMM sification rates through time, RPANDA96 and RevBayes 1.0.1097. In
analyses were run for the full and molecular phylogenies separately. In RPANDA, we fitted four different birth-death models (speciation and
these runs, we set the fraction of backbone completeness and the extinction constant, speciation constant extinction variable, extinction
sampling fractions for each tip (genus) using the sampleProbsFilename constant speciation variable, and both speciation and extinction vari-
parameter in BAMM. This parameter is taken as an input file where the able) to our data, taking into account the fraction of species not pre-
level of sampling incompleteness for each tip and for the backbone sent in our phylogeny. Of these four possibilities, the model where
topology is specified. For analyses based on the molecular phylogeny, both speciation and extinction were allowed to vary best fit the data
the fraction of backbone completeness was set to 0.725 assuming that (based on AIC scores). Results from BAMM also favor a scenario where
the species that belong to the 1792 genera without DNA must have both speciation and extinction vary through time. Next, we used the fit_
branched out from the backbone portion of the phylogeny. When env function to investigate the diversification dynamics of angios-
using the global phylogeny, although we have included all described perms in relation to environmental variability (in that case in relation
genera, we followed a conservative assumption, and we used a back- to global temperature). For the temperature, we used the environ-
bone sampling fraction of 0.97. The sampling incompleteness for each mental data on average global temperatures through the Cenozoic
tip was set as the reciprocal of the known species diversity of each that is provided by RPANDA (in the InfTemp dataset distributed with
genus. Using clade-specific sampling fractions allows accounting bet- the package) and extended it to include the time interval back to the
ter for potential biases introduced by non-random sampling strategies origin of crown angiosperms (depending on the dating analyses either
and the non-random distribution of missing tips89. These files are also 150, 210, or 260 Ma). The data of global historical temperature in the
available upon request from the authors. InfTemp dataset has been reconstructed using the delta-O-18
Before all BAMM analyses, outgroup taxa were pruned and proper measurements98,99, and is available for free. Additional historical tem-
priors and MCMC chain settings were selected using the setBAMM- perature data for periods before the Cenozoic were extracted from
priors function in the package BAMMtools and the chainSwapPercent.R ref. 24, and were estimated with δ18O measurements with corrections
function respectively89. All analyses were run on the Abel cluster at the for water pH effects (see ref. 24 for more details). Although the tem-
University of Oslo and the High-performance Computing Platform of poral resolution and precision of deep-time temperature data are not
Peking University until satisfactory chain mixing and effective sam- as high as information for recent geological periods, it is indicative for
pling size values (ESS) of log-likelihood were achieved. ESS was the general trends in global temperature fluctuation. Results of these
examined using the R library coda as recommended in ref. 90. environmental dependency diversification analyses show general
Outputs from the diversification analyses in BAMM were pro- trends that are very similar to those obtained by BAMM and are pre-
cessed in the R package BAMMtools v 2.0.689. We used the BAMMtools sented in Fig. S2.
functions to evaluate the shifts in diversification regimes across the In RevBayes we set analyses using the episodic birth-death
seed plant phylogeny (Fig. 1), to extract tip rates of diversification and model100 following the instruction in the RevBayes manual. This
to visualize the evolutionary rates dynamics through time (Figs. 2, S1, model allows rates to change between intervals in which they are
and S4). To extract tip speciation and net diversification rates from the treated as constant. RevBayes analyses were run until ESS of estimated
results of the BAMM analyses we used the getTipRates function. BAMM model parameters exceeded 200 after burn-in. Results were then
was run using a segLength setting = 0.02, thus branches ware split into processed with the R package RevGadgets v 1.0.0. (Fig. S3).
fragments with a length of 0.2% of the total height of the topology. For In addition, we compared the inferred diversification dynamics
each of these 0.2% fractions constant rates are assumed. This dis- of Silvestro et al.16 based on fossils with our results. In their study
cretization of the rates is used in BAMM to speed up computation. Silvestro et al.16 used a different methodological approach and an
Here the 0.02 setting implies that branches are discretized in intervals independent data set based on the fossil record and found patterns
ranging from 3 Ma (for the tree dated with the 140–150 Ma constraint of diversification dynamics that are very consistent with our BAMM,

Nature Communications | (2023)14:7609 12


Article https://doi.org/10.1038/s41467-023-43396-8

RPANDA and RevBayes results. The convergence of results from In our study, drylands correspond to dry climate (B) while temperate
BAMM, RPANDA, RevBayes and fossil-based analyses is an additional regions correspond to continental region (D) in Koppen climate
indication that our results represent a real phenomenon and not an classification system.
analytical artefact. Thus, we used BAMM to generate the results in
the main text and figures, as it is more flexible than RPANDA and Reporting summary
RevBayes. Further information on research design is available in the Nature
We also compared the estimated evolutionary rates based on the Portfolio Reporting Summary linked to this article.
three dating schemes, and we found that these are also very similar
(Fig. S28). Data availability
All information needed to evaluate the results and conclusions pre-
Spatial patterns of generic diversity and evolutionary dynamics. To sented in this study is provided in the manuscript and/or supple-
calculate spatial patterns of angiosperm generic diversity (Fig. 3), we mentary materials. Phylogenies are publicly available at https://en.
summed the total number of genera in each GSU. We mapped the geodata.pku.edu.cn/index.php?c=content&a=list&catid=200. Dis-
spatial patterns of angiosperm evolutionary dynamics in terms of tributional data and diversification rates estimates are provided in
average genus age, speciation, and net diversification of all genera Supplementary Data 5. Distribution data was obtained from both on-
within GSUs. line databases and directly from the literature and the complete list of
We calculated the mean values of genus age, speciation rate and distributional data sources is provided in Supplementary Data 1. Spe-
net diversification rate for all genera in each GSU (Figs. 3, S8, and S9). cies distribution data recorded as locality names were searched in the
To demonstrate the latitudinal gradients of these variables, we used global geographical names service http://www.geonames.org. Global
the ‘lowess’ function to plot the generic diversity, mean genus age and Administrative Areas boundaries were downloaded from http://www.
mean evolutionary rates per geographical unit against the latitude of gadm.org and were used as a base to develop the geographical units
the geometric center of the geographical units (Figs. 3 and S10). The used in our spatial analyses. The shape file of the geographical units
‘lowess’ regression line was generated with the ‘smooth’ function used in the analyses is included in Supplementary Data 5. Family level
implemented in MATLAB (2017a). evolutionary rate estimates are provided in Supplementary Data 3. All
To visually check the spatial patterns of generic diversity sequences used in the phylogenetic analyses are available in GenBank
according to their evolutionary history, we first assigned all genera to and accession numbers for all sequences used in the analyses are
four quartiles by their speciation rate or net diversification rate provided in Supplementary Data 4. Sampling fractions used in the
respectively. For each quartile, we then mapped their relative generic BAMM analyses are available in Supplementary Data 4. The informa-
diversity within each geographical unit (Figs. 5 and S20–S24). Spatial tion on fossil calibrations used in the analyses along with a complete
variation of absolute and relative generic diversity (proportion) mean list of the relevant references is provided in Supplementary Data 2. The
evolutionary rates per geographical unit among different age quartiles taxonomic status and accepted names of species were standardized
were also mapped by classifying all genera to four quartiles by their age using the Catalogue of Life (http://www.catalogueoflife.org/col/,
(Figs. S14–S19). accessed: May, 2018), the plant list (TPL; http://www.theplantlist.org/),
To explore the correlations between the spatial variation of cli- World Checklist of Selected Plant Families (http://apps.kew.org/wcsp/)
mate and the spatial variation in mean evolutionary rates per geo- and Tropicos (http://www.tropicos.org/). Misspelled taxonomic names
graphical unit, we employed linear regression models using mean net were corrected using the Taxonomic Name Resolution Service 4.0
diversification rate per geographical unit as the dependent variable (TNRS, http://tnrs.iplantcollaborative.org/TNRSapp.html, accessed:
and mean annual temperature or mean annual precipitation as pre- May, 2018). The final data set was also compared to the Plant of The
dictors respectively. Reduced major axis regression (i.e. type II World on-line (PTW, http://plantsoftheworldonline.org/, accessed:
regression) was used to explore the correlations between generic August, 2023). Climate data was downloaded from the WorldClim
richness and mean genus age and mean evolutionary rate because of database (v2.0) and Chelsea (v2.0) https://chelsa-climate.org/ and cli-
the potential errors for both dependent variables and predictors matic data used in the analyses are provided in Supplementary Data 5.
(Figs. 4E, F and S26). Linear regressions and reduced major axis Photosynthetic pathway data was collected directly from the
regressions were performed with MATLAB v. 2017b using the func- literature77–80. Growth form data was obtained from the Plant Trait
tions ‘lm.fit’ and ‘lsqfitgm’ (https://www.mbari.org/summary-of- Database https://www.try-db.org/ and from published databases10,81.
modifications/). We built a null model to test whether diversifica-
tion rate was randomly distributed across space, assuming that Code availability
species were randomly distributed across space according to genus During the work on this manuscript, we did not develop custom code,
richness patterns. We repeated the random process 999 times and novel computation algorithms, or mathematical approaches. We used
used t-test to assess whether the observed correlations between the publicly available software packages as described in the materials and
spatial variation of climate and the mean evolutionary rates per methods.
geographical unit differed significantly from the null model.
Finally, to evaluate the spatial patterns of rates across latitudes, References
we investigated the evolutionary dynamics of present-day generic 1. Willis, K. & McElwain, J. The Evolution of Plants. (OUP
assemblages by dividing the world into 13 latitudinal belts at 10- Oxford, 2013).
degree intervals as follows: S5-N5 (the equator), S5-S15, S15-S25, S25- 2. Shear, W. A. Shaking The Tree: Readings From Nature In The His-
35, S35-S45, S45-S55, N5-N15, N15-N25, N25-N35, N35-N45, N45-N55, tory Of Life (ed. Gee, H.) 169–179 (University of Chicago
N55-N65 and N65-N75. We then estimated the evolutionary rates Press, 2000).
through time for all the genera distributed in each belt separately 3. Crepet, W. L. & Niklas, K. J. Darwin’s second “abominable mys-
(Figs. 2B, S5, and S10). Genera of each latitudinal belt were extracted tery”: why are there so many angiosperm species? Am. J. Bot. 96,
by aggregating the genera of all GSU within each latitudinal belt. GSU 366–381 (2009).
were assigned to a latitudinal belt if >50% of its area fell into a specific 4. Humboldt, A. von & Bonpland, A. Essai Sur La Géographie Des.
belt. Genera that are found in more than one latitudinal belt were Plantes (Schoell et Cie, 1805).
included in all respective subsamples. We used the Koppen climate 5. Kreft, H. & Jetz, W. Global patterns and determinants of vascular
classification system to categorize temperate regions and drylands. plant diversity. Proc. Natl Acad. Sci. USA 104, 5925–5930 (2007).

Nature Communications | (2023)14:7609 13


Article https://doi.org/10.1038/s41467-023-43396-8

6. Jansson, R. & Davies, T. J. Global variation in diversification rates of 30. Eldrett, J. S., Greenwood, D. R., Harding, I. C. & Huber, M.
flowering plants: energy vs. climate change. Ecol. Lett. 11, Increased seasonality through the Eocene to Oligocene transition
173–183 (2008). in northern high latitudes. Nature 459, 969–973 (2009).
7. Wiens, J. J. & Donoghue, M. J. Historical biogeography, ecology 31. Schubert, M., Marcussen, T., Meseguer, A. S. & Fjellheim, S. The
and species richness. Trends Ecol. Evol. 19, 639–644 (2004). grass subfamily Pooideae: Cretaceous–Palaeocene origin and
8. Jablonski, D., Roy, K. & Valentine, J. W. Out of the tropics: evolu- climate-driven Cenozoic diversification. Glob. Ecol. Biogeogr. 28,
tionary dynamics of the latitudinal diversity gradient. Science 314, 1168–1182 (2019).
102–106 (2006). 32. Heine, C., Müller, R. D. & Gaina, C. Continent-Ocean Interactions
9. Davies, T. J. et al. Darwin’s abominable mystery: insights from a Within East Asian Marginal Seas (eds. Clift, P., Kuhnt, W., Wang, P. &
supertree of the angiosperms. Proc. Natl Acad. Sci. USA 101, Hayes, D.) 37–54 (American Geophysical Union, 2004).
1904–1909 (2004). 33. Herbert, T. D. et al. Late Miocene global cooling and the rise of
10. Zanne, A. E. et al. Three keys to the radiation of angiosperms into modern ecosystems. Nat. Geosci. 9, 843–847 (2016).
freezing environments. Nature 506, 89–92 (2014). 34. Klak, C., Reeves, G. & Hedderson, T. Unmatched tempo of evolu-
11. Linder, H. P. Plant species radiations: where, when, why? Philos. tion in Southern African semi-desert ice plants. Nature 427,
Trans. R. Soc. B 363, 3097–3105 (2008). 63–65 (2004).
12. Bell, C. D., Soltis, D. E. & Soltis, P. S. The age and diversification of 35. Cai, L. et al. Climatic stability and geological history shape global
the angiosperms re-revisited. Am. J. Bot. 97, 1296–1303 (2010). centers of neo- and paleoendemism in seed plants. Proc. Natl
13. Magallón, S., Gómez-Acevedo, S., Sánchez-Reyes, L. L. & Her- Acad. Sci. USA 120, e2300981120 (2023).
nández-Hernández, T. A metacalibrated time-tree documents the 36. Jia, G., Peng, P., Zhao, Q. & Jian, Z. Changes in terrestrial ecosys-
early rise of flowering plant phylogenetic diversity. N. Phytol. 207, tem since 30 Ma in East Asia: Stable isotope evidence from black
437–453 (2015). carbon in the South China Sea. Geology 31, 1093–1096 (2003).
14. Ramírez-Barahona, S., Sauquet, H. & Magallón, S. The delayed and 37. Strömberg, C. A. Evolution of grasses and grassland ecosystems.
geographically heterogeneous diversification of flowering plant Annu. Rev. Earth Planet. Sci. 39, 517–544 (2011).
families. Nat. Ecol. Evol. 4, 1232–1238 (2020). 38. Smith, S. A. & Donoghue, M. J. Rates of molecular evolution are
15. Igea, J. & Tanentzap, A. J. Angiosperm speciation cools down in linked to life history in flowering plants. Science 322,
the tropics. Ecol. Lett. 23, 692–700 (2020). 86–89 (2008).
16. Silvestro, D., Cascales-Miñana, B., Bacon, C. D. & Antonelli, A. 39. Rice, A. et al. The global biogeography of polyploid plants. Nat.
Revisiting the origin and diversification of vascular plants through Ecol. Evol. 3, 265–273 (2019).
a comprehensive Bayesian analysis of the fossil record. New 40. Zhan, S. H., Drori, M., Goldberg, E. E., Otto, S. P. & Mayrose, I.
Phytol. 207, 425–436 (2015). Phylogenetic evidence for cladogenetic polyploidization in land
17. Cascales-Miñana, B. & Cleal, C. J. The plant fossil record plants. Am. J. Bot. 103, 1252–1258 (2016).
reflects just two great extinction events. Terra Nova 26, 41. Brochmann, C. et al. Polyploidy in arctic plants. Biol. J. Linn. Soc.
195–200 (2014). 82, 521–536 (2004).
18. Stevens, P. F. Angiosperm Phylogeny Website. Version 14. http:// 42. Bone, R. E., Smith, J. A. C., Arrigo, N. & Buerki, S. A macro-
www.mobot.org/MOBOT/research/APweb/ (2017). ecological perspective on crassulacean acid metabolism (CAM)
19. Barba‐Montoya, J., Reis, M., dos, Schneider, H., Donoghue, P. C. J. photosynthesis evolution in Afro-Madagascan drylands: Eulophii-
& Yang, Z. Constraining uncertainty in the timescale of angios- nae orchids as a case study. New Phytol. 208, 469–481 (2015).
perm evolution and the veracity of a Cretaceous Terrestrial 43. Osborne, C. P. et al. A global database of C4 photosynthesis in
Revolution. New Phytol. 218, 819–834 (2018). grasses. New Phytol. 204, 441–446 (2014).
20. Smith, S. A. & Brown, J. W. Constructing a broadly inclusive seed 44. Spriggs, E. L., Christin, P.-A. & Edwards, E. J. C4 photosynthesis
plant phylogeny. Am. J. Bot. 105, 302–314 (2018). promoted species diversification during the miocene grassland
21. Janssens, S. B. et al. A large-scale species level dated angiosperm expansion. PLoS ONE 9, e97722 (2014).
phylogeny for evolutionary and ecological analyses. Biodivers. 45. Osborne, C. P. & Freckleton, R. P. Ecological selection pressures
Data J. 8, e39677 (2020). for C4 photosynthesis in the grasses. Proc. R. Soc. B: Biol. Sci. 276,
22. Coiffard, C., Gomez, B., Daviero-Gomez, V. & Dilcher, D. L. Rise to 1753–1760 (2009).
dominance of angiosperm pioneers in European Cretaceous 46. Wiens, J. J. The causes of species richness patterns across space,
environments. Proc. Natl Acad. Sci. USA 109, time, and clades and the role of “ecological limits”. Quart. Rev.
20955–20959 (2012). Biol. 86, 75–96 (2011).
23. Crane, P. R., Friis, E. M. & Pedersen, K. R. The origin and early 47. Stephens, P. R. & Wiens, J. J. Explaining species richness from
diversification of angiosperms. Nature 374, 27–33 (1995). continents to communities: the time‐for‐speciation effect in
24. Royer, D. L., Berner, R. A., Montañez, I. P., Tabor, N. J. & Beerling, D. emydid turtles. Am. Nat. 161, 112–128 (2003).
J. CO2 as a primary driver of Phanerozoic climate. GSA Today 14, 48. Rabosky, D. L. et al. An inverse latitudinal gradient in speciation
4 (2004). rate for marine fishes. Nature 559, 392–395 (2018).
25. Chaboureau, A.-C., Sepulchre, P., Donnadieu, Y. & Franc, A. 49. Weir, J. T. & Schluter, D. The latitudinal gradient in recent spe-
Tectonic-driven climate change and the diversification of ciation and extinction rates of birds and mammals. Science 315,
angiosperms. Proc. Natl Acad. Sci. USA 111, 14066–14070 (2014). 1574–1576 (2007).
26. Wing, S. L. et al. Floral and environmental gradients on a Late 50. Pearson, P. N. et al. Stable warm tropical climate through the
Cretaceous landscape. Ecol. Monogr. 82, 23–47 (2012). Eocene Epoch. Geology 35, 211–214 (2007).
27. Krug, A. Z. & Jablonski, D. Long-term origination rates are reset 51. Zheng, H. et al. Late oligocene–early miocene birth of the Takli-
only at mass extinctions. Geology 40, 731–734 (2012). makan Desert. Proc. Natl Acad. Sci. USA 112, 7662–7667 (2015).
28. Jetz, W., Thomas, G. H., Joy, J. B., Hartmann, K. & Mooers, A. O. The 52. Zhang, Z. et al. Aridification of the Sahara desert caused by Tethys
global diversity of birds in space and time. Nature 491, Sea shrinkage during the Late Miocene. Nature 513, 401 (2014).
444–448 (2012). 53. Coiro, M., Doyle, J. A. & Hilton, J. How deep is the conflict between
29. Auderset, A. et al. Enhanced ocean oxygenation during Cenozoic molecular and fossil evidence on the age of angiosperms? New
warm periods. Nature 609, 77–82 (2022). Phytol. 223, 83–99 (2019).

Nature Communications | (2023)14:7609 14


Article https://doi.org/10.1038/s41467-023-43396-8

54. Mannion, P. D., Upchurch, P., Benson, R. B. J. & Goswami, A. The 78. Silvera, K. et al. Evolution along the crassulacean acid metabolism
latitudinal biodiversity gradient through deep time. Trends Ecol. continuum. Funct. Plant Biol. 37, 995–1010 (2010).
Evol. 29, 42–50 (2014). 79. Winter, K., Holtum, J. A. M. & Smith, J. A. C. Crassulacean acid
55. Couvreur, T. L. P. Odd man out: why are there fewer plant species metabolism: a continuous or discrete trait? New Phytol. 208,
in African rain forests? Plant Syst. Evol. 301, 1299–1313 (2015). 73–78 (2015).
56. Etienne, R. S. et al. A minimal model for the latitudinal diversity 80. Sage, R. F. A portrait of the C 4 photosynthetic family on the
gradient suggests a dominant role for ecological limits. Am. Nat. 50th anniversary of its discovery: species number, evolu-
194, E122–E133 (2019). tionary lineages, and Hall of Fame. J. Exp. Bot. 67, 4039–4056
57. Xu, X., Dimitrov, D., Rahbek, C. & Wang, Z. NCBIminer: sequences (2016).
harvest from Genbank. Ecography 38, 426–430 (2015). 81. Engemann, K. et al. A plant growth form dataset for the New
58. Campbell, V. & Lapointe, F.-J. The use and validity of composite World. Ecology 97, 3243–3243 (2016).
taxa in phylogenetic analysis. Syst. Biol. 58, 560–572 (2009). 82. R. Core Team. R: A Language and Environment for Statistical
59. Pyron, R. A. et al. Genus-level phylogeny of snakes reveals the Computing (R Foundation for Statistical Computing, 2019).
origins of species richness in Sri Lanka. Mol. Phylogenet. Evol. 66, 83. Cohen, J. Statiscal Power Analysis for the Behavioral Sciences,
969–978 (2013). Secon (La Wrence Erlabaum Associates, Publishers, 1988).
60. Hinchliff, C. E. & Roalson, E. H. Using supermatrices for phyloge- 84. Vargha, A. & Delaney, H. D. A critique and improvement of the cl
netic inquiry: an example using the sedges. Syst. Biol. 62, common language effect size statistics of McGraw and Wong. J.
205–219 (2013). Educ. Behav. Stat. 25, 101–132 (2000).
61. Katoh, K., Asimenos, G. & Toh, H. Multiple alignment of DNA 85. Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1-km spatial resolution
sequences with MAFFT. Meth. Mol. Biol. 537, 39–64 (2009). climate surfaces for global land areas. Int. J. Climatol. 37,
62. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis 4302–4315 (2017).
and post-analysis of large phylogenies. Bioinformatics 30, 86. Louca, S. & Pennell, M. W. Extant timetrees are consistent with
1312–1313 (2014). a myriad of diversification histories. Nature 580, 502–505
63. Soltis, D. E. et al. Angiosperm phylogeny: 17 genes, 640 taxa. Am. (2020).
J. Bot. 98, 704–730 (2011). 87. Morlon, H., Hartig, F. & Robin, S. Prior hypotheses or regularization
64. Smith, S. A. & O’Meara, B. C. treePL: divergence time estimation allow inference of diversification histories from extant timetrees.
using penalized likelihood for large phylogenies. Bioinformatics bioRxiv https://doi.org/10.1101/2020.07.03.185074 (2020).
28, 2689–2690 (2012). 88. Rabosky, D. L. Automatic detection of key innovations, rate shifts,
65. Sanderson, M. J. Estimating absolute rates of molecular evolution and diversity-dependence on phylogenetic trees. PLoS ONE 9,
and divergence times: a penalized likelihood approach. Mol. Biol. e89543 (2014).
Evol. 19, 101–109 (2002). 89. Rabosky, D. L. et al. BAMMtools: an R package for the analysis of
66. Rabosky, D. L. et al. Rates of speciation and morphological evo- evolutionary dynamics on phylogenetic trees. Methods Ecol. Evol.
lution are correlated across the largest vertebrate radiation. Nat. 5, 701–707 (2014).
Commun. 4, 1958 (2013). 90. Plummer, M., Best, N., Cowles, K. & Vines, K. CODA: convergence
67. Magallón, S. Using fossils to break long branches in molecular diagnosis and output analysis for MCMC. R. N. 6, 7–11 (2006).
dating: a comparison of relaxed clocks applied to the origin of 91. Moore, B. R., Höhna, S., May, M. R., Rannala, B. & Huelsenbeck, J. P.
angiosperms. Syst. Biol. 59, 384–399 (2010). Critically evaluating the theory and performance of Bayesian
68. Wu, Z. et al. A precise chloroplast genome of Nelumbo nucifera analysis of macroevolutionary mixtures. Proc. Natl Acad. Sci. USA
(Nelumbonaceae) evaluated with Sanger, Illumina MiSeq, and 113, 9569–9574 (2016).
PacBio RS II sequencing platforms: insight into the plastid evolu- 92. Rabosky, D. L., Mitchell, J. S. & Chang, J. Is BAMM flawed? Theo-
tion of basal eudicots. BMC Plant Biol. 14, 289 (2014). retical and practical concerns in the analysis of multi-rate diver-
69. Kuhn, T. S., Mooers, A. Ø. & Thomas, G. H. A simple polytomy sification models. Syst. Biol. 66, 477–498 (2017).
resolver for dated phylogenies. Methods Ecol. Evol. 2, 93. Rabosky, D. L. BAMM Documentation — bamm 2.5.0 documenta-
427–436 (2011). tion. http://bamm-project.org/documentation.html (2015).
70. Drummond, A. J., Suchard, M. A., Xie, D. & Rambaut, A. Bayesian 94. Meyer, A. L. S. & Wiens, J. J. Estimating diversification rates for
phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, higher taxa: BAMM can give problematic estimates of rates and
1969–1973 (2012). rate shifts. Evolution 72, 39–53 (2018).
71. Rambaut, A. & Drummond, A. J. Tracer v1.4, Available at http:// 95. Rabosky, D. L. BAMM at the court of false equivalency: a response
beast.bio.ed.ac.uk/Tracer (2007). to Meyer and Wiens. Evolution 72, 2246–2256 (2018).
72. Liu, Y. et al. An updated floristic map of the world. Nat. Commun. 96. Morlon, H., Condamine, F. L., Lewitus, E. & Manceau, M. RPANDA:
14, 2990 (2023). an R package for macroevolutionary analyses on phylogenetic
73. Rosindell, J. & Harmon, L. J. OneZoom: a fractal explorer for the trees. Methods Ecol. Evol. 7, 589–597 (2015).
tree of life. PLoS Biol. 10, e1001406 (2012). 97. Höhna, S. et al. RevBayes: bayesian phylogenetic inference using
74. Xu, X., Wang, Z., Rahbek, C., Lessard, J.-P. & Fang, J. Evolutionary graphical models and an interactive model-specification lan-
history influences the effects of water–energy dynamics on oak guage. Syst. Biol. 65, 726–736 (2016).
diversity in Asia. J. Biogeogr. 40, 2146–2155 (2013). 98. Epstein, S., Buchsbaum, R., Lowenstam, H. A. & Urey, H. C. Revised
75. Xu, X., Wang, Z., Rahbek, C., Sanders, N. J. & Fang, J. Geographical carbonate-water isotopic temperature scale. Geol. Soc. Am. Bull.
variation in the importance of water and energy for oak diversity. J. 64, 1315–1326 (1953).
Biogeogr. 43, 279–288 (2016). 99. Zachos, J. C., Dickens, G. R. & Zeebe, R. E. An early Cenozoic
76. Brummitt, R. K. World Geographical Scheme For Recording Plant perspective on greenhouse warming and carbon-cycle dynamics.
Distributions (Hunt Inst. for Botanical Documentation, 2001). Nature 451, 279 (2008).
77. Smith, J. A. C. & Winter, K. Crassulacean Acid Metabolism: Bio- 100. Höhna, S. The time-dependent reconstructed evolutionary pro-
chemistry, Ecophysiology and Evolution (eds. Winter, K. & Smith, J. cess with a key-role for mass-extinction events. J. Theor. Biol. 380,
A. C.) 427–436 (Springer, 1996). 321–331 (2015).

Nature Communications | (2023)14:7609 15


Article https://doi.org/10.1038/s41467-023-43396-8

Acknowledgements Additional information


We thank K. Katoh for his advice on MAFFT alignment strategies, G. Supplementary information The online version contains
Thomas for his advice on polytomy resolution, and D. Rabosky for supplementary material available at
advice on BAMM. This work was supported by the National key https://doi.org/10.1038/s41467-023-43396-8.
Research Development Program of China (#2022YFF0802300 to Z.W.,
#2017YFC0505203 to X.X. and J.L.), National Natural Science Foun- Correspondence and requests for materials should be addressed to
dation of China (#32125026 and #31988102 to Z.W., #31770566 to X.X., Zhiheng Wang.
#32030006 to J.L.), Natural Science Foundation of Sichuan Province
(2023NSFSC1280 to X.X.), Chinese Academy of Sciences-Peking Uni- Peer review information Nature Communications thanks the anon-
versity Pioneer Collaboration Team, and funds from the Danish ymous reviewers for their contribution to the peer review of this work.
National Research Foundation to the Center for Macroecology, Evo-
lution and Climate (DNRF96). All computation was conducted in the Reprints and permissions information is available at
High-performance Computing Platform of Peking University and the http://www.nature.com/reprints
Abel cluster of the Norwegian Metacenter for Computational Science
(NOTUR; project NN9601K to D.D.). J.D.K. was supported by an Indivi- Publisher’s note Springer Nature remains neutral with regard to jur-
dual Fellowship from Marie Sklodowska-Curie actions (MSCA-792534) isdictional claims in published maps and institutional affiliations.
and a Reintegration Fellowship from the Carlsberg Foundation
(CF19-0334). Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as
Author contributions long as you give appropriate credit to the original author(s) and the
Z.W., C.R, D.D., and X.X. conceived the idea; X.X., Z.W., D.D., X.S., source, provide a link to the Creative Commons license, and indicate if
and Y.L. compiled the data; D.D., X.X., X.S., and Z.W. conducted the changes were made. The images or other third party material in this
analyses; Z.W., D.D., X.X., and C.R. led the writing; J.R. produced the article are included in the article’s Creative Commons license, unless
interactive tree visualization. D.D., X.X., X.S., N.S., Y.L., J.D.K., L.L., indicated otherwise in a credit line to the material. If material is not
D.N-B., J.R., Y.Y., J.F., J.L., B.S., J.F., C.R., and Z.W. discussed the included in the article’s Creative Commons license and your intended
findings and contributed to the writing of the final version of the use is not permitted by statutory regulation or exceeds the permitted
manuscript. use, you will need to obtain permission directly from the copyright
holder. To view a copy of this license, visit http://creativecommons.org/
licenses/by/4.0/.
Competing interests
The authors declare no competing interests. © The Author(s) 2023

Nature Communications | (2023)14:7609 16

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy