Reconciling Hierarchical Taxonomy With Molecular Phylogenies
Reconciling Hierarchical Taxonomy With Molecular Phylogenies
Reconciling Hierarchical Taxonomy With Molecular Phylogenies
63(6):1010–1017, 2014
© The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved.
For Permissions, please email: journals.permissions@oup.com
DOI:10.1093/sysbio/syu061
Advance Access publication August 19, 2014
Received 26 February 2014; reviews returned 12 August 2014; accepted 13 August 2014
Associate Editor: Frank Anderson
1010
quantify the full extent of such taxonomic inconsistencies Taxonomic Clade Ages
need to be explored. The level of divergence (referred to as “clade
The recent emergence of complete dated species-level age”) within taxonomic groups was calculated as
DNA phylogenies for large well-studied animal groups the age, in millions of years ago (ma), of the
such as birds (Jetz et al. 2012) and mammals (Bininda- oldest phylogenetic node connecting the immediately
Emonds et al. 2007; Fritz et al. 2009) has greatly benefited subordinate taxonomic groups within a clade (i.e.,
the study of the natural world. However, these new species within genera, genera within families, and
hypotheses have not yet been used to systematically families within orders). Monotypic taxonomic groups
review the classifications that ornithologists and (with only a single immediately subordinate taxonomic
mammalogists use on a daily basis. Analyzing levels group) were not assigned a clade age and are therefore
of temporal divergence within these large phylogenies not included within this specific aspect of the study.
allows temporal inconsistencies (i.e., inconsistent levels Note that this approach differs subtly, from the approach
of phylogenetic divergence among taxonomic groups) known as “crown age,” which is calculated as the
to be fully quantified within major taxonomic ranks. age of the node representing the most recent common
Temporal divergence is often an important consideration
These raw scores are not directly comparable because is randomly distributed across phylogenetic trees or if it
expected levels of error differ among study groups and is clustered toward the base or the tips. We conducted
among taxonomic ranks. “Standardized error scores” Spearman correlation tests to consider potential
were therefore produced for each taxonomic rank within correlations between taxonomic clade phylogenetic
each taxonomic group. These scores compare the total position and clade temporal error score, for all three
amount of error to random expectations, resulting in hierarchical ranks considered and for both birds and
a metric for which zero represents a perfect score (a mammals. Taxonomic clade phylogenetic position was
completely consistent taxonomy), and one represents quantified as root node distance, that is, the number of
a score that is equal to the mean random expectation. nodes between the root and ancestral node of the clade,
Random expectations were generated by randomly with clades having low root node distances referred
dividing phylogenies into the required number of to as being more basal and those with higher scores
monophyletic groups (Fig. 1). For example, for random referred to as being more distal. Clade temporal error
bird orders, all bird species were randomly divided into scores were calculated as explained in the “Consistency
40 monophyletic groups in accordance with the number among taxonomic schemes” section.
of orders in the bird taxonomy. Species were divided into
random clades by randomly splitting the phylogenetic
nodes, starting with the root, until the required number RESULTS
of clades was produced. Nodes were randomly selected
for splitting in a weighted manner that ensured that For birds, the new phylogenetically delineated orders,
all possible random results were equally possible. This families, and genera were generated by cutting the
process was repeated 1000 times and the mean random phylogeny at 65, 37.5, and 11.405 ma, respectively
score calculated. Standardized error scores were then (Supplementary Material S1, (available from http://
calculated by dividing the total observed error by the dx.doi.org/10.5061/dryad.qd3pd). Standardized error
mean total random error. All analyses were performed scores for the three bird taxonomic ranks show
using the R statistical analysis software (R Core Team that the taxonomy is most consistent at the order
2013), using the packages ape, foreach and picante level and least consistent at the genus level (Fig. 2).
(Paradis et al. 2004; Kembel et al. 2010; Analytics and Overall 695 of the 2325 taxonomic clades were
Weston 2013). completely consistent between the original and revised
classifications, with roughly half of consistent clades
being monotypic. Traditional hierarchical taxonomic
clade age distributions overlapped considerably (Fig. 3a)
Phylogenetic Bias in Taxonomic Inconsistencies and clades varied substantially in their amount of
We investigated whether particular parts of temporal error (Fig. 4a–c; Supplementary Material
phylogenies are more prone to inconsistent taxonomic S2). The most extreme comparison was between the
treatment. Specifically, we ask whether temporal error order Casuariiformes, “emus and cassowaries” (clade
100
80 1.0 Mean Random
0.8
60
0.6
40
0.4
0.2
0 Perfect
0
FIGURE 2. Consistency of current higher taxonomic classifications for birds and mammals, for genera, families, and orders. As measured by
percentage of groups that are identical to a group produced by a temporal banding approach (bars) or by standardized error scores (points), for
which zero = no error and one = mean random error. Standardized error scores are the most appropriate metric for comparing study groups
or hierarchical ranks. See the “Methods” section for analytical protocol for producing standardized error scores.
age = 22.31 ma), and the genera Caprimulgus and the order Dasyuromorphia: carnivorous marsupials
Eurostopodus, both nightjars, with clade ages of 55.13 ma. (clade age = 30.5 ma), and the genus Abrocoma (clade
Our two repeated versions of the bird analysis, one using age = 45.3 ma), which is within the chinchilla rat family,
the data-only phylogeny and the other using the IOC Abrocomidae.
taxonomic classification, both produced highly similar For both birds and mammals, families and genera
results to the main analysis (Supplementary Material S3). showed a significant negative correlation between
For mammals, the new phylogenetically based orders, root node distances and clade error scores (birds
families, and genera were generated by cutting the rs = −0.54 and −0.46; mammals: rs = −0.49 and −0.43,
phylogeny at 75.7, 44.2, and 17.05 ma, respectively both P<0.001; Fig. 3b,c,e,f). This result suggests that
(Supplementary Material S1). Due to polytomies, it more basal taxonomic clades are more likely to be split
was not possible to find cut-off ages returning the under temporal banding and more distal clades are more
same number of clades for families and genera as likely to be lumped. Orders did not show any significant
suggested by Wilson and Reeder (2005). The selected correlation for either birds or mammals (birds:
cut-off ages returned the closest possible number rs = −0.22, P = 0.18; mammals: rs = −0.07, P = 0.7;
of clades, with 10 more genera (1198) and 2 more Fig. 3a,d). Analytical code is available as Supplementary
families (150). Mammalian standardized error scores Material S4.
again show that the taxonomy is most consistent
at the order level and least consistent at the genus
level (Fig. 2). Within hierarchical taxonomic ranks, the DISCUSSION
mammal taxonomy was more consistent than the bird Quantitative analyses presented here demonstrate
taxonomy (Fig. 2). Overall, 533 of the 1365 original that the current bird taxonomy is more inconsistent
mammal clades were completely consistent with a than that of mammals within the three taxonomic
clade in the temporal banding scheme, with more ranks considered. Within hierarchical taxonomic ranks,
than half of consistent clades being monotypic. The mammal groups are consistently older than bird
traditional mammal taxonomic clade age distributions groups. For both birds and mammals, the amount
again overlapped strikingly across taxonomic ranks of temporal inconsistency among clades decreases as
(Fig. 3b) and clades again varied substantially in their taxonomic hierarchical ranks increase. Bird genera are
amount of temporal error (Fig. 4d–f; Supplementary particularly inconsistent and only slightly better than
Material S2). The most extreme comparison was between a randomly delineated taxonomy. Analysing only the
FIGURE 4. Phylogenetic bias of taxonomic consistencies for bird and mammal orders, families, and genera. Dashed lines represent the cut-off
ages required to split the phylogenies into the existing number of groups for each taxonomic rank. If a clade is split too recently (i.e., “Should
be lumped”) the phylogenetic lineage is shaded red. If a taxonomic clade is split too long ago (i.e., “Should be split”) the phylogenetic lineage
is shaded blue. In both cases the intensity of color reflects the temporal distance from the dashed cut-off age. Lineages that are consistent with
the temporal banding approach are shaded gray. For birds and mammals, a significant negative correlation exists between the root node age of
clades and clade temporal error for families and genera (birds rs = −0.54 and −0.46; mammals: rs = −0.49 and −0.43, both P < 0.001). For orders
no significant correlation exists between these variables (birds: rs = −0.22, P = 0.18; mammals: rs = −0.07, P = 0.7).
part of the bird phylogeny for which molecular data families were lumped. For birds, 12 of the most distal
exists (6670 species) provides congruent results; this third of families (65 families) are split under temporal
suggests that the temporal banding approach may even banding but many more of these families are lumped
be applied to incomplete phylogenies and ultimately (47). The most extreme case of lumping for birds (in
to the whole tree of life. Across current families and terms of number of old families lumped) is represented
genera, basal taxonomic groups are more likely to be split by the existing superfamily Passeroidea, which consists
under a temporal banding approach, with significant of 15 families, 14 of which are lumped into one single
negative correlations between phylogenetic position and family by temporal banding, with the remaining family
temporal error within these taxonomic ranks, for both being the single species family Urocynchramidae. In
study groups. the corresponding example of extreme lumping under
Our results have important implications for bird and mammalian temporal banding, six current families are
mammal studies based on higher taxonomic ranks and, lumped into a single family that is currently recognized
since our results are consistent across the two groups, as the suborder Feliformia.
there is also the potential that similar studies based on Taken together it appears that the coarse delimitations
other groups could be influenced in a similar manner. of birds and mammals into orders have been carried
taxonomies. It is entirely feasible that alternative cut- decisions can be made on a systematic, consistent basis.
off ages may be preferred, perhaps relying on other The temporal banding approach itself is not completely
characters such as morphology and anatomy, to fine free from practical challenges. Without the possibility of
tune the temporal cut-off ages. Unfortunately, it is only molecular data, fossils must still be assigned to lineages
too easy to envision the levels of disagreement among through their morphological characteristics. Following
expert taxonomists on such fine-tuned cut-off ages and, this process however, fossils could be assigned to higher
given the arbitrary nature of these ranks, the approach groups defined by temporal banding ranks in much the
used for our analysis is more practical, consistent, and same manner as they are currently.
objective. Alternatively, existing or future objective Our study demonstrates that the temporal banding
analytical methods may offer an approach that returns approach can produce consistently defined taxonomic
nonarbitrary cut-off ages (Humphreys and Barraclough groups. These groups therefore represent more
2014). meaningful units of comparison when discussing other
The Linnean hierarchical taxonomic system has aspects of interest, such as phenotypic divergence.
proven resilient and continues to have a strong influence Temporal banding hierarchies also provide more
on modern taxonomy. Despite the increasing availability consistent units of study for formal analysis, however,
Avise J.C., Johns G.C. 1999. Proposal for a standardized temporal Holt B.G., Lessard J.-P., Borregaard M.K., Fritz S.A., Araújo M.B.,
scheme of biological classification for extant species. Proc. Natl Dimitrov D., Fabre P.-H., Graham C.H., Graves G.R., Jønsson K.A.,
Acad. Sci. U. S. A. 96:7358–7363. Nogués-Bravo D., Wang Z., Whittaker R.J., Fjeldså J., Rahbek C. 2013.
Avise J.C., Liu J.-X. 2011. On the temporal inconsistencies of Linnean An update of Wallace’s zoogeographic regions of the world. Science
taxonomic ranks. Biol. J. Linn. Soc. 102:707–714. 339:74–78.
Bininda-Emonds O.R.P., Cardillo M., Jones K.E., MacPhee R.D.E., Humphreys A.M., Barraclough T.G. 2014. The evolutionary reality of
Beck R.M.D., Grenyer R., Price S.A., Vos R.A., Gittleman J.L., higher taxa in mammals. Proc. Biol. Sci. 281:20132750.
Purvis A. 2007. The delayed rise of present-day mammals. Nature Jetz W., Thomas G.H., Joy J.B., Hartmann K., Mooers A.O. 2012. The
446:507–512. global diversity of birds in space and time. Nature 491:444–448.
Condamine F.L., Sperling F.A.H., Wahlberg N., Rasplus J.-Y., Kergoat Jønsson K.A., Bowie R.C.K., Moyle R.G., Christidis L., Norman J.A.,
G.J. 2012. What causes latitudinal gradients in species diversity? Benz B.W., Fjeldså J. 2010. Historical biogeography of an Indo-Pacific
Evolutionary processes and ecological constraints on swallowtail passerine bird family (Pachycephalidae): different colonization
biodiversity. Ecol. Lett. 15:267–277. patterns in the Indonesian and Melanesian archipelagos.
Coyne J.A., Orr H.A. 2004. Speciation. Sunderland (MA): Sinauer J. Biogeogr. 37:245–257.
Associates, Incorporated Publishers. Kembel S.W., Cowan P.D., Helmus M.R., Cornwell W.K., Morlon
Darwin C. 1859. On the origin of the species by natural selection. H., Ackerly D.D., Blomberg S.P., Webb C.O. 2010. Picante: R
London, UK: Murray. tools for integrating phylogenies and ecology. Bioinformatics
De Queiroz K. 1997. The Linnaean hierarchy and the evolutionization 26:1463–1464.