ITS As An Environmental DNA Barcode For Fungi: An: in Silico Approach Reveals Potential PCR Biases

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Bellemain et al.

BMC Microbiology 2010, 10:189


http://www.biomedcentral.com/1471-2180/10/189

RESEARCH ARTICLE Open Access

ITS as an environmental DNA barcode for fungi: an


Research article

in silico approach reveals potential PCR biases


Eva Bellemain*1, Tor Carlsen2, Christian Brochmann1, Eric Coissac3, Pierre Taberlet3 and Håvard Kauserud2

Abstract
Background: During the last 15 years the internal transcribed spacer (ITS) of nuclear DNA has been used as a target for
analyzing fungal diversity in environmental samples, and has recently been selected as the standard marker for fungal
DNA barcoding. In this study we explored the potential amplification biases that various commonly utilized ITS primers
might introduce during amplification of different parts of the ITS region in samples containing mixed templates
('environmental barcoding'). We performed in silico PCR analyses with commonly used primer combinations using
various ITS datasets obtained from public databases as templates.
Results: Some of the ITS primers, such as ITS1-F, were hampered with a high proportion of mismatches relative to the
target sequences, and most of them appeared to introduce taxonomic biases during PCR. Some primers, e.g. ITS1-F,
ITS1 and ITS5, were biased towards amplification of basidiomycetes, whereas others, e.g. ITS2, ITS3 and ITS4, were
biased towards ascomycetes. The assumed basidiomycete-specific primer ITS4-B only amplified a minor proportion of
basidiomycete ITS sequences, even under relaxed PCR conditions. Due to systematic length differences in the ITS2
region as well as the entire ITS, we found that ascomycetes will more easily amplify than basidiomycetes using these
regions as targets. This bias can be avoided by using primers amplifying ITS1 only, but this would imply preferential
amplification of 'non-dikarya' fungi.
Conclusions: We conclude that ITS primers have to be selected carefully, especially when used for high-throughput
sequencing of environmental samples. We suggest that different primer combinations or different parts of the ITS
region should be analyzed in parallel, or that alternative ITS primers should be searched for.

Background preferred DNA barcoding marker both for the identifica-


Molecular identification through DNA barcoding of tion of single taxa and mixed environmental templates
fungi has, during the last 15-20 years, become an inte- ('environmental DNA barcoding'). It has recently been
grated and essential part of fungal ecology research and proposed as the official primary barcoding marker for
has provided new insights into the diversity and ecology fungi (Deliberation of 37 mycologists from 12 countries
of many different groups of fungi (reviewed by [1-4]). at the Smithsonian's Conservation and Research Centre,
Molecular identification has made it possible to study the Front Royal, Virginia, May 2007). More than 100 000 fun-
ecology of fungi in their dominant but inconspicuous gal ITS sequences generated by conventional Sanger
mycelial stage and not only by means of fruiting bodies. sequencing are deposited in the International Nucleotide
Interest in sequenced-based analysis of environmental Sequence Databases and/or other databases [11], provid-
samples ('environmental barcoding') has increased in the ing a large reference material for identification of fungal
past decade as it allows to study abundance and species taxa. However, these data are to some extent hampered
richness of fungi at a high rate and more reliably than by misidentifications or technical errors such as mixing
conventional biotic surveys (e.g. [5-10]). The internal of DNA templates or sequencing errors [12]. Further-
transcribed spacer (ITS) of nuclear DNA (nrDNA) is the more, a large amount of partial ITS sequences generated
by next-generation sequencing has recently been depos-
* Correspondence: eva.bellemain@nhm.uio.no ited in public sequence databases.
1National Centre for Biosystematics, Natural History Museum, University of The ITS region includes the ITS1 and ITS2 regions,
Oslo, P.O. Box 1172 Blindern, NO-0318 Oslo, Norway
Full list of author information is available at the end of the article separated by the 5.8S gene, and is situated between the
© 2010 Bellemain et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons
Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.
Bellemain et al. BMC Microbiology 2010, 10:189 Page 2 of 9
http://www.biomedcentral.com/1471-2180/10/189

18S (SSU) and 28S (LSU) genes in the nrDNA repeat unit fungi in the Dikarya, Ascomycota and Basidiomycota.
(Figure 1). The large number of ITS copies per cell (up to Ascomycota represents the largest phylum of Fungi, with
250; [13]) makes the region an appealing target for over 64,000 species, while Basidiomycota contains about
sequencing environmental substrates where the quantity 30,000 described species [21]. In total those two groups
of DNA present is low. The entire ITS region has com- represent 79% of the described species of true Fungi.
monly been targeted with traditional Sanger sequencing The aim of this study was to analyse the biases com-
approaches and typically ranges between 450 and 700 bp. monly used ITS primers might introduce during PCR
Either the ITS1 or the ITS2 region have been targeted in amplification. First, we addressed to what degree the var-
recent high-throughput sequencing studies [14-17], ious primers mismatch with the target sequence and
because the entire ITS region is still too long for 454 whether the mismatches are more widespread in some
sequencing or other high-throughput sequencing meth- taxonomic groups. Second, we considered the length
ods. Using high-throughput sequencing, thousands of variation in the amplified products, in relation to taxo-
sequences can be analysed from a single environmental nomic group, to assess amplification biases during real
sample, enabling in-depth analysis of the fungal diversity. (in vitro) PCR amplification, as shorter DNA fragments
Various primers are used for amplifying the entire or are preferentially amplified from environmental samples
parts of the ITS region (Figure 1). The most commonly containing DNA from a mixture of different species [22].
used primers were published early in the 1990's (e.g. Finally, we analyzed to what degree the various primers
[18,19] when only a small fraction of the molecular varia- co-amplify plants, which often co-occur in environmen-
tion in the nrDNA repeat across the fungal kingdom was tal samples. For these purposes we performed in silico
known. Several other ITS primers have been published PCR using various primer combinations on target
more recently [20] but have not been used extensively sequences retrieved from EMBL databases as well as sub-
compared to the earlier published primers. However, lit- set databases using the bioinformatic tool EcoPCR [23].
tle is actually known about the potential biases that com- In order to better simulate real PCR conditions, we
monly used ITS primers introduce during PCR allowed a maximum of 0 to 3 mismatches except for the 2
amplification. Especially during high-throughput last bases of each primer and we assessed the melting
sequencing, where quantification (or semi-quantifica- temperature (Tm) for each primer in relation to primer
tion) of species abundances is also possible to a certain mismatches.
degree (although hampered by factors like copy-number
variation), primer mismatches might potentially intro- Methods
duce large biases in the results because some taxonomic Compilation of datasets
groups are favoured during PCR. Our main focus in this The EcoPCR package contains a set of bioinformatics
study is on the two dominating taxonomic groups of tools developed at the Laboratoire d'Ecologie Alpine,
Grenoble, France ([23], freely available at http://
www.grenoble.prabi.fr/trac/ecoPCR). The package is
composed of four pieces of software, namely 'ecoPCRFor-
mat', 'ecoFind', 'ecoPCR' and 'ecoGrep'. Briefly, EcoPCR is
based on the pattern matching algorithm agrep [24] and
selects sequences from a database that match (exhibit
similarity to) two PCR primers. The user can specify (1)
which database the given primers should be tested
against, and (2) the primer sequences. Different options
allow specification of the minimum and maximum ampli-
fication length, the maximum count of mismatched posi-
tions between each primer and the target sequence
(excluding the two bases on the 3'end of each primer),
and restriction of the search to given taxonomic groups.
The ecoPCR output contains, for each target sequence,
amplification length, melting temperature (Tm), taxo-
nomic information as well as the number of mismatched
Figure 1 Commonly used primers for amplifying parts or the en-
positions for each strand.
tirety of the ITS region. a) Relative position of the primers, design of
the subsets and number of sequences in each subset. b) Primer se- First, we retrieved from EMBL sequences from fungi in
quences, references and position of the primer sequence according to the following categories: 'standard', 'Genome sequence
a reference sequence of Serpula himantioides (AM946630) stretching scan', 'High Throughput Genome sequencing', 'Whole
the entire nrDNA repeat. Genome Sequence' from ftp://ftp.ebi.ac.uk/pub/data-
Bellemain et al. BMC Microbiology 2010, 10:189 Page 3 of 9
http://www.biomedcentral.com/1471-2180/10/189

bases/embl/release/ (release embl_102, January 2010) to introns), ITS3-ITS4 and ITS5-ITS2. From dataset 3 we
create our initial database. It corresponds to 1,212,954 used the combinations ITS3-ITS4 and ITS3-ITS4B. Dur-
sequences including approximately 79,500 ITS sequences ing these virtual PCRs we also allowed from 0 to 3 mis-
(estimated from EMBL SRS website requesting for fungi matches between each primer and the template, except in
sequences annotated with 'ITS' or 'Internal Transcribed the 2 bases of the 3' primer end.
Spacer'). These ITS entries refer to more than 10,800
taxa. This database hereafter referred to as the "fungi Assessing the degree of primer mismatches and Tm
database" was compiled using EcoPCRFormat. For all in silico internal amplifications from each subset,
To assess the specificity of the primers to fungi, we used we assessed the proportion of sequences retrieved when
the plant database from EMBL (release embl_102, Janu- allowing for 0 to 3 mismatches between each primer and
ary 2010 from ftp://ftp.ebi.ac.uk/pub/databases/embl/ the template. For the amplifications from each subset, we
release/) to run amplifications using the same primers as used an external primer (one of the primers used to cre-
for fungi. This database, hereafter referred to as the "plant ate the subset) and an internal primer. Therefore, for each
database", contained 1,253,565 sequences, including analysis, we assessed the proportion of sequences includ-
approximately 65,000 ITS sequences (estimated from ing mismatches for the internal primer only. The primer
EMBL SRS website requesting for viridiplantae sequences pair ITS5-ITS2 was evaluated both for subset 1 and sub-
annotated with 'ITS' or 'Internal Transcribed Spacer'). set 2, with the focus on ITS5 for subset 1 and on ITS2 for
These ITS entries refer to more than 6,100 taxa. This subset 2 (as those primers correspond to internal primers
database was also compiled using EcoPCRFormat. within their respective subsets). Similarly, the primer pair
As there are relatively few sequences submitted to pub- ITS3-ITS4 was evaluated both for subsets 2 and 3, with
lic databases covering the entire ITS region as well as the the focus on ITS3 in subset 2 and ITS4 in subset 3. The
commonly used universal primer sites in the flanking primer ITS1 was evaluated both for subset 1 (with the
SSU and LSU regions, we created three subset datasets combination ITS1-ITS2) and for subset 2 (with the com-
covering either ITS1, ITS2 or the entire ITS region. From bination ITS1-ITS4) as ITS2 and ITS4 were used as exter-
the initial fungi database, we compiled three subset data- nal primers in subsets 1 and 2, respectively.
bases (hereafter referred to as subset 1, 2, and 3) by in sil- To assess whether certain taxonomic groups were more
ico amplification (see below) of target sequences using prone to mismatches, we assessed the proportion of
the following primer pairs: NS7-ITS2 (dataset 1, focused sequences including one mismatch for each of the three
on ITS1 region), ITS5-ITS4 (dataset 2, including both taxonomic groups 'ascomycetes', 'basidiomycetes' and
ITS1 and ITS2 regions) and ITS3-LR3 (dataset 3, focused 'non-dikarya' (the latter is a highly polyphyletic group
on ITS2 region). To simulate relatively stringent PCR including e.g. Blastocladiomycota, Chytridiomycota,
conditions, a single mismatch between each primer and Glomeromycota and Zygomycota [25]). We also assessed
the template was allowed except in the 2 bases of the 3' the Tm for each primer based on the analyses from inter-
primer end. These three subsets were then compiled nal amplifications, allowing a single mismatch. The Tm is
using EcoPCRFormat and included 1291, 5924 and 2459 defined as the temperature at which half of the DNA
partial nrDNA sequences, respectively. strands are in the double-helical state and half are in the
"random-coil" states. The strength of hybridization
In silico amplification and primer specificity to fungi between the primers and the template affects Tm. It is
Using EcoPCR, we ran in silico amplifications from both therefore informative to assess how Tm decreases as the
the fungi and the plant databases using various com- number of mismatches increases, i.e. with less stringent
monly used primer combinations, to assess the number PCR conditions. Tm was calculated in ecoPCR based on a
of amplifications and the specificity of the primers to thermodynamic nearest neighbor model [26]. Exact com-
fungi. For each amplification, we allowed from 0 to 3 mis- putation was performed following [27].
matches between each primer and the template (exclud-
ing mismatches in the 2 bases of the 3' primer end) in Assessing bias in amplification length relative to taxonomic
order to simulate different stringency conditions of PCRs. group
Secondly, from the three subsets, we amplified sequences To further assess the taxonomic bias introduced by the
using different internal primer combinations in order to use of the different primer pairs, we separated the ampli-
evaluate the various primers (Figure 1). From dataset 1 fied sequences from selected analyses into the groups
we used the primer combinations ITS1-F-ITS2, ITS5- 'ascomycetes', 'basidomycetes' and 'non-dikarya' based on
ITS2 and ITS1-ITS2. From dataset 2 we used the combi- their taxonomic identification number, using the ecoGrep
nations ITS1-ITS4 (amplifying both ITS1 and ITS2 tool. These selected analyses were (1) the three subsets,
Bellemain et al. BMC Microbiology 2010, 10:189 Page 4 of 9
http://www.biomedcentral.com/1471-2180/10/189

and (2) all internal amplifications within each subset with Primer mismatches in sequence subsets
one mismatch allowed. The amplification length was The selected ITS primers showed large variation in their
reported for each analysis. ability to amplify fungal sequences from the three subsets
when allowing different number of mismatches (Figure
Results 2). All primer pairs amplified at least 90% of the
Relative amplification of different primer combinations sequences when allowing two or three mismatches, with
from the fungi and plant databases the exception of ITS4-B (see below). It is noteworthy that
The number of fungal versus plant sequences amplified in the percentages of sequences were quite similar for two
silico with various ITS primer combinations directly from and three mismatches, indicating that rather few
the raw data downloaded from EMBL (Table 1) mainly sequences included three mismatches. Under strict con-
reflected the number of sequences deposited. However, ditions (i.e. allowing no mismatches), the proportion of
the number of amplified sequences varied considerably amplified sequences varied considerably between primer
with varying stringency conditions (in this context allow- pairs, ranging from 36% for ITS1-F to 81% for ITS5 (Fig-
ing zero to three mismatches) across different primer ure 2).
combinations (see Table 1 for details). Only a few plant Allowing one mismatch increased the proportion of
ITS sequences were amplified using the fungus-specific amplified sequences from 36% to 91.6% for the com-
primer ITS1-F (ranging from 20 to 24 sequences under monly used primer ITS1-F, implying that more than half
different stringency conditions). Assessing these of the amplified sequences included one mismatch. ITS5
sequences using Blast, 20 out of 24 were revealed to be amplified the highest proportion of the sequences when
fungal sequences erroneously deposited as algae from an allowing for a single mismatch (97.5%), and less than 10%
unpublished study (six Liagora species, two Caulerpa of the sequences in each taxonomic group included one
species, Helminthocladia australis, and Ganonema mismatch. The primer ITS1, on the other hand, only
farinosum). There was a sequence deposited as Chorella amplified 56.8% and 65.9% of the sequences from subsets
matching a fungal sequence. The three others were Chlo- one and two, respectively, when allowing no mismatches.
rarachniophyte species that did not match any known Allowing three mismatches, ITS1 was still only able to
fungal sequence. Some of the other primer combinations, amplify 92% of the sequences in subsets one and two.
including ITS1-ITS2, amplified a high number of plant Allowing no mismatches, the complementary primers
sequences from different orders. We also confirmed that ITS2 and ITS3 amplified 79.4% and 77.3% of all
the assumed basidiomycete-specific primer ITS4-B did sequences respectively, in subset 2. Allowing one mis-
not amplify any plant sequences even when allowing 3 match, these numbers increased to 87.5 and 90%, respec-
mismatches. tively. Primer ITS4 amplified 74.9% of all sequences in

Table 1: Number of plant and fungi ITS sequences amplified in silico from EMBL fungal and plant databases, using the
various primer combinations and allowing none to three mismatches.

Primer comb. Fungal ITS sequences Plant ITS sequences

Number of mismatches * 0 1 2 3 0 1 2 3

ITS5-ITS4 5482 5924 6026 6123 500 514 5667 5996


NS7-ITS2 1067 1291 1313 1320 23 190 231 403
ITS3-LR3 2070 2459 2499 2548 51 168 248 300
ITS1-ITS2 17545 19816 25223 25457 1107 17665 18755 19084
ITS1-F-ITS2 2112 4169 4592 4658 20 21 21 24
ITS5-ITS2 7713 8993 9180 9293 94 703 11123 12100
ITS1-ITS4 10013 10610 12488 12656 5783 6740 7500 7620
ITS3-ITS4 18815 21195 21663 22078 415 7829 8583 8852
ITS3-ITS4-B 1269 1673 1811 1863 0 0 0 0
* The number of mismatches allowed between the primer and the DNA strand reflects the stringency level of the PCR, i.e. strict PCR conditions
such as annealing temperature close to or above the recommended Tm will not allow unspecific sequences (including one or more mismatches)
to be amplified.
Bellemain et al. BMC Microbiology 2010, 10:189 Page 5 of 9
http://www.biomedcentral.com/1471-2180/10/189

ITS3 and ITS4 (the two first being complementary) were


biased towards ascomycetes when analysing subsets 2
and 3. The assumed basidiomycete-specific primer com-
bination ITS3-ITS4-B only amplified 39.3% of the
basidomycete sequences. Primers ITS4 and ITS5 ampli-
fied the highest proportion of 'non-dikarya' sequences.
The number of mismatches allowed had a significant
impact on the optimal annealing temperature to be used
in the PCR reaction (Table 3). Optimal annealing temper-
atures decreased by approximately 6-8 degrees Celsius
with each additional mismatch.

Taxonomic bias relative to length of the amplified region


We found considerable length variation among the ampli-
fied fragments both in the ITS1 and ITS2 regions, as well
Figure 2 Percentage of sequences amplified from each subset us-
as in the entire ITS region (Figure 3). A taxonomic bias in
ing different primer pairs allowing a maximum of 0, 1, 2, or 3 mis-
matches. relation to length was apparent but not consistent
between the ITS regions. In the ITS1 region, the propor-
tions of ascomycetes and basidiomycetes were quite simi-
subset 3 and this proportion only increased to 93.7%
lar across the size range (p = 0.2, two tailed T-test), but
when allowing three mismatches. The assumed basidio-
'non-dikarya' fungi had far more short fragments and dif-
mycete-specific primer ITS4-B amplified only 5.6% of the
fered significantly from the two other groups (p < 0.01
sequences in subset 3 under strict conditions (corre-
and p < 0.01, two-tailed T-tests). In contrast, in the ITS2
sponding to 46% of the basidiomycetes sequences, see
region, the proportion of ascomycetes and basidiomy-
below) and up to 14.9% allowing 3 mismatches. However,
cetes were highly skewed across the size range, with
about half of the sequences included a mismatch when a
basidiomycetes having significantly longer ITS2 frag-
single mismatch was allowed.
ments than ascomycetes (p < 0.01, two-tailed T-test; on
Taxonomic bias for different primers average 95.2 bp longer fragments). Also for the entire ITS
The taxonomic composition in the three target sequence region (primer pair ITS1-ITS4), basidiomycetes had sig-
subsets (Figure 1) was compared with the taxonomic nificantly longer fragments than ascomycetes (p < 0.01,
composition in the amplified datasets in order to reveal two-tailed T-test), with average lengths of 634.9 versus
whether a taxonomic bias was introduced during the 551.0 bp, respectively. The 'non-dikarya' fungi had signifi-
amplification process (Table 2). A single mismatch was cantly shorter ITS fragments than the basidiomycetes (p
allowed during these comparisons. The primers ITS1, < 0.01, T-test), but did not differ significantly from the
ITS1-F and ITS5 amplified a notably higher proportion of ascomycetes (p = 0.34, two-tailed T-test).
basidiomycetes in subset 1. In contrast, primers ITS2,

Table 2: Percentage of sequences amplified in silico, allowing one mismatch, from ascomycetes, basidiomycetes and 'non-
Dikarya' with different primer combinations and using the three sequence subsets 1-3 (see Material and Methods) as
templates.

Data subsets Primer comb. Ascomycetes Basidiomycetes 'non-Dikarya'

Subset 1 ITS1*-ITS2 61.21 86.21 88.57


ITS1-F*-ITS2 90.75 99.14 92.38
ITS5*-ITS2 90.84 99.14 98.10
Subset 2 ITS1*-ITS4 61.91 82.00 84.86
ITS3*-ITS4 98.39 73.91 91.04
ITS5-ITS2* 94.89 72.10 92.63
Subset 3 ITS3-ITS4* 94.71 85.55 98.49
ITS3-ITS4-B* - 39.31 -
* primer evaluated for mismatches within each pair
Bellemain et al. BMC Microbiology 2010, 10:189 Page 6 of 9
http://www.biomedcentral.com/1471-2180/10/189

Table 3: Melting temperature (Tm) of each primer according to the number of mismatches allowed between the primer
and the target sequence.

Primer Number of mismatches allowed


0 1* 2* 3*

ITS1(1) ** 58.64 51.75+/-2.88 46.51+/-0.6 41.4+/-NA


ITS1(2) ** 58.64 52.02+/-2.58 46.46+/-0.87 39.49+/-2.75
ITS1-F 51.04 42.31+/-1.2 38.91+/-2.62 31.64+/-0.67
ITS2 56.68 48.5+/-1.97 39.3+/-2.74 32.99+/-5.67
ITS5 51.64 41.8+/-1.69 36.6+/-3.93 NA
ITS3 56.68 50.6+/-1.15 44.3+/-3.65 39.93+/-7.25
ITS4 50.9 45.04+/-1.3 35.94+/-3.38 32.73+/-1.83
ITS4-B 59.33 54.49+/-2.39 46.6+/-3.06 37.72+/-7.38
* Mean Tm +/- SD is given for primers with 1 or more mismatches as the Tm depends on the type of mismatch.
** ITS1 is evaluated both with the first subset (1) and the second subset (2).

Discussion ports the reliability of the conclusions in this respect. All


Although the ITS region has been widely used as a the investigated primers were hampered by some mis-
genetic marker during the last 15 years for exploring fun- matches relative to the target sequences in subsets 1-3,
gal diversity in environmental samples (e.g. [7,8,10,28]), and they also varied in their specificity to fungi versus
little effort has been invested to explore the potential plants. It is noteworthy that ITS1-F, which is frequently
biases that the most commonly used ITS primers may used in fungal environmental sequencing studies and
introduce during PCR. In this study we have documented assumed to be fungal specific [18], only amplified three
how the most commonly used fungal ITS primers are plant sequences after removing the fungal sequences
hampered by different types of biases (length bias, taxo- erroneously deposited as plants. Those three sequences
nomic bias and primer mismatch bias). Hence, in envi- deposited as plants most probably corresponded to errors
ronmental sequencing studies aiming at describing fungal as well. However, the ITS1-F primer is hampered with a
diversity and community composition these primers high degree of mismatches. Our analysis indicates that it
should be used with caution. Our analyses were based on may be important to use this primer under relaxed PCR
entries in the public sequence databases (GenBank, conditions when targeting all fungi in an environmental
EMBL and DDBJ). A general but naive assumption in sample. We confirmed that the primer ITS4-B, which has
studies based on this type of data is that the sequences are also often been used in environmental sequencing studies
reliable from a technical aspect and that the sequenced (e.g. [8,28,30,31]), is very specific to basidiomycetes, as it
samples have been correctly identified taxonomically. did not amplify plant ITS even under relaxed PCR condi-
However, these two assumptions are often violated. Given tions. However, this primer is only able to target a small
that the quality control of the raw data typically depends proportion of the basidiomycete diversity (Table 4).
solely on the scientist depositing the sequences, a propor- Mainly Boletales and a fraction of the Agaricales are
tion of published sequences admittedly contains errors amplified under strict conditions, while under relaxed
[29]. In addition, Nilsson et al. [12] showed that about conditions, Chantharellales, Hymenochaetales, Tremello-
20% of the fungal DNA sequences from the public mycetes, Polyporales and Russulales are amplified to a
sequence databases may be identified to incorrect spe- certain degree (from 28 to 94% depending on the group).
cies, and that the majority of entries lack descriptive and Pucciniomycotina and Ustilaginomycotina are not ampli-
up-to-date annotations. However, our analyses deal with fied at all. Hence, our in silico analyses indicate that ITS4-
taxonomic groups at the sub-kingdom/phylum level B should be used with great caution or perhaps aban-
(basidiomycetes, ascomycetes and 'non-dikarya fungi') doned completely in environmental sequencing studies
and it is unlikely that those classes suffer significantly where the aim is to characterize the diversity of all basidi-
from incorrect identifications (e.g. that ascomycetes have omycetes. Although not specific to fungi, the primer pairs
been accessioned as basidiomycetes). The fact that no ITS5-ITS2 and ITS3-ITS4 apparently have a better ability
ascomycete sequences were amplified using primer ITS4- to amplify fungal ITS as the proportion of sequences
B, even when allowing 3 mismatches (Table 1), also sup- amplified does not vary much between strict and relaxed
Bellemain et al. BMC Microbiology 2010, 10:189 Page 7 of 9
http://www.biomedcentral.com/1471-2180/10/189

allowing one mismatch (corresponding to rather strin-


gent PCR conditions) we found that the primer pairs
ITS1-F, ITS1 and ITS5 preferentially amplified basidio-
mycetes whereas the primer pairs ITS2, ITS3 and ITS4
preferentially amplified ascomycetes. This type of bias
must also be considered before selecting primer pairs for
a given study. Also in molecular surveys of protistan and
prokaryotic diversity, it has been documented that differ-
ent 16S primers target different parts of the diversity [32-
34].
In addition, our results clearly demonstrate that basidi-
omycetes, on average, have significantly longer amplicon
sequences than ascomycetes both for the whole ITS
region, and the ITS2 region. This fact probably also intro-
duces taxonomic bias during PCR amplification of envi-
ronmental samples, since shorter fragments are more
readily amplified compared to longer ones. In several
studies, it has been demonstrated that a greater propor-
tion of the diversity can be detected with short target
sequences compared to longer ones [35,36]. Hence, using
the ITS2 region or the whole ITS region, a higher number
of the ascomycetes will probably be targeted compared to
basidiomycetes. This bias could be avoided by using
primers amplifying ITS1 only, but this would imply a
preferential amplification of the 'non-dikarya' fungi.

Conclusion
The in silico method used here allowed for the assess-
ment of different parameters for commonly used ITS
primers, including the length amplicons generated, taxo-
nomic biases, and the consequences of primer mis-
matches. The results provide novel insights into the
relative performance of commonly used ITS primer pairs.
Our analyses suggest that studies using these ITS primers
to retrieve the entire fungal diversity from environmental
samples including mixed templates should use lower
annealing temperatures than the recommended Tm to
allow for primer mismatches. A high Tm has been used in
most studies, which likely biases the inferred taxonomic
composition and diversity. However, one has to find a bal-
ance between allowing some mismatches and avoiding
Figure 3 Box plots illustrating length differences between the non-specific binding in other genomic regions, which can
amplicons obtained using different primer combinations for
also be a problem.
each of the three subsets. The plot in each subset represents the
primer pair used to create the subset (*). Considering the different types of biases (specificity to
fungi; mismatches; length; taxonomy), we suggest that
different primer combinations targeting different parts of
PCR conditions. Overall, the results indicate that it is
the ITS region should be analyzed in parallel. When deal-
important to assess the specificity of the amplification in
ing with single culture isolates compared to environmen-
relation to PCR stringency before interpreting the results
tal samples, the choice of a primer pair to amplify ITS is
from environmental samples in terms of abundance and
less problematic because there is no 'competition'
diversity.
between DNA fragments of different taxonomic groups/
Our in silico analyses further indicate that most of the
lengths, and the DNA quality is generally higher.
primers will introduce a taxonomic bias due to higher
This study also illustrates potential benefits of using a
levels of mismatches in certain taxonomic groups. When
bioinformatics approach before selecting primer pairs for
Bellemain et al. BMC Microbiology 2010, 10:189 Page 8 of 9
http://www.biomedcentral.com/1471-2180/10/189

Table 4: Number (percentages) of sequences and amplified in each of the most common Basidiomycete groups, from the
original subset3 and from the amplification of ITS3-ITS4-B from subset3, allowing no or 3 mismatches.

subset3 ITS3-4B_3mis ITS3-4B_0mis

Agaricales 361 269 (74.5) 118 (32.7)


Boletales 18 17 (94.4) 15 (83.3)
Cantharellales 33 31 (93.9) 0
Hymenochaetales 10 7 (70) 0
Polyporales 28 8 (28.6) 0
Russulales 97 64 (66.0) 0
Thelephorales 6 4 (66.7) 0
Dacrymycetes 1 0 0
Tremellomycetes 38 13 (34.2) 0
Pucciniomycotina 8 0 0
Ustilaginomycotina 21 0 0
Other categories * 71 21 (29.6) 3 (4.2)
* 'Other categories' represent smaller orders including Agaricomycetidae.

a given study. We nevertheless emphasize that an in silico Author Details


1National Centre for Biosystematics, Natural History Museum, University of
analysis does not necessarily reflect the performance of Oslo, P.O. Box 1172 Blindern, NO-0318 Oslo, Norway, 2Microbial Evolution
the primers in vitro, since there are many other PCR Research Group (MERG), Department of Biology, University of Oslo, P.O. Box
parameters such as ITS copy number, amplification pro- 1066 Blindern, N-0316 Oslo, Norway and 3Laboratoire d'Ecologie Alpine (LECA),
CNRS UMR 5553, University Joseph Fourier, BP 53, 38041 Grenoble Cedex 9,
gram, and salt and primer concentration in the PCR mix France
that cannot easily be simulated. This study should there-
fore be followed up by in vitro PCR analyses of the fungal Received: 9 April 2010 Accepted: 9 July 2010
Published: 9 July 2010
ITS primers where biases are measured based on BMC
©
This isMicrobiology
2010
article
an
Bellemain
Open
is available
Access
et
2010,
al;from:
article
licensee
10:189
http://www.biomedcentral.com/1471-2180/10/189
distributed
BioMed Central
under the
Ltd.terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

sequence output, although it will be a huge task to control References


and check for all types of biases that might be involved. 1. Anderson I, Cairney J: Diversity and ecology of soil fungal communities:
We are currently performing further bioinformatics anal- increased understanding through the application of molecular
techniques. Environmental Microbiology 2004, 6(8):769-779.
yses using the tool 'ecoPrimer' (http://www.greno- 2. Chase M, Fay M: Barcoding of plants and fungi. Science 2009,
ble.prabi.fr/trac/ecoPrimers; Riaz et al. unpublished) to 325:682-683.
identify the most appropriate barcoding primers within 3. Horton T, Bruns T: The molecular revolution in ectomycorrhizal ecology:
Peeking into the black box. Molecular Ecology 2001, 10:1855-1871.
the ITS region and other regions, with the intent of deter- 4. Seiffert K: Progress toward DNA barcoding of fungi. Molecular Ecology
mining whether new ITS primers, such as those recently Resources 2009, 9(Suppl 1):83-89.
published by Martin and Rygiewicz [20], should replace 5. Freeman K, Martin A, Karki D, Lynch R, Mitter M, Meyer A, Longcore J,
Simmons D, Schmidt S: Evidence that chytrids dominate fungal
the currently used ones. communities in high-elevation soils. Proceeding of the National Academy
of Sciences USA 2009, 106(43):18315-18320.
Authors' contributions 6. Frohlich-Nowoisky J, Pickergill D, Despres V, Poschl U: High diversity of
EB, TC and HK conceived of the study, participated in its design and coordina- fungi in air particulate matter. Proceeding of the National Academy of
tion. EB carried out the bioinformatics analysis and drafted the manuscript. EC Sciences USA 2009, 106:12814-12819.
designed the bioinformatic tool used in this study (ecoPCR). All authors helped 7. Lindahl B, Ihrmark K, Boberg J, Trumbore S, Högberg P, Stenlid J, Finlay R:
to draft the manuscript and approved the final manuscript. Spatial separation of litter decomposition and mycorrhizal nitrogen
uptake in a boreal forest. New Phytologist 2007, 173:611-620.
Acknowledgements 8. O'Brien H, Parrent J, KJackson J, Moncalvo J, Vilgalys R: Fungal community
Eva Bellemain was funded by the Natural History Museum, University of Oslo analysis by large-scale sequencing of environmental samples. Applied
and this work has been initiated as part of the BarFrost project (Barcoding of and Environmental Microbiology 2005, 71(9):5544-5550.
permafrost samples). We are thankful to four anonymous reviewers for con- 9. Pickles B, Genney D, Potts J, Lennon J, Andersonand I, Alexander I: Spatial
structive comments and to Marie Davey for helping to improve the style of and temporal ecology of Scots pine ectomycorrhizas. New Phytologist
written English. 2010. 10.1111/j.1469-8137.2010.03204.x
10. Zinger L, Coissac E, Choler P, Geremia R: Assessment of microbial
communities by graph partitioning in a study of soil fungi in two
alpine meadows. Applied and Environmental Microbiology 2009,
75:5863-5870.
Bellemain et al. BMC Microbiology 2010, 10:189 Page 9 of 9
http://www.biomedcentral.com/1471-2180/10/189

11. Nilsson R, Ryberg M, Abarenkov K, Sjökvist E, Kristiansson E: The ITS region 34. Sipos R, Szekely A, Palatinszky M, Revesz M, K M, Nikolausz M: Effect of
as a target for characterization of fungal communities using emerging primer mismatch annealing temperature and PCR cycle number on
sequencing technologies. FEMS Microbiology Letters 2009, 296:97-101. 16S rRNA gene -targetting bacterial community analysis. FEMS
12. Nilsson R, Ryberg M, Kristiansson E, Abarenkov K, Larsson K, Koljalg U: Microbiology Ecology 2007, 60:341-350.
Taxonomic reliability of DNA sequences in public sequence databases: 35. Engelbrektson A, Kunin V, Wrighton K, Zvenigorodsky N, Chen F, Ochman
a fungal perspective. PLoS One 2006, 1(1):e59. H, Hugenholtz P: Experimental factors affecting PCR-based estimates of
13. Vilgalys R, Gonzalez D: Organisation of ribosomal DNA in the microbial species richness and evenness. The International Society for
basidiomycete Thanatephorus praticola. Current Genetics 1990, Microbial Ecology Journal 2010. doi:10.1038/ismej.2009.153
18:277-280. 36. Huber J, Morrison H, SM H, Neal P, Sogin M, Welch D: Effect of PCR
14. Buée M, Reich M, Murat C, Morin E, Nilsson R, Uroz S, Martin F: 454 amplicon size on assessments of clone library microbial diversity and
Pyrosequencing analyses of forest soils reveal an unexpectedly high community structure. Environmental Microbiology 2009,
fungal diversity. New Phytologist 2009, 2:449-456. 11(5):1292-1302.
15. Ghannoum M, Jurevic R, Mukherjee P, Cui F, Sikaroodi M, Naqvi A, Gillevet
P: Characterization of the Oral Fungal Microbiome (Mycobiome) in doi: 10.1186/1471-2180-10-189
Healthy Individuals. PLoS Pathogens 2010, 6(1):e1000713. Cite this article as: Bellemain et al., ITS as an environmental DNA barcode for
16. Jumpponen A, Jones K: Massively parallel 454-sequencing of Quercus fungi: an in silico approach reveals potential PCR biases BMC Microbiology
macrocarpa phyllosphere fungal communities indicates reduced 2010, 10:189
richness and diversity in urban environments. New Phytologist 2009,
184:438-448.
17. Jumpponen A, Jones K, Mattox J, Yeage C: Massively parallel 454-
sequencing of Quercus spp. ectomycorrhizosphere indicates
differences in fungal community composition richness, and diversity
among urban and rural environments. Molecular Ecology 2010 in press.
18. Gardes M, Bruns T: ITS primers with enhanced specificity for
basidiomycetes - application to the identification of mycorrhizae and
rusts. Molecular Ecology 1993, 2(2):113-118.
19. White T, Bruns T, Lee S, Taylor J: Amplification and direct sequencing of
fungal ribosomal RNA genes for phylogenetics. In PCR-protocols a guide
to methods and applications Edited by: Innis MA, Gelfand DH, Sninski JJ,
White TJ. San Diego: Academic press; 1990:315-322.
20. Martin K, Rygiewicz P: Fugal-specific primers developed for analysis of
the ITS region of environmental DNA extracts. BMC Microbiology 2005,
5:28.
21. Kirk PM, Cannon PF, David JC, Stalpers J: Ainsworth and Bisby's
Dictionary of the Fungi. 9th edition. Wallingford UK: CAB International;
2001.
22. Deagle B, Eveson J, Jarman S: Quantification of damage in DNA
recovered from highly degraded samples - a case study on DNA in
faeces. Frontiers in Zoology 2006, 3:11.
23. Ficetola GF, Coissac E, Zundel S, Riaz T, Shehzad W, Bessière J, Taberlet P,
Pompanon F: An In silico approach for the evaluation of DNA barcodes.
BMC Genomics in press.
24. Wu S, Mamber U: Agrep- a fast approximate pattern matching tool.
Proceedings of the Winter 1992 USENIX Conference San Francisco USA.
Berkeley 1992:153-162.
25. James T, et al.: Reconstructing the early evolution of Fungi using a six-
gene phylogeny. Nature 2006, 443:818-822.
26. SantaLucia JJ, Hicks D: The thermodynamics of DNA structural motifs.
Annual Review of Biophysics and Biomolecular Structure 2004, 33:415-440.
27. Duitama J, Kumar D, Hemphill E, Khan M, Mandoiu I, Nelson C:
Primerhunter: a primer design tool for pcr-based virus subtype
identification. Nucleic Acids research 2009, 37(8):2483-2492.
28. Peay K, Kennedy P, Davies S, Tan S, Bruns T: Potential link between plant
and fungal distributions in a dipterocarp rainforest: community and
phylogenetic structure of tropical ectomycorrhizal fungi across a plant
and soil ecotone. New Phytologist 2010, 185:529-542.
29. Harris D: Can you bank on GenBank? Trends in Ecology and Evolution
2003, 18(7):317-319.
30. Landeweert R, Leeflang P, Kuyper T, Hoffland E, Rosling A, Wernars K, Smit
E: Molecular identification of ectomycorrhizal mycelium in soil
horizons. Applied and Environmental Microbiology 2003, 69(1):. DOI:
10.1128/AEM.1169.1121.1327-1333.2003
31. Robinson C, Szaro T, Izzo A, Anderson I, Parkin P, Bruns T: Spatial
distribution of fungal communities in a coastal graasland soil. Soil
Biology and Biochemistry 2009, 41:414-416.
32. Hong S, Bunge J, Leslin C, S J, Epstein S: Polymerase chain reaction
primers miss half of rRNA microbial diversity. The ISME shopping 2009,
3:1365-1373.
33. Jeon S, Bunge J, Leslin C, Stoeck T, Hong S, Epstein S: Environmental rRNA
inventories miss over half of protistan diversity. BMC Microbiology 2008,
8:222.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy