El Sayed2005
El Sayed2005
El Sayed2005
bers CT005244 to CT005272. T. brucei genome accession numbers: Sequence data have been deposited
at DDBJ/EMBL/GenBank with consecutive accession
numbers CP000066 to CP000071 for chromosomes 3
to 8 and project accession numbers AAGZ00000000,
AAHA00000000, AAHB0000000 for the wholechromosome shotgun projects of chromosomes 9
to 11. The versions of chromosomes 9 to 11 described in
this paper are the first versions, AAGZ01000000,
AAHA01000000, and AAHB01000000, and unassembled contigs have accession number CR940345. T. cruzi
genome accession numbers: This Whole-Genome
Shotgun project has been deposited at DDBJ/EMBL/
GenBank under the project accession AAHK00000000.
The version described in this paper is the first version,
AAHK01000000. All data sets and genome annotations
are also available through GeneDB at www.genedb.org.
Supporting Online Material
www.sciencemag.org/cgi/content/full/309/5733/404/
DC1
Materials and Methods
Figs. S1 to S5
Tables S1 to S7
References
14 March 2005; accepted 21 June 2005
10.1126/science.1112181
RESEARCH ARTICLE
www.sciencemag.org
SCIENCE
VOL 309
15 JULY 2005
409
S PECIAL S ECTION
S PECIAL S ECTION
410
15 JULY 2005
VOL 309
SCIENCE
Number
60,372,297
51
838
4,008
58.9
23,216
22,570
12,000
3,590
1,513
1,152
53.4
385
1,024
47
115
219
192
19
1,447
2
www.sciencemag.org
Table 2. Large gene families in T. cruzi. Members are listed as total genes (pseudogenes in parentheses).
Gene product
trans-Sialidase (TS)
MASP
Mucin
Retrotransposon hot spot (RHS) protein
Dispersed gene family protein 1 (DGF-1)
Surface protease (gp63)
Mucinlike protein
Hypothetical
Hypothetical
Kinesin, putative
Protein kinase (CMGC group)
Protein kinase (several groups)
Hypothetical protein
Glycosyltransferase
RNA helicase (eIF-4a)
Protein kinase (NEK group)
MASP-related
Glycosyltransferase
Hypothetical
Amino acid permease
AAA ATPase
Protein phosphatase
Heat shock protein HSP70
Protein kinase (STE group)
RNA helicase
Phosphatidylinositol phosphate kinaserelated
Hypothetical
Elongation factor 1-g (EF-1-g)
DNA helicase (DNA repair)
Actin-related
Cysteine peptidase
Members
Tritryp orthologs
1430 (693)
1377 (433)
863 (201)
752 (557)
565 (136)
425 (251)
123
117
93
79
77
79
42
52
47
39
38
36
35
28
33
30
21
25
23
23
24
22
21
20
20
Tb
No
No
Tb
No
Tb Lm
No
LmTb
LmTb
LmTb
LmTb
LmTb
No
LmTb
LmTb
LmTb
No
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
LmTb
www.sciencemag.org
SCIENCE
VOL 309
15 JULY 2005
411
S PECIAL S ECTION
S PECIAL S ECTION
Tb Myo2
Tc Myo2
Lm Myo2
87
Tc Myo3
100
Tc Myo5
99
94
82
Tc Myo6
Tc Myo8
Tc Myo4
83
Sp Myo5 1
97
Retrotransposons
Tb
Tc
LTR retrotransposons
VIPER (4.5 kb)
26 (0)
275 (0)
SIRE (0.43 kb)
10 (0)
480 (0)
Non-LTR retrotransposons
SLACS (6.3 kb)
4 (1)
0
CZAR (7.25 kb)
0
8 (*)
ingi (5.2 kb)
115 (3)
0
RIME (0.5 kb)
86 (0)
0
L1Tc (4.9 kb)
0
320 (15)
NARTc (0.25 kb)
0
133 (0)
DIRE(4 to 5 kb)
73 (0)
257 (0)
100
0
0
0
0
0
0
52 (0)
Hs Myo5A
100
0
0
412
Myo V
Tc Myo7
Lm
Tc Myo1
91
100
Myo I
Tb Myo1
Sp Myo2
Myo II
Ce Myo2
Fig. 1. Evolutionary analysis of trypanosomatid myosins, in comparison with myosins from Schizosaccharomyces pombe (Sp), C. elegans (Ce), and Homo sapiens (Hs). See supporting online material (18)
for details.
15 JULY 2005
VOL 309
SCIENCE
www.sciencemag.org
Table 4. Comparison of Tritryp, yeast, and human kinome. The comparison is based on catalytic
domains. Data for human (Hs) and yeast (Sc) were
derived from Manning et al. (68).
PKs
Tb
AGC
CAMK
CK1
CMGC
NEK
STE
TK
TKL
Unique
Othery
Total
12
13
5
42
20
24
0
0
21
19
156
ABC
Alpha
Bud32
Cofilin
PDHK
PIKK
RIO
Other
Total
5
2
1
1
3
6
2
0
20
Tc
Lm
Eukaryotic PKs
12
11
13
16
6
7
41*
47
23
22
28
32
0
0
0
0
25
27
19
18
167
180
Atypical PKs
5
5
1
5
1
1
1
1
3
3
6
6
2
2
0
0
19
23
Sc
Hs
17
21
4
21
1
14
0
0
4
33
115
63
74
12
61
15
47
90
43
7
61
478
3
0
1
1
2
5
2
1
15
5
6
1
2
5
6
3
12
40
www.sciencemag.org
SCIENCE
VOL 309
15 JULY 2005
413
S PECIAL S ECTION
S PECIAL S ECTION
414
members of which are characterized by conserved N- and C-terminal domains that encode a signal peptide and a GPI anchor
addition site, respectively, which suggests a
surface location in the parasite. The central
region of these proteins is highly variable
(Fig. 2) and often contains repeated sequence.
Because most members of this family are
located downstream of TcMUC II mucins
(which they resemble structurally, if not at the
sequence level), we have named the family
mucin-associated surface proteins (MASPs).
Of the 1377 masp genes identified, 771 appear to be intact and encode both N- and
C-terminal conserved regions; 433 are pseudogenes. An interesting observation is the existence of chimeras (26) that contain the N- or
C-terminal conserved domain of MASP combined with the N- or C-terminal domain of
mucin or the C-terminal domain from the TS
superfamily. The mechanism for the generation of such chimeric masp genes is unknown,
although previous studies have described mosaic genes formed by group II and III members of the TS superfamily (65). Proteomic
data from four different T. cruzi developmental stages revealed at least four distinct masp
genes in trypomastigotes and another in epimastigotes (66). The low number of MASP
peptides detected by proteomic approaches
suggests that MASPs may contain extensive
posttranslational modifications. Alternatively,
masp genes may be expressed in intermediate
stages not represented in the proteome data
or may be expressed in a mutually exclusive
fashion, similar to the T. brucei variant surface
glycoproteins (VSGs).
The gp63 family of surface metalloproteases is found in the three trypanosomatids
and has been implicated in virulence, host cell
infection, and release of parasite surface
proteins (67). Although L. major has only
four gp63 genes and two gp63-like genes, and
T. brucei has only 13, T. cruzi contains more
than 420 genes and pseudogenes. These
appear to be dispersed throughout the genome, although they sometimes occur in
tandem clusters. The reason for this massive
expansion of the gp63 gene family in T. cruzi
is not yet apparent.
Several common themes emerge from
genomic examination of Tritryp surface proteins: Many are highly glycosylated, and the
proteins are members of large families containing highly variable central domains. The
genes in T. cruzi and T. brucei are often located in large haploid arrays. It is likely that
they have evolved to evade the host immune
response, and the presence of pseudogenes
may contribute to the diversity of the sequence
repertoire through recombination. Nevertheless, species-specific differences do occur, because T. brucei expresses only one VSG at
a time and has evolved a sophisticated system to constantly change the expressed copy,
15 JULY 2005
VOL 309
SCIENCE
whereas T. cruzi simultaneously expresses numerous copies of the TSs, mucins, and probably MASPs and gp63s.
Implications for novel therapies. The
elucidation of critical pathways in DNA
repair, DNA replication, and meiosis and the
identification of numerous protein kinases
and phosphatases afforded by analysis of the
Tritryp genomes promise to provide novel
drug targets. Differences from the typical eukaryotic machinery for nucleotide excision/
repair, initiation of DNA replication, and the
presence of additional bacteria-like DNA
polymerases used in replication of the mitochondrial genome all provide potential points
of attack against the parasites. In addition, the
presence of several PKs with little similarity
to those in other eukaryotes present new possibilities for targeted drug development. The
surface TS activity, which is, in T. cruzi at
least, essential for incorporation of host sialic
acid into parasite glycoconjugates, is another
target for chemotherapeutic intervention, and
work is already well advanced in this area
(58). The elucidation of the complete repertoire of active T. cruzi TSs should help in
this endeavor.
References and Notes
1. WHO, The World Health Report, 2002 (World Health
Organization, Geneva, 2002).
2. Anonymous, Mem. Inst. Oswaldo Cruz 94, 429
(1999).
3. C. G. Clark, O. J. Pung, Mol. Biochem. Parasitol. 66,
175 (1994).
4. S. Brisse, C. Barnabe, M. Tibayrenc, Int. J. Parasitol.
30, 35 (2000).
5. M. R. Briones, R. P. Souto, B. S. Stolf, B. Zingales, Mol.
Biochem. Parasitol. 104, 219 (1999).
6. B. Zingales et al., Acta Trop. 68, 159 (1997).
7. N. R. Sturm, N. S. Vargas, S. J. Westenberger, B.
Zingales, D. A. Campbell, Int. J. Parasitol. 33, 269
(2003).
8. A. Pedroso, E. Cupolillo, B. Zingales, Mol. Biochem.
Parasitol. 129, 79 (2003).
9. S. Brisse et al., Mol. Biochem. Parasitol. 92, 253 (1998).
10. C. A. Machado, F. J. Ayala, Proc. Natl. Acad. Sci.
U.S.A. 98, 7396 (2001).
11. M. W. Gaunt et al., Nature 421, 936 (2003).
, D. A. Campbell, N. R.
12. S. J. Westenberger, C. Barnabe
Sturm, Genetics, in press.
13. S. Brisse et al., Infect. Genet. Evol. 2, 173 (2003).
14. M. C. Elias et al., Mol. Biochem. Parasitol. 140, 221
(2005).
15. N. M. El-Sayed et al., Science 309, 404 (2005).
16. M. Berriman et al., Science 309, 416 (2005).
17. A. C. Ivens et al., Science 309, 436 (2005).
18. Materials and methods are available as supporting
material on Science Online.
19. M. I. Cano et al., Mol. Biochem. Parasitol. 71, 273
(1995).
20. W. De Souza, Curr. Pharm. Des. 8, 269 (2002).
21. M. S. Villanueva, S. P. Williams, C. B. Beard, F. F.
Richards, S. Aksoy, Mol. Cell. Biol. 11, 6139 (1991).
22. S. Aksoy, T. M. Lalor, J. Martin, L. H. Van der Ploeg,
F. F. Richards, EMBO J. 6, 3819 (1987).
23. B. E. Kimmel, O. K. ole-MoiYoi, J. R. Young, Mol. Cell.
Biol. 7, 1465 (1987).
24. M. Olivares et al., Electrophoresis 21, 2973 (2000).
25. F. Bringaud et al., Mol. Biol. Evol. 21, 520 (2004).
26. G. Hasan, M. J. Turner, J. S. Cordingley, Cell 37, 333
(1984).
27. F. Bringaud et al., Mol. Biochem. Parasitol. 124, 73
(2002).
28. E. Ghedin et al., Mol. Biochem. Parasitol. 134, 183
(2004).
www.sciencemag.org
www.sciencemag.org
SCIENCE
VOL 309
15 JULY 2005
415
S PECIAL S ECTION