Smallpox Dna Sequence

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Journal of General Virology (1992), 73, 2887-2902.

Printed in Great Britain 2887

Nucleotide sequence of 21.8 kbp of variola major virus strain Harvey and
comparison with vaccinia virus
Begofia Aguado,1" lan P. Selmes and Geoffrey L. Smith*

Sir William Dunn School o f Pathology, University o f Oxford, South Parks Road, Oxford OX1 3RE, U.K.

A 21-8 kbp region of the genome ofvariola major virus lutinin gene there is a deletion of 1910 bp so that the
(strain Harvey), a virus that caused haemorrhagic-type equivalent of vaccinia virus gene SalF17R is truncated,
smallpox, has been sequenced and shown to possess and SalF 16R, which shows amino acid similarity to the
96 % nucleotide identity to the corresponding region of tumour necrosis factor receptor, is absent. The region
vaccinia virus, the smallpox vaccine. Overall the gene sequenced includes the genes for thymidylate kinase
arrangement in the two viruses is highly similar and and DNA ligase both of which are active in vaccinia
individual open reading frames (ORFs) display a high virus and are highly conserved in variola virus. Other
degree of amino acid identity, for instance 26 of the 32 conserved ORFs with interesting homologies are those
variola virus ORFs have i> 90% identity with their encoding profilin, superoxide dismutase and part of
vaccinia virus counterparts. A remarkable difference is guanylate kinase. Two vaccinia virus genes encoding
the disruption of seven vaccinia virus ORFs into small glycoproteins of the outer envelope of extracellular
fragments in variola virus. These include the variola enveloped virus are also conserved in variola virus and
virus homologue of vaccinia virus SalF2R, which this homology is likely to have contributed to the
encodes a protein related to C-type animal lectins, and immunological protection which vaccinia virus evoked
SalF7L, which encodes an active 3p-hydroxysteroid against smallpox. Lastly, there are multiple instances
dehydrogenase enzyme that contributes to vaecinia in which short oligonucleotide direct repeats flank a
virus virulence. Upstream of the variola virus haemagg- region absent from either variola or vaccinia virus.

Introduction the sequences of the two viruses, one the pathogen and
the other the vaccine, is of great interest.
In 1798 Jenner introduced vaccination for the prevention A 42 kbp region of the vaccinia virus strain WR
of smallpox (Jenner, 1798) and predicted a few years genome from near the right terminus has been sequenced
later that this practice would result in the elimination of in this laboratory (G. L. Smith et al., 1991) and shown to
smallpox, 'the greatest scourge of the human species' contain several genes that contribute to virus virulence in
(Jenner, 1801). This prophecy was fulfilled in 1977 and vivo but are not essential for virus replication in vitro.
subsequently certified by the WHO (World Health These include genes encoding the enzymes DNA ligase
Organization, 1980). Variola virus is now contained only (Colinas et al., 1990; Kerr et al., 1991; Kerr & Smith,
in two high security laboratories in Moscow and Atlanta, 1991), thymidylate kinase (TmpK) (Hughes et al., 1991)
and the WHO has proposed that once complete genomic and 3fl-hydroxysteroid dehydrogenase (3fl-HSD), an
sequences of representative strains of variola virus have enzyme that synthesizes steroid hormones (Moore &
been determined all remaining virus and DNA clones Smith, 1992). Other genes may interfere with the
should be destroyed. The virus used for smallpox immune response to virus infection, including those
immunoprophylaxis in the modern era, vaccinia virus, encoding serine protease inhibitors (Kotwal & Moss,
has already been completely sequenced for strain 1989; Smith et al., 1989b), soluble receptors for the
Copenhagen (Goebel et al., 1990) and mostly sequenced cytokines tumour necrosis factor (TNF) (Howard et al.,
for the commonly used laboratory strain Western 1991; Upton et al., 1991) and interleukin-1 (Smith &
Reserve (WR) (see G. L. Smith et al., 1991 and Chan, 1991), and a membrane glycoprotein related to
references therein), and the future prospect of comparing complement control factors (Takahashi-Nishimaki et
al., 1991; Engelstad et al., 1992). Three vaccinia virus
t Present address: MRC ImmunochemistryUnit, Departmentof genes which have interesting homologies [superoxide
Biochemistry,Universityof Oxford, South Parks Road, OxfordOX1
3QU, U.K. dismutase (SOD), TNF receptor (TNFR) and guanylate

0001-1152 © 1992SGM Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
2888 Variola virus nucleotide sequence

kinase (GmpK)] are all unlikely to encode proteins with was translated into open reading frames (ORFs) representing >/65
these activities (G. L. Smith et al., 1991), yet if the amino acids using the programs ORFFILE and DELIB (M. E. G.
Boursnell, Institute of Animal Health, Houghton, U.K.), and these
proteins were active they might contribute to virus
were screened against the SWISSPROT protein database using the
virulence. In the case of SOD, the protein might confer program FASTA (Pearson & Lipman, 1988).
on the virus resistance to destruction by an oxidative
burst in phagocytes. A soluble T N F R would sequester
soluble T N F and obviate the antiviral activities of the Results
cytokine, as has been shown for leporipoxviruses (C. A.
Smith et al., 1991; Upton et al., 1991). GmpK might The region of the variola major virus (strain Harvey)
contribute to virus virulence by providing increased genome we have sequenced is shown in Fig. 1, and the
nucleotide pool sizes to aid virus replication. Another nucleotide sequences of the 14-6 kbp S s t I - H i n d l I I and
group of important genes from this region are those 7.2 kbp HindlII I fragments are shown in Fig. 2. The
encoding glycoproteins that form part of the envelope of restriction map of the variola virus Harvey genome
the extracellular enveloped virus (EEV) (Shida, 1986; (Mackett & Archard, 1979; Esposito & Knight, 1985)
Duncan & Smith, 1992a; Engelstad et al., 1992). For predicts that these fragments should directly abut.
immunity against orthopoxviruses these antigens are However, after the ends of the sequences were compared
especially important because they induce protective against the sequences from vaccinia virus strains WR
immunity, in contrast to inactivated intracellular naked (G. L. Smith et al., 1991) and Copenhagen (Goebel et al.,
virus (INV) (Appleyard et al., 1971; Boulter & Apple- 1990), it was apparent that a second HindlII site lies
yard, 1973; Payne, 1980). EEV is also the form of the within this region of the variola virus genome and that a
virus that mediates long-range virus dissemination in 67 bp HindlII fragment between these sites had not been
vitro (Appleyard et al., 1971 ; Boulter & Appleyard, 1973; cloned. In an attempt to obtain the sequence of the
Payne, 1980) and in vivo (Payne & Kristensson, 1985). missing fragment, the large overlapping SstI A fragment
In view of the abundance of interesting genes from this was used as a template for the polymerase chain reaction
region of vaccinia virus, we wished to determine whether (PCR) with oligonucleotides (5' TGCCGCCTTATT-
similar genes existed in variola virus that might GTCGC 3' and 5' CGCTACTCCAGAGAACGC 3')
contribute to virus pathogenesis or explain the immuno- containing nucleotides 85 to 69 upstream and nucleotides
logical cross-protection afforded by prior infection with 635 to 652 downstream of the unsequenced region. These
either virus. Variola major virus was selected owing to attempts yielded PCR fragments which, when cloned
the higher mortality (30~) caused by this virus than by and sequenced, did not match the predicted sequence in
variola minor virus (I to 2~0), and two cloned restriction any way. Examination of the cloned SstI A fragment by
fragments (Hamilton et al., 1985) containing 21-8 kbp restriction digestion and agarose gel electrophoresis
were obtained and sequenced. This substantially in- revealed that it had a size and structure incompatible
creases the available variola virus sequence data, which with the published data (data not shown). We conclude
had previously been restricted to the thymidine kinase that this instability was due to the very large insert
(TK) gene (Esposito & Knight, 1984) and a 500 bp region (50 kbp) and that the region sought had unfortunately
near the left inverted terminal repeat (Cowley & been deleted. Since variola virus is no longer available
Greenaway, 1990). within the U.K., the extra sequence could not be
obtained and so the sequence is presented in two pieces
which are predicted to be joined by an extra 67
Methods
DNA sequencing and computer analysis. Plasmids containing the
variola major virus (strain Harvey) 7.2 kbp HindllI I fragment and the (a) M 0
14.6 kbp SstI-HindllI fragment from the right end of the HindlII A CN[KF EP[I GLJ H D A B
fragment (Hamilton et al., 1985) were obtained from Drs R. Cowley
and P. Greenaway, Porton Down, U.K. The poxvirus insert was
excised from these plasmids, self-ligated, randomly fragmented by (b) D N O C D PJ GLKH E M A I B F
I II I pl i Jl i I II ~ I I
sonication and cloned into M13. Single-stranded DNAs from recom-
binant phages were sequenced by the dideoxynucleoside triphosphate Fig. 1. HindlII restriction maps of the variola major virus (Harvey) (b)
chain termination method (Sanger et al., 1977) using [35S]dATP and and vaccinia virus (WR) (a) genomes indicating the regions sequenced
buffer gradient gels (Biggin et al., 1983), and either the Klenow in this study and previously (G. L. Smith et al., 1991) (hatched areas).
fragment of D N A polymerase I or Sequenase. Autoradiograms of The 10 kbp inverted terminal repeats of vaccinia virus are shown as
sequencing gels were read by using a sonic digitizer and the sequence stippled boxes. Note the presence of two HindlII sites between the
data assembled into contiguous sequence using the program SAP (R. variola virus HindIII A and I restriction fragments. The missing 67 bp
Staden, MRC, Cambridge, U.K.). A consensus nucleotide sequence fragment might be called fragment Q when it is formally identified.

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
B. Aguado, L P. Selmes and G. L. Smith 2889

(.)
TCCTAATGGAGATTTTCTATT~TCGTCCATTTTAGGATATG~TTTCATAAAGTCCCTAATAACTTCGTGAATAATGTTTCTATGTTTTCTACTGATC~TGTATTTC4~TTC~ATTTTTTT 120
G L P S K R N E D M K P Y A K M F D R I V E H I I N R H K R S I C T N A E I K K

AT~C~ATGTTT~AT~TATCATAGATTTAAA~GCAGTAATC-CTTGCAA~ATTAACATCTTGAAC~ATTGGTACAATTCCGTTCCATAAATTTATAATGTTCGC~a~ TATAACTCATTT 240


D W T E D I M S K F A T I S A V N V D Q V M P V I G N W L N I I N A M
• -J&26L 9.2X
TTTGAATATACTTTTAATTGAACA~GAGTTAAGTTACTCATATGGG~GC~GAC~AGTCTGAACATCAATCTTTTTAGCCAGAGATATCATAGCCGCTCTTAGAGTTTC~ 360
• E Y P R R G T Q V D I K K A L S I M A A R L T E A [~ N

T•CAACCTAAATAGAACGTCATCATTGCGTTTAcAAcACTTTTCTATTTGTTCAAACTTTGTTGTTACATTAGTAATTTTTTTTTCCAAATTAGTTAGCCGTTGTTTGAGAGTTTCCTcA 480
E L R F L V D D N R K C C K E I Q E F K T T V N T I K K E L N T L R Q K L T E E

TTAT CGTCTCCAT CGGCT T T ~ A C A A T T G C T T C GCGTTTAGCCT CTGGCTT T T T A G C A G C C T T T G T A G A G A ~ T T C A G T T G C ~ ~ T C A T ~ C A ~ ~ 600


N D D G D A K V I A E R K A E P K K A A K T S F F E T A P I A L D D D G P F L T

CCGTCCATTTA ~ A G T A C A G A T T T T / K A ~ C T G A C A C T C T A C G T T A T T T A T AT TTGGTGCAACACATGGATT A T A ~ A T A T C G A T G T T A A T A A C A T ~ T G T ~ G T C T A T A C A T ~ 720


G D M * L V S K L F Q C E V N N I N P A V C P N Y I D I N I V D S F T F D I C 0
+ - &27L 1 2 . 5 K
GCCATCGTGTT/t~ATTTTCT AATfiGAT CT AGT AT T A T T G G G T C C A A C T T C AGC C T G A A A T C C ~ T A~GC~ATAC ~ C C G ~ C C T G ~ T ~ A T ~ C C A ~ 840
A M T N F K R I S R T N N P G V E A Q F G F I S A S V F G N G P Y A V C R W K Q

T T T A C A T C A G A ~ A T T G T G T C A T TGACATC TTGAACT CTC CTATCTAATGC TGG TGTACCACCTATAGATTTTG AATAT TCGAATGCT GCATGAGTACCATT~u%ATTCCTTAATATTGCCA 960
K V D S I T D N V D Q V R R D L A P T G G I S K S Y E F A A H T A N F E K I N G

TRAT TCT CATATATT GAGT/u~CC CT G G A T A ~ G T A A A C A C A C T G C A G CC GTCGC T A C C A C A A T A ~ T T G A T A G A G A G T T


C A T ~ A TAATC T A T T A G ~ C ~ 1080
Y N E Y I S Y G Q I F L L C V A M A T A V V
* L R N S A L L I F F
I I S L'S N K
~-- A 2 8 L 1 6 . 2 K
TTACA•GCAT•AGACAATGCTTTAATAAATAGTTCAACATC•ACTTT•G•TATA••GAACCGATGATATGATT•TAACCTAGAATTACATCCGAAAAAGTTGACTATGTTCATTGTcATT 1200
K C A D S L A K I F L E V D V K T I D F R H Y S E L ~[ S N C G F F N V I N M T M

AAGTCATTAACGAACAATATGCCAGACTCTGGATTATAAGACGATACTGTTTCATCACAATCAC CTACCTTAATCATGTGATTATGAATATTGGCTATTAGAGCACCTTCTAAGAAATCT 1320


L D N V F L I G S E P N Y S S V T E D C D G V K I M H N H I N A I L A G E L F D

A T A A T A T C T T T G A A A C A C G A T T T A A A A T C A A A C C A C G A A T A T A C TTCTACGAAGAAAGT TAG TTTACCCATAG GAGAGATAAC TATAAAT G G A G A T C T A G A T A C A A A A T C C G G A T C TATG 1440


I I D K F C S K F D F W S Y V E V F F T L K G M P S I V I F P S R S. V F D P D I

ATAGTT~2.a~CATTATTATA'I~U2CTATT~TACCTCCACA T f f 2 . ~ A T G T T ~ T TI~T G ~ C T A T G TCTTCGTTTATTACCGTAC~!2~ C T . ~ T A T ~ ~ TATTG~ 1560


I T K V N N Y E R N F V E V D L F T L K S V I D E N I V T G S S F A I L E I T Q

fiAACTCTTT AAACGAT A T T C T T G A A A T AC ATGTAAC AAAGT TTCCT TT AACTCGGTCGGTTT AT CTACC AT AGTTACAGAATT TGTATCCTT AT CTATAATATAAT AATCAAAAT CGTAT 1680
S S K L R Y E Q F V H L L T E K L E T P K D V M T V S N T D K D I I Y Y D F D Y

AAAGTTATATAATTATCG C G T T C A G A T T G T G A T C T T TTCAAAT AGACT AAAAACCCCAT TTC TCTAGTAAGTATCTTATGTATATGTTTGTAAAATATCTTCATC-GTG GGAAT AT C-CTCT 1800


L T I Y N D R E S Q S R K L Y V L F G M E R T L I K H I H K Y F I K M T P I H E

A C C A C A G T T A G C C A T T C C T C A T T G A T A G C G G T AGATGTATTAG ACAAAACTATTCCAATGT T TAACAAGfiGCCATTTTACAAGATTATTAAATCCTTGTTTGATAAATGTAC-CTAATGAG 1920


V V T L W E E N I A T S T N S L V I G I N L L [ W K V L N N F G Q K I F T A L S

GGTTCGAGT TCAACGACGAT TGAAT TCTCTTCCCGCGGATGCC GCATGATGAACGACTGGATGT TGT TCGATTGATTTGGAATTCTTTT TCGAC T TTTTGT TTATATTAAATAT T T TAAA 2040
• R R N F E R G A S A A H H V V P H Q E I S K S N K K S K K N I N F I K F
P E L E V V I S N E E R P H R M
• -- A29L 3 5 . 3 K
ATTTATAGCTGATAGCAATTCATGTACTACGGATAATGTAGACGCGTATTGTGCATCGATATCTTTATTATTAGATAAATTTATCAATAAATGTGAGAAGTTTGCCTCGTTAAGGTCTTC 2160
N I A S L L E S V V S L T S A Y Q A D I D K N N S L N [ L L H S ~ N A E N L D E

CATTTAAATATTATATAAACATTTGTGTTTGTATCTTATTCGTCTTTTATGGAATAGTTTTTAAcTAGTAAACCTGTAATTACATAcTTTGTCCGTAAAACATAAATATAAACACCCC-CT 2280
M
4-- & 3 0 L 8.7K A31R "-~ / SalLIR 16.3K
M V S I L N T L R F L E K T S F Y N C N D S I T K E
TTTATCAAACGTTCCAAAAAGTCGTTAGTAGACATTTTTAACATGGTATCTATTTTAAATACACTTAGGTTTTTAGAAAAAACATCATTTTATAATTGTAACGATTCAATAACTAAAGAA 2400

K I K I K H K G M S F V F Y K P K H S T V V K Y L S G G C I Y H D D L V V L G K
AAGATTAAGATTAAACATAAGGGAATGTCATTTGTATTTTATAAC.CCAAAGCATTCTA•CGTTGTTAAATACTTATCTGGAGGATGTATATATCATGATGATTTGGTTGTATTGGGGAAG 2520

V T I N D L K M M L F Y M D L S Y H G V T S S G V I Y K L G S S I D R L S L N R
GTAACAATTAATGATCTAAAGATGATGCTATTTTACATGGATTTATCATATCATGGAGTGACAAGTAGTGGAGTAATTTACA~ATTGGGATCATCCATAGATAGACTTTCTCTA~ATAGG 2640

T I V T K V N N N Y N N Y N N Y N N Y Y N C Y N Y D D T F F D D D D *
ACTATTGTTACAAAAGTTAATAACAATTATAACAATTATAAcAATTATAACAATTATTATAATTGTTATAATTATGATGATACATTTTTTGACGATGATGATTGATCGCTATTACACAAT 2760
• S S V N K S S S S Q D S N C L

TTTG'I~Tr T GTACTTT CTi~TATAGTGTTTAGGT TCTTT TT CATATGAG-I~TA'I~GATI"TACT / ~ T A T C TATGTTT~ C ' I ~ TT GTTCTATi~CGTCCTTAT CC~CG~I'ATC~TACAT' 2880
K T K T S E L I T N L N K K M H S Y Q N V L I D I N L K Q E I V D K D A T D T C

AT AC GT AATTC ACCTT CACAAAAT ACGGAGTCTT CG AT AAT AAT AGCC AATCG AT T ATT GGATCTAACCGT CT GTATCAT A T T C A A C A T G T T T A A T AT ATCCTTTCGT TT ACCCTTT ACA 3000
I R L E G E C F V S D E I I I A L R N N S R V T Q I M N L M N L I D K R K G K V

GGCATCGATCGTAGCATATTTTCCGCGTCTGAGATGGAAATGTTAAAACTACAAAAATGCGTAATGTTAGCCCGTCCTAATATTGGTACATGCCTATAAGTTTGGCATAGTAGAATAATA 3120
P M S R L M N E A D S I S I N F S C F H T I N A R G L I P V H R Y T Q C L L I I

GACGTGTTCAAATGCCTTCCAAAGTTTAAGAATTCTATTAGAGTATTGCATTTTGATAGTTTATCACCTACATCATCAAAAATAAGTAAAAAGTGTGCTGATTTTTTATCATTTTGTGCG 3240
S T N L H [% G F N L F E I L T N C K S L K D G V D D F I L L F H ]% $ K K H N Q A

ACAGTAATACA T T T T T C T A T G T T A C T T T T A G T T C G T A T T A G A T T A T A T T C T A G A G A T T C C T G A C T A C T A A C C ~ T T A A T A T G A T T T G G C C A A A T G T A C C C A T C A T A A T C T C ~ T T A T A A 3360
V T I C K E I N S K T R I L N Y E L S E Q S S V F N I H N P W I Y G D Y D P N Y

ACGGGTGTAAACAAGAATATATGTTTATATTTTTTAACTAGTGTAGAAAACAGAGATAGTAAATAGATAGTTTTTCCAGATCCAGATCCTCCCGTTAAAACCATTCTAAACGGCATTTTT 3480
V P T F L F I H K Y K K V L T S F L S L L Y I T K G S G S G G T L V M R F P M K

AATAAATTTTCTcTTAAAAATTGTTTTTCTTGGAAACAATTcATAATTATATTTACAGTTAcTAAATTAATTTGATAATAAATCAAAATATGGAAAACTAAGGTCGTTAGTAGGGAGGAG 3600
L L N E R L F Q K E Q F C N M
(-- A 3 2 L / SalL2L 31.0K
A33R / S a l L 3 R 20.5K --)
M M T P E N D E E O T S V F S A T V Y G D K I Q G K
AACAAAGAAGGCACATCGTGATATAAATAATATTTGTTATcATGATGACACCAGAAAACGACGAAGAGCAAAcATCTGTGTTCTCCGCCACTGTTTAcGGAGACAAAATTCAGGGAAAAA 3720

N K R K R V I G I C I R I S M V I S L L S M I T M S A F L I V R L N Q C M S A N

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
2890 Variola virus nucleotide sequence

A T A A A C G C A A A C G C G T A A T T G G T A T AT G T A T T AG A A T A T C T A T GGT T A T T T C A C T A C T A T C T A T G A T T A C C A T GTC C G C G T T T C T C A T A G T G C G T C T A A A T C A A T G C A ~ ~T~CG 3840

E A A I T D A T A V A A A L S T H R K V A S S T T Q Y K H Q E S C N G L Y Y Q G
AGGCTGCTATTP~C T G A C G C C A C T G C A G T T G C T G C T G C A T T A T C T A C T C AT A G A A A G G T T G C G T C T A G C A C T A C A C A A T A T A A A C A C C A A G A A A G C T G T A A T G G T T T A T A T T A C C A G G G T T 3960

S C Y I F H S D Y Q L F S D A K A N C A T E S S T L P N K S D V L T T W L I D Y
CTTGTTATATATTCCATTCAGACTACCAGTTATTCTCGGATGCTAAAGCAAATTGCGCCACAGAATCATCAACACTACCCAATAAATCTGATGTCTT~~A~ATG 4080

V E D T W G S D G N P I T K T T T D Y Q D S D V S Q E V R K Y F C V K T M N *
TT G A G G A T A C A T G G G G A T C T G A T G G T A A T C C A A T T A C A A A A A C T A C A A C C G A T T A T C A A G A T T C C G A T G T A T C A C A A G A A G T T A G A A A G T A T T T T T G T G ~ ~ T ~ T A ~ 4200
A34R / 8alL4R ig.sK -9
M K S L N R Q T V S R F K K ;. S V p A A I M M T L S T ~ I S c. T G T
TTTT~TACATTAATAAATG AAAT CGCTTAATAGAC AAACT GTAAGTAGGTT TAAGAAGTTGTC GGTGCCGGC CGCTATAATGAT GATACTC TCAACCATTATTAGCGGCATAGGAAC 4320

F L H Y K E E L M P S A C A N G W I Q Y D K H C Y L D T N I K M S T D N A V Y Q
A T TT C T G C A T T A C A A A G A A G A A C T G A T G C C T A G T G C T T G C G C C A A T G G A T G G A T A C A A T A C G A T A A A C A T T G T T A T T T A G A T A C T ~ C A T T ~ T ~ T A ~ T ~ T A ~ 4440
M E A N C L L V S A * H K R W H I S V I R Y V N N N L Y * C * F T * L Y H P K D

C R K L R A R L P R P D T R H L R V L F S I F Y K D Y W V S L K K T N N K W L D
GTGTCGTAAATTACGAGCCAGATTGCCTAGACCGGATACTAGACATCTAAGAGTATTGTT TAGTAT TTTTTATAAAGATTATTGGGTAAGTTTAAA~G~C~T~T~T~A~ 4560

I N N D K D I D I S K L T N F K Q L N S T T D A E A C Y I Y K S G K L V K T V C
T A T T A A T A A T G A T A A A G A T A T A G A T A T T A G T A A A T T AAC A A A T T T T A A A C A A C T A A A C A G T A C A A C G G A T G C T GAAGC GT GTT AT AT A T A C A A G T C T G G A A A A C T G G T T ~ A~ 4680
(A35R / SalLSR} -9
K S T O S V L C V K R F Y K * m d t t f v i t p m g
TAAAAGTACTCAATCTGTACTATGTGTTAAAAGATTCTACAAGTAAcAACAAAAAATAAAATAATAATAAGT~CTTAACGAAcGT~GATGGACAcCAcGTTTGTTATTACTCCAATGGGT 4800

m 1 t i t d t 1 y d d 1 d i s i m d f i g p y i i g n i k t v q i d v r d i k y
A T G C T G A C T AT A A C A G A T AC AT T A T A T G A T G A T C T C GAT AT CT C A A T A A T G G A C T T T A T A G G A C C A T A C A T T A T A G G T A A C A T A A A A A C T GT CC A A A T A G A T G TAC G G G A T A T A A A A T A T 4920

s d mY q k c f s n t i i
•k g k i v p * d s n d 1 v r f n i y s i c t a y r s k
TCCGACATGCAAAAATGCTACTTTAGCTAAGGGTAAAATAGTTCCTTAGGATTCTAATGATTTGGTTAGATTCAACATTTATAGCATTTGTACCGCATACAGATCAAAAATACCATCATC 5040

i a c d y d t m 1 d i e g k h q p f y 1 f t s i d v f n a t i i e a y n 1 y t a
ATAGCATGCGACTATGATACCATGTTAGATATAGAAGGTAAACATCAG•CATTTTATCTATTTACATCTATTGATGTTTTTAACGCTACAATCATAGAAGCGTATAACCTGTATACAGCT 5160

g d y h 1 i i n p s d n 1 k m k 1 * f n s s f C i s n g n g w i ~ i d g k c n s
GGAGATTATCATCTGATCATCAATCCTTCAGATAATCTGAAAATGAAATTGTAGTTTAATT~TTCATTCTGCATATCAAACGGCAATGGATGGATCATAATTGATGC~SAAATGCAATAGT 5280
A36R / SalL6R 24.41( -9
n f 1 s M I L V P L I T V T V V A
AATTTTTTATCATAAAAGTTGTAAAGTAAATAATAAAACAATAAATATTGAACTAGTAGTACGTATATTGAGCAATCAGAAATGATACTGGTACCTCTTATCAcAGTGACCGTAGTTGCG 5400

G T I L V C Y I L Y I C R K K I R T V Y N D N K I I M T K L K K [ K S S N S $
C~SAACAATATT AG T A T G T T A T A T A T T A T A T A T T T GT A G G A A A A A G A T A C G T A C T G T C T A T A A T G A C A A T A A A A T T A T T AT G A C A A A A T T A A A A A A G A T A A A G A G T T C T A A T T C C A G C A A A 5520

S S K S T D N E S D W E D H C S A M E Q N N D V D N I S R N E I L D D D S F A G
TCTAGTAAATCAACTGATAACGAATCAGACTGGGAGGATCACTGTAGTGCTATGGAACAAAATAATGAcGTGGATAATATTTCTAGGAATGAGATATTGGATGATGATAGCTTCGCTGGT 5640

S L I W D N E ~ N V I A P S T E H I Y D S V A G S T R L I N N D C N E Q T I Y Q
AGTTTAATATGGGATAACGAATCCAAT GTCATAGCGCCTAGCACAGAACACATTTACGATAGTGTTGCT GGAAGCACGCGGCTAATAAATAATGATTGTAATGAACAAACTATTTATCAG 5760

N T T V I N E T E T I E V L N E D T K Q N P S Y S S N P F V N Y N K T S I C S K
AACACTACAGTAATTAATGAGACAGAGACTATTGAAGTACTTAATGAAGATACCAAACAGAATCCTAGCTATTCGTCCAATCCTTTCGTAAATTATAATAAAACCAGTATTTGTAGCAAG 5880

S N P F I T E L N N .K F S E N N P F R R A H S D D Y L N K Q E H D D I E S S V V
TCAAATCCGTTCATTACAGAACTCAACAATAAATTTAGTGAGAATAATCCGTTTAGGAGAGCACATAGCGATGATTACCTAAATAAGCAAGAACATGATGATATAGAATCATCTGTTGTA 6900
(A37R / SalL7R)-9
S L V * m e i f p v f g i s k i s n f
TCATTAGTCTGATTAGTTTCC TTTTTATAAAATTGAAGTAATATTTAGTATTAATTGCTACCGTTACATTGTACAAATGGAGATATTCCCTGTATTTGGCATTTCTAAAATTAGCAATTT 6120
A37R / SalLTR 7.7K -9
i a n n d c r y y i d t e h h * k i i p n e I n r q M D E M v L L T N I L S V E
TATTGCTAATAATGACTGTAGATATTATATAGATACAGAACATCATTAAAAAATTATACCTAATGAGATCAATAGACAGATGGATGAAATGGTACTTCTTACCAACATCTTAAGCGTAGA 6240

V V N N N E M Y H L I P H R L S M I I L C I S S I G R C V I S I D N D V N N K N
AGTTGTAAATAACAATGAGATGTATCATCTTATTCC CCATAGACTATCGATGATTATACTCTGTATTAGTTCTATTGGAAGATGTGTTATCTCTATAGATAATGACGTTAATAACAAAAA 6360

p 1 s k c v v v s k g p t t i 1 v v k a d i p s k r h
I L T F P I D H A V I I S H *
TATTCTAACCTTTCCCATTGATCAT GCTGTAATCATATCCCACTAAGTAAATGTGTCGTAGT TAGTAAGGGTC CTACAACCATATTGGTTGTTAAAGCGGATATACCCAGCAAACGACAT 6480

i y v n n 1 s 1 i n y 1 p 1 s v f I i r r v t n y 1 d r h I c d q i f a n n k w
AcTATATGTAAATAATCTGTCACTGATTAATTATTTGC•GTTGTcTGTATTCATTATTAGACGAGTTA•AAACTATTTGGATAGAcAcATATGCGATCAGATATTTGCTAATAATAAGTG 6600

y s i I t i d d k q f p i p 1 n c i g m s s t k y i n s s i e q d t 1 i h v c n
GTATTCCATTATAACCATCGACGATAAGCAGTTTCCTATTCCATTAAACT GTATAGGTATGTCC TCTACCAAGTACATAAATTCTAGCATCGAGCAAGATACTTTAATCCATGTTTGTAA 6720

n s v p i k e q i 1 y g r i d n i n m s I s i s v
1 e h p f d s v y k
k m q s y
CCTCGAO~ATCCATTCGACTCAGTAT AcAAAAATGCAGTCGTACA~uATTCTGTACCTAT~AAGG.r~C~TATTGTACGGTAGAATTGAT~TAT~ATATGAc9~ATrAGTATTTCTGTG 6840

d *
G A T T A A T A G A T TTC TAGTATGC43ATC AT T A A T C ATC T C T ~ T C TC TP~.AT A C C TC A T A A ~ A C ~ . C ~ T A T T A TC~.~.TACT G T A C G GAATGGAT']'C AT TCT CT TCT CT T T T T A T ~ 6960

AC TC TG'I~I'GTATA TCTAC T G A T A A A A C TG G ~ G C ~ r ~ A A T CT GAT A G A A A G A A T A A G A A T A A G A T C,~.C~3ATTAT AT C9SA~%CAC G A T T A T T AT ~ T A A C A A T A G T TC C T G G T T C ~ C 7080

'I"TC C A C GTCTACTAC4~ TC GT C4~TATTATACC4ZAT GC CTAGT.~ATAG T C TC TTT C~ GT T G A T G G~,~AC~AGACT A G ~ T . ~ r ~ C A C ~ C T A ~ A T GT TC AC~CACCAT.r~TAGTI~CCCAACCC 7200


• Y Y D R ~l T S P F C V L F L L S F H E 5 V M I T G L G

A G A T A A T A A C A G A G T T C C A T C A A C A C A T T CCT TT A G A C T C A A T C C C A A T C C C A A A A C C G T T A A A A T G T A T C C G GCC A A T T G A T A G T A A A T A A T G A G G T G T A C A G C A C A T G A T A A T TT A C A 7320


S L L L T G D V C E K L S L G L G L V T L I Y G A L O Y Y I I L H V A C S L K C

CAGTAACCAAAATGAAAACACTTTAGTAATTATAAGAAATATAGACGGTAATGTCATCATTAACAATCCAATAATATGCCTGAGAGT AAACATTGACGGATAAAACAAAAATGCCCCGCA 7440


L L W F S F V K T I I L,F I S P L T M M L L G I T H R L T F M S P Y F L F A G C

TAACTCTATCATGGCAATAATACAACCAAACACTTGTAAAATTCCTAAATTAGTAGAAAATACAACTGATATCGATGTATAAGCGATTTCGAGGAATAATAAGAACAAAGTAATGCCTGT 7560
-L E T M A I I C G F V O L I G L N T ~ F V V S ~ S T Y A I E L F L L F L T I G T

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
B. Aguado, L P. Selmes and G. L. Smith 2891

/ ~ , A T RAACATC~.CATTGTATGRTGGTCATT~,CCAATTAGTATGACGTTG~CTAATTTCACAGTAGAT~ ATTCC~TG~ATC~ ~CAT~AT~ACC ~ T A T ATC 7680


F I F M n M T H H D N F W N T H R O V L K V T S K I G T N D E C T Y T G P L I D

TTTATACTTTATAATTAATGAAACATCGTTATCAGATAACGAATGGAGTTTGGCAcTAGTATGCCATTTACTTAATATGATCGT•TTGGAAGTTTTATTATAAGTTAAAATATTATGGTT 7800
K Y K I I L S V D N D S L S H L K A S T H W K S L I I T K S T K N Y T L I N H N

GT CCAATTTCC AT C T A A T A T A C T T T G T T G G A T T A T C T A T A G T A C A C G G AATAATGAT GGTAT CATT OCACGCC GTATACTCTATAGT C T T T G T A G A T G T T A T A A C C ~ C A ~ T G C A G A G 7920


D L K W R I Y K T P N D I T C P I I I T D N C A T Y E I T K T S T T V V F T C L
1 3 9 R / S & I L g R 8 . 1 K '-'+
M I P L L F I L F Y F A N G I E W H K F E T $ E K
GTAT kTC A A C ~ T A T T C T A A C T C ~ ~%CATTTTTATTT ATTT~d%AATGAT ACCTTT GTT ATTT ATTTT ATT ~ A ~ ~G~C~T kT C O ~ T ~ C A T ~ T ~ C G ~ ~ 8040
y I L L T R V R L M
4.- ~L38!". / 81tlLSL 3 1 . S K
I I S T Y L I D D V L Y T G V N G A V Y T F S N N K L N K T G L A N T N Y I T T
ATAATTTCTACTTACT TAAT AGATGAC GT ATT AT AC ACGGGTGTTAATGG GGC AG TATATACATTT T C A A A T A A T A A A C T A A A C A A A A C T G G T T T A G C T A A T A C T A A T T A T A T C A C A A C A 8160

S I K V
d t 1 V C g t n n @ n p k c w k i d g s k d p k h r g r g y a p y q
e d
TCTATAAAAGTGAGGATGTGATACATTAGTATGCGGAACCAATAACGGAAATcC~AAATGTTGGAAAATAGACGGTTCCAAAGATCCAAAAcATAGAGGTAGAGGATACGCTC~TATcA 8280
~ g R / Sa2.Lglit 14.31r. - ~
n s k v t i i a h n k c i 1 s n i n i s k e g i k r w r r f d g p c g M I Y L
AAAT AGTAAAGTAACGAT AATCAh-'TC A T A A C A A A T G T AT ACT ATCTAACATAAACAT ATCAAAAGAAGGAATT AAACG ATGGAGAAGATTTGAC ~ C AT~TATG A~TA~TAT 8400

¥ T A D N V I P K D G L Q G A F V D K D G T Y D K V Y I L F T V T I G S K R I V
A C A C G G C G G A T A A C G T A A T T C C A A A A G A T G G T TTACAAGGAGC ATT CG TCGATAAAGAC GGT AC TTATGAC AAAGT TT ACATTCTTTTCACT ~ T A ~ A T C ~ C ~ T ~ A 8520

K I P Y I A Q M C L N D E C" G P S S L S S H R W S T L L K V E L E C D I D G R S
AAATTC C G T A T A T ~ J C A C A A A T G T G C T T A ~ A C G A C G A A T GT GG TCC AT C~TCATTGTCTAGT CATAGATGG TCGAC GT T G ~ C ~ G
TCG ~ A G ~ T ~ GACAT C G ~ ~ 8640
/ ~.FIR 1 6 . 3 K --~ A39R
M N T I K Q
Y S Q I N H S K T I K Q I M I R Y Y M Y S L I V L F Q V R I M Y L F Y E Y H *
ATAGTCAAATTAATCATTCTAAAACTATAAAACAGATAATGATACGATACTATATGTATTCTTTGATAGTCCTTTTcCAAGTCCGC~TTATGTACCTATTCTATGAATAC~TT~C~ 8760

S F S T S N W E D I Q S N Y C L Q L L V Y V Y Q L E K V V P H N T F D V I E Q Y
TCTTTTTCTACGTCAAATTGGGAGGATATACAAAGCAATTACTGTCTCcAGCTTTTGGTATATGTCTAC~AGCTGGAAAAAGTTGTTCCACATAACACGTTTGACGTTATAGAACAATAT 8880

N V L D N I I K P L S N Q P I F K G P S D V K W F D I K E K E N E H R K Y R I Y
A A T G T A C T A G A T A A T A T T A T AAAGCCTTT ATCTAAC CAACC TATCT TCAAAGGACCGTC TGATGTTAAATGGTTCGAT AT AAAGGAG A A G G A A A A T G A A C A T C GGAAATATAGAATATAC 9000

F I K E N T I Y S F N T K K Q T R S S Q V D A Q L F S V M V T S K P L F S I A D
TTCATA~C4%A~u%TACT AT AT A T T C G T T C A A T A C ~ T CTAAAC AAAC TCGTAGC TC GCAAGTC GAT GCGCAACTATTTTC A G T ~ A T G G T A A ~ C G ~ C ~ T A ~ A T ~ T 9120
(A40R / S&IF2R) --)
I G I E V G M P R I K N T * m t m n k p k t n v a a v a e c v
ATAGG~ATAGAAGTAGC4%AT G C C A C G A A T A A ~ T ACT T ~ A A T G TAAT C T T A A T C G A G T A C G C C A T A T G A C A A T G ~ C AAACC T A A G A C A ~ T T A T G C T GG T T A T G C T T ~ T G C G T ~ 9240

y n k c i h 1 s t n q k t w e e g r n a c k a 1 n p n s d 1 i n i
c p t d * i s y
GTCCTACTGACTGAATAAGCTATATAATAAATGTATACATTTATCGACTAATCAAAAAACCTGGGA GAAGGACGTAATGCATGCAAAGCTCTAAATCCAAATTCGGATCTAATTAATAT 9480

e t 1 n e 1 s f 1 r s 1 r ~ ~ y w v g e f k i 1 n ~ t t t y n f i a k . v t k ~I
AGAGACTCTAAAC G A G T T A A G T T T T T T A A G A A G C C T T A G A A G A C G C T A T T GGG TAGGAGAATTC AAAATATTAAAC C A G A C A A C C ACGTATAATTT T A T A G C T A A A A A T G T C A C G A A G A A 9600

k y i c s t t n t p k 1 h s c y t i *
s t k k r
TG~AACTAAAAAACGAAATATATTTGTAGTACAACGAATACTCCCAAACTGcATTCATGTTAcACTATATAAcAATTACACTACA TTTTTATCATAACAcTACTTTGGTTAGATGTTTTA 9720

~ TATCGCCGTAcCGTTCTTGTTTTATAAAAATAAcAATTAACAATTATCAAAT TTTTTCTTTAATATTTTACGTGGTTGACCATT~GTGGTGGTAAAATAATCTCrTAGTGT 9840


• C N D F K K K L I K R P Q G N T T T F Y D R L T

TGGAATGGAATGCTGTTTAATGTTTCCACAcTCATCGTATATTTTGACGTATGCAGTTACAT•GTTTACGCAATAGT•AGACTGTAGTTCTATTATC.CTTCCTAcATTAGGAGGAACAGT 9960
P I S H Q K I N G C E D Y I K V Y A T V D N V C Y D S Q L E I I S G V N P P V T

TTTAAAGTCTCTTGGTTTTAATCTATTACCGTTAGTTTTTATGAAATCCTTTGTTTTATCCACTTCAcATTTTAAATAAATGTC•ACTATACATTCTTCTGTTAATTTTACTAGATCGTC 10080
K F D R P K L R N G N T K I F D K T K D V E C K L Y I D V I C E E T L K V L D D

A T A T G T C A T A G A A T T T A T A G G T T C C A T A G T C C ATGGATCCAAACTAGCAAACT TC GCGTATACAGTATCGCGATT A G T G T A T A C A C C A A C T G T A T G A A A A T T A A G A A A A C A G T T T A A T A G 10200


Y T M S N I P E M T W P D L S A F K A Y V T D R N T Y V G V T H F N L F C N L L

ATCAACAGAAATATTTAATCCTCCGTTTGATACAGATGCAcCATATTTATGGATTTCGGATTCACACGTTGTTTGTCTGAGGGGTTCGTCTAGCGTTGCTTCTACATAAACTTCTATTCC 10320
D V S I N L G G N S V S A G Y K H I E S E C T T Q R L P E D L T A E V Y V E I G

CATATATTC TTTATTATC A G A A T C G C A T A C C G A T TT ATCAT CATAC ACTGTTTGAAAAC TAAAT GGTATAC A C A T C A A A A T A A C A A A T A C TAAC G A G T A C A T T C T G C A A T ATT GTTATCG 10440
M Y E K N D S D C V S K D D Y V T Q F S F P I C M L I V F V L S Y M
4-- A 4 1 L / S a l F 3 L 24.9K
TAATTGGAAAATTAGTGTT~GGGTGAGTTGGATTATGTGAGTACTGGATTG~ATATTATATTTTATATTTTATATTTTATATTTTGTAATAAGAATA~T~T~C~TA~ 10560
A42R / SalF6R iS OK. ( p z o f i l i n )
M-" A E W H K I I E D I S K N N N F E D A A I V D Y K T
CAATAAATGACTTATTAAAAAACATATATAATAAATAACAATGGCTGAATGC.CATAAAATTATcGAGGATATTTCAAAAAATAATAATTTcGAGGATGCCGcCATCGTTGATTACAAGAC 10680

T K N V L A A I P N R T F A K I N P G E V I P L I T N H N I L K P L I G Q K F C
TACAAAGAATGTT CTAGCGGCT ATT CCTAACAGAAC ATTTGCAAAGATTAATC C ~ GG CGAAGTT ATTCCGC TC ATCACTAATC AT AATATTCT AAAACCTCTT ATT G G T C A G A A G T T ~ G 10800

I V Y T N S L M D E N T Y A M E L L T G Y A P V S P I V I A R T H T A L I F L M
TATTGTATATACTAACTCTCTAATGGATGAGAACACGTATGCTATGGAGTTGCTTACTGGGTACGCCCCTGTATCTCCGATCGTTATAGCGAGAACTCATACCGCACTTATATTTTTGAT 10920

--~A43R / S ~ I F S R 22.7K
M M ~ K W I [ S [ L T M S T M P V L V Y S s S I F R F R S E D V E L C Y G N L Y
"~GA T G A T A A A A T G G A T A A T A T C C A T A T T G A C GATGT C A A T A A T G C C G G T A T TGGTATACAGCTC ATCGATTT TTAGATTTCGTTCAGAGGATGTC43 A A T T A T G T T A T G G G A A T T T ~ A ~ 11160

F D R I Y N N V V N I K Y I P E H I P Y K Y N F I N R • F S V D E L D N N V F F
TTGAT AQGATCT AT AAT AAT GT AGT AAAT AT AAAAT AT ATT CCTGAGC AT ATT CC AT AT AAAT AT AATT TT ATT AATCGT ACGTT CT CCGT AGATG AACT AGACAAT AAT GTCTT T T ~ A 11280

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
2892 Variola virus nucleotide sequence

T H G Y F L K H K Y G S L N P S L I V S L S G N L K Y N D I Q C S V N V S C L I
C A C A T G G T T AT TT T T T ~ % A C A C A A A T AT G G T T C A C T T A A T C C T A G T T T G A T T G T C T C A T T A T C A G G A A A C T T A A A A T A T A A T G A T A T A C A A T G C T C A G T A A A T G T A T C G T G C CT C A T T A 11400

K N L A T S I S T I L T S K H K T Y S L H R $ K C I T I I G Y D S I I W Y K D I
A A A A T T T G G C A A C G A G T A T A T C T A C T A T A T T A A C A T C T A A A C A T A A G A C G T A T T C T C T A C A T C G G T C C A A G T G T A T T A C T A T A A T A G G A T A T G A T T C T A T T A T A T G G T A T A A A G A T A T A A 11520
(SalF6R) -~
m 1 1 e
N D K Y N D I Y D F T A I C M L I A S T L I V T I Y V F K K I K M N S
AT G A C A A G T A T A A T G A T A T C T A T G A T T T T A C C G C A A T A T G T A T G C T A A T A G C G T C T A C A T T G A T A G T G A C C A T A T A C G T G T T T A A A A A A A T A A A A A T ~ T T A ~ T A ~ 11640

e k d v 1 1 a
m d k i k i t v d s k i g n v v t i s y n 1 e k i t i n d t p k k k
A A T G G A T A A A A T C A A A A T T A C G G T T G A T T C A A A A A T T G G T A A T G T T G T T A C C A T A T C G T A T A A C TT G G A A A A A A T A A C T A T T A A C G A C A C A C C A A A A A A G A A A A G G A T G T A ~ A T T A ~ 11760

q s v a v e e a k d v k v e e k n i d i e d d n d d d m d v e s v
C A A T C A G T T G C T G T T G A A G A G G C A A A A G A T G T C A A G G T C~]AAG A A A A A A A T A T C G A T A T T G A A G A T G A C A A T G A C G A T G A T A T G G A T G T A G A A A G C G T G T A A T A C ~ A T ~ T A 11880

AGTATATAAATACTTTTTATTTAcGGTACTCTTGTAGTC~TAATACC~CTAcT~GATTATTTT~FFFrA~AAAAAATACTTATTCTGATTATTCTAGCCATTTCCGTGTTCGTTcAAATG 12000
e s * e 1 w k r t r e f a

CcAcATCAAAGATATGGGAGTAGTTGAAATCTAGTTCTGCATTGTTGG•G•GCCTCAAATGTAGTGTTGGATATCTT•AAcGTATAGTTGTTGAGTAGTGATGGTTTTCTAAATAGAATT 12120
v d f i h s y n f d 1 e a n n a r r
E F T T N S I K L T Y N N L L S P K R F L I

C T C T T C A T A T C A T T C T T G C A C G T G T A C A T T T T T A G C A T C C A T C T T G G A A T C C T A G A T C C T T G T T C T A T T C C C A A T G G T T T C A T C A A T A G A A G A T T A A A A T A T C G T A C G A A C A C G A T G G A 12240
R K M D N K C T ¥ M K L M W R P I R S G Q E I G L P K M L L L N F M D Y S C S P

G A G T A A C C G T A G C A A A A G T A A G C A T T T C C T T T A A T C T C A G A T C CCG G A T A C T G G A T A T A T T T T A C A G C C A A C A C A T G C A T C C A T G C A A C A T T TC CT A C A T A T A C C C G G C T A T G T A C C G C G 12360
S Y G Y C F Y A N G K I E S G P Y Q I Y K V ~. L V H M W .1%- V /~ G V Y V R S H V A

TCATTATCGACTGTACGATACATAATGTTACCGTGTTGTGTACATT GCTCGTAAAAGACTTTCGTTAATTTGTCTCCTCCTCCGTAAATTCCAGTGGGTCTTAGGCAACAAGTATACAAT 12480


D N D V T R Y M I N G H Q T C Q E Y F V K T L K D G G G Y I G T P R L C C T Y L

TT T G C A C C A T T C A T A A T T AC A G A A T T A T T G G C TT T C A T G A C C A G T T GC TC GAC C A T A C G T T T A C TT TTT GC GT A T A C A T G T C C G G G T G A T AT AT C G T A C A G G G T A T G C T C A T G A C ~ T G 12600


K A G N M I V S N N A K M V L Q E V M R K S K A Y V H G P S I D Y L T H E H G. I

AATGGATTACCGTGTTTATTTGGTCCTATTC~CTTcCATGCTACCTAATATAAATCAAATACTTGATTCCTAGGTCTACAGAAGCTGCCAATATAGTCTGTGTTACATAATAGTTTACTTT 12720
~&44L / S & I F 7 L 24.1K (3~-HSD) ~- y i 1 y k i g 1 d V s a a 1 i t q t v y y n v k
F P N G H K N P G I A E M s g 1
CATGATTTCATTATCGGTGTATTTTCCAAATACATCCACTAGAGCAACCGTATGAATAATCAGATTTACCCCATCTAGTGCTTCTTTCACCTTATTAAGTCGTTTATATcACATTGTATA 12840
m i e n d t y k g f v d v 1 a v t h i i 1 n v g d 1 a e k v k n 1
d n i d c q i

TAGTTTATAACTTTAACCTTCGATACGAGAGGTTGTGGATCTTcTACGACATTGATAACTCTGATTTCTTGAACATCATCTGCGCTAATTAAAAGTTTTACTATATACCTC~CCTAGAAAT 12960
y n i v k v k s v 1 p q p d e v v n i v r i e q v d d a s i 1 1 k v i y r g 1 f
A45R / S a l F S R 1 3 . 7 K (SOD) , - . b
A V C I I D H D N I R G
TCGGCAcCAcCAGTAACCGCGTAcACGGTCATTGCTGCTGT•ACT•ATAAATAT•GGACTACTTATTCTATTTTACAAATAATGGCTGTTTGTATAATAGA•CACGATAATATCAGAGGA 13080
e a g g m t v a y v t
•- ( A 4 8 L / SalF7L 3~-~SD)
V I Y F E P V H G K D K V L G S V I G L K S G T Y N L I I H R Y G D I S R G C N
GTTATTTACTTTGAACCAGTCCATGGAAAAGATAAAGTTTTAGGATCAGTTATTGGATTAAAATCCGGAACGTATAATTTGATAATTCATCGTTACGGGGATATTA~CGA~AT~T 13200

S I G S P E I F I G N I F V N R Y G V A Y V Y L D T D V N I S T I I G K A L S I
T C C A T A G G C A G TC C A G A A A T A T T T A T C G G T A A C A T C T T T G T A A A C A G A T A T G G T G T A G C A T A T G T T TAT T T A G A T A C A G A T GT A A A T A T A T C T A C A A T T AT T G G A A A G G C G T T A T C T A T T 13320
A 4 6 R / S a l F g R 2 7 . 6 K -~
M A F D I S V N A S K
S K N D Q R L A C G, V I G I S Y I N E K I I H F L T I N E N G V *
T C A A A A A A T G A T C A G A G A T T A G C G T G T G G A G T T A T T G G T A T T T C T T A C A T A A A T G A A A A G A T A A T A C A T T T T C T T A C A A T T A A C G A G A A T G G C G T T T G A T A T A T C A G T T A A T G C A T C T A A 13440

T I N A L V Y F S T Q Q N K L V I R N E V N D T H Y T V E F D R D K V V D T F I
A A C A A T A A A T G C A T T A G T T T A C T T T T C T A C T C A G C A A A A T A A A T T A G T C A T A C G T A A T G A A G T T A A T G A T A C A C A C T A T A C T G T C G A A T T T G A T A G G G A C A A A G T A G T T G A C A C G T T T A T 13560

S Y N R H N D S I E I R G V L P E E T N I G C T V N T P V S M T Y L Y N K Y S F
T T C A T A T A A T A G A C A T A A T G A C T C C A T A G A G A T A A G A G G G G T G C T T C C A G A G G A A A C T A A T A T C G G T T G C A C G G T T A A T A C G C C G G T T A G T A T G A C T T A C T T G T A T A A T A A G T A T A G T T T 13680

K L I L A E Y I R H R N T V S G N I Y S A L M T L D D L V I K Q Y G D I D L L F
T A A A C T G A T T T T A C 4 • A G A A T A T A T A A G A C A C A G A A A T A C T G T A T C C G G C A A T A T T T A T T C G G C A T T G A T G A C A C T A G A T G A T T T G G T T A T T A A A C A G T A T G G C G A C A T T G A T C T A T T A T T 13800

N E K L K V D S D S G L F D F V N F V K D I I C C D S R I V V A L S S L V S K H
T A A T G A G A A A C T T A A A G T G G A C T C C G A T T C G G G A T T A T T T G A C T T T G T C A A C T T T G T A A A G G A T A T T A T A T G T T G T G A T T C T A G A A T A G T A G T A G C T C T A T C T A G T C T A G T A T C T A A A C A 13920

W E L T N K K Y R C M A L A E H I A D S I P I S E L S R L R Y N L C K Y L R G H
T T G G G A A T T G A C A A A T A A A A A G T A T A G G T G T A T G G C A T T A G C C G A A C A T A T A G C T G A T A G T A T T C C A A T A T C T G A G C T A T C T A G A C T A C G A T A C A A T C T A T G T A A G T A T T T A C G A G G A C A 14040

T E S I E D E F D Y F E D D D S S T C S V V T D R E T D V *
CACTGAGAGCATAGAGGATGAATTTGATTATT TTGAAGACGATGATT CGTCGACATG TTCTGTCGTAACCGACAGGGAAACGGATGTAT AAT TTT TTATAGTGTGCAC~ATATG ~ 14160

A T A T A A T T G T T G T A T C C A T T C C C A T T C T A A T C A C A T T A T A T G A T T C T G T A A A A A A T T A T A C T G T A A c A C A A T G A A G T A G T T G C A T A G A T G T A T A T A G G T C A G A T A C T G G T T T G A T A A A C T 14280
• V T V C H L L Q M S T Y L- D S V P K I F K

T T T T A T T C C A C A T G A G T A T G T T T G A C T T T A T G G T T A G A C C C G C A T A C T T T A A C A A A T C A C T G A A A A T T G G A G T T A G G T A T T G A C A T C T C A G A A T C A G T T G C C G T T C T G G A A C A T T A A A T G 14400
K N W M L I N S K I T L G A Y K L L D S F I P T L Y Q C R L I L Q R E P V N F T

T A T T T T T T A T G A T A T A T T C C A A C G C A T T T A T G T G G G T A T A C A A C A A G T C A T T A C T A A T A G A G T A T T C C A A G A G T T T T A A T T G G C T A G T A T T T A A C A A G A G A A G A G A T T T C A A C A A A C T G T 14560
N K I I Y E L A N I H T Y L L D N S I S Y E L L K L Q S T N L L L L S K L L S N

TTATGAACTCGAATGCCGCCTTATTGTCGCTTATATTGATGATGTCGAATTCTCCCAATATCATCACTGATGAGTAGCTCATCTTGTTATCAGGATCCAAGCT 14623
I F E F A A K N D S I N I I D F E G L I M V S S Y S M K N D P D L S
•- A47L / SalFIOL ctd

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
B. Aguado, L P. Selmes and G. L. Smith 2893

AAGC TTAGTC CG TTTGTCCACATCTATAGACGATGATI~f C T G A A T T A T T G CATATATCTC TCTC T ~ A A C T C CAGGAACTTGTCAGGATGGTCCACTTTA~XZATG ~YFCTC GC C T A A G A G A 120


L K T R K D V D I S S S K Q I I A Y I E R K L E L F K D P H D V K V H E R R L S

T A A A A A T C T T T G G A T G G T T G CATG T G A C T T T T C T C T A A A T G A T G A T G TTG C C CAAGATC C TC T C T T A A A T G A A T C C A T C C TATC C T T G T A C A A G A T G GACAGTC T A T T T T C C T I A G A T G G 240


L F R Q I T A H S K E R F S S T A W S G R K F S D M R D K Y L I S L R N E K S P
A4$R / SalFIlR 23.3K (T~pK) --)
M
~'ITAATATrTTTGTTAC CCATGATCTATAAAGGTAGACCTAATCGTCTTG~ATGACCATATA~TTA~"/¢IC CAGTT TTAT T A T A C G C A T ~ . ~ A T T G T A ~ T A ~ A ~ T ~ T 360
K I N K N G M
(-- ]UI7L / SalF10L 11.0][
S R G A L I V F E G L D K S G K T T Q C M N I M E S I P T N T I K Y L N F P Q R
GTCTCGC G G G G C A T T A A T C G I - I - I ' I - ~ C ~ G A T T G G F ~ C A A A T C ~ 3 C ~ C A A C A C A A T G TATC~%ACATCATGGAATCTATACC G A C I ~ A ~ C ~ T ~ T A ~ ~ C G~G 480

S T V T G K M I D D Y L T R K K T Y N D H I V N L L F C A N R W E F A S F I Q E
ATCCACTGTC~CTGG~AGATGATAGATGACTATC T ~ C TCG T / L K A A ~ C C T A T A A T G A T C A T A T A G T T A A T C T A T T A T ~ T ~ A A A T A ~ T ~ G ~CATL-FI-I-rAT~ 600

Q L E Q G I T L I V D R Y A F S G V A Y A T A K G A S M T L S K S Y E S G L P K
AcAATTAGAACAGGGAATTACTTTAATAGTTGATAGATACGCGTTCTCTGGAGTAGCATATG•CACCGCTAAAGGCGCGTCAATGAcTCTCAGTAAGAGTTATGAATCTGGATTGCCTAA 720

P D L V I F L E S G S K E I N R N V G E E I Y E D V A F Q Q K V L Q E Y K K M I
~C CC GAC T T A G T T A T A T ~ %~fGGAATC TG G T A G C A A A G A ~ T T A A T A G A A A C G T C GG C G A G G A A A T T T A T G A A G A T G TAGCA~'f C ~ C ~ G G T A ~ A ~ T A T ~ T 840

E E G E D I H W Q I I S S E F E E D V K K E L I K N I V I E A I H T V T G P V G
TGAAGAAGGAGA~JGATATTCATTGGCAAATTATTTCTTC T G A A T T C G A G G A A G A T G T A A A G A A G G A G T T G A T T A A G A A T A T A G T T A T A G A G G CTATACATAC GGTTACTGGACCAGTGGG 960
A49R / S a l F I 2 R 1 8 . 7 K --~
Q L W M * M D E G ¥ Y S G N L E S V L G ¥ V S
GCAAC TG T G G A T G T A A T A A A G T G A A A T T A C A T T T T T A T A A A T A G A T G T T A G T G C A G T G T T A A A A A A T G G A T G A A G G A T A T T A C TC TGGCAAC ~TG GAATCFE TTC TC GGATACG TATCTG 1080

D M H T K L A S I T Q L V I A K I E T I" D N D I L N N D I V N F I M C R S N L N
ATATGCATACTAAACTCGCATCAATAACT•AATTAGTTATTG•CAAGATAGAAACTATAGATAATGATATATTAAAcAACGACATTGTAAATTTCATTATGTGTAGATCAAACTTAAATA 1200

N P F I S F L D T V Y T I I D Q E I Y Q N E L I N S L D D N K I I D C I V N K F
A T C C A T T T A T C T C T T T C C T A G A T A C TG TATATAC T A T T A T A G A T C A A G A G A T C T A T C A G A A C G A G T T G A T T A A T T C A T T A G A C G A C A A T A A A A T T A T C G A T T G T A T A G T T A A T A A G T T T A 1320

M S F Y K D N L E N I V D
I I T L K Y A
I M N N P D F K T T Y A E V L G S R I A
TGAGC %~fTTATAAG GATAAC C T A G A ~ T A T A G T A G A ~ C
TATC ATTAC T C T A A A A T A T A T A A T G A A T A A T C C A G A T T T T A A A A C T A C G T A T G C A G A A G T A C TC G GTTC C AGAATAGCGG 1440
A S O R / S a l F I 3 R 6 3 . 3 K (DI~A l i g a J e } --~
D I D I K Q V I R E N I L Q L S N D I R E R Y L * M T S L R
ATATAGATATTAAACAAGTAATACGTGAGAATATACTACAATTGTCTAATGATATCCGCGAACGATATTTGTGAAAATA~[~AAAAAAAAATAC ACGTCTCTTCGT 1560

E F R K L C C A I Y H A S G Y K E K S K L I R D F I T D R D D K Y L I I K L L L
GAATTTAC~TTATGCTGTGCTATATATCACGCATCAGGATAT/~AGA~TCTAAA~AAT TAGAGAC I ~ T A T A A C A G A T A G G G A T G A T A A A T A T q ' T A A T C A T T ~ T A ~ 1680

P G L D D R I Y N M N D K Q I I K I Y S I I F K Q S Q K D M L Q D L G Y G Y I G
CC CG GATTAGAC GATAGAA~q~TATAA~AT G A A C G A T ~ A A C A A A T T A T A ~ T A T A T A G T A T A A T A T ~ T / ~ C A A T C l ~ A C 4 % A A G A T A T GCTAC A A G A T T T A G G A T A C G G A T A ~ T A ~ A 1800

D T I S T F F K E N T E I R P R N K S I L T L E D V D S F L T T L S S I T I( E S
GACAC T A T T A G T A C A T T C T T C A A A G A G A A C A C A G A A A T C C GTCC AAGAAATAAAAC~CATTTTAAC T T T A G A A G A C G T G G A T A G T T T C TTAAC T A C A T T A T CATC C A T A A C TAAAGAATC G 1920

H Q I K L L T D I A S V C T C N D L }( C V V M L I D K D L K I K A G P R Y V L N
C A T C A A A T A A A A T T A T T G A C T G A C A T C G C A T C C G T ~ f G T A C A T G T A A T G ATTTAAAATG TGTAG TCATGC T T A T T G A T A A A G A T C TAAAAAT TAAAGCGG GTCC TC G GTACGTAC TTAAC 2040

A I S P H A Y D V F R K S N N L K E I I E N E S K Q N L D S I S V S V M T P I N
GC TATCAGTC CTCAT GC C T A T G A T G T T ~ T A G A A A A T C T A A T A A C TT G A A A G A G A T A A T A G A A A A T G A A T C T A A A C A A A A T C TAGAC TC TATATC TG TTT C T G T T A T G A C TC CAATTAAT 2160

P M L A E S C D S V N K A F K K F P S G M F A E V K Y D G E R V Q V H K N N N E
CC CAT G T T A G C G C ~ T C G T G T G A T T C T G T C A A T ~ G G C A T T T A ~ T T T C CAT C A G G A A T G T T T G CT G ~ G T C A A A T A C G A T G G C GAGAGAG TACAAGTTCATAAAAATAAT/u~C GAG 2280

F A F F S R N M K P V L S Y K V D Y L K E Y I P K A F K K A T S I V L D S E I V
TTTGC CTTr TTTAG T A G A A A C A T G A A A C C A G T A C TCT C T T A T A A A G T G G A T T A T C T C A A A G A A T A C A T A C C GAAAGC A T T T A A A A A A G C TAC GT C TATC G T A T T G G A T T C T G A A A T T G T T 2400

L V D E H N V Q L P F G S L G I H K K K E Y K N S N M C L F V F D C L Y F D G F
CTTG TAGAC G A A C A T A A T G T A C A G C TC C C G Tr TG G A A G T T T A G G TATAC A C A A A A A G A A A G A A T A T A A A A A C TC T A A C A T G T G T T T G T r T G T G T T T G A C TGTITG T A C T T T G A T G GATTC 2520

D M T D I P L Y K R R S F L K D V M V E I P N R I V F S E L T N I S N E S Q L T
GATATGACG G A C A T T C C A T T G T A C A A A C G A A G A T C TTTT C TCAAAGATG TTATG G TC GAGATAC C C A A T A G A A T A G T A T T CTCAGAG ~VfGAC G A A T A T T A G T A A C GAGT C TCAGTTAAC T 2640

D V L D D A L T R K L E G L V L K D I N G V Y E P G K R R W L K I K R D Y L N E
GACGTATTG G A T G A T G C A C T A A C A A G A A A A T T A G A A G GATTGGTCI~f A A A A G A T A T T A A T G G A G TATAC GAG CC G G G A A A G A G A A G A T G G T T A A A A A T A A A G C G A G A C T A T T T G A A C GAG 2760

G S M A D S A D L V V L G A Y Y G K G A K G G I M A V F L M G C Y D D E S G K W
GGTT C C A T G G C A G A T T C T G C C G A T T T A G T A G T A C TAG GTGCTTAC TATGG TAAAG GAGC A A A G G G T G G T A T T A T G G C A G TCTTT C TAAT G GG TT G TTAC G A T G A T G A A T C CGGTAAATGG 2880

K T V T K C S G H D D N T L R V L Q D Q L T M V K I N K D P K K I P E W L V V N
AAGAC GG T A A C T A A A T G T T C A G G A C A T G A T G A T A A T A C G T T ~ A G G GTTTTGC AAG AC CAATTAAC GATG G T T / ~ T T ~ A C A A G G A T C C CA~I~f CCA~G ~T~TAG ~ T 3000

K I Y I P D F V V E D P K Q S Q I W E I S G A E F T S S K S H T A N G I S I R F
A A A A T C T A T A T T C C C fiATT T TGTAGTAGAGGATCCGAAACAATCTCAAATATGGGAAATTTCAGGAGCAGAG~CPTAC ATC T T C C A A G T C A C A T A C A G C G A A T G G A A T A T C G A ~ T A G A T T T 3120

P R F T R I R E D K T W K E S T H L N D L V N L T K S *
C C T A G A T T T A C T A G G A T T A G A G A A G A T A A A A C GT G GAAAGAATC TAC TCATC TAAAC GATTTAG T A A A T T T G A C TAAATC TTAATAG T T A C A T A C A A A C TGAAAA~'2AAAATAACAC TAT 3240
A51R / SalFI4R --~ 3 7 . 6 K
M D G V I V Y C L N A L V K H G E E I N H l K N D F M 1 K P C C E R V
TTAGTTGGTGATCG C CATG G A T G G T G T T A T T G T A T A C TG TCTAAACG CG T T A G T A A A A C A T G GTGAG G A A A T A A A T C A T A T A A A A A A T G A T T T C A T G A T T A A A C C A T G T T ~TGAAAGAGT 3360

C E K V K N V H I D G Q S K N N T V I A D L P Y L D N A V L D V C K S V Y K I( N
T T G T G A A A A A G T C A A G A A C G T A C A C A T C G A C G G A C A A T C T A A A A A C A A T A C A G T G A T T G C A G A T T T G C C A T A T C T G G A T A A T G C T G T A T T G G A T G T A T G C A A A T C A G T A T A T A A A A A G A A 3480

V S R I S R F A N L I K I D D D D K T P T G V Y N Y F K P K D A I S V I I S I G
T G T A T C A A G A A T A T C C A G A T ~ G C T A A T T T G A T A A A G A T A G A T G A T G A T G A C A A G A C TC C TACC G G T G T A T A T A A T T A T T T T A A A C C T A A A G A T G CTATTTC TG T T A T T A T A T C CATAGG 3600

K D K D V C E L L I A S D K A C A C I E L N S Y K V A I L P M N V S F F T K G N
AAAG G A T A A A G A T G T C T G T G A A C T A T T A A T C G CATCC GATAAAGC GTGTG CGTG T A T A G A G T T A A A T T C A T A T A A A G T A G CCATTCTTC C C A T G A A T G T T T C C T T C T T T A C C A A A G G A A A 3720

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
2894 Variola virus nucleotide sequence

A S L I I L L F D F S I N A A P L L R S V T D N N V V I S R H K R L H G E I P S
TGCGTCA~'fGATTATTCTCCTGTTTGACTTCTCTATCAATGC GGCAC CTC TC T T A A G A A G T G T A A C C G A T A A T A A T G ~ G ~ A T A ~ T A G ~ C G ~ ~ G 3840

S N W F K F ~ Y I S I K S N Y C S I L Y M V V D G S V M Y A I A D N K T H T I I S
TTcCAATTGGTTCAAGT~fTATATA~TATAA~JGTCCAACTATTGTTCTATATTATATATGGTAGTTGATGGATCTGTGATGTATGCAATAGCTGATAATAAAACTCACACAATTATTAG 3960

K N I L D N T T I N D E C R C C Y F E P Q I K I L D R D E M L N G S S C D M N R
CAAAAATATATTAGACAATACT~AATTAA~GATGAGTGCAGATGCTGTTATTTTGA~CACAGATTAAGATTCTCGATAGAGATGAGAT~~ ~ ~ G 4080

H C I M M N L P D I G E F G S S I L G K Y E P D M I K I A L S V A G N L I R N Q
ACATTGTATTATGATGAATTTACCAGATATAGGAGAATTCGGTTCCAGTATATTGGGGAAATATGAACCTGACATGATTAAGATTGC TCTTTCAGTGGCTGGT~T~A 4200

D Y I P G R R G Y S Y ¥ V Y G I A S R *
AGACTACA_~TCC C G G G A G A C G C G G C T A T A G CTACTAC G T T T A C G G T A T A G C C T C T A G A T A A T T T T ~ T A A G CACAAAATAAAAAACATAA'±TI'±AA~TAGTC TATTTCATACTA'I'I'I-I~T 4320
-e (J-~2R / I & I F I S R )
m d i k i d i s i f g d k f t v t t r r e n e e r
k k y 1 p 1 q k e k f t
GTGATCAcCATGGACATAAAGATAGATATTAGTATTTTTGGTGATAAATTTAcGGTC~.CTACTAGGAGGGAAAA~GAAAAAATATCTAcCTCTCCAAAAA~CTA 4440
A52R / S a l F I S R @ . 4 K -~
t d v i k p n y 1 e h d n 1 1 d r d e M S T I L E E Y F M Y R G L L G L R I K Y
CTGATGTTATCAAAC CTAATTATCTTGAGCAC GATAACTTATTAGATAGAGATGAGATGTCTAC TATTCTAGAGGAATATTTTAT~ACA~ ~ C ~ ~ T A ~ 4560

G R L F N E I R K F D N D A E E Q F G T I E E L K Q K L R L N A E E G A D N F I
GACGAcA-rlTA~ACGAAATTAGAAAATTC G A C A A T G A T G C G G A A G A A C A A T T c G G T A C T A T A G A A G A A C T C A A A C ~ % A A C T T A G A T T A A A T G C T G A A G A G G G A G C C G A T A A C T T T A T A G 4680

k q
i s m i g 1 c a c v v d v w r k e k 1 f s r w k y c 1 r a i k 1 f
D Y I K V Q
A T T A T A T A A A G G T A C A A A C A G G A T A T C T A T G A T A G G A T T G TGTGC GTGTG TG GTAGATG TTTGGAGAAAG GA A A A C T G T T T T C TAGATG G A A A T A T T G T T r A C G A G C T A T T A A A C T G T T 4800

i d d p i 1 d k i k s i 1 q n r 1 v y v e m 1 *
T A T T G A T G A T C C C A T A C T T G A T A A G A T A A A A T C T A T A C T G C A G A A T A G A C TAGTG TATG TGGAAATG TTATA A A A T T A A A A G T T A A T G A G A G C A A A A T A T A A T G T T G T A T T C TAATC C 4920
A55R / SalFITR @ . 2 K -@

T~- r r y r c s f a v t v n i. i y M M G G Y D ~ Y P Y R S S K V I V Y N
C A T A T T T A T T A T T T T C A C G G A ~ A T A T A G G TGTAG TTTTG C A G T G A C C G T C A T A A T A T T A T C T A T A T G A T G G G TG G A T A T G A T C A G T A C C C G T A T A G A A G T T C ~ ~ATA~TA~ 5040

T C T N S W I Y D I P E L K Y P R S N C G G V A D D E Y I Y C I G D Q D S S L I
T A C A T G T A C ~ % A T T C ~'fGGATATATGATATAC C A G A G C T A A A A T A T C C T C G T T C T A A T T G T G G A G G A G T T G C T G A T G A T G A A T A C A T T T A T T G C A T A G G C ~ A ~ G ~ T 5160
1.55R / S & I F I 7 R 1 9 . 9 K --)
S S I D R W K P S K P Y * M R E T K C D I G V A M L N G L I Y V I G G
ATCTAGTATTGATAGATGG~GCCA~KAACCATATTGATA~C GTATG CTAAAATGC G A G A G A C ~ T G T G A T A ~ f G G T G T A G C GATGq~fIJ%AC G G A ~ T A T A T G ~ T A ~ C G G 5280

V V K G D T C T D T L E S L S E D G W M M H Q R L P I N V Q Y V D D C S Y R Q N
AG T T G T W A A A G G T G A C A C A T G T A C C GACACACI~fGAGAGTTTATCAG A A G A T G G A T G G A T G A T G C A T C A A C G T C T T C C A A T A A A T G T C C A A T A T G T C G A C G A T T G ~ T ~ ~ G ~ 5400

L Y I R R L H N S S V V N G I S N L V L S Y N P I Y D E W T K L S S L N I P R I
T T T A T A T A T C A G G A G G C T A C A C A A T A G T A G TG T A G T T A A T G G A A T A T C A A A T C T A G T CC TTAGC TATAATCC G A T A T A T G A T G A A T G G A C C A A A T T A T C A T C A T T A A A T A T r C C TAGAAT 5520

N P A L W S V H N K V Y V G G I S D D I Q T N T S E T Y N K E K D R W T L D N S
TAATCCTGC TCTATGGTCAGTGCATAATAA~JSTATATGTAGGAG GAATATCTGATGATAT TCAAACTAATACATC TGAAACATACAACAAAGAAAAAGATCGTTGGACATTGGATAATAG 5640

H V L P R N Y I M Y K C E P I K H K Y P L E K H S T R M I F *
TCAC G T G T r A C C A C G C A A T T A T A T A A T G T A T A A A T GC GAACC G A T T A A A C A T A A A T A T C C A T T G G A A A A A C A C A G T A C A C G A A T G A T T T T C T A A A G T A C TTG GAAAG TTTTATATGT~J3T 5720
A56R / SalGIR 34.4K (HA) -~
M T R T, S I I, r. ~, L ~ ~ L V Y S T P Y P O T
TGATAGAAC AAAATACATAITI-[TTG T A / ~ C A C q'FFTTATAC TAATATGACAC GATTG TC A A T A C T T ~ G TTAC T A A T A T C A T T A G T A T A C T C T A C A C C T T A T C CTCAGAC~JC 5840

O I S K K I G D D A T L S C S R N N I N D Y V V M S A W Y K E P N S I I L L A A
A G A T A T C T A A A A A A A T A G G T G A T G A T G C A A C T C T A T C A T G T A G T A G A A A T A A T A T A A A T G A T T A T G T T G T T A T G A G T G C TTG G T A T A A G G A G C C CAATrC C A T r A T T C T T T T A G C TGCCA 6000

K S D V L Y F D N Y T K D K I S Y D S P Y D D L V T T I T I K S L T A K D A G T
AAAG TGACG T C T T G T A T T T T G A T A A T T A T A C C A A G G A T A A A A T A T C A T A C GATrC TC CATAC GATGATC TAG TTACAAC T A T T A C A A T T A A A T C A T T G A C T G C T A A A G A T G C C G G T A C T T 6120

Y V C A F F M T S T T N D T N K V D Y E E Y S T E L I V N T D S E S T I D I I L
A T G T A T G T G C A T T C T T T A T G A C A T C A A C T A C A A A T G A T A C TAATAAAGTAGAI~fATGAAGAATAC TC TACAGAG T T G A T T G T A A A C A C A G A T A G TGAATC G A C T A T A G A C A T A A T ~ T A T 6240

S G S S H S P E T S S E K P D Y I N N F N C S L V F E I A T P G P I T D N V E N
CTGGATC T T C A C A T T C A C C G G A A A C T A G T T C T G A G A A A C C T G A T T A T A T A A A T A A T T T T A A T T G C TC GTTGG T A T I T G A A A T C G C GACTC C G G G A C C A A T T A C T G A T A A T G T A G A A A A T C 6360

H T D T V T Y T S D I I N T V S T S S G E S T T D K T S G P I T N K E D H T V T
A T A C A G A C A C TGTC A C A T A C A C TAG T G A T A T C A T T A A T A C A G T A A G T A C A T C A T C TG GAGAATC C A C A A C A G A C A A G A C G TC GGGAC CAATTAC T A A T A A A G A A G A T C A T A C A G T C A C A G 6480

D T V S Y T T V S T S S E I V T T K S T A N D A H N D N E P S T V S P T T V K N
ACAC T G T C T C A T A C A C T A C A G T A A G T A C A T C A T C TGAAATTG TCACTAC TAAATCAACC GCCAATGATG C G C A C A A T G A T A A T G A A C CAT CTAC TGTGTCAC C A A C A A C T G T A A A A A A C A 6600

IT K S I G K Y S T K D Y V K V F G T A A L I I L ~ A V A T F C I T Y Y I C N K
TCAC G A A A T C TATAG G T A A G T A T A G T A C T A A A G A C TATGT CAAAG TATTTGG TATTG CAG C A T T A A T T A T A T TGTC G GC C GTGG CAATTTTC TG TATTAC G T A T T A T A T A T G T A A T A A A C 6720
( A 5 7 R / S a l G 2 R GmpK) -~

r 1 w e y A57R / SalG2R (G~K) -9


I f g f V V S h t t r f p r p M E R E G V D Y H Y V N R E A I W K G I A
AGAC TATGGG~J~TATATTTGGATTTGTGGTGT CCCATAC CAC TAGATTTC CTCG TCC TATGGAAC GAGAAGG TG TTGATTAC C A T T A C G T T A A C A G A G A G G C A A T C T G G A ~ G G A A T ~ G C 6960

T G N F L E H T E F L G N I Y G T S K T V V N T A A I N N R I C V M D L N I D G
CAC C GGAAAC TTTC T A G A A C A T A C T G A G T T l~fTAG G A A A T A T T T A C G GAACTTC TAAAACAG TT G TGAATACAG C GG C T A T T A A T A A T C G T A T T T G T G T G A T G G A T T T A A A C A T C GACG G 7080

V R S L K N T Y L M P Y S V Y I E P T S L K M V E T K L
TG T T A G A A G T C T T A A A A A T A C T T A C C T A A T G C CTTAC TCG GT G T A T A T A A A A C C TAC CTC TC T T A A A A T G G T T G A G A C C A A G C T T 7165

Fig. 2. (a) The nucleotide sequence of the 14-6 kbp SstI-HindIII fragment from the right end of the HindIII A fragment Nucleotide
sequence numbering starts from the left end of the fragment with the orientation shown in Fig. 1. Predicted amino acid sequences
(shown in upper case letters) of ORFs >/65 residues in length and starting with a methionine are included above the nucleotide sequence
for ORFs transcribed from left to right, and below the nucleotide sequence for those ORFs running right to left. ORF names are as for
the vaccinia virus homologues from strains Copenhagen (Goebel et aL, 1990) and WR (G. L. Smith et al., 1991), and the predicted sizes

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
B. Aguado, I. P. Selmes and G. L. Smith 2895

EEV gp CD23 EEV gp


Lectin Lectin Profilin SOD TmpK DNA ligase TNFR HA ~GmpK
I I I f J t I \ I
. . . . . . ~' ~÷ ~ , ~- Var
,~ Cop
~ 7 WR

14K fusion 3~-HSD Hindlll


protein
Fig. 3. Schematic alignment of the ORFs from vaccinia virus strains WR and Copenhagen with those from variola virus identified in
this study. Note the presence of seven broken ORFs and the deletion of a 1.9 kbp region (bracketed) in variola virus. Principal
homologiesor functions are indicated. EEV gp, EEV glycoprotein; Cop, vaccinia virus Copenhagen strain; Var, variola virus. Hindlll
indicates the position in variola virus at which there are two adjacent HindIII sites. The ORFs separated by these sites may in future be
shown to be one contiguous protein-coding region.

nucleotides. In Fig. 2, variola virus O R F s that start with W R O R F s SalL9R and S a l F I R and to variola virus
a methionine codon and are predicted to encode proteins Harvey O R F s 13R, 14R and 15R (Table 1). N o strain
of ~>65 amino acids in length are indicated in capital Copenhagen O R F equivalent to strain W R O R F SalF6R
letters. For comparative reasons the O R F s are n a m e d was described (Goebel et al., 1990) and variola virus
according to their vaccinia virus (strains W R and Harvey also lacks a comparable O R F encoding /> 65
Copenhagen) counterparts. amino acids due to a frameshift. Some clustering of the
At the nucleotide level the sequences in Fig. 2 share fragmented genes is apparent because to the right of
96 ~ identity with the corresponding regions of vaccinia SalL6R the next three rightward transcribed genes
virus strain WR. Translation of the sequence shows that (Copenhagen strain O R F s A37R, A39R and A40R or W R
the majority of variola virus O R F s are extremely similar strain O R F s SalF7R, SalF9R, SalF1R and SalF2R) are
to the vaccinia virus homologue in overall arrangement, all disrupted in variola virus. The arrangement of O R F s
length and amino acid content (Fig. 3 and Table 1). within the SalL9R/SalF1R (Copenhagen strain O R F
However, there are three types of significant differences A39R) region differs in all the orthopoxviruses for which
and these are described before cataloguing the similari- sequences are available. Possible consequences of the
ties. The first type of diversity is the fragmentation of presence of multiple broken O R F s in variola virus and
seven genes which in vaccinia virus are present as a not in vaccinia virus are considered in the discussion.
single O R F but in variola virus are broken into two or Another major difference between the variola virus
more separate pieces by termination codons or frame- and vaccinia virus O R F s is the absence from variola
shifts. Only some of these derivative fragments contain a virus of 1910 bp corresponding to vaccinia virus W R
suitable initiation codon and encode proteins >/65 amino nucleotides 17 360 to 19 270 (numbering according to G.
acids in length. The other fragments that are clearly L. Smith et al., 1991). Consequently variola virus H a r v e y
related to vaccinia virus ORFs, but which alone would lacks homologues of vaccinia virus O R F SalF16R
not be classified as an O R F due to either their small size (related to T N F R ) and part of O R F SalF17R, at least
or lack of an initiating methionine codon, are represent- from a comparable genomic position.
ed in Fig. 2 in lower case letters. These broken genes The third type of difference between variola virus and
correspond to vaccinia virus strain W R O R F s SalL5R, vaccinia virus is short deletions (mostly less than 20
SalL7R, SalL9R, SalF1R, SalF2R, SalF6R, SalF7L and nucleotides) in one virus or the other that occur between
SalF 15R, and to vaccinia virus strain Copenhagen O R F s short direct repeats (Table 2 and Fig. 4). There is no
A35R, A37R, A39R, A40R, A44L and A52R. Note that obvious sequence specificity of the direct repeats and the
the strain Copenhagen A39R O R F is equivalent to strain sequences vary from three to nine nucleotides in length.

of the polypeptides are indicated. Predicted amino acid sequences which show clear similarity to vaccinia virus ORFs but which are
either < 65 residues in length or lack an initiating methionine are indicated in lower case letters. Underlined nucleotide sequences
represent potential early transcription termination signals (Yuen & Moss, 1987)or late initiation signals (Roselet al., 1986). Underlined
amino acid sequences represent potential hydropfiobictransmembrane spanning regions or potential sites for attachment of N-linked
carbohydrate in those polypeptides having hydrophobic membrane sequences. (b) Nucleotide sequence of the 7.1 kbp Hind]II l
fragment. The dagger symbolsabove nucleotides 4935 and 4936 indicate the positions between which 1910 nucleotides are missing in
comparison with vaccinia virus strains WR and Copenhagen. Other annotations are as in (a).

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
2896 Variola virus nucleotide sequence

Table 1. Summary of ORFs in variola virus strain Harvey and vaccinia virus strains WR and Copenhagen

Variola virus Predicted size Vaccinia virus ORF* Number of amino acids
Harvey of protein Amino acid Amino acid Function or
ORF (Mr x 10-3) WR Cop Harvey WR Cop identity (~)t overlap~ homology

1L 9.2 A26L 72 322 98"7 75


2L 12.5 A27L 110 110 96-4 14K fusion protein
3L 16.2 A28L 146 146 96.6
4L 35.5 A29L 305 305 97.7
5L 8-7 A30L 77 77 100
6R 16.3 SalL1R A31R 140 >~130 124 92-4 119 RNA-binding motif
7L 31.0 SalL2L A32L 270 270 230 97.8
8R 20-5 SalL3R A33R 184 185 185 94.1
9R 19-5 SalL4R A34R 168 168 168 98-8 C-type lectin, EEV glycoprotein
SalL5R§ A35R 176 176
10R 24.4 SalL6R A36R 216 221 221 94.6
11R 7.7 SalL7R§ A37R 68 263 263 89.6 67
12L 31.5 SalL8L A38L 277 277 277 94.9
13R 8.1 SalL9R§ A39R 71 295 403 92-8 69
14R 14-3 SalL9R A39R 122 295 403 84-3 102
15R 16.3 SalF1R§ A39R 139 142 403 75-4 138
SalF2R§ A40R 159 168 C-type lectin
16L 24.9 SalF3L A41L 218 219 219 96.3 219
17R 15.0 SalF4R A42R 133 133 133 97.0 Profilin
18R 22.7 SalF5R A43R 195 194 194 92.3 195
SalF6R§ 78
19L 24-1 SalF7L§ A44L 210 346 346 95.6 197 3fl-HSD
20R 13.7 SalF8R A45R 125 125 125 97.6 SOD
21R 27.6 SalF9R A46R 240 240 214 96.7 240
22L 15.5 SalF10L A47L 135 252 244 94.8 134
23L 11.0 SalF10L A47L 92 252 244 92.0 87
24R 23.3 SalF 11R A48R 205 204 204 98.0 205 TmpK
25R 18-7 SalFI2R A49R 162 162 162 95-0
26R 63.3 SalF13R A50R 552 552 552 97.8 DNA ligase
27R 37-6 SalF14R A51R 334 334 334 92-0
28R 8.4 SalF15R§ A52R 71 190 190 95-5 67
1-9 kb deletion
29R 8-2 SalFI7R§ A55R 71 564 564 74
30R 19.9 SalF17R§ A55R 172 564 564 79.6 167
31R 34.4 SalG1R A56R 313 314 315 85.0 HA, EEV glycoprotein
32R 10.1 SalG2R A57R 88 151 151 96.6 89 GmpK

* Nomenclature for ORFs from vaccinia virus strains Copenhagen (Cop) and WR as described by Goebel et al. (1990) and G. L. Smith et al. (1991),
respectively.
t Amino acid identity between variola virus strain Harvey and vaccinia virus strain WR, except where only the strain Copenhagen ORFs are
shown.
Where ORFs show incomplete overlap.
§ Vaccinia virus strain WR ORFs for which the variola virus homologue is fragmented.

Eleven of the deletions occur in variola virus and nine in second intergenic deletion occurs between genes (SalF3L
strain WR, and these changes are predicted to cause a and SalF4R) which are transcribed away from each
variety of changes in the ORFs or their regulation. other, so that the deletion is likely to affect RNA
Three of these deletions occur in intergenic regions initiation rather than termination. Transcriptional map-
(Table 2 and Fig. 4). Between SalL7R and SalF8L there ping is available for SalF4R (homology to profilin)
is a loss of 24 nucleotides which also results in the loss of (Duncan & Smith, 1992b) and the RNA initiates late
an early transcription termination motif downstream of during infection from the TAAAT motif approximately
SalL8L. However, as noted before (G. L. Smith et al., 10 nucleotides upstream of the ATG codon. Since late
1991), where two ORFs are predicted to be transcribed promoters are functionally very short (Davison & Moss,
towards each other early during infection there are 1989), and the deletion is present more than 70
multiple T5NT motifs on both strands which may help to nucleotides upstream of the TAAAT motif, it is unlikely
reduce transcriptional interference or double-stranded to affect SalF4R transcription. Interestingly, there is also
RNA formation. In the present example there are two a difference between vaccinia virus strains WR and
other T5NT signals on the same strand in variola virus Copenhagen at this position (G. L. Smith et al., 1991),
(Fig. 2) and consequently the loss of one extra signal but variola virus and strain Copenhagen share identical
downstream of SalF8L is probably less important. The sequences. The third intergenic deletion occurs in the

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
B. Aguado, L P. Selmes and G. L. Smith 2897

(o) SAIL1R ~. 2 3 4 i xi complex series of repeats, which includes four perfect


ACAAAAGTTAA%~%KACAATTAT~ACAA TTA~ACAATTA'IIAACAAT TAYITATAAT TGT TATAAT TATGATG 2719
Illlll[I]l ]lllllftllll tandem repeats of the sequence A A C A A T T A T , absent
ACAAAAGTTA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ATAATTATGATG 371

(b) SalL6R from vaccinia virus. This translates into a fourfold


CTAAATAAGCAA ............ ~-'~-'~T GA T GA TATAG AATCA TCT GT 5996 repetition of the tripeptide N N Y before the exceptional-
II itit111il III1111111141[I[[ II
CTTAATAAGC~--~-~GATCA~--~qCGATGATATAGAATCATCGGT 3671 ly acidic C terminus. A more extreme example of peptide
(c) SaIL7R repetition has been described at the N terminus of the
TATACCCAG ....................... F--~G.~CATACTAT 6485
IIIIII II
TA TACC TA~{7"~-'~-~ TGGTAAC A T C GT T T / ~
[IIIIIII
A TAC TAT 4180
protein encoded by O R F B11R where strain Copenha-
(d) Intergenic region SalL7R/SalL8L gen contains nine copies of a dipeptide that is present
TACCTCATAAAAC . . . . . . . . . . . . . . . . . . . . . . . . AAAAAAGC TATTAT 6 9 1 6 only once in strain W R (G. L. Smith et al., 1991). Other
IIIIIIIIIIIII ~ 1 1 1 1 1 1 1 1
TACCT CA~ K A A A ~ ACT GCT ACAAA GCTATTAT 4 6 3 7 in-frame deletions result in the loss of amino acids
(e) SalL9R E Q D H from near the C terminus of the protein encoded
GIAGGAq~ ....... TGATACAT 8186 T A T ~ T A C A 6403
by variola virus O R F SalL6R, and the substitution of an
.... ~ ' ~ ' ; 5916 ;~'~ ~ ' ~ 6129
GATACTATATGTATTCTTTGAT~TTTCC~CCAT TATGTAC 8735
E residue at the C terminus of the variola virus SalF10L-
IIIIIIIIIIIIIIIIII 111
GATACTATATGTATTCT TCGAT . . . . . . . . . . . .
~1111/11/~11
GCATTATGTAC 6450
encoded protein with G M G I V K Y S E in vaccinia virus
(f) lntergenic region SalF3L/SaIF4R W R (Table 2 and Fig. 4). This strain W R O R F is larger
TACTGGATTGCATATT~TATTT~ATATTTHTATTTT~TAT~fTT~TAATAA 10532 than the variola virus counterpart despite the loss of
TATTGGATTGIJIATATTTT]A I T T~TATTTT
T T .............. GTAATAA 8244 these 15 nucleotides because of the removal of the stop
~g) SalF6R codon used in variola virus. Consequently in strain W R
CTCTTAAT . . . . . ~-~TAT 11636 CACACC . . . . . . . /~K,KAG 11145
klll~lll III llilll I IIIII translation continues downstream to the next in-frame
CTCTT~~~ 9349 ......~ 0 9~65
TATCGATAT T G A A ~ A T G A T A T G G A T G ' f A G A A A G 11855
nonsense codon. Several other proteins have extra
IIi II IiI I ]I ]II
TATCGATATTGAAG...
lllllllIlltllllllIll
. . G II I AGIATGATATGGATGTAGAAA
T G A C ] 9572 internal amino acids attributable to length differences
(h) SalF10L between direct repeats, such as T m p K , SalF17R and the
TA~ATCACAT TATATGAT ~ T T A T / ~ A ~ A C A A T G A A G T
lltill llllllrlll ~IIIIIIIII
14237 haemagglutinin (HA) (Table 2). For a list of all detected
CA~ATCACCT TATATGAT T . . . . . . . . . . . . . . . ACAATGAAGG 11946
direct repeats see Table 2, and for comparisons of
(0 SalF15R selected examples from the sequences in variola virus
AAACAGG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ~"~YATGATA ~1714
II1~111
A~CAG~G TC AAA CT T A C T G T A T A C G A T T G ~ A T G A
' IIIIIII
TA 1 7 1 t 1 0
Harvey and vaccinia virus W R see Fig. 4.
0) SalF17R Despite these differences m a n y of the proteins of
TGC~'~'~'~ ......... ATCAGG 5147 ATATGT...~"~F%AT 5568 variola virus and vaccinia virus are extremely similar in
II I I I I II I Illlll IIII I { ~ 1 1 1
TG~IATAGGC~TCAGG 19494 ATATGT~GGI~GGI~G(~AT 19921 length and amino acid sequence (Table 1); indeed O R F
(k) H A A30L is identical between the two viruses. O f the ORFs,
ATCC~:I~A~ATCTAAA 5891 CCGCC~ ............... CGCACAA 6556
IIIIIIIII IIIIIII II II I I [ I ~ II IIII 5 3 ~ have ~>95~ identity and 81 ~ show >~90~ identity.
TTCC'i~ ...... ATCTAAA 20238 CCACC~CGGATCTTT/~%CGTACAA 20918
For the other O R F s the lower percentage identity is
Fig. 4. Shortnucleotide deletions in either vaccinia virus strain WR or attributable to frameshift mutations that link blocks of
variola virus strain Harvey that lie between short oligonucleotidedirect unrelated amino acids to amino acid sequences that are
repeats. In each case the top line represents variola virus strain Harvey clearly related in the two viruses. I f the comparison is
and the bottom line vaccinia virus strain WR. Numbers refer to
nucleotide numbering of vaccinia virus strain WR (G. L. Smith et al., restricted only to the regions from the two viruses that
1991), or from the left end of the variola virus Sstl HindlII fragment (a are in the same frame then the amino acid identity is
to h) or HindlII I fragment (i to k). Short direct repeats are boxed. >~90~, similar to the other ORFs. M a n y of the O R F s
with interesting amino acid similarities are highly
conserved, although there are exceptions.

very short gap between O R F s SalF5R and SalF6R and


Homologies to enzymes
results in the loss of five nucleotides which are the site of
transcription initiation for SalF6R (Duncan & Smith, Within the region of variola virus sequenced there are
1992b). This is unlikely to affect R N A initiation in five O R F s with homology to enzymes (3fl-HSD, SOD,
variola virus because the nucleotide sequences of the two T m p K , D N A ligase and G m p K ) , three of which have
viruses are identical for more than 70 bases upstream. been shown to be enzymatically active in vaccinia virus
However, such an R N A might only allow initiation from (3/~-HSD, T m p K and D N A ligase). D N A ligase is very
a different A U G codon located downstream, and cause highly conserved between the two viruses with only 12
the loss of the four N-terminal amino acids. conservative changes in 552 residues (97-8~ identity).
All other deletions occur within ORFs, but several of These changes are non-randomly distributed and occur
these do not cause translational frameshifts. For in- only in the N-terminal 334 amino acids, whereas the
stance, near the 3' terminus of S a l L I R (A31R in strain remaining 218 residues are identical. This C-terminal
Copenhagen) the variola virus D N A sequence contains a region includes the basic motif conserved in D N A ligases

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
2898 Variola virus nucleotide sequence

Table 2. Deletions between direct repeats in either variola virus (Harvey) or vaccinia virus W R

Virus with Position in


deletion Region/strainWR ORF Effect Repeat sequence Copies variolavirus
WR 3' terminusof SalL1R Loss of 27 bp AACAATTAT 4 2662*
loss of three copies of NNY
TTATAATT 2 2697*
Harvey Y terminus of SalL6R Loss of 12 bp, EQDH GAACA 2 5970*
Harvey Within SalL7R Loss of 23 bp, frameshift CAAACGA 2 6471"
Harvey Intergenic region Loss of 24 bp GAAAAAAA 2 6903*
SalL7R/SalL8L loss of TsNT motif
Harvey Within SalL9R Loss of 7 bp, stop codon AGGAT 2 8174"
now in-frame in Harvey GGATA 2
WR Within SalL9R Loss of 4 bp ATTA 2 8391"
WR Within SalL9R Loss of 12 bp AGTCC 2 8708*
WR Intergenic region Loss of 14 bp ATATTTT 4 10498*
SalF3L/SalF4R
Harvey SalF5R Loss of 3 bp, residue M 1 ATG 4 11040*
WR SalF5R Loss of 3 bp, residue G 92 ATA 2 11 173"
Harvey Intergenic region Loss of 5 bp TATGC 2 11629*
SalF5R/SalF6R
Harvey Within SalF6R Loss of 7 bp, frameshift AAAAAGA 2 11735*
WR Within SalF6R Loss of 6 bp ATGAC 2 11825*
WR 3' terminus of SalF10L Loss of 15 bp including strain Harvey CTGTAA 2 14205*
termination codon. Strain WR has extra
C-terminal GMGIVKYSE
WR Within TmpK gene Loss of 3 bp, residue E 165 AGA 2 850t
Harvey Within SalF15R Loss of 29 bp, frameshift puts a ATATC 2 4703t
strain Harvey stop codon in frame
Harvey SalF17R Loss of 9 bp, residues GIR 353-355 ATAGGCG 2 5135t
Harvey SalF17R Loss of 3 bp, residues G 497 AGG 3 55601"
WR Within HA gene Loss of 6 bp, residues QI 23-24 CAGA 2 5874t
Harvey Within HA gene Loss of 15 bp, residues DKYDT 24(~250 TGATG 2 6545t

* Nucleotide sequence numbering is from the left end of the 14.6 kb SstI HindlII fragment, or t the left end of HindlII I fragment.

from yeast, vaccinia virus and humans (Smith et al., expressed by translational frameshifting the mechanism
1989a; Lasko et al., 1990) and which is immunologically is likely to be conserved in all the poxviruses for which
cross-reactive between vaccinia virus and mammalian this gene has been sequenced.
DNA ligase I (Kerr et al., 1991). The predicted active site Of the other two enzymes (3fl-HSD and SOD), SOD is
lysine (Tomkinson et al., 1991) and the surrounding highly conserved, being the same length as in vaccinia
residues are also unaltered in variola virus. virus and showing 97"6~o amino acid identity. Like that
Variola virus TmpK is also very similar to the vaccinia of vaccinia virus, the predicted variola virus protein
virus enzyme, with only three conservative changes lacks the two loops that protrude from the fl-barrel
(98.0~ identity) and an extra residue at position 165. The structure and which are necessary for ion binding.
third homology to an enzyme involved in nucleic acid Consequently, the protein is unlikely to have SOD
metabolism is that of GmpK which, like TmpK and activity. The 3fl-HSD gene is fragmented into four pieces
DNA ligase, is very similar between the two viruses. In in variola virus and is therefore the least conserved of the
three strains of vaccinia virus this enzyme is predicted to variola virus enzyme genes. Although the individual
be active only if the body of the ORF is joined to a 5'- fragments share a high degree of amino acid identity
terminal region that contains the ATP-binding domain, with the corresponding region of vaccinia virus 3fl-HSD,
possibly by a translational frameshift (G. L. Smith et al., the region is most unlikely to encode an active enzyme,
1991). The same situation is observed in variola virus owing to this fragmentation. The fragmentation of the
GmpK in which the Y-terminal region upstream of the variola virus 3fl-HSD gene was surprising given that the
main ORF has a nucleotide sequence identical to that of vaccinia virus gene encodes an active enzyme which
vaccinia virus WR. Although the ORF is incompletely functions as a virus virulence factor (Moore & Smith,
sequenced in variola virus, because it extends beyond the 1992). It would seem more logical for the virulent
right end of the HindlII I fragment, the available data pathogen (variola virus) to have an intact virulence
show only three conservative changes in 89 amino acid factor and for the vaccine (vaccinia virus) to have lost
residues (96.6~o identity). If the complete protein is this.

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
B. Aguado, L P. Selmes and G. L. Smith 2899

Structural proteins HA, an 85K surface protein that belongs to the


immunoglobulin superfamily (Jin et al., 1989), and is
Three genes from within the region sequenced have been
non-essential for virus growth in culture (Shida et al.,
shown, in vaccinia virus, to encode structural proteins of
1987). The second is a protein related to C-type lectins
the virus particle. These are the 14K fusion protein, the
that is required for plaque formation and encodes a
85K HA and a group of three glycoproteins that all
triplet of glycoproteins of 22K to 24K that are present on
derive from the same gene, SalL4R, and show homology
the outer envelope of EEV (Duncan & Smith, 1992a).
to C-type animal lectins. The 14K protein plays
Antibodies to the EEV outer envelope proteins prevent
important roles in entry and exit of virus particles from
long-range spread of virus in vitro [the anti-comet test
cells and the high degree of conservation of the ORF in
(Appleyard et al., 1971)], and confer much better
variola virus (96-4% identity) is consistent with similar
protection than antibodies to INV proteins (Appleyard
important functional roles in this virus. The other two
et al., 1971; Boulter & Appleyard, 1973; Payne, 1980).
genes both encode glycoproteins that form part of the
Thus, immunity to these proteins and the products of
EEV outer envelope. Conservation of the variola virus
gene B5R, another gene encoding EEV glycoproteins
and vaccinia virus HAs is less than that of the majority of
(Engelstad et al., 1992) which is highly conserved in
other proteins found in this region (85.1%), but the
variola virus (B. Aguado, I. P. Selmes, M. Engelstad &
distribution of these changes is non-random with the
G. L. Smith, unpublished data), is likely to have made
exposed N terminus, including the single immunoglobu-
the major contribution to the protection vaccinia virus
lin domain, being highly conserved (97.5% identity)
induced against smallpox.
whereas the region near the C-terminal membrane
Three types of differences between vaccinia virus and
anchor is quite divergent. The SalL4R gene-encoded
variola virus are noted. At the nucleotide level there are
protein is extremely similar with only two substitutions
20 instances of short deletions that lie between short
(98.8% identity). If the gene products are shown to have
direct repeats. These changes are predicted to cause a
lectin activity in vaccinia virus it is very likely that the
variety of changes to the encoded proteins or to their
same will be true for variola virus.
transcriptional regulation. Similar instances of repeated
nucleotide sequences flanking regions deleted in either
Discussion vaccinia virus strain WR or Copenhagen have been
described (G. L. Smith et al., 1991). It seems likely that
This paper provides the first sizeable amount of these short repeats contribute to orthopoxvirus genome
sequence from variola virus and enables comparison of variation by promoting either recombination, or poly-
the ORFs from this virulent pathogen with those of the merase stuttering or jumping. Similarly, in African swine
smallpox vaccine, vaccinia virus. There is considerable fever virus, a distantly related large cytoplasmic D N A
similarity between these orthopoxviruses, as would be virus (Vifiuela, 1985), there are multiple short direct
expected from their genomic restriction maps (Mackett repeats downstream of the gene encoding the p12
& Archard, 1979; Esposito & Knight, 1985) and their attachment protein in some isolates, but not in others
immunological cross-protection. At the nucleotide level (Angulo et al., 1992), and there are several instances of
there is 96% identity between the two viruses and 26 of tandem short direct repeats that can give rise to length
the 32 ORFs show greater than 90 % amino acid identity. heterogeneity (Dixon et al., 1990).
These degrees of conservation are very similar to those A second difference is a 1.9 kbp deletion in the variola
reported for sequences around the variola virus and virus compared to the vaccinia virus genome. Heterogen-
vaccinia virus TK genes (1275 nucleotides), which share eity in this region in other variola virus strains has been
96-7% and 95.4% nucleotide and amino acid identity, shown by restriction endonuclease site mapping (Espo-
respectively (Esposito & Knight, 1984). sito & Knight, 1985) and, interestingly, a highly
In terms of antigens that may have contributed to the attenuated form of vaccinia virus strain Ankara derived
protection vaccinia virus evoked against variola virus, by 572 passages in culture has a deletion corresponding
there are three genes of interest from this region. The to this region (Meyer et al., 1991). The loss of ORF
14K protein is present on the surface of infected cells and SalF16R is noteworthy because the predicted encoded
the INV particle, induces cell fusion and is important for protein shows homology to TNFR, although in vaccinia
virus egress. It is also immunologically conserved in virus strains WR and Copenhagen the ORF is fragment-
several orthopoxviruses, is a target for neutralizing ed into three pieces (Howard et al., 1991 ; Upton et al.,
antibodies and can induce immunological protection 1991). In leporipoxviruses the gene is contiguous and
against challenge with INV (Rodriguez et al., 1985, encodes a secretory protein that binds T N F and
1987; Rodriguez & Esteban, 1987; Rodriguez & Smith, increases virus virulence (C. A. Smith et al., 1991 ; Upton
1990; Lai et al., 1991). The other two ORFs encode et al., 1991).
glycoproteins of the outer envelope of EEV. One is the The third and most remarkable difference between

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
2900 Variola virus nucleotide sequence

variola virus and vaccinia virus is the disruption of seven sequence data are needed from other variola virus strains
variola virus ORFs into small fragments. For the which have a shorter and better characterized passage
vaccinia virus counterparts transcriptional mapping is history since clinical isolation. Nonetheless it would, in
available for three of these ORFs, SalF6R (Duncan & our view, be remarkable if all the broken genes identified
Smith, 1992b), SalF2R (S. A. Duncan & G. L. Smith, in this region are intact in other variola virus strains.
unpublished data) and SalF7L (Moore & Smith, 1992). Assuming that the clones sequenced here are representa-
For two of these ORFs (SalF2R and SalF7L) the gene tive of virulent variola virus strains, why are so many
product has been detected, and in one case (SalF7L) the genes broken? Perhaps some of the ORFs that are
protein has been shown to be an active 3fl-HSD enzyme broken in variola virus but complete in vaccinia virus
that synthesizes steroid hormones and contributes to play a role in moderating virus virulence, so that the virus
virus virulence (Moore & Smith, 1992). The presence of a possessing complete ORFs is attenuated compared to the
virulence factor in vaccinia virus but not in variola virus virus with defective genes. Thus it may have been the
is counter-intuitive and remarkable. It is possible that functional disruption of these genes that allowed variola
these genes are not broken in variola virus DNA isolated virus to become such a virulent pathogen. There is a
directly from virions and that the breaks have been precedent for the loss of an orthopoxvirus gene resulting
introduced during cloning of the virus restriction in increased virulence: vaccinia virus WR ORF B15R
fragments and propagation in Escherichia coli. Although encodes a secretory, high affinity receptor for interleu-
more comparative sequence data from other variola kin-lfl that is not essential for virus replication in vitro,
virus strains are required to address this possibility, it but loss of the gene causes a more rapid onset of
seems unlikely that this is the mechanism for four symptoms and death in intranasally infected mice
reasons. (i) These changes are clustered within a few (Alcami & Smith, 1992). Whether this is more generally
genes (several genes have multiple breaks) and are not true for other genes cannot be predicted, but the
randomly spaced throughout the region sequenced. (ii) observed disruption of variola virus genes has important
Deletions between short direct repeats, which contribute implications for researchers deleting vaccinia virus
to gene disruptions, seem to occur equally in vaccinia ORFs either individually or in combination and studying
virus and variola virus, and not only in one virus. (iii) the consequences for virus replication in vitro and in vivo.
Comparisons of vaccinia virus strains WR and Copenha- If the genes deleted are ones that are already defective in
gen show that these genomes are very similar and variola virus, then the recombinant vaccinia virus may
therefore reasonably stable after cloning in E. coli, thus have a genetic structure more akin to that of variola
the same would be expected for the variola virus genome, virus. Nevertheless, the construction of a 'variola virus'
with 96 ~ nucleotide identity. (iv) A similar situation has from vaccinia virus is impossible owing to the different
been reported for the variola virus homologue of the terminal restriction profiles showing sequences unique to
vaccinia virus host range gene K 1L, which is fragmented variola virus. Instead the information gained from this
in multiple variola virus strains including Harvey type of comparison is helpful in indicating that some
(Cowley & Greenaway, 1990). combinations of genes should not be deleted from
Can one then conclude that these genes are not vaccinia virus.
essential for variola virus growth? For in vitro propaga- A prediction at the outset of this sequencing was that
tion of the virus this is very likely to be the case, because the variola virus genes for TNFR, SOD and GmpK
these gene disruptions have probably occurred within the might be functional for these activities and contribute to
virus and not after cloning of its genome. However, it is the greater virulence of variola virus, in contrast to the
also possible that these genes are required for efficient situation in vaccinia virus in which the genes are not
replication and pathogenesis in vivo and were mutated likely to encode proteins with these activities (see
subsequent to the isolation of the virus from a soldier Introduction). Surprisingly, in each case the variola virus
named Harvey who returned by convoy from Gibraltar ORF is extremely similar to the vaccinia virus counter-
in 1944 and started an outbreak of haemorrhagic-type part (SOD and GmpK) or is absent from the comparable
smallpox in Middlesex, U.K. (Bradley et al., 1946; region of the variola virus genome (TNFR). Hence
Downie & Dumbell, 1947). Following its isolation the comparison of these provides no explanation for the
virus was passaged 36 times on the chorioallantoic increased virulence of variola virus.
membrane of chick embryos prior to cloning of the virus What are the implications of these comparisons for
genome (K.R. Dumbell, personal communication). The orthopoxvirus evolution? The high nucleotide identity
degree of nucleotide and amino acid variation within the (96~) shows that variola virus and vaccinia virus are
broken ORFs is comparable to the degree of variation very closely related despite having dramatically different
seen for the complete ORFs, implying that the fragmen- pathogenesis. Nevertheless, the number of differences in
tation of these genes is not ancient. More comparative the ORFs, the 1.9 kbp deletion in variola virus and the

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
B. Aguado, I. P. Selmes and G. L. Smith 2901

documented differences in the restriction patterns of the ANGULO, A., VII~UELA, E. & ALCAMi, A. (1992). Comparisons of the
sequence of the gene encoding African swine fever virus attachment
genomic termini suggest that one of these viruses has not protein pl2 from field virus isolates and viruses passaged in tissue
recently been derived from the other. It seems more culture. Journal of Virology 66, 3869-3872.
likely that the two viruses have evolved from a common APPLEYARD, G., HAPEL, A. J. & BOULTER, E. A. (197l). An antigenic
difference between intracellular and extracellular rabbitpox virus.
ancestral orthopoxvirus and that for variola virus the Journal of General Virology 13, 9-17.
evolution from this ancestor was accompanied by the BAXBY,D. (1981). Jenner's Smallpox Vaccine. The Riddle of the Origin of
disruption of some genes, and possible acquisition of Vaccinia Virus. London: Heinemann.
BIGGIN, M. D., GIBSON, T. J. & HUNG, G. F. (1983). Buffer gradient
others, that may have contributed to the enhanced gels and 35S label as an aid to rapid DNA sequence determination.
virulence in man. The origin of vaccinia virus remains a Proceedings of the National Academy of Sciences, U.S.A. 80, 3693-
mystery but some of the possible sources of the virus, 3695.
BOULTER, E. A. & APPLEYARD, G. (1973). Differences between
such as its derivation by a simple recombination between extracellular and intracellular forms of poxviruses and their
cowpox virus and variola virus in recent times, or its implications. Progress in Medical Virology 16, 86-108.
derivation from variola virus by passage in cows (Baxby, BRADLEY, W. H., DAVIES, J. O. F. & DURANTE, J. A. (1946). The
outbreak of smallpox in Middlesex, 1944. British Medical Journal ii,
1981; Fenner, 1992), are made increasingly unlikely by 194-196.
the data presented here. More extensive sequence data COLINAS, R. J., GOEBEL, S. J., DAVIS, S. W., JOHNSON, G., NORTON,
from several orthopoxviruses are required to compile E. K. & PAOLETTI, E. (1990). A DNA ligase gene in the Copenhagen
strain of vaccinia virus is nonessential for virus replication and
convincing evolutionary trees for these viruses. recombination. Virology 179, 267-275.
The WHO has proposed that all variola virus DNA COWLEY, R. & GREENAWAY, P. J. (1990). Nucleotide sequence
should be destroyed after the sequencing of several comparison of homologous genomic regions from variola,
monkeypox, and vaccinia viruses. Journal of Medical Virology 31,
variola virus strains is complete. This proposal seems 267-271.
illogical given that there is complete identity between DAVISON, A. J. & MOSS, B. (1989). Structure of vaccinia virus late
vaccinia and variola virus over hundreds of nucleotides promoters. Journal of Molecular Biology 210, 771-784.
DIXON, L. K., BRISTOW,C., WILKINSON, P. J. & SUMPTION,K. J. (1990).
and 9 6 ~ identity overall in the region studied. Are Identification of a variable region of the African swine fever virus
vaccinia virus oligonucleotides and cloned DNA frag- genome that has undergone separate DNA rearrangements leading
ments, and DNA from other orthopoxviruses, which to expansion of minisatellite-like sequences. Journal of Molecular
Biology 216, 677-688.
may in the future be shown to be equally or more related DOWNIE, A. W. & DUMBELL, K. R. (1947). The isolation and
to variola virus, also to be destroyed? It seems more cultivation of variola virus on the chorio-allantois of chick embryos.
logical that the destruction of variola virus material Journal of Pathology and Bacteriology 59, 189-198.
DUNCAN, S. A. & SMITH, G. L. (1992a). Identification and
should be restricted to the infectious virus. characterization of an extracellular envelope glycoprotein affecting
In conclusion, we have presented the sequence of vaccinia virus egress. Journal of Virology 66, 1610-1621.
approximately 129/o of the variola virus genome, com- DUNCAN, S. A. & SMITH, G. L. (1992b). Vaccinia virus gene SalF5R is
non-essential for virus replication in vitro and in vivo. Journal of
pared this to that of vaccinia virus and demonstrated a General Virology 73, 1235-1242.
remarkable degree of similarity between the two viruses. ENGELSTAD,M., HOWARD, S. T. & SMITH,G. L. (1992). A constitutively
For the virion envelope glycoproteins this similarity will expressed vaccinia virus gene encodes a 42 kDa glycoprotein related
to complement control factors that forms part of the extracellular
have contributed to the effectiveness of vaccinia virus as envelope. Virology 188, 801-810.
a live vaccine that eradicated smallpox. The most EsPoslTO, J. J. & KNIGHT, J. C. (1984). Nucleotide sequence of the
profound differences between the viruses are the thymidine kinase gene region of monkeypox and variola viruses.
Virology 135, 561 567.
fragmentation of seven variola virus ORFs and tile ESPOSITO, J. J, & KNIGHT, J. C. (1985). Orthopoxvirus DNA: a
partial or complete deletion of two others from a total of comparison of restriction profiles and maps. Virology 143, 230-251.
32. A possible explanation for this surprising observation FENNER, F. (1992). Vaccinia virus as a vaccine, and poxvirus
pathogenesis. In Recombinant Poxviruses, pp. 1-43. Edited by M. M.
is that the ancestors of these fragmented genes encoded Binns & G. L. Smith. Boca Raton: CRC Press.
factors that moderated virus virulence and that by their GOEBEL, S. J., JOHNSON, G. P., PERKUS, M. E., DAVIS,S. W., WINSLOW,
loss variola virus became more pathogenic. J. P. & PAOLETTI, E. (1990). The complete DNA sequence of
vaccinia virus. Virology 179, 247-266.
HAMILTON, A., KINCHINGTON, n., GREENAWAY, P. • DUMBELL, K.
We thank Antonio Alcami for advice and critical reading of the (1985). Recombinant bacterial plasmids containing inserts of variola
manuscript. This work was supported by the Medical Research DNA. Lancet ii, 1356-1357.
Council and the Medical Research Fund of the University of Oxford. HOWARD, S. T., CHAN, Y. S. & SMITH, G. L. (1991). Vaccinia virus
G.LS. is a Lister Institute-Jenner Research Fellow. homologues of the Shope fibroma virus inverted terminal repeat
proteins and a discontinuous ORF related to the tumour necrosis
factor receptor family. Virology 180, 633-647.
HUGHES, S. J., JOHNSTON, L. H., DE CARLOS, A. & SMITH, G. L. (1991).
Vaccinia virus encodes an active thymidylate kinase that
complements a cdc8 mutant of Saccharomyces cerevisiae. Journal of
References Biological Chemistry 266, 20103-20109.
JENNER, E. (1798). An Enquiry into the Causes and Effects of Variolae
ALC.~,~ti,A. & SMITH, G. L. (1992). A soluble receptor for interleukin-lfl Vaccinae, a Disease Discovered in some Western Counties of England,
encoded by vaccinia virus: a novel mechanism of virus modulation Particularly Gloueestershire, and Known By the Name of Cow Pox.
of the host response to infection. Cell 71, 153-167. London: Reprinted by CasseU, 1896.

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52
2902 Variola virus nucleotide sequence

JENNER, E. (1801). The Originof the VaccineInoculation. London: D. N. initiation sites of vaccinia virus late genes deduced by structural and
Shury. functional analysis of the HindlIl H genome fragment. Journal of
JIN, D., LI, Z., JIN, Q., YUWEN, H. & HOE, Y. (1989). Vaccinia virus Virology 60, 436-449.
hemagglutinin. A novel member of the immunoglobulin SANGER, F., NICKLEN, S. & COULSON, A. R. (1977). DNA sequencing
superfamily. Journal of Experimental Medicine 170, 571-576. with chain-terminating inhibitors. Proceedings of the National
KERR, S. M. & SMITH, G. L. (1991). Vaccinia virus DNA ligase is Academy of Sciences, U.S.A. 74, 5463-5467.
nonessential for virus replication: recovery of plasmids from virus- SHIDA, H. (1986). Nucleotide sequence of the vaccinia virus
infected cells. Virology 180, 625-632. hemagglutinin gene. Virology 150, 451-462.
KERR, S. M., JOHNSTON,L. H., ODELL, M., DUNCAN,S. A., LAW, K. M. SHIDA, H., TOCHIKURA,T., SATO, T., KONNO, T., HIRAYOSHI, K., SEKI,
& SMITH, G. L. (199l). Vaccinia DNA ligase complements S. M., ITO, Y., HATANAKA, M., HINUMA, Y., SUGIMOTO, M.,
cerevisiae cdcD, localizes in cytoplasmic factories, and affects TAKAHASHI-NISHIMAKI, F., MARUYAMA,T., MIKI, K., SUZUKI, K.,
virulence and virus sensitivity to DNA damaging agents. EMBO MORITA, M., SASHIYAMA, H. & HAYAMI, M. (1987). Effect of the
Journal 10, 4343-4350. recombinant vaccinia viruses that express HTLV-I envelope gene on
KOTWAL, G. J. &. MOSS, B. (1989). Vaccinia virus encodes two proteins HTLV-I infection. EMBO Journal 6, 3379-3384.
that are structurally related to members of the plasma serine protease SMITH, C. A., DAVIS, T., WIGNALL, J. M., DIN, W. S., FARRAH, T.,
inhibitor superfamily. Journal of Virology 63, 600-606. Erratum, UPTON, C., MCFADDEN, G. & GOODWIN, R. G. (1991). T2 open
Journal of Virology 64, 966. reading frame from Shope fibroma virus encodes a soluble form of
LA1, C., GONG, S. &. ESTEBAN, M. (1991). The purified 14-kilodalton the TNF receptor. Biochemical and Biophysical Research
envelope protein of vaccinia virus produced in Escherichia coli Communications 176, 335-342.
induces virus immunity in animals. Journal of Virology 65, 5631- SMITH, G. L. & CHAN, Y. S. (1991). Two vaccinia virus proteins
5635. structurally related to interleukin-I receptor and the
LASKO, D. D., TOMKINSON, A. E. & LINDAHL, T. (1990). Mammalian immunoglobulin superfamily. Journal of Genera/ Virology 72, 511-
DNA ligases: biosynthesis and intracellular location of DNA ligase 518.
I. Journal of Biological Chemistry 265, 12618 12622. SMITH, G. L., CHAN, Y. S. & KERR, S. M. (1989a). Transcriptional
MACKETT, M. & ARCHARD, L. C. (1979). Conservation and variation in mapping and nucleotide sequence of a vaccinia virus gene with
Orthopoxvirus genome structure. Journal of General Virology45, 683- extensive homology to DNA ligases. Nucleic Acids Research 17,
701. 9051-9061.
MEYER, H., SUTTER, G. & MAYR, A. (1991). Mapping of deletions in SMITH, G. L., HOWARD,S. T. & CrtAN, Y. S. (1989b). Vaccinia virus
the genome of the highly attenuated vaccinia virus MVA and their encodes a family of genes with homology to serine proteinase
influence on virulence. Journal of General Virology 72, 1031 1038. inhibitors. Journal of General Virology 70, 2333 2343.
MOORE, J. B. & SMITH, G. L. (1992). Steroid hormone synthesis by a SMITH, G. L., CHAN, Y. S. & HOWARD, S. T. (1991). Nucleotide
vaccinia enzyme: a new type of virus virulence factor. EMBO sequence of 42 kbp of vaccinia virus strain WR from near the right
Journal 11, 1973-1980. inverted terminal repeat. Journalof General Virology 72, 1349-1376.
PAYNE, L. G. (1980). Significance of extracellular enveloped virus in TAKAHASHI-NISHIMAKI,F., FUNAHASHI,S.-I., MIKI, K., HASHIZUME,S.
the in vitro and in vivo dissemination of vaccinia virus. Journal of & SUGIMOTO,n . (1991). Regulation of plaque size and host range by
General Virology 50, 89-100. a vaccinia virus gene related to complement system proteins.
PAYNE, L. G. & KRISTENSSON, K. (1985). Extracellular release of Virology 181, 158-164.
enveloped vaccinia virus from mouse nasal epithelial cells in vivo. TOMKINSON, A., TOTTY, N. F., GINSBURG, M. & LINDAHL, T. (1991).
Journal of General Virology 66, 643-646. Location of the active site for enzyme-adenylate formation in DNA
PEARSON, W. R. & LIPMAN, D. J. (1988). Improved tools for biological ligases. Proceedings of the National Academy of Sciences, U.S.A. 88,
sequence comparison. Proceedings of the National Academy of 400~404.
Sciences, U.S.A. 85, 2444-2448. UPTON, C., MACEN, J. L., SCrtREIBER, M. & MCFADDEN, G. (1991).
RODRIGUEZ, J. F. & ESTEBAN, M. (1987). Mapping and nucleotide Myxoma virus expresses a secreted protein with homology to the
sequence of the vaccinia virus gene that encodes a 14-kilodalton tumor necrosis factor receptor gene family that contributes to viral
fusion protein. Journal of Virology 61, 3550-3554. virulence. Virology 184, 370-382.
RODRmUEZ, J. F. & SMITH, G. U (1990). IPTG-dependent vaccinia VII~'UELA, E. (1985). African swine fever virus. Current Topics in
virus: identification of a virus protein enabling virion envelopment Microbiology and Immunology" 116, 151-170.
by Golgi membrane and egress. Nucleic Acids Research 18, 5347 WORLD HEALTH ORGANIZATION (1980). The global eradication of
5351. smallpox. Final Report of the Global Commission for the Certifica-
RODRIGUEZ, J. F., JANEZCKO, R. & ESTEBAN, M. (1985). Isolation and tion of Smallpox Eradication. In History of International Public
characterization of neutralizing monoclonal antibodies to vaccinia Health. Geneva: World Health Organization.
virus. Journal of Virology 56, 482-488. YUEN, L. & MOSS, B. (1987). Oligonucleotide sequence signaling
RODRIGUEZ, J. F., PAEZ, E. & ESTEBAN, M. (1987). A 14,000-Mr transcriptional termination of vaccinia virus early genes. Proceedings
envelope protein of vaccinia virus is involved in cell fusion and forms of the National Academy of Sciences, U.S.A. 84, 6417-6421.
covalently linked trimers. Journal of Virology 61, 395-404.
ROSEL, J. L., EARL, P. L., WEIR, J. P. & Moss, B. (1986). Conserved
TAAATG sequence at the transcriptional and translational (Received 9 June 1992; Accepted 16 July 1992)

Downloaded from www.microbiologyresearch.org by


IP: 27.6.198.93
On: Tue, 25 Jun 2019 17:07:52

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy