Phylogeny & Systematics
Phylogeny & Systematics
Phylogeny & Systematics
Phylogeny
Descent with modification: Evolutionary tree of elephant family, based on fossil evidence)
i. Fossils near the surface are relatively recent, while those that are deeper are relatively older.
ii. Geologists have established a geologic time scale that reflects a consistent sequence of historical periods.
Those periods are grouped into four eras: Precambrian, Paleozoic, Mesozoic, and Cenozoic.
b. Absolute dating age is given in years, instead of relative terms (before/after, early/late). i. Radiometric dating is the measurement of radioactive isotopes found in fossils and rocks, to determine age.
The half-life of an isotope is the number of years it takes for 50% of the original sample to decay.
3. The fossil record is substantial, but does not provide a complete evolutionary history. a. The fossil record usually tells us about abundant, widespread organisms with hard shells or skeletons. 4. Phylogeny has a biogeographic basis in continental drift. a. Moving continents isolate populations, allowing for evolution to occur. b. 250 million years ago all continents were connected as Pangaea. c. Pangaea broke apart about 180 million years ago.
b. Permian extinction i. 90% of marine species went extinct. ii. Pangaea formed and some species began competing with each other for the first time. iii. Mass extinction was caused by volcanic eruptions and climate changes. c. Cretaceous extinction i. Dinosaurs went extinct. ii. An asteroid (or comet) hit the earth and created a cloud of debris that blocked out sunlight for months. Temperatures dropped and plants died.
Post-impact foraminifera from the Tertiary Period. Only tiny, less ornate foraminifera survived; a few new species evolved.
Tektites--glassy material condensed from the hot vapor cloud produced by the impact--rained down and accumulated in a distinctive layer within the core (SEM image). Pre-impact foraminifera from the Cretaceous Period. Large, ornate foraminifera flourished.
Pace (2001) described a tree of life based on small subunit rRNA sequences.
Pace, N. R. (1997) Science 276, 734-740
This tree shows the main three branches described by Woese and colleagues.
Chlamydiae
Fig. 1. Phylogeny of chlamydiae. 16S rRNA-based neighbor-joining tree showing the affiliation of environmental and pathogenic chlamydiae with major bacterial phyla. Arrow, to outgroup. Scale bar, 10% estimated evolutionary distance.
Eukaryotes
(Baldauf et al., 2000)
Expansion*
duplication HGT
Phylogeny*
genesis
species genome
loss
HGT
Exchange*
Deletion*
Original version
Actual version
Hurles M (2004) Gene Duplication: The Genomic Trade in Spare Parts. PLoS Biol 2(7): e206.
Orthologs: A1 vs A2 and B1 vs B2
A1A1
BB1 1
A22 A
Species-1
Species-2
Molecular evolution
DNA yields more phylogenetic information than proteins. The nucleotide sequences of a pair of homologous genes have a higher information content than the amino acid sequences of the corresponding proteins, because mutations that result in synonymous changes alter the DNA sequence but do not affect the amino acid sequence. (But amino-acid sequences are more efficiently aligned)
C
branches
external nodes
Rooted trees C D B C D A A
D
A B C A D B D C
B
3
C B B D In rooted trees a single node is designated as a common ancestor, and a unique path leads from it through evolutionary time to any other node.
Unrooted trees only specify the relationship between nodes and say nothing about the direction in which evolution occured. Roots can usually be assigned to unrooted trees through the use of an outgroup.
n 2 3
NR 1 3
NU 1 1
4
5 10
15
105 34459425
3
15 2027025
Speciation events
Species A
Gene B
Gene C Gene D
Gene E
Species E
Gene tree
Species tree
These two events - mutation and speciation- are not expected to occur at the same time. So gene trees cannot represent species tree.
Speciation
B Gene tree
Methodology :
1- Multiple alignment; 2- Bootstrapping; 3- Consensus tree construction and evaluation;
If errors in indel placement are made in a multiple alignment then the tree reconstructed by phylogenetic analysis is unlikely to be correct.
S te ps i n mu l tipl e ali gn me n t A- Pai rwi se al ignme n t Example- 4 sequences, A, B, C, D A B C D Similarit y B- Mu l tiple al ignme n t fol l owi n g th e tre e from A B D Align most similar pair Gaps t o optimise alignment A C 6 pairwise comparisons t hen clust er analysis
B D A C
Procedure
An efficient procedure consists of aligning amino-acid sequences and use the resulting alignment as template for corresponding nucleotide sequences. Alignment is garanteed at the codon level.
1. Alignment of a family protein sequences using clustalW
2. Alignment of corresponding DNA sequences using as template their corresponding amino acid alignment obtained in step 1
Note: clean multiple alignment from gaps common to the majority of considered sequences
Parsimony
The concept of parsimony is at the heart of all characterbased methods of phylogenetic reconstruction.
2- the more unlikely events a model invokes, the less likely the model is to be correct. As a result, the relationship that requires the fewest number of mutations to explain the current state of the sequences being considered, is the relationship that is most likely to be correct.
Parsimony
Informative and Uninformative Sites: Multiple sequence alignment, for a parsimony approach, contains positions that fall into two categories in terms of their information content : those that have information (are informative) and those that do not (are uninformative). Example: seq 1 2 3 4 1 G G G G 2 G G G A 3 G G A T 4 G A T C 5 G G A A 6 G T G T
In general, for a position to be informative regardless of how many sequences are aligned, it has to have at least 2 different nucleotides, and each of these nucleotides has to be present at least twice.
Position 1 is said invariant and therefore uninformative, because all trees invoke the same number of mutations (0); Position 2 is uninformative because 1 mutation occurs in all three possible trees; Position 3 idem, because 2 mutations occur; Position 4 requires 3 mutations in all possible trees.
Positions 5 and 6 are informative, because one of the trees invokes only one mutation and the other 2 alternative trees both require 2 mutations.
Krane &
1G G G
G3 T4 A3 G A A4 T3
1G G 3G 1G G 3A 1G G A 3T 1G G 3A 1G G G 3G 1G G G 3G G G T
T2 T4
1G G 4T 1G G G 4A 1G G A T
T2 T3 G2 A3
2T
1G 2G 1G
G2
A4 A2 C4 G2 T4
4
2A 1G
A2
T3 G2 G G
T
C4 A3
4C
1G 4T 1G G G 4A 1G G G 4G
A T4 G3
2G
A3 G2
G3 G3 G3
1G G G 2G 1G
G2
A4 G2 G4
A4 G3 G G G4
1
2G
Maximum likelihood This approach is a purely statistically based method. Probabilities are considered for every individual nucleotide substitution in a set of sequence alignment. Exp.
Since transitions (exchanging purine for a purine and pyrimidine for a pyrimidine) are observed roughly 3 times as often as transversions .. C.. (exchanging a purine for a pyrimidine or vice versa); it can be reasonably argued that a greater likelihood exists that the sequence with C and T are ..T.. more closely related to each other than they are to the sequence with G.
Still, objective criteria can be applied to calculating the probability for every site and for every possible tree that describes the relationships of the sequences in a multiple alignment.
Distance matrix methods (NJ,...) Convert sequence data into a set of discrete pairwise distance values, arranged into a matrix. Distance methods fit a tree to this matrix. Di,j = the distance between i and j sequences; di,j = sum of branches on the tree path from i to j; The phylogeny makes an estimation of the distance for each pair as the sum of branch lengths in the path from one sequence to another through the tree. A measure of how close is the tree to D is given by the least square criterion : ( Di,j - di,j )2/ D2ij
i,j
The phylogenetic topology tree is constructed by using a cluster analysis method (like the NJ method).
1. easy to perform ; 2. fast calculation ; 3. fit for sequences having high similarity scores ; drawbacks : 1. all sites are generally equally treated (do not take into account differences of substitution rates ) ; 2. not applicable to distantly related sequences; 3. Some of the information is lost, particularly those pertaining to the identities of the ancestral and derived nucleotides at each position in the
There is at present no statistical method allowing comparisons of trees obtained from different phylogenetic methods; nevertheless many attempts have been made to compare the relative consistency of the existing methods. The consistency depends on many factors, including the topology and branch lengths of the real tree, the transition/transversion rate and the variability of the substitution rates. In practice, one infers phylogeny between sequences which do not generally meet the specified hypothesis.
One expects that if sequences have strong phylogenetic relationships, different methods will result in the same phylogenetic tree.
At present only sampling techniques allow to test the topology of a phylogenetic tree :
Bootstrapping It consists of drawing columns from a sample of aligned sequences, with replacement, until one gets a data set of the same size as the original one (usually some columns are sampled several times and others left out).
Bootstrapping
Constructs a new multiple alignment at random from the real alignment, with the same size. Note that the same column can be sampled more than once, and consequently some columns are not sampled.
ATAGCCATA ATACCCATG ATACCCATA
ATAGCCATA
ATCCCCCAT
TCAAATGC A
TCGAATCC A TCAAATCC A
Methodology
1. Consider the set of sequences to analyse ; 2. Align "properly" these sequences ; 3. Apply phylogenetic making tree methods ; 4. Evaluate statistically the obtained phylogenetic tree.
1- Multiple alignment; 2- Bootstrapping (100 samples); 3. Apply phylogenetic making tree methods ; 4- Consensus tree construction and evaluation;
This tree shows the main three branches described by Woese and colleagues.
B. Systematics: Connecting classification to phylogeny Systematics: the study of biological diversity in an evolutionary context, including taxonomy and phylogenetics. 1. Taxonomy uses a hierarchical classification system a. Review the Linnaean (binomial) system of classification: genus and species. b. Review hierarchical classfication: Kingdom, Phylum, Class, Order, Family, Genus, Species - A named taxonomic unit at any level is called a taxon.
c. Phylogenetic trees are used to place different taxonomic schemes together, and to show connection between classification and phylogeny.
2. Modern phylogenetic systematics are based on cladistic analysis a. A phylogenetic diagram (tree) is also called a cladogram. b. Each branch in the tree is called a clade. c. Monophyletic pertains to a taxon that is derived from a single ancestral species. only legitimate cladogram type!
d. Polyphyletic pertains to a taxon whose members were derived from two or more ancestors not common to all members. e. Paraphyletic pertains to a taxon that excludes some members that share a common ancestor with members included in the taxon.
3. Constructing cladograms
a. Identify homologies shared characteristics derived from one ancestor. NOTE: Analogous structures may look similar to one another, but are not derived from a common ancestor. These are in contrast to homologous structures. Example of an analogous structure in two distantly related plants.
When two organisms have analogous structures, this is an example of convergent evolution Independent development of similarity between species due to similar selection pressures.
b. When constructing a cladogram, the greater the number of homologous parts between two organisms, the more closely related they are. c. The classification scheme must reflect these similarities.
1. Select your species for which you want to make a cladogram. These are called the ingroup. They have shared primitive and derived characters.
2. Select an outgroup a species that is closely related to the species under study, the outgroup has a shared primitive character that is common to all species. 3. Construct a character table and tabulate the data. The more shared characters, the more closely related are the species. 4. Construct a cladogram based on the number of shared characters. For example: Figure 25.11 (p. 497) Constructing a cladogram. The outgroup here, the lancelet has a notochord, the shared primitive character. The ingroup is five vertebrates.
5. The principle of parsimony helps systematists reconstruct phylogeny a. Phylogenies can be extremely complicated. b. The principle of parsimony states that a theory about nature should be the simplest explanation that is consistent with facts. - Keep it simple. - Sometimes called Occams Razor. c. A phylogenetic tree is a hypothesis. There may be many possible trees, but the simplest one is probably the most accurate.
TATA
+1
Transit Peptide
Met-rich
100bp
Met-rich
HSE
TATA
+1
Transit Peptide
100bp
Met-rich
HSE
TATA
+1
Transit Peptide
390bp
Met-rich
HSE
TATA
+1
Transit Peptide
+1
436bp
Met-rich
HSE
TATA
Transit Peptide
?bp
con I
HSE
TATA
+1
Transit Peptide
Met-rich
con I
390bp
HSE
TATA
Transit Peptide
+1
Met-rich
con I
421bp
(1 ) 1 10 20 30 40 50 60 70 85 (1 ) ------MAAANAPFALVSRLSSPAARLPIRAWRAARPAPLGAG--------RARPLTTASASQDNRDN-SVDVQVSQNGGG--NQ (1 ) ------MAAANAPFALVSRLSSPAARLPIRAWRAASPAPLGAG--------RARPLTTASASQDNRDN-SVDVQVSQNGGG--NQ (1 ) - - - - - - M AAANAPFALVSRLSSPATRLPARAW RAARPAPVAAG - - - - - - - RTRPLTTASASQ ENRDN- SVDVQ VSQ G - - NQ NG (1 ) - - - - - - - - - - M ALARLALRNLQ KLSPSLM Q Q G SCERG LVG NRHN- - - - - - PM KLNRFM ATSAG EDKM EQ NTEVSVSEKK- - - - SP (1 ) - - - - M ASSTAKSG FHTFM EALTG REPVTAVSCRPPCYG AG FRR- - - - - - - LAVVSSSQ ENASENSDRSLTQ Q LPRQ G DG SR- - SP (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - G DNKDN- SVEVQ G HVSKG - - - DQ (1 ) - - - - - - M AAATAPFALVSRLS- Q AARLPI RAW RAARPAPLW - - - - - - - - RTRPLSVASAAQ TG EDRDN- SVDVQ VSQ ARNAG NQ (1 ) - - - M AYTSLTSSPLVSNVSVG TSKI NNNK- - VSAPCSVFVP- - - - - - SM G RRPTTRLVARATG DNKDT- SVDVHHSSAQ G NNQ G (1 ) - - - - - LTCSAASPLSNVVNVSAASSRSNNR- - VTAPCSVFFP- - - SACNVKRPASRLVAQ ATG DNKDT- SVDVHVSSG G NNNQ Q G (1 ) M ACKTLTCSAASPL- - VVNG VTASSRSNNR- - VAAPFSVFFP- - - STCNVKRPASRLVVEATG DNKDT- SVDVHVSSG G NNNQ Q G (1 ) - - - - - - - - - M AAPFALVSRVS- PAARLPI RAAW RSEPTVG LPSS- - - - - - G RARQ LAVASAAQ ENRDNTAVDVHVNQ G - - NQ DG (1 ) M ACKTLTCSASPLVS- - NG VVSATSRTNNKKTTTAPFSVCFPY- - SKCSVRKPASRLVAQ ATG DNKDT- SVDVHVSNNNQ G G NNQ (1 ) - - - - - - - - M AQ SVSLSTI ASPI LSQ KPG SSVKSTPPCM ASFPLRRQ LPRLG LRNVRAQ G DNKDN- SVEVHRVNKD- - - - DQ AG DG (1 ) - - - - - - M AAANAPFALVSRLS- PAARLPI RAW RAARPAPLSTG - - - - - - - RTRPLSVASAAQ G ENRDN- SVDVQ VSQ NAG NQ AQ (1 ) - - - - - - - - M AAAPFAI AG RLS- PVARLPVRAW RPAHG FASSG - - - - - - - - ARSLAVASAAQ RENRDN- SVDVQ VSQ G - RQ NG N(1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (1 ) A AP L A R R A R A A DNRD SVDV VS NQ (8 6) 8 6 1 00 1 10 1 20 1 30 1 40 1 50 1 60 1 75 (6 9) QGNAVQRRPR-RAGFDVAP-------FGLVDPMSPMRTMRQMLDTMDRLFDD-AVGFPTTRRS-PAAASEAPRMPWDIVEDDKEVKMRFD (6 9) QGNAVQRRPR-RTGFDVAP-------FGLVDPMSPMRTMRQMLDTMDRLFDD-AVGFPTTRRS-PAAASEAPRMPWDIVEDDKEVKMRFD (6 8) Q NAVQ G RRPR- RAG FDI SP- - - - - - - FG LVDPM SPM RTM M RQ LDTM DRLFDD- TVG FPTTRRS- PATASEVPRM DI M PW EDDKEVKM RFD (6 6) RQ NFPRRRG RKSLW RNTDDHG YFTPTLNEFFPPTI G NTLI Q ATENM NRI FDN- - FN- - - - - - - - - - - - VNPFQ G VKEQ LM Q DDCYKLRYE (7 3) G PRRPM LRRG G DTRRDLTSS- - - - - LFDI W DPFI G DRSLKQ LNTVDRLFADPFFG M SPPS- - - - - - ATALDLRTPW DVKEDADAYKLRFD (2 0) G TAVEKKPR- RTAM DI SP- - - - - - - FG LDPW I SPM RSM I LDTM RQ DRVFED- TM TFPG RN- - - - - I G G RAPW KDEEHEI RM G EI DI RFD (6 9) Q NAVQ G RRPR- RAG FDI SP- - - - - - - FG LVDPM SPM RTM M KQ SDTM DRLFDD- AVG FPTARRSPAAAAG PRM DI M EM PW EDDKEVKM RFD (7 3) G TAVERRPT- RM ALDVSP- - - - - - - FG VLDPM SPM RTM M DTM RQ I DRLFED- TM TFPG RNRA- - - SG EI RTPW HDDENEI KM TG DI RFD (7 5) G STSVQ RRPR- KM ALDVST- - - - - - - FG LLDPM SPM RTM M DTM RQ M DRLFED- TM TFPG SNR- - - - ASTG RAPW KDDENEI KM EI DI RFD (7 8) G STSVDRRPR- KM SLDVSP- - - - - - - FG LLDPM SPM RTM M DTM RQ M DRLLED- TM TFPG RNRS- - - SAVG RAPW KDDENEI KM EI DI RFD (6 7) Q NAVQ G RRPR- RSSAFG RHL- - - - - PFG LVDPM SPM RTM M RQ LDTM DRM FDDVALG FPATPRR- - SLATG EVRM DVM PW EDDKEVRM RFD (8 1) G SAVERRPR- RM ALDVSP- - - - - - - FG LLDPM SPM RTM M DTM RQ M DRLFED- TM TFPG SRN- - - - RG EI RAPW KDDENEI KM TG DI RFD (7 3) G TAVERKPR- RSSI DI SP- - - - - - - FG LLDPW SPM RSM M RQ LDTM DRI FED- AI TI PG RN- - - - - I G G RVPW KDEEHEI RM G EI EI RFD (7 0) Q NAVQ G RRPR- RAG FDI SP- - - - - - - FG LVDPM SPM RTM M RQ LDTM DRLFDD- AVG FPTARRS- PAAASETPRM DI M PW EDEKEVKM RFD (6 5) Q NAVQ G RRPRRATALDI SPS- - - - - PFG LVDPM SPM RTM M RQ LDTM DRLFDD- AVG FPM TRR- SPATTG G DVRLPW VEDEKEVKM D DI RI (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M RQ M M DTM DRM FED- AM TFPG SSRS- - - - TAG RAPW M EI DI EDEKEVKM RFD (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M RQ LDTM M DRLFED- TM TVPTR- - - - - - - - M EM G RAPW M DI EDENEYKM RFD (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M RQ M M DTM DRLFED- TM TVPTR- - - - - - - - M EM APW M RERVQ ST G Q DI VG VG (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M RQ LDSM M DRLFED- AM PG - - - - - - - G AEM TM M RAPW VEDDNEVKM DI RFD (1 ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M RQ I DTM M DRLFDD- - - - - - - - - - - - - TM CPPARCG G CRG TSRATRKI CW LD (8 6) AV RRPR R A DI SP FG LVDPM SPM RTM M RQ LDTM DRLFED M TFP R A G R PW M EI DI EDE EVKM RFD (1 76 ) 1 76 1 90 2 00 2 10 2 20 2 30 2 40 2 50 2 65 (1 49 ) MPGLSRDEVKVMVEDDTLVIRGEHKKEVSEGQGDGAEGQGDGWWKERSVSSYDMRLALPDECDKSQVRAELKNGVLLVSVPKTE--TERK (1 49 ) MPGLSRDEVKVMVEDDTLVIRGEHKKEVSEGQGDGAEGQGDGWWKERSVSSYDMRLALPDECDKSQVRAELKNGVLLVSVPKTE--TERK (1 48 ) M PG LSRDEVKVM VEDDTLVI RG EHKKEAG Q DG EG G AEG G W KERSVSSYDM Q DG W RLTLPDECDKSQ VRAELKNG VLLVTVPKTE- - TERK (1 42 ) VPG LTKEDVKI TVNDG LTI KG I DHKAEEEKG SP- - - - - EEDEYW SSKSYG YYNTSLSLPDDAKVEDI KAELKNG VLNLVI PRTEK- PKKN (1 52 ) M PG LSKEEVKVSVEDG DLVI RG EHNAEDQ KEDS- - - - - - - - - - W SSRSYG SYNTRM ALPEDALFEDI KAELKNG VLYVVVPKSKKDAQ KK (9 5) M PG LAKEDVKVSVEDDM LVI KG HKSEQ G EHG - - - - - - - - DDSW G SSRTYSSYDTRLKLPDNCEKDKVKAELKNG VLYI TI PKTK- - VERK (1 50 ) M PG LSREEVKVM VEDDALVI RG EHKKEAG Q EAA- G G W KERSVSSYDM EG G G DG W RLALPDECDKSQ VRAELKNG VLLVSVPKRE- - TERK (1 50 ) M PG LSKEDVKVSVENDM LVI KG EHK- KEEDG - - - - - - - DKHSW RNYSSYDTRLSLPDNVVKDKI KAELKNG RG VLFI SI PKTE- - VEKK (1 52 ) M PG LSKEDVKVSVENDVLVI KG EHK- KEESG - - - - - - - - DDNSW RNYSSYDTRLSLPDNVEK- - - - - - - - - - - - - - - - - - - - - - - - - G (1 56 ) M PG LSKDEVKVSVEDDLLVI KG EYK- KEETG - - - - - - - - DDNSW RNYSSYDTRLSLPDNVEKDKI KAELKNG G VLFI SI PKTK- - VEKK (1 49 ) M PG LSREEVKVM VEDDALVI RG EHKKEEG AE- - - - G DG W EG SG W KERSVSSYDM RLALPDECDKSKVRAELKNG VLLVTVPKTE- - VERK (1 57 ) M PG LSKEEVKVSVEDDVLVI KG EHK- KEESG - - - - - - - - KDDSW RNYSSYDTRLSLPDNVDKDKVKAELKNG G VLLI SI PKTK- - VEKK (1 48 ) M PG VSKEDVKVSVEDDVLVI KSDHR- - EENG - - - - - - - - EDCW G SRKSYSCYDTRLKLPDNCEKEKVKAELKDG VLYI TI PKTK- - I ERT (1 50 ) M PG LSREEVRVM VEDDALVI RG EHKKEAG Q - - - - EG DG W EG G G W KERSVSSYDM RLALPDECDKSQ VRAELKNG VLLVSVPKRE- - TERK (1 48 ) M PG LARDEVKVM VEDDTLVI RG EHKKEEG AEG SG - G DG W RSVSSYDM G DG W KQ RLALPDECDKSKVRAELKNG VLLVTVPKTE- - VERK (4 8) M PG SKEEVKVSVEDNVLVI KG M EHKAEEG EE- - - - G EG KDESW RG W KSSSNYDM RLM LPDNCEKDKVRAELKNG VLL- - - - - - - - - - - - (4 4) M PG LDKG DVKVSVEDNM LVI KG ERK- KEEG - - - - - - - - - DDAW G SKRSYSSYDTRLQ LPDNCEM DKI KAEFKNG VLL- - - - - - - - - - - - (4 4) CRG STRG SRCRSRI TCLSSKESARRKKEVTM - - - - - - - - - - VKEVI AHM LG M HG I FNCLI I VSW RLRPSSRTECFY- - - - - - - - - - - - I (4 4) M PG LSKEDVKVM VEDDM LVI RG ETK- KEEG - - - - - - - - - DDAW G KRRSYSSYDTRLQ LPDDCEM DKI KAELKNG VLL- - - - - - - - - - - - (4 0) M PG LERDEVKVM VEDDTLVI RG EPKKEKG AEASG - - - - - DG W W KESSVSAYHM RLALPEACDKSKVRAELKNG VLL- - - - - - - - - - - - (1 76 ) M PG LSKEEVKVSVEDDM LVI KG EHK EEE G D W K RSYSSYDTRLALPDECDKDKVRAELKNG W VLLVSVPKT ERK