Globin Gene & Molecular Clock PDF
Globin Gene & Molecular Clock PDF
Globin Gene & Molecular Clock PDF
MOLECULAR EVIDENCES OF
EVOLUTION
Dr. Simran Luthra
EVOLUTIONARY BIOLOGY
CORE COURSE XIV THEORY (CREDITS 4)
Unit 1: 7
Life’s Beginnings: Chemogeny, RNA world, Biogeny, Origin of photosynthesis, Evolution of eukaryotes
Unit 2: 4
Historical review of evolutionary concept: Lamarckism, Darwinism, Neo-Darwinism
Unit 3: 10
Evidences of Evolution: Fossil record (types of fossils, transitional forms, geological time scale, evolution of horse, Molecular
(universality of genetic code and protein synthesising machinery, three domains of life, neutral theory of molecular evolution,
molecular clock ,example of globin gene family, rRNA/cyt c
Unit 4: 8
Sources of variations: Heritable variations and their role in evolution
Unit 5: 13
Population genetics: Hardy-Weinberg Law (statement and derivation of equation, application of law to human Population);Evolutionary forces
upsetting H-W equilibrium.
Natural selection (concept of fitness, selection coefficient, derivation of one unit of selection for a dominant allele, genetic load, mechanism of
working, types of selection, density-dependent selection, heterozygous superiority, kin selection, adaptive resemblances, sexual selection.
Genetic Drift (mechanism, founder’s effect, bottleneck phenomenon; Role of Migration and Mutation in changing allele frequencies
Unit 6: 7
Product of evolution: Micro evolutionary changes (inter-population variations, clines, races, Species concept, Isolating mechanisms, modes of
speciation—allopatric, sympatric, Adaptive radiation / macroevolution (exemplified by Galapagos finches
Unit 7: 2
Extinctions, Back ground and mass extinctions (causes and effects), detailed example of K-T extinction
Unit 8: 6
Origin and evolution of man, Unique hominin characteristics contrasted with primate characteristics, primate phylogeny from Dryopithecus
leading to Homo sapiens, molecular analysis of human origin
Unit 9: 2
Phylogenetic trees, Multiple sequence alignment, construction of phylogenetic trees, interpretation of trees
SUGGESTED BOOKS
1. Ridley, M. Evolution. III Edition.
2. Hall, B.K. and Hallgrimsson, B. Evolution. IV Edition.
3. Campbell, N.A. and Reece J.B. Biology. IX Edition
4. Douglas, J. Futuyma. Evolutionary Biology.
5. Pevsner, J. Bioinformatics and functional genomics. II Edition.
1
17-02-2020
Molecular evolution: changes in DNA (RNA) of chromosomes which take place over
the history of a species and distinguish the species from its ancestors.
Studied through evolutionary or molecular systematics or phylogenetics.
2
17-02-2020
MULTIGENE FAMILIES
3
17-02-2020
Two (or more) identical genes present on the same chromosome are described
as nonallelic copies.
Nonfunctional genes are defined as such by their inability to code for proteins; the
reasons for inactivity vary, and the deficiencies may be in transcription or translation (or
both). They are called pseudogenes and given the symbol Ψ.
Processed pseudogene is an inactive gene copy that lacks introns, contrasted with the
interrupted structure of the active gene. Such genes presumably originate by reverse
transcription of mRNA and insertion of a duplex copy into the genome.
4
17-02-2020
An alternative fate for gene duplications is for both copies to remain functional, while diverging
in their sequence and pattern of expression and taking on different roles.
This process of “duplication and divergence” almost certainly explains the presence of large
families of genes with related functions in biologically complex organisms, and it is thought to
play a critical role in the evolution of increased biological complexity.
5
17-02-2020
The unmistakable homologies in amino acid sequence and structure among the present-
day globins indicate that they all must derive from a common ancestral gene
6
17-02-2020
MOLECULAR CLOCK
7
17-02-2020
• Molecules can estimate the date of common ancestors for which no fossils are known (filling in gaps
in the fossil record)….. invertebrates, bacteria, and viruses
• Molecules can estimate divergence dates when there is no obvious morphological change
(particularly important for microorganisms).
• It also allows us to estimate the timing of events that are too recent to be resolved by fossil
evidence, such as divergences among conspecific populations.
Zuckerkandl & Pauling (1962) noted that the number of amino acid
differences between animal haemoglobins was proportional to species
divergence time, as defined by the fossil record. Thus sequences seem to
provide an accurate evolutionary molecular clock.
8
17-02-2020
Rates of amino acid substitution in three proteins: fibrinopeptides, hemoglobin, and cytochrome c. The number of amino acid differences (per
100 residues, and corrected for multiple changes at the same residue) is plotted for comparisons between various mammals, birds and reptiles,
mammals and reptiles, reptiles and fish, carp and lamprey, and vertebrates and insects, all lineages of organisms for which fossil data provided
estimates of the time since divergence. Note that while some comparisons of the three proteins are from the same pair of organisms and time
of divergence, the three proteins are evolving at very different rates (i.e., fibrinopeptide the fastest, and cytochrome c the slowest). In addition,
the rough linearity of the rate of accumulation of molecular divergence with time illustrates the molecular clock concept.
The more EARLY in the past an ancestral stock diverged into present day
species, MORE CHANGES accumulate in the amino acids sequences of the
proteins in two contemporary species.
9
17-02-2020
MOLECULAR CLOCK
Highly conserved proteins- cyctochrome c, haemoglobin- provide best
molecular clocks for determining
Rate of evolution
Trace evolutionary relationships between different groups.
The more early in the past an ancestral stock diverged into present day
species, more changes accumulate in the amino acids sequences of the
proteins in two contemporary species.
Number of amino acid modifications in the line of descent can be used as a
measure of time of divergence of two species from a common ancestor.
E.g. cytochrome c from horses and other mammals differs in 5.1 amino acids.
Since horses diverged from other mammals about 90 million years ago, it
means on an average one amino acid substitution has occurred every 17.6
million years (90/5.1= 17.6).
10
17-02-2020
Each protein has a distinct rate of evolution depending upon how important its function is.
The less functional constraint on a molecule, the faster it evolves in terms of mutant substitution
than those molecules subject to stronger constraint.
For example, histones bind DNA in chromosomes and regulate DNA activity. Thus, a histones structure
is strictly defined because its ability to bind DNA depends upon its particular structure and shape. The
103 amino acids in this protein are identical for nearly all plants and animals.
Fibrinopeptides can perform their role in blood clotting with almost any amino acid change. The 20
amino acids in this protein differ by 86% between a horse and a human. Fibrinopeptides exhibit a very
fast rate of change because they are subject to less functional constraint.
Therefore, when doing a study, it is essential to select a molecule that is appropriate to the time span
of interest.
Different rates even within a molecule
16S rRNA-conserved and variable regions
11
17-02-2020
Nucleotide sequence of a coding region into potential replacement sites and silent
sites:
• At replacement sites, a mutation alters the amino acid that is coded. The effect of the
mutation (deleterious, neutral, or advantageous) depends on the result of the amino
acid replacement.
• At silent sites, mutation only substitutes one synonym codon for another, so there is no
change in the protein.
12
17-02-2020
The rate of divergence can be measured as the percent difference per million
years, or as its reciprocal, the unit evolutionary period (UEP), the time in
millions of years that it takes for 1% divergence to develop.
An average divergence of 10% in the replacement sites of either the a- or β-globin genes of
mammals that have been separated since the mammalian radiation occurred ~85 million years
ago. This corresponds to a replacement divergence rate of 0.12% per million years.
Average replacement divergence between corresponding mammalian and chicken globin genes is
23%. Relative to a separation -270 million years ago, this gives a rate of 0.09% per million years.
Compare the α- with the β-globin genes: >500 million years ago. They have an average
replacement divergence of 50%, which gives a rate of 0.1% per million years.
From the slope: average rate of ~0.096% per million years (or a UEP of 10.4).
The difference between the human β and δ genes is 3.7% for replacement sites. At a UEP of 10.4,
these genes must have diverged 10.4 X 3.7 = 40 million years ago—about the time of the separation
of the lines leading to New World monkeys, Old World monkeys, great apes, and man. All of these
higher primates have both β and 8 genes, which suggests that the gene divergence commenced just
before this point in evolution.
γ and ε genes is 10%, which corresponds to a time of separation ~100 million years ago.
13
17-02-2020
Calculate the number of differences (e.g., in base pairs) that have accrued among pairs of
species since their common ancestor.
The average rate of base pair substitution in any lineage can be estimated if we have an
estimate of the absolute time of divergence.
• For example, the oldest fossils of cercopithecoid monkeys are dated at 25 million years
(my) ago, providing a minimal estimate of time since divergence between the rhesus
monkey and the hominoids.
• The number of substitutions per base pair per million years for the rhesus monkey lineage
is 457/10,000 base pairs sequenced/25 My = 1.83 X 10-3 per My, or 1.83 x 10-9 per year.
From this common ancestor to Homo, the average rate has been 310/10,000/25 = 1.24 x
10-3 per My.
• The average rate at which substitutions have occurred in each Iineage is therefore (457 +
310)/2 = 383.5/10,000/25 My, or 1.534 X 10-3 per My.
14
17-02-2020
suppose the proportion of base pairs that differ between the globin pseudogene sequences of
two primate species is 0.0256.
D= 2rt
where D is the proportion of base pairs that differ between the two sequences,
r is the rate of divergence per base pair per My,
t is the time (in My) since the species' common ancestor,
the factor 2 represents the two diverging lineages.
If D = 0.0256 and
r = 0.001534, as estimated from the earlier data,
then t= D/2r· = 8.3,
8.3 My is our best estimate of when the two species diverged from their common ancestor.
1. Male-driven evolution: more cell division 2. The mtDNA of warm blooded animals
in the male germ line than the female, seems to evolve faster than that of cold-
leading to faster Y chromosome evolution blooded ones
than the X chromosome (in mammals).
15
17-02-2020
16
17-02-2020
17
17-02-2020
18
17-02-2020
19
17-02-2020
• Cytochrome c is an enzyme whose function depends most critically on a few amino acid residues
at the active site, which bind to its heme cofactor.
• Consequently, these active site residues rarely vary, even though amino acids around them
change.
• Of 104 residues, only Cys-17, His-18 and Met-80 are totally invariant.
• In other places variation is low; large, nonpolar, amino acid residues always fill positions 35 and
36.
• Cytochrome c molecules may vary by as many as 88% of their residues, they retain the same 3-D
conformation
20
17-02-2020
21
17-02-2020
22
17-02-2020
NEUTRAL THEORY
In 1968, Motoo Kimura, propose the neutral theory of molecular evolution
(not organism level….pertains to aa substitutions in DNA)
The molecular clock hypothesis was based on empirical observations, but it
soon received theoretical backing when biologist Motoo Kimura developed the
neutral theory of molecular evolution in 1968.
Kimura suggested that a large fraction of new mutations do not have an effect
on evolutionary fitness, so natural selection would neither favor nor disfavor
them.
Eventually, each of these neutral mutations would either spread throughout a
population and become fixed in all of its members, or they would be lost
entirely in a stochastic process called genetic drift.
Neutral theory applies to molecules and not to the phenotypes of organism
Kimura observed that the total rate of nucleotide substitution in a mammalian genome far
exceeds the upper limit of adaptive evolution suggested by Haldane which led him to propose
that most genetic changes have been fixed by random drift rather than positive Darwinian
selection.
He further states that the large amounts of fruit fly and human genetic polymorphism
discovered by using protein gel electrophoresis is consistent with the hypothesis that natural
polymorphisms are largely neutral.
King and Jukes also suggested that most evolutionary changes in DNA sequence are neutral.
2. Molecular clock
23
17-02-2020
The neutral theory of molecular evolution suggests that most of the genetic variation in
populations is the result of mutation and genetic drift and not selection.
FATE OF MUTATIONS
The evolution of living organisms is the consequence of two processes:
• genetic variability generated by mutations, which continuously arise within populations.
• changes in the frequency of alleles within populations over time.
A. Natural selection: partly determines the fate of those mutations that affect the fitness of their carrier
1. New alleles that confer a higher fitness tend to increase in frequency over time until they reach fixation, thus
replacing the ancestral allele in the population. This evolutionary process is called positive or directional
selection.
2. Conversely, new mutations that decrease the carrier's fitness tend to disappear from populations through a
process known as negative or purifying selection.
3. Finally, it may happen that a mutation is advantageous only in heterozygotes but not in homozygotes. Such
alleles tend to be maintained at an intermediate frequency in populations by way of the process known as
balancing selection.
B. Genetic Drift: consider a theoretical population in which all individuals, or genotypes, have exactly the same
fitness. In this situation, natural selection does not operate, because all genotypes have the same chance to
contribute to the next generation. Given that populations do not grow infinitely and that each individual produces
many gametes, it follows that only a fraction of the gametes that are produced will succeed in developing into adults.
Thus, in each generation, allelic frequencies may change simply as a consequence of this random process of
gamete sampling. This process is called genetic drift.
The difference between genetic drift and natural selection is that changes in allele frequency caused by genetic
drift are random, rather than directional.
Ultimately, genetic drift leads to the fixation of some alleles and the loss of others.
24
17-02-2020
A DNA position at which all alleles are selectively equivalent, and where the rate of mutation per
generation is µ.
In a haploid population of size N, Nµ mutations occur at this site at each generation.
Given that there is no selection, all genotypes have the same probability to reach fixation. Under a
neutral model, the probability that an allele or mutation fixes is simply its relative frequency in the
population. For a new mutation in a haploid population, this relative frequency is 1/N; thus, the
probability that a new mutation reaches fixation is simply 1/N (the same reasoning also holds for
diploid species).
The rate of substitution per generation (K) is obtained simply by multiplying the number of
mutations that occur at each generation by their probability of fixation. Thus, for neutrally evolving
sites, the equation becomes the following:
K = Nµ × 1/N = µ
Of course, because of natural selection, advantageous mutations have a higher probability of
fixation than neutral mutations, and deleterious mutations have a lower probability of fixation.
1. It therefore follows that sequences subject to positive selection evolve faster than neutral sites (K > µ),
2. whereas sequences subject to negative selection evolve more slowly (K < u).
25
17-02-2020
• Kimura suggested that a large fraction of new mutations do not have an effect on evolutionary fitness,
so natural selection would neither favor nor disfavor them. Eventually, each of these neutral mutations
would either spread throughout a population and become fixed in all of its members, or they would be
lost entirely in a stochastic process called genetic drift.
• Kimura then showed that the rate at which neutral mutations become fixed in a population (known as
the substitution rate) is equivalent to the rate of appearance of new mutations in each member of the
population (the mutation rate). Provided that the mutation rate is consistent across species, the
substitution rate would remain constant throughout the tree of life.
• At the molecular level most evolutionary changes and most of the variation within and between
species is not caused by natural selection but by random drift of mutant alleles that are neutral.
• A neutral mutation is one that does not affect an organism's ability to survive and reproduce.
• The neutral theory allows for the possibility that most mutations are deleterious, but holds that
because these are rapidly purged by natural selection, they do not make significant contributions to
variation within and between species at the molecular level. Mutations that are not deleterious are
assumed to be mostly neutral rather than beneficial.
• The theory applies only for evolution at the molecular level, and phenotypic evolution is controlled by
natural selection, as postulated by Charles Darwin.
The neutral theory of molecular evolution holds that although a small minority
of mutations in DNA or protein sequences are advantageous and are fixed by
natural selection, and although many mutations are disadvantageous and are
eliminated by natural selection, the great majority of those mutations that are
fixed are effectively neutral with respect to fitness and are fixed by genetic drift.
26
17-02-2020
The difference between the two ideas can be understood in terms of the frequency
distribution for the selection coefficients of mutations, or genetic variants.
If the selection coefficient is positive- natural selection favours variant ;
If it is negative, it is elimnated ;
27
17-02-2020
Kimura’s neutral theory holds that effectively neutral mutations that rise to
fixation by drift vastly outnumber beneficial mutations that rise to fixation by
natural selection. Genetic drift, not natural selection, is thus the mechanism
responsible for most molecular evolution.
2. Silent Sites Change Faster than Replacement Sites in Most Coding Loci…Both kinds of
substitution accumulated in a linear, clocklike fashion, but the rate of evolution for silent
changes is much higher than the rate of evolution for replacement changes. Because the rate of
neutral substitution equals the rate of neutral mutation, neutral theory can explain the
molecular clock phenomenon if the neutral mutation rate is constant per year.
3. Variation among Loci: Evidence for Functional Constraints…. Kimura and Ohta shows that the
functionally important and constrained histone H4 evolves much more slowly than the
functionally relatively unimportant and unconstrained fibrinopeptides, consistent with the
prediction of the neutral theory.
28
17-02-2020
RATE OF DIVERGENCE
The rate of divergence can be measured as the percent difference per million years, or
as its reciprocal, the unit evolutionary period (UEP), the time in millions of years that it
takes for 1% divergence to develop.
percent difference
Rate of divergence
time of separation
29
17-02-2020
1. The neutral theory of molecular evolution suggests that molecular evolution is mainly due to
neutral drift. Alternatively, molecular evolution may be mainly driven by natural selection.
2. Four main observations were originally interpreted in favor of the neutral theory: molecular
evolution has a rapid rate, its rate has a clock-like constancy, it is more rapid in functionally less
constrained parts of molecules, and natural populations are highly polymorphic.
3. Kimura argued that the high rate of evolution, and the high degree of variability of proteins,
would, if caused by natural selection, impose a high genetic load. Neutral drift, however, can drive
high rates of evolution, and maintain high levels of variability, without imposing a genetic load.
4. The constant rate of molecular evolution gives rise to a 'molecular clock'.
5. Neutral drift should drive evolution at a stochastically constant rate; Kimura pointed to the
contrast between uneven rates of morphological evolution and the constant rate of molecular
evolution and argued that natural selection would not drive molecular evolution at a constant
rate.
6. The molecular clock for proteins ticks over according to absolute time rather than generational
time. But for silent changes in DNA, lineages with shorter generation times probably evolve faster.
Neutral drift should cause the molecular clock to run according to generational, not absolute,
time.
7. Selection can operate without producing impossible genetic loads, and Kimura's original case for
the neutral theory is no longer convincing.
• The neutral theory explains the higher evolutionary rate of functionally less
constrained regions of proteins by the greater chance that a mutation there will be
neutral.
• Selectionists explain the higher evolutionary rate of functionally less constrained
regions of proteins by the greater chance that a mutation there will be a small,
rather than a large, change.
• Pseudogenes and silent changes in third codon positions may be relatively
functionally unconstrained. These parts of the DNA evolve faster than do the first
two positions in codons, and meaningful third base changes. Neutralists attribute
this high rate of evolution to enhanced neutral drift.
• For amino acids encoded by more than one codon, there are consistent biases in the
frequencies of the codons. Changes between the silent codons are therefore not
completely unconstrained.
• The neutral theory predicts a positive relation between the degree of variability of a
molecule and its rate of evolution.
30
17-02-2020
31
17-02-2020
32
17-02-2020
Drawback
Test shows that a molecule evolved at same rate in the two lineages connecting
the two modern species with their common ancestor.
This test does not require to know the absolute date of common ancestor. - no
need of fossils
33
17-02-2020
• The relative rate test for constancy of the rate of molecular divergence. Sequences
are obtained for living species A and B
• and for outgroup species E. Y and X represent ancestral species. Lowercase italic
letters represent the number of character differences (e.g., nucleotide changes)
along each branch. The genetic distance between A and E is DAE = a + c + d. That
between Band E is DBE = b + c
• + d. If the rate of nucleotide substitution
• is constant, then a = b, so DAE = DBE. If rate constancy holds throughout the tree,
the distance between any pair of species that have species X as a common ancestor
will equal that between any other such pair of species.
The time that has elapsed from any common ancestor(lie., any branch point on
a phylogenetic tree) to each of the living species derived from that ancestor is
exactly the same.
Therefore, if lineages have diverged at a constant rate, the number of
changes (sometimes called the GENETIC DISTAnce) along all paths of the
phylogenetic tree from one descendant species to another through their
common ancestor should be about the same.
In the hominoid example, the number of differences between the rhesus
monkey and the various hominoids ranges from 806 (to orangutan) to 767
(human). These numbers are so close that they indicate a fairly constant rate
of divergence, although the human lineage appears to have slowed down
somewhat.
34
17-02-2020
Drawbacks
• For closely related species test applies but not for distantly related
species. Distantly related taxa often have rather different evolutionary
rates
For example, the rate of sequence evolution in rodents is two to three
times greater than in primates.
Can only show that a molecule evolved at a same rate in two lineage
but this Does not prove that molecules always have a constant rate
Use to estimate the time of divergence where the fossil record is inadequate or
absent. For eg, fungi don’t make fossils well
Different genes evolve at different rates, which gives us flexibility to date
events throughout the history of life
evolution of important genes – slow For e.g., histones
35
17-02-2020
Histones
1 amino acid change
Pea Cow
Function –bind to DNA
Amino acid – 103
Rate of evolution – slow
Fibrinopeptides
86% amino acid change
Horse Human
Function – role in blood clotting
Amino acid – 20 Rate of evolution – fast
2. Aneuploidy
3. Deletions
4. Duplications
5. Inversions
6. Translocations
2. Mutation
1. Point mutation
1. Transition
2. Transversion
1. Insertion
2. deletions
4. Transposons
B. ENVIRONMENTAL VARIATION
C. GENOTYPE-BY-ENVIRONMENT INTERACTION
36