Karthikeya XII-A Chemistry Project PDF
Karthikeya XII-A Chemistry Project PDF
1
The structure of the DNA double helix. The atoms in the structure are
colour-coded by element and the detailed structures of two base pairs are
shown in the bottom right.
2
that these sections do not serve as patterns for protein
sequences. The two strands of DNA run in opposite directions to
each other and are thus antiparallel. Attached to each sugar is
one of four types of nucleobases (or bases). It is the sequence of
these four nucleobases along the backbone that encodes genetic
information. RNA strands are created using DNA strands as a
template in a process called transcription, where DNA bases are
exchanged for their corresponding bases except in the case of
thymine (T), for which RNA substitutes uracil (U). Under the
genetic code, these RNA strands specify the sequence of amino
acids within proteins in a process called translation.
3
DNA: PROPERTIES
DNA is a long polymer made from repeating units called
nucleotides, each of which is usually symbolised by a single letter:
either A, T, C, or G. The structure of DNA is dynamic along its
length, capable of coiling into tight loops and other shapes. All
species are composed of two helical chains bound to each other
by hydrogen bonds. Both chains are coiled around the same axis
and have the same pitch of 34 ångströms (3.4 nm). The pair of
chains have a radius of 10 Å (1.0 nm). According to another
study, when measured in a different solution, the DNA chain
measured 22–26 Å (2.2–2.6 nm) wide, and one nucleotide unit
measured 3.3 Å (0.33 nm) long. Although each individual
nucleotide is very small, a DNA polymer can be very long and
may contain hundreds of millions of nucleotides, such as in
chromosome 1. Chromosome 1 is the largest human
chromosome with approximately 220 million base pairs and would
be 85 mm long if straightened.
4
Chemical structure of DNA; hydrogen bonds shown as dotted lines. Each
end of the double helix has an exposed 5' phosphate on one strand and an
exposed 3' hydroxyl group (—OH) on the other.
5
The backbone of the DNA strand is made from alternating
phosphate and sugar groups. The sugar in DNA is 2-deoxyribose,
which is a pentose (five-carbon) sugar. The sugars are joined by
phosphate groups that form phosphodiester bonds between the
third and fifth carbon atoms of adjacent sugar rings. These are
known as the 3′-end (three prime ends), and 5′-end (five prime
ends) carbons, the prime symbol is used to distinguish these
carbon atoms from those of the base to which the deoxyribose
forms a glycosidic bond. Therefore, any DNA strand normally has
one end at which there is a phosphate group attached to the 5′
carbon of a ribose (the 5′ phosphoryl) and another end at which
there is a free hydroxyl group attached to the 3′ carbon of a ribose
(the 3′ hydroxyls). The orientation of the 3′ and 5′ carbons along
the sugar-phosphate backbone confers directionality (sometimes
called polarity) to each DNA strand. In a nucleic acid double helix,
the direction of the nucleotides in one strand is opposite to their
direction in the other strand: the strands are antiparallel. The
asymmetric ends of DNA strands are said to have a directionality
of five prime ends (5′ ), and three prime ends (3′), with the 5′ ends
having a terminal phosphate group and the 3′ ends a terminal
hydroxyl group. One major difference between DNA and RNA is
the sugar, with the 2-deoxyribose in DNA being replaced by the
related pentose sugar ribose in RNA.
The DNA double helix is stabilised primarily by two forces:
hydrogen bonds between nucleotides and base-stacking
interactions among aromatic nucleobases. The four bases found
in DNA are adenine (A), cytosine (C), guanine (G) and thymine
(T). These four bases are attached to the sugar-phosphate to
form the complete nucleotide, as shown for adenosine
monophosphate. Adenine pairs with thymine and guanine pairs
with cytosine, forming A-T and G-C base pairs.
6
Nucleobase classification
The nucleobases are classified into two types: the purines, A and
G, which are fused five- and six-membered heterocyclic
compounds, and the pyrimidines, the six-membered rings C and
T. A fifth pyrimidine nucleobase, uracil (U), usually takes the place
of thymine in RNA and differs from thymine by lacking a methyl
group on its ring. In addition to RNA and DNA, many artificial
nucleic acid analogues have been created to study the properties
of nucleic acids, or for use in biotechnology.
Non-canonical bases
Modified bases occur in DNA. The first of these recognized was
5-methylcytosine, which was found in the genome of
Mycobacterium tuberculosis in 1925. The reason for the presence
of these noncanonical bases in bacterial viruses (bacteriophages)
is to avoid the restriction enzymes present in bacteria. This
enzyme system acts at least in part as a molecular immune
system protecting bacteria from infection by viruses. Modifications
of the bases cytosine and adenine, the more common and
modified DNA bases, play vital roles in the epigenetic control of
gene expression in plants and animals.
7
DNA major and minor grooves. The latter is a binding site for the Hoechst
stain dye 33258.
8
Grooves
Twin helical strands form the DNA backbone. Another double
helix may be found tracing the spaces, or grooves, between the
strands. These voids are adjacent to the base pairs and may
provide a binding site. As the strands are not symmetrically
located with respect to each other, the grooves are unequally
sized. The major groove is 22 ångströms (2.2 nm) wide, while the
minor groove is 12 Å (1.2 nm) in width. Due to the larger width of
the major groove, the edges of the bases are more accessible in
the major groove than in the minor groove. As a result, proteins
such as transcription factors that can bind to specific sequences
in double-stranded DNA usually make contact with the sides of
the bases exposed in the major groove. This situation varies in
unusual conformations of DNA within the cell (see below), but the
major and minor grooves are always named to reflect the
differences in width that would be seen if the DNA was twisted
back into the ordinary B form.
Base pairing
In a DNA double helix, each type of nucleobase on one strand
bonds with just one type of nucleobase on the other strand. This
is called complementary base pairing. Purines form hydrogen
bonds to pyrimidines, with adenine bonding only to thymine in two
hydrogen bonds, and cytosine bonding only to guanine in three
hydrogen bonds. This arrangement of two nucleotides binding
together across the double helix (from six-carbon ring to six-
carbon ring) is called a Watson-Crick base pair. DNA with high
GC content is more stable than DNA with low GC content. A
Hoogsteen base pair (hydrogen bonding the 6-carbon ring to the
5-carbon ring) is a rare variation of base pairing. As hydrogen
bonds are not covalent, they can be broken and rejoined relatively
easily. The two strands of DNA in a double helix can thus be
pulled apart like a zipper, either by a mechanical force or high
temperature. As a result of this base pair complementarity, all the
information in the double-stranded sequence of a DNA helix is
duplicated on each strand, which is vital in DNA replication. This
9
Top, a GC base pair with three hydrogen bonds. Bottom, an AT base pair
with two hydrogen bonds. Non-covalent hydrogen bonds between the pairs
are shown as dashed lines.
10
reversible and specific interaction between complementary base
pairs is critical for all the functions of DNA in organisms.
11
molecules have no single common shape, but some
conformations are more stable than others.
Supercoiling
DNA can be twisted like a rope in a process called DNA
supercoiling. With DNA in its "relaxed" state, a strand usually
circles the axis of the double helix once every 10.4 base pairs, but
if the DNA is twisted the strands become more tightly or more
loosely wound. If the DNA is twisted in the direction of the helix,
this is positive supercoiling, and the bases are held more tightly
together. If they are twisted in the opposite direction, this is
negative supercoiling, and the bases come apart more easily. In
nature, most DNA has slight negative supercoiling that is
introduced by enzymes called topoisomerases. These enzymes
are also needed to relieve the twisting stresses introduced into
12
DNA strands during processes such as transcription and DNA
replication.
13
From left to right, the structures of A, B and Z DNA
14
Alternative DNA structures
DNA exist in many possible conformations that include A-DNA, B-
DNA, and Z-DNA forms, although only B-DNA and Z-DNA have
been directly observed in functional organisms. The conformation
that DNA adopts depends on the hydration level, DNA sequence,
the amount and direction of supercoiling, chemical modifications
of the bases, the type and concentration of metal ions, and the
presence of polyamines in the solution.
The first published reports of A-DNA X-ray diffraction patterns—
and also B-DNA—used analyses based on Patterson functions
that provided only a limited amount of structural information for
oriented fibres of DNA. An alternative analysis was proposed by
Wilkins et al. in 1953 for the in vivo B-DNA X-ray diffraction-
scattering patterns of highly hydrated DNA fibres in terms of
squares of Bessel functions. In the same journal, James Watson
and Francis Crick presented their molecular modelling analysis of
the DNA X-ray diffraction patterns to suggest that the structure
was a double helix.
15
can be recognized by specific Z-DNA binding proteins and may
be involved in the regulation of transcription.
Quadruplex structures
DNA quadruplex is formed by telomere repeats. The looped
conformation of the DNA backbone is very different from the
typical DNA helix. The green spheres in the centre represent
potassium ions.
At the ends of the linear chromosomes are specialized regions of
DNA called telomeres. The main function of these regions is to
allow the cell to replicate chromosome ends using the enzyme
telomerase, as the enzymes that normally replicate DNA cannot
copy the extreme 3′ ends of chromosomes. These specialized
chromosome caps also help protect the DNA ends, and stop the
DNA repair systems in the cell from treating them as damage to
be corrected. In human cells, telomeres are usually lengths of
single-stranded DNA containing several thousand repeats of a
simple TTAGGG sequence.
These guanine-rich sequences may stabilize chromosome ends
by forming structures of stacked sets of four-base units, rather
than the usual base pairs found in other DNA molecules. Here,
four guanine bases, known as a guanine tetrad, form a flat plate.
These flat four-base units then stack on top of each other to form
a stable G-quadruplex structure. These structures are stabilized
by hydrogen bonding between the edges of the bases and the
16
chelation of a metal ion in the centre of each four-base unit. Other
structures can also be formed, with the central set of four bases
coming from either a single strand folded around the bases or
several different parallel strands, each contributing one base to
the central structure.
In addition to these stacked structures, telomeres also form large
loop structures called telomere loops or T-loops. Here, the single-
stranded DNA curls around in a long circle stabilized by telomere-
binding proteins. At the very end of the T-loop, the single-
stranded telomere DNA is held onto a region of double-stranded
DNA by the telomere strand disrupting the double-helical DNA
and base pairing to one of the two strands. This triple-stranded
structure is called a displacement loop or D-loop.
Branched DNA
In DNA, fraying occurs when non-complementary regions exist at
the end of an otherwise complementary double-strand of DNA.
However, branched DNA can occur if a third strand of DNA is
introduced and contains adjoining regions able to hybridize with
the frayed regions of the pre-existing double strand. Although the
simplest example of branched DNA involves only three strands of
DNA, complexes involving additional strands and multiple
branches are also possible. Branched DNA can be used in
nanotechnology to construct geometric shapes, see the section
on uses in technology below.
17
DNA quadruplex is formed by telomere repeats. The looped conformation
of the DNA backbone is very different from the typical DNA helix. The
green spheres in the centre represent potassium ions.
18
Artificial bases
Several artificial nucleobases have been synthesized, and
successfully incorporated into the eight-base DNA analogue
named Hachimoji DNA. Dubbed S, B, P, and Z, these artificial
bases are capable of bonding with each other in a predictable
way (S–B and P–Z), maintain the double helix structure of DNA,
and be transcribed to RNA. Their existence could be seen as an
indication that there is nothing special about the four natural
nucleobases that evolved on Earth. On the other hand, DNA is
tightly related to RNA which not only act as a transcript of DNA
but also performs as molecular machines many tasks in cells. For
this purpose, it has to fold into a structure. It has been shown that
to allow to creation of all possible structures at least four bases
are required for the corresponding RNA, while a higher number is
also possible but this would be against the natural principle of
least effort.
Acidity
The phosphate groups of DNA give it similar acidic properties to
phosphoric acid and it can be considered as a strong acid. It will
be fully ionized at a normal cellular pH, releasing protons which
leave behind negative charges on the phosphate groups. These
negative charges protect DNA from breakdown by hydrolysis by
repelling nucleophiles which could hydrolyze it.
Macroscopic appearance
Pure DNA extracted from cells forms white, stringy clumps.
19
DNA: INTERACTIONS WITH PROTEINS
All the functions of DNA depend on interactions with proteins.
These protein interactions can be non-specific, or the protein can
bind specifically to a single DNA sequence. Enzymes can also
bind to DNA and of these, the polymerases that copy the DNA
base sequence in transcription and DNA replication are
particularly important.
DNA-binding proteins
Interaction of DNA (in orange) with histones (in blue). These
proteins' basic amino acids bind to the acidic phosphate groups
on DNA.
Structural proteins that bind DNA are well-understood examples
of non-specific DNA-protein interactions. Within chromosomes,
DNA is held in complexes with structural proteins. These proteins
organize the DNA into a compact structure called chromatin. In
eukaryotes, this structure involves DNA binding to a complex of
small basic proteins called histones, while in prokaryotes multiple
types of proteins are involved. The histones form a disk-shaped
complex called a nucleosome, which contains two complete turns
of double-stranded DNA wrapped around its surface. These non-
specific interactions are formed through basic residues in the
histones, making ionic bonds to the acidic sugar-phosphate
backbone of the DNA, and are thus largely independent of the
base sequence. Chemical modifications of these basic amino acid
residues include methylation, phosphorylation, and acetylation.
These chemical changes alter the strength of the interaction
between the DNA and the histones, making the DNA more or less
accessible to transcription factors and changing the rate of
transcription. Other non-specific DNA-binding proteins in
chromatin include the high-mobility group proteins, which bind to
bent or distorted DNA. These proteins are important in bending
arrays of nucleosomes and arranging them into the larger
structures that make up chromosomes.
20
Interaction of DNA (in orange) with histones (in blue). These proteins' basic
amino acids bind to the acidic phosphate groups on DNA.
21
A distinct group of DNA-binding proteins is the DNA-binding
proteins that specifically bind single-stranded DNA. In humans,
replication protein A is the best-understood member of this family
and is used in processes where the double helix is separated,
including DNA replication, recombination, and DNA repair. These
binding proteins seem to stabilize single-stranded DNA and
protect it from forming stem loops or being degraded by
nucleases.
The lambda repressor helix-turn-helix transcription factor is bound
to its DNA target
In contrast, other proteins have evolved to bind to particular DNA
sequences. The most intensively studied of these are the various
transcription factors, which are proteins that regulate transcription.
Each transcription factor binds to one particular set of DNA
sequences and activates or inhibits the transcription of genes that
have these sequences close to their promoters. The transcription
factors do this in two ways. Firstly, they can bind the RNA
polymerase responsible for transcription, either directly or through
other mediator proteins; this locates the polymerase at the
promoter and allows it to begin transcription. Alternatively,
transcription factors can bind enzymes that modify the histones
at
22
The lambda repressor helix-turn-helix transcription factor bound to its DNA
target
23
the promoter. This changes the accessibility of the DNA template
to the polymerase.
DNA-modifying enzymes
Nucleases and ligases
Nucleases are enzymes that cut DNA strands by catalyzing the
hydrolysis of the phosphodiester bonds. Nucleases that hydrolyse
nucleotides from the ends of DNA strands are called
exonucleases, while endonucleases cut within strands. The most
frequently used nucleases in molecular biology are the restriction
endonucleases, which cut DNA at specific sequences. For
instance, the EcoRV enzyme shown to the left recognizes the 6-
base sequence 5′-GATATC-3′ and makes a cut at the horizontal
line. In nature, these enzymes protect bacteria against phage
infection by digesting the phage DNA when it enters the bacterial
cell, acting as part of the restriction-modification system. In
technology, these sequence-specific nucleases are used in
molecular cloning and DNA fingerprinting.
24
The restriction enzyme EcoRV (green) in a complex with its substrate DNA
25
Enzymes called DNA ligases can rejoin cut or broken DNA
strands. Ligases are particularly important in lagging strand DNA
replication, as they join the short segments of DNA produced at
the replication fork into a complete copy of the DNA template.
They are also used in DNA repair and genetic recombination.
Topoisomerases and helicases
Topoisomerases are enzymes with both nuclease and ligase
activity. These proteins change the amount of supercoiling in
DNA. Some of these enzymes work by cutting the DNA helix and
allowing one section to rotate, thereby reducing its level of
supercoiling; the enzyme then seals the DNA break. Other types
of these enzymes are capable of cutting one DNA helix and then
passing a second strand of DNA through this break, before
rejoining the helix. Topoisomerases are required for many
processes involving DNA, such as DNA replication and
transcription.
26
In DNA replication, DNA-dependent DNA polymerases make
copies of DNA polynucleotide chains. To preserve biological
information, it is essential that the sequence of bases in each
copy are precisely complementary to the sequence of bases in
the template strand. Many DNA polymerases have a proofreading
activity. Here, the polymerase recognizes the occasional mistakes
in the synthesis reaction by the lack of base pairing between the
mismatched nucleotides. If a mismatch is detected, a 3′ to 5′
exonuclease activity is activated and the incorrect base is
removed. In most organisms, DNA polymerases function in a
large complex called the replisome that contains multiple
accessory subunits, such as the DNA clamp or helicases.
27
RNA: AN INTRODUCTION
Ribonucleic acid (RNA) is a polymeric molecule essential in
various biological roles in coding, decoding, regulation and
expression of genes. RNA and deoxyribonucleic acid (DNA) are
nucleic acids. Along with lipids, proteins, and carbohydrates,
nucleic acids constitute one of the four major macromolecules
essential for all known forms of life. Like DNA, RNA is assembled
as a chain of nucleotides, but unlike DNA, RNA is found in nature
as a single strand folded onto itself, rather than a paired double
strand. Cellular organisms use messenger RNA (mRNA) to
convey genetic information (using the nitrogenous bases of
guanine, uracil, adenine, and cytosine, denoted by the letters G,
U, A, and C) that directs synthesis of specific proteins. Many
viruses encode their genetic information using an RNA genome.
28
A hairpin loop from a pre-mRNA. Highlighted are the nucleobases (green)
and the ribose-phosphate backbone (blue). This is a single strand of RNA
that folds back upon itself.
29
RNA: COMPARISON WITH DNA
The chemical structure of RNA is very similar to that of DNA but
differs in three primary ways:
30
Three-dimensional representation of the 50S ribosomal subunit. Ribosomal
RNA is in ochre, and proteins are in blue. The active site is a small
segment of rRNA, indicated in red.
31
In this fashion, RNAs can achieve chemical catalysis (like
enzymes). For instance, the determination of the structure of the
ribosome—an RNA-protein complex that catalyzes peptide bond
formation—revealed that its active site is composed entirely of
RNA.
32
Watson-Crick base pairs in a siRNA (hydrogen atoms are not shown)
33
RNA: STRUCTURE
Each nucleotide in RNA contains a ribose sugar, with carbons
numbered 1' through 5'. A base is attached to the 1' position, in
general, adenine (A), cytosine (C), guanine (G), or uracil (U).
Adenine and guanine are purines, and cytosine and uracil are
pyrimidines. A phosphate group is attached to the 3' position of
one ribose and the 5' position of the next. The phosphate groups
have a negative charge each, making RNA a charged molecule
(polyanion). The bases form hydrogen bonds between cytosine
and guanine, between adenine and uracil and between guanine
and uracil. However, other interactions are possible, such as a
group of adenine bases binding to each other in a bulge, or the
GNRA tetraloop that has a guanine–adenine base pair.
34
Structure of a fragment of an RNA, showing a guanosyl subunit.
35
RNA is transcribed with only four bases (adenine, cytosine,
guanine and uracil), but these bases and attached sugars can be
modified in numerous ways as the RNAs mature. Pseudouridine
(Ψ), in which the linkage between uracil and ribose is changed
from a C–N bond to a C–C bond, and ribothymidine (T) are found
in various places (the most notable ones being in the TΨC loop of
tRNA). Another notable modified base is hypoxanthine, a
deaminated adenine base whose nucleoside is called inosine (I).
Inosine plays a key role in the wobble hypothesis of the genetic
code.
There are more than 100 other naturally occurring modified
nucleosides. The greatest structural diversity of modifications can
be found in tRNA, while pseudouridine and nucleosides with 2'-O-
methylribose often present in rRNA are the most common. The
specific roles of many of these modifications in RNA are not fully
understood. However, it is notable that, in ribosomal RNA, many
of the post-transcriptional modifications occur in highly functional
regions, such as the peptidyl transferase centre and the subunit
interface, implying that they are important for normal function.
36
Secondary structure of a telomerase RNA.
37
ribose. By the use of L-ribose or rather L-ribonucleotides, L-RNA
can be synthesized. L-RNA is much more stable against
degradation by RNase.
Like other structured biopolymers such as proteins, one can
define the topology of a folded RNA molecule. This is often done
based on the arrangement of intra-chain contacts within a folded
RNA, termed circuit topology.
38
RNA: UNIQUE NATURE AND CHARACTERISTIC
FEATURES
RNA is of different types and each type serves a unique function.
Messenger RNA (mRNA) carries information about a protein
sequence to the ribosomes, the protein synthesis factories in the
cell. It is coded so that every three nucleotides (a codon)
corresponds to one amino acid. In eukaryotic cells, once
precursor mRNA (pre-mRNA) has been transcribed from DNA, it
is processed to mature mRNA. This removes its introns—non-
coding sections of the pre-mRNA. The mRNA is then exported
from the nucleus to the cytoplasm, where it is bound to ribosomes
and translated into its corresponding protein form with the help of
tRNA. In prokaryotic cells, which do not have nucleus and
cytoplasm compartments, mRNA can bind to ribosomes while it is
being transcribed from DNA. After a certain amount of time, the
message degrades into its component nucleotides with the
assistance of ribonucleases.
Transfer RNA (tRNA) is a small RNA chain of about 80
nucleotides that transfers a specific amino acid to a growing
polypeptide chain at the ribosomal site of protein synthesis during
translation. It has sites for amino acid attachment and an
anticodon region for codon recognition that binds to a specific
sequence on the messenger RNA chain through hydrogen
bonding.
39
Transfer-messenger RNA (tmRNA) is found in many bacteria and
plastids. It tags proteins encoded by mRNAs that lack stop
codons for degradation and prevents the ribosome from stalling.
40
CONCLUSION
DNA contains the genetic information that allows all forms of life
to function, grow and reproduce. However, it is unclear how long
in the 4-billion-year history of life DNA has performed this
function, as it has been proposed that the earliest forms of life
may have used RNA as their genetic material. RNA may have
acted as the central part of early cell metabolism as it can both
transmit genetic information and carry out catalysis as part of
ribozymes. This ancient RNA world where nucleic acid would
have been used for both catalysis and genetics may have
influenced the evolution of the current genetic code based on four
nucleotide bases. This would occur, since the number of different
bases in such an organism is a trade-off between a small number
of bases increasing replication accuracy and a large number of
bases increasing the catalytic efficiency of ribozymes. However,
there is no direct evidence of ancient genetic systems, as
recovery of DNA from most fossils is impossible because DNA
survives in the environment for less than one million years, and
slowly degrades into short fragments in solution. Claims for older
DNA have been made, most notably a report of the isolation of a
viable bacterium from a salt crystal 250 million years old, but
these claims are controversial.
Building blocks of DNA (adenine, guanine, and related organic
molecules) may have been formed extraterrestrially in outer
space. Complex DNA and RNA organic compounds of life,
including uracil, cytosine, and thymine, have also been formed in
the laboratory under conditions mimicking those found in outer
space, using starting chemicals, such as pyrimidine, found in
meteorites. Pyrimidine, like polycyclic aromatic hydrocarbons
(PAHs), the most carbon-rich chemical found in the universe, may
have been formed in red giants or in interstellar cosmic dust and
gas clouds.
41
In February 2021, scientists reported, for the first time, the
sequencing of DNA from animal remains, a mammoth in this
instance over a million years old, the oldest DNA sequenced to
date.
42
BIBLIOGRAPHY
-"RNA: The Versatile Molecule". University of Utah. 2015.
-Purcell A. "DNA". Basic Biology. Archived from the original on 5
January 2017.
43