Genetics, Lecture 5, Trascription (Slides)
Genetics, Lecture 5, Trascription (Slides)
DNA RNA
Adenine Adenine
Cytosine Cytosine
Guanine Guanine
Thymine Uracil (U)
Tertiary structure
Classes of prokaryotic RNA
• ribosomal RNA (rRNA)
16S (small ribosomal subunit)
23S (large ribosomal subunit)
5S (large ribosomal subunit)
• transfer RNA (tRNA)
• messenger RNA (mRNA)
RNA polymerase
open promoter complex
initiation
elongation
termination
RNA product
Promoter structure in prokaryotes
mRNA
5’ PuPuPuPuPuPuPuPu AUG
-30 -10 +1
[ ]
Promoter
transcription start site
mRNA
5’
-30 region -10 region
TTGACA TATAAT
AACTGT ATATTA
-36 -31 -12 -7 +1 +20
Pribnow box
T T G AC A TATA AT
82 84 79 64 53 45% 79 95 44 59 51 96%
consensus sequences
Prokaryotic RNA polymerase structure
a 2 uncertain
b (Rifampicin target) 1 forms phosphodiester bonds
b’ 1 binds DNA template
s 1 recognizes promoter and
facilitates initiation
a2bb’s a2bb’ + s
holoenzyme core polymerase sigma factor
The function of sigma factor
• the sigma subunit of RNA polymerase is an “initiation factor”
• there are several different sigma factors in E. coli that are
specific for different sets of genes
• sigma factor functions to ensure that RNA polymerase binds
stably to DNA only at promoters
• sigma destablizes nonspecific binding to non-promoter DNA
• sigma stabilizes specific binding to promoter DNA
• this accelerates the search for promoter DNA
Ka (M-1)
Any DNA Promoter DNA
(nonspecific) (specific)
Core 2 X 1011
s
• the sigma subunit binds to the -10 region
s
promoter regions because of sigma factor
s
• elongation takes place with
the core RNA polymerase
A=T A = T
U=A U=A
• RNA synthesis usually initiated with ATP or GTP (the first nucleotide)
• RNA chains are synthesized in a 5’ to 3’ direction
Eukaryotic Transcriptional Regulation
What are the enzymes responsible for the synthesis of these RNAs?
The human RNA polymerases
Polymerase Location Product
translated region
UGA
termination
3’ untranslated region
polyadenylation signal
AAUAAA (A)~200 3’
poly(A) tail
• all mRNAs have a 5’ cap and all mRNAs (with the exception
of the histone mRNAs) contain a poly(A) tail
• the 5’ cap and 3’ poly(A) tail prevent mRNA degradation
• loss of the cap and poly(A) tail results in mRNA degradation
Complexity1 of mRNA classes in the mammalian cell2
Number of
different
Abundance Abundance mRNA
class (copies/cell) species Total
high 12,000 9 108,000
intermediate 300 700 210,000
low (rare) 15 11,500 172,500
12,209 490,500
Based on these measurements, this cell type contains
• three abundance classes of mRNA
• ~ 12,209 different mRNA species
• ~490,500 total mRNA molecules
1
determined in RNA-DNA hybridization experiments analogous to Cot curves
2
mouse liver cytoplasmic poly(A)+ RNA
• how are these mRNAs made and what determines their relative amounts?
• rate of synthesis vs. rate of turnover (degradation)
Transcription and promoter elements for RNA polymerase II
transcription
+1 transcription unit
element
TE P exon exon
promoter
Promoter (DNA sequence upstream of a gene)
• determines start site (+1) for transcription initiation
• located immediately upstream of the start site
• allows basal (low level) transcription
Transcription element (DNA sequence that regulates the gene)
• determines frequency or efficiency of transcription
• located upstream, downstream, or within genes
• can be very close to or thousands of base pairs from a gene
• includes
enhancers (increase transcription rate)
silencers (decrease transcription rate)
response elements (target sequences for signaling molecules)
• genes can have numerous transcription elements
Transcription and promoter elements for RNA polymerase II
transcription
element transcription unit
TE P exon exon
promoter
transcription
element
TE P exon exon
promoter complex
TE P exon exon TE
P exon TE exon
The locus control region is a specialized transcription element
TE
gene A
P
gene B
TE P
E F TAFs
B TFIID
H
TBP
A J
-25 +1
• TFIID (a multisubunit protein) binds to the TATA box
to begin the assembly of the transcription apparatus
• the TATA binding protein (TBP) directly binds the TATA box
• TBP associated factors (TAFs) bind to TBP
• TFIIA, TFIIB, TFIIE, TFIIF, TFIIH, TFIIJ assemble with TFIID
Binding of RNA polymerase II
E F
B TFIID
H
TBP
A J
RNA pol II
E F
B TFIID
H
TBP
A J +1
RNA pol II
• transcription factors binding to
other promoter elements and transcription elements interact
with proteins at the promoter and further stabilize (or inhibit)
formation of a functional preinitiation complex
• this process is called “transactivation”
Formation of a stable preinitiation complex
E F
B TFIID H
TBP
J +1
RNA pol II
E F
B TFIID H initiation
TBP
J +1 RNA pol II
P CTD
P
P
• RNA pol II is phosphorylated by TFIIH on the carboxy terminal
domain (CTD), releasing it from the preinitiation complex and
allowing it to initiate RNA synthesis and move down the gene
Transcription factors (partial list)
Factor Full name or function
CREB Cyclic AMP response element binding protein
CTF CAAT box transcription factor (=NF1) (binds GGCCAATCT)
NF1 Nuclear factor-1 (=CTF)
AP1 Activator protein-1 (dimer of the Fos-Jun proteins)
Sp1 Specificity factor-1 (binds CCGCCC)
OTF Octamer transcription factor (binds ATTTGCAT)
NF-kB Nuclear factor kB
HSTF Heat shock transcription factor
MTF Metal transcription factor
USF Upstream factor
ATF Activating transcription factor
HNF4 Hepatocyte nuclear factor-4 (nuclear receptor superfamily)
GR Glucocorticoid receptor (nuclear receptor superfamily)
AR Androgen receptor (nuclear receptor superfamily)
ER Estrogen receptor (nuclear receptor superfamily)
TR Thyroid hormone receptor (nuclear receptor superfamily)
C/EBP CAAT/enhancer binding protein
E2F E2 factor (named for the adenovirus E2 gene)
p53 p53 (tumor suppressor protein)
Myc Product of the c-myc protooncogene (dimerizes with Max)
Basic region-leucine zipper (bZIP) transcription factors
Fos Jun
Basic regions
(DNA contact surfaces
that bind to the DNA)
N
DNA binding
• Dimerization via the leucine domains
C C
Gcn4 (Basic Region, Leucine Zipper) Complex With Ap-1 DNA
Structures generated using RasWin Molecular Graphics
Windows Version 2.6 and PDB ID# 1YSA
DNA binding
Leucine zipper
Binding of AP1 to DNA transactivates transcription
E F
B TFIID
H
TGACTCA TBP
ACTGAGT A J +1
RNA pol II
E F
P
Fos Jun B TFIID
H
TGACTCA TBP
ACTGAGT A J +1
RNA pol II
Cys
Model for binding of steroid receptor dimer to DNA
T C C A G T N N N A C T G G
5’-AGGTCANNNTGACCT-3’
:::::::::::::::
T
3’-TCCAGTNNNACTGGA-5’
Estrogen response element (ERE)
Steroid hormone action in target cells
mifepristone (RU486) is a
progesterone receptor antagonist
Mutations affecting promoters
The factor IX gene
• located on the X chromosome
• transcribed region >32,700 bp, with 8 exons
The factor IX gene promoter
• there are overlapping binding sites for AR and HNF4
AR HNF4
• AR = androgen receptor
• zinc finger nuclear receptor superfamily transcription factor
• binds androgen
• androgen levels increase at puberty
• HNF4 = hepatocyte nuclear factor-4
• zinc finger nuclear receptor superfamily transcription factor
• ligand unknown - therefore an “orphan” receptor
• HNF4 is expressed early in development and in adult liver
• mutation at -20 results in
Hemophilia B Leyden in which
the hemophilia improves at puberty
when levels of androgen increase
AR HNF4
cap
cap
cap poly(A)
cap poly(A)
mGpppNmpNm
AAUAAA A A
A
polyadenylation
mGpppNmpNm A
A
A
3’
intron 1
Pre-mRNA
2’OH-A branch site adenosine
exon 1 exon 2
5’ G-p-G-U
- A-G-p-G 3’
intron 1 Splicing
intermediate
U-G-5’-p-2’-A
A
exon 1 exon 2
5’ G-OH
O 3’ A-G-p-G
A - 3’
U-G-5’-p-2’-A
A
3’ G-A
Spliced mRNA
exon 1 exon 2
5’ G-p-G 3’
Recognition of splice sites
• invariant GU and AG dinucleotides at intron ends
• donor (upstream) and acceptor (downstream) splice sites
are within conserved consensus sequences
donor (5’) splice site branch site acceptor (3’) splice site
G/GUAAGU..................…A.......…YYYYYNYAG/G
U1 U2
= hnRNP proteins
Spliceosome assembly
intron 1
Step 1: binding of U1
and U2 snRNPs
U2
2’OH-A
exon 1 exon 2
5’ U1
G-p-G-U
- A-G-p-G 3’
intron 1 Step 2: binding of U4, U5, U6
U2 2’OH-A
exon 1
U4 U6 exon 2
5’
U5
G-p-G-U
- A-G-p-G 3’
U1
Step 3: U1 is released,
intron 1
then U4 is released
2’OH-A
U2
exon 1
U6 exon 2
5’ G-p-G-U
- U5 A-G-p-G 3’
Step 4: U6 binds the 5’ splice site and
the two splicing reactions occur,
catalyzed by U2 and U6 snRNPs
intron 1
2’OH-A
U6 U2
U-G-5’-p-2’-A
A
mRNA 3’ G-A U5
5’ G-p-G 3’
Frequency of bases in each position of the splice sites
Donor sequences
exon intron
%A 30 40 64 9 0 0 62 68 9 17 39 24
%U 20 7 13 12 0 100 6 12 5 63 22 26
%C 30 43 12 6 0 0 2 9 2 12 21 29
%G 19 9 12 73 100 0 29 12 84 9 18 20
A G G U A A G U
Acceptor sequences
intron exon
%A 15 10 10 15 6 15 11 19 12 3 10 25 4 100 0 22 17
%U 51 44 50 53 60 49 49 45 45 57 58 29 31 0 0 8 37
%C 19 25 31 21 24 30 33 28 36 36 28 22 65 0 0 18 22
%G 15 21 10 10 10 6 7 9 7 7 5 24 1 0 100 52 25
Y Y Y Y Y Y Y Y Y Y Y N Y A G G
Polypyrimidine track (Y = U or C; N = any nucleotide)
Mutations that disrupt splicing
• bo-thalassemia - no b-chain synthesis
• b+-thalassemia - some b-chain synthesis
Intron 2 acceptor site bo mutation: no use of mutant site; use of cryptic splice site in intron 2
Translation of the retained
portion of intron 2 results
Exon 1 Exon 2 in premature termination
Intron 1 of translation due to a stop
codon within the intron, 15
codons from
Intron 2 cryptic acceptor site: UUUCUUUCAG/G the cryptic splice site
Donor site: /GU AG/: Normal acceptor site (used 10% of the time in b+ mutant)
Translation of the retained portion of intron 1 results in termination at a stop codon in intron 1
Exon 1 b+ (Hb E) mutation creates a new donor splice site: use of both sites
Exon 2 Exon 3
Intron 2
/GU: Normal donor site (used 60% of the time when exon 1 site is mutated)
The incorrect splicing results in a frameshift and translation terminates at a stop codon in exon 2
Patterns of alternative exon usage
• one gene can produce several (or numerous) different
but related protein species (isoforms)
Cassette
Mutually exclusive
Alternative promoters
The Troponin T (muscle protein) pre-mRNA
is alternatively spliced to give rise to
64 different isoforms of the protein