Lecture1 20060306 Kang
Lecture1 20060306 Kang
Lecture1 20060306 Kang
Lecture 1:
Overview
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Suggested Texts
Bioinformatics: Sequence and Genome
Analysis. David Mount. 2001. ISBN: 9-
87969-608-7.
Biological Sequence Analysis: Probabilistic
models of proteins and nucleic acids. R.
Durbin, S. Eddy, A. Krogh and G. Mitchison.
1998. ISBN: 0-521-62971-3.
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Suggested Texts
•
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Other Reference Books
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
What is Bioinformatics/
Computational Biology?
• Bioinformatics: collection and storage of biological
information
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
What is Bioinformatics?
Source: http://ccb.wustl.edu/
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Why should I care?
• SmartMoney ranks
Bioinformatics as #1 among
next HotJobs
• Important information
waiting to be decoded!
http://smartmoney.com/consumer/index.cfm?story=working-june02
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Why is bioinformatics hot?
• Supply/demand: few people adequately
trained in both biology and computer science
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
What skills are needed?
• Well-grounded in one of the following
areas:
– Computer science
– Molecular biology
– Statistics
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Where Can I Learn More?
• ISCB: http://www.iscb.org/
• NBCI: http://ncbi.nlm.nih.gov/
• http://www.bioinformatics.org/
• Journals
• Conferences (ISMB, RECOMB, PSB…)
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Overview of Molecular Biology
• Cells
• Chromosomes
• DNA
• RNA
• Amino Acids
• Proteins
• Genome/Transcriptome/Proteome
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Cells
• Complex system enclosed
in a membrane
• Humans:
– 60 trillion cells
– 320 cell types
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Organisms
• Classified into two types:
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Chromosomes
• In eukaryotes, nucleus
contains one or several
double stranded DNA
molecules orgainized as
chromosomes
• Humans:
– 22 Pairs of autosomes
– 1 pair sex chromosomes
Human Karyotype
http://avery.rutgers.edu/WSSP/StudentScholars/
Session8/Session8.html
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Image source: www.biotec.or.th/Genome/whatGenome.html
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
What is DNA?
• DNA: Deoxyribonucleic Acid
• 4 different nucleotides:
– Adenosine (A)
– Cytosine (C)
– Guanine (G)
– Thymine (T)
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Nucleotide Bases
• Purines (A and G)
• Pyrimidines (C and T)
• Difference is in base structure
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
DNA
• Can be thought of as an alphabet with 4
characters
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
DNA polynucleotides(oligomers)
• Different nucleotides are
strung together to form
polynucleotides
• Ends of the
polynucleotide are
different
• A directionality is
present
• Convention is to label
the coding strand from
5’ to 3’
http://www.emc.maricopa.edu/faculty/farabee/BIOBK/BioBookDNAMOLGEN.html
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Single Strand Polynucleotide
Example polynucleotide:
5’ GTAAAGTCCCGTTAGC 3’
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Double Stranded DNA
• DNA can be single-stranded or double-stranded
• Complementary bases:
– A, T
– C, G
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Double Stranded Sequence
Example double stranded polynucleotide:
5’ GTAAAGTCCCGTTAGC 3’
| | | | | | | | | | | | | | | |
3’ CATTTCAGGGCAATCG 5’
http://www.emc.maricopa.edu/faculty/farabee/BIOBK/BioBookDNAMOLGEN.html
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Double Stranded DNA
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Double Helix
• Two complementary DNA strands form a stable DNA
double helix
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
RNA
• Ribonucleic Acid
• Similar to DNA
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
RNA
• RNA is generally single stranded
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
RNA secondary
structure
• E. coli Rnase P
RNA secondary
structure
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
mRNA
• Messenger RNA
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
mRNA processing
• Eukaryotic genes can be pieced together
– Exons: coding regions
– Introns: non-coding regions
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
mRNA Processing
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
tRNA
• Transfer RNA
• Well-defined three-dimensional
structure
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
tRNA structure
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
tRNA
• Amino acid attached to each tRNA
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Genetic Code
• 4 possible bases (A, C, G, U)
• 3 bases in the codon
• 4 * 4 * 4 = 64 possible codon sequences
• Start codon: AUG
• Stop codons: UAA, UAG, UGA
• 61 codons to code for amino acids (AUG as
well)
• 20 amino acids – redundancy in genetic code
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
20 Amino Acids
• Glycine (G, GLY)
• Alanine (A, ALA)
• Valine (V, VAL)
• Leucine (L, LEU)
• Isoleucine (I, ILE)
• Phenylalanine (F, PHE)
• Proline (P, PRO)
• Serine (S, SER)
• Threonine (T, THR)
• Cysteine (C, CYS)
• Methionine (M, MET)
• Tryptophan (W, TRP)
• Tyrosine (T, TYR)
• Asparagine (N, ASN)
• Glutamine (Q, GLN)
• Aspartic acid (D, ASP)
• Glutamic Acid (E, GLU)
• Lysine (K, LYS)
• Arginine (R, ARG)
• Histidine (H, HIS)
• START: AUG
• STOP: UAA, UAG, UGA
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Amino Acids
• building blocks for proteins (20 different)
• vary by side chain groups
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Proteins
• Polypeptides having a three dimensional structure.
• Primary–sequence of amino acids constituting the polypeptide
chain
• Secondary–local organization into secondary structures
such as helices and sheets
• Tertiary –three dimensional arrangements of the amino acids
as they react to one another due to the polarity and resulting
interactions between their side chains
• Quaternary–number and relative positions of the protein
subunits
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Protein Structure
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Central Dogma
DNA
RNA
PROTEIN
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Central Dogma
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
What is a Gene?
• the physical and functional unit of
heredity that carries information from
one generation to the next
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Genome
• chromosomal DNA of an organism
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Genome Comparison
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Transcriptome
• complete collection of all possible mRNAs
(including splice variants) of an organism.
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)
Proteome
• the complete collection of proteins that
can be produced by an organism.
Introduction to Bioinformatics National Genome Information Center Spring 2006 (original author: Dr. Eric Rouchka)