0% found this document useful (0 votes)

61 views

Databases in Bioinformatics - An Introduction

Uploaded by

Isha Chopra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views

Databases in Bioinformatics - An Introduction

Uploaded by

Isha Chopra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

3

Databases in Bioinformatics -
An Introduction

'Knowledge is the eye of desire and can become

the pilot of the soul.'
WILL D URANT

INTRODUCTION
The amount and rate of accumulation of biological information is increasing
expo nentially with the discovery of new and automated sequencing methods and
development of powerful new technologies for acquiring large-scale genomic and
proteomic datasets. The exponential growth in molecular sequence data started in the
early 1980s, when the methods for DNA sequencing became widely available. The data
generated from various sequencing projects needed to be stored and analysed to
annotate the genes and their products and measure their dynamic interactions. As a
result came the concept of 'biological database' to store biological data in an electronic
format . All the biological databases use standardized formats. This tremendous growth
in the biological data has turned biological sciences into a data-rich science. The
common examples of biological data are the nucleotide sequences (genes and genomes);
the protein sequences and motifs; the macromolecular structural data generated from X -
ray crystallography and macromolecular N MR; metabolic pathways; gene expression
data (microarrays); protein-protein interactions; and many other types of data related to
biological function a nd processes. This explosion of biological sequence data in the early
1980s paved the way for the development of three popular databases: NCBI (National
Center for Biotechnology Information), EMBL (European M olecular Biology Labora-
tory), and DDBJ (DNA Data Bank of Japan).

BIOLOGICAL DATABASES
We can define biological database as a collection of data that is structured, searchable,
updated periodically, and cross-referenced. The database administrator updates these
data from time to time by editing existing data and adding new data. Biological
llal11bfflllcl: Princlples and Applications

databases are developed to perform several functions. Some of the main purposes/
functions of biological databases are as follows:
• Databases aid in the systematization of results from biological experiments and
analysis. All the biological data obtained through experiments or analysis are
useful for future work. So databases help to organize and store all known data
which prevents recomputing and duplication of experiments.
nd
• Databases make biological data available to scientists at one place a help them
to obtain data for their research and cross-validation.
nd
• Biological data in databases are available in computer-readable form a this
forms the first fundamental step of biological data analysis.

History of Biological Databases d b

· · I · · b. · r · II ed by the necessity to create ata ases . of
.
I mtta interest m 1om1ormat1cs was prope
· · , , t d 1·mmediately after the msulm
b1olog1cal sequences. The first database was crea e .
· ·1 bl · J 956 I 11·n was the first protem to be
protem sequence was made avai a e 111 • nsu
sequenced and this consisted of only 51 amino acid residues. Proba~ly, the first
11
published work on the biological sequence databases was Atlas of Prote~ Sequences
and Structures, 1965, by Margaret Dayhoff et al. It contained the_ prot~m sequences
determined at that time, and new editions of the book were pubhshed m the 1970s.
The data published in this book became the foundation for the PIR (Protein
Information Resource) database. Then, the first nucleic acid sequence of Yeast tRNA,
with 77 bases, was also obtained. Dunng this period, three-dimensional (30)
structures of proteins were studied and the well-known PDB (Protein Data Bank)
was developed. This was the first protein structure database with only 10 entries in
1972. This has now grown into a large database with over 43,755 entries (as on May
2007). While the initial databases of protein sequences were maintained at the
individual laboratories, the development of a consolidated formal protein sequence
database known as Swiss Prot was initiated in 1986, which now has about 2,69,293
protein sequences from more than I 0,917 model organisms (as on May 2007). This
huge selection of divergent database resources is now made available for study and
research by both academic institutions and industries. These are made available as
public domain information in the larger interest of the research community through
Internet and CD-ROMs. These databases are constantly updated with additional
entries.
. The first genome of a free-living organism (viruses aside) was that of Jlaemoplzi/us
znjluenzae published in 1995. A genome is defined as the completely (or almost com-
pletely) dete~ined_ DNA sequence of the genetic material (chromosomes as well
as_ any plasmids, mitochondrial DNA, etc.) of an organism. The word is a bit of a
misno,mer-: a genome is not the same as 'all genes', it is rather a 'sequence of all
DNA
. wherem all genes can be 1round . N owad ays, scientists • . are showmg . more mterest.
m genome research
. and th ey consi·d er genome as a basic . requirement for working on a
.fi
spec1 1c orgamsm Wh . .
eno " · ~ are. comp1ete genomes so mterestmg? Knowing the complete
g me ior an orgamsm is the fi t t . h
constituents d irs s ep m t e complete mapping of the gene
an processes of the organis Th .
not sufficient) requireme t ~ d Ill_· e comp1ete genome 1s a necessary (but
n or un erstandmg an organism.
Databases in Bioinfonnatics - An Introduction 49
The analysis of a g~nome covers many different . . of the most common
aspects. A hst
·
i en t th at entirely · a complete genome
~
ones follows
. ' but it is ev'd novel ways of analysing
can be mvented. There is great potential for interesting discoveries in the complete
genomes; we probably have just scratched the surface with the few possibilities as
follows:
1. ~e~ne t~e location of genes (coding sequences, regulatory regions); gene pre-
diction (identification).
2 - Gene prediction ab initio using software based on the rules and patterns. Find
Open Reading Frames (ORFs), with some additional criteria. Fairly simple
for bacteria, very difficult for eukaryotes.
3. Gene identification through alignment with known proteins and EST
(Expressed Sequence Tag) sequences.
4. Gene prediction through similarity with proteins or ESTs in other organisms.
5. Gene prediction through comparison with other genomes; conserved regions
may be coding or regulatory regions.

Features of Biological Databases

The various features of biological databases are as follows:
1. Data heterogeneity: There is a great deal of diversity in the data types of a
biological database. Some of the biological data types are listed here.
a. Seque11ces: Sequence data type includes DNA, RNA, and amino-acid sequences
(proteins). These data have grown enormously due to the availability of automated
sequencers and large-scale sequencing projects such as human, mouse, dog, and
many more genomes.
b. Graphs: Biological data indicating relationships among themselves can be
captured as graphs or networks. Examples of these data are pathway data
(metabolic pathways, signaling pathways, and gene-regulatory networks), genetic
maps, and structured taxonomies.
c. High-dimensional data: High-dimensional data include the data generated
from micro-array experiments that involve thousands of genes, hundreds of
experimental conditions, clustering studies on genes, etc. These data are used for
comparing the behaviour of various biological units in different conditions.
d. Shapes: The data type 'Shapes' consists of 3D molecular structural data. For
example, docking is put under these data category as 'docking' behaviour
of molecules at a potential binding site, and it depends on the 3D configuration of
the molecule and the site.
e. Temporal data: Temporal data are useful for studying the dynamics of any
biological system, e.g., electrophysiology recordings, development biology,
protein structure dynamics, cellular structure dynamics, etc.
f Patterns: Within the genome, there are patterns that characterize biologically
interesting entities. For example, the genome contains patterns associated with
genes (i.e., sequences of particular genes) and with regulatory sequences (that
determine the extent of a particular gene's expression, such as promoter,
transcription factor, etc.).
so Biointonnatics• Pri .
. nap1es and Applications

g. Mode/ data· Th b' .

mathemat' · e I~l~gical phenomena are represented as computa ·
"c , tea1' and statistical models used for parameter estimation, testing
h· .,, a,ar and . '
. vector fields: Charge distribution across cell surface, calcnun
P. roEt em fluxes across cell surface, etc., are included under this
· category.
z. xtracted features data: Numerical data extracted from the combination of
~ne of the above data types are put under this category of data.
2· High-volume data: In addition to being highly heterogeneous, biological data are
\ 0 luminous to support comprehensive investigation in various fields and direc..
hons.
3. Uncertainty: Biological data have a great deal of uncertainty as they represent
biological phenomena that are observed and assumed (based on some evidence) to
be true. The uncertainty must be modelled and recorded as a part of the data.
4. Data curation: The biological data are collected from various sources across
different structural and functional boundaries and so there is always a chance of
many missing links and inconsistencies. Some of these inconsistencies are due to a
lack of knowledge in the desired field. To improve these data inconsistencies,
cross-correlation of data and its analysis are essential. Automatic data curation
is essential in understanding the missing links as the biological databases are
flourishing quickly.
5. Large-scale data integration: Data collected from laboratories worldwide after
years of research, across different structural and functional scales, are integrated
together through a database and made available for use.
6. Data sharing: Biological data generated from an experiment needs to be cross-
verified by other scientists around the world to confirm its reproducibility.
Therefore, data are shared via databases for the scientific community's examina-
tion and inspection.
7. Dynamic and subject to continual change: New data are generated in plenty every
day in various laboratories and sometimes these new data obtained contradict the
old data. So there is a necessity for developing new organizational database
schemes to incorporate any new data.

CLASSIFICATION SCHEMA OF BIOLOGICAL DATABASES

The biological databases can be classified into various categories based on different
criteria such as data type, maintainer status, data access, data source, database design,
and type of organism. These classifications are explained as follows:
• Based on Data Type
1. Genome databases
Human, Mouse, Yeast, C. elegans, Flybase, and Parasites
2. Sequence database
a. Nucleotide databases: Alternative Splicing, EMBL-Bank, Ensembl,
Genomes Server, Genome, MOT, EMBL-Align, Simple Queries.
dbSTS Queries, Parasites, Mutations, and JMGT ·
b. Protein databases: Swiss- Prot, TrEMBL, JnterPro, CluSTr, IPI, G()A,
GO, Proteome Analysis, HPI, IntEnz, TrEMBLnew SP ML, NEW'f,
and PANDIT ' -
Databases In Bioinformatics - An lntroductiOn 51
3. Structure databases
PJ?B, MSD, NDB, FSSP, and DALI
4. M1croarray database
ArrayExpress and MIAME
5. Chemical database
ChEBI
6. Pathway database
7· BRENDA, KEGG, and BioSilico
8. Enzyme database
EC Enzyme Database, Enzyme Nomenclature Database (ExPASy) and
REBASE '
9. Disease database
OMIM and OMIA
10. Literature databases
MEDLINE, Software Biocatalog, and Flybase Archives
• Based on Maintainer Status
NCBI, EMBL, and SIB
• Based on Data Access
1. Publicly available
2. Available with copyright
3. Browsing only, accessible but not downloadable
4. Academic, but not freely available
5. Proprietary, commercial
6. Restricted SQL queries against underlying DBMS
• Based on Data Source
1. Primary database (archival)
a. Nucleotide: GenBank EMBL, DDBJ
b. Protein: UniProt, TrEMBL
c. Structure: PDB
d. Literature: Medline (PubMed)
2. Secondary database (curated)
a. Genomic: RefSeq, TIGR gene indices of human
b. Proteomic: Prosite, Swiss-Prot
• Database Design
Relational and object-oriented
• Orga11ism
Bacteria, virus, human, etc.

✓Databases Based on Data Type

Biological databases can be broadly classified into nine categories based on the
composition of their data types. These are illustrated in Figure 3.1.

Sequence Databases
Sequence databases are applicable to both nucleic acid sequences (GenBank, EMBL-
Bank, and DDBJ) and protein sequences (Entrez protein, Integr8, proteome F ASTA,
52 Biolnfonnatics: Principles and Applications

r Databases l
Sequence database
' Genome database

Nuoleotide
(DNA)
~ Protein
(protein)
l
/
Microarray database
Bibliographic database (Transcriptome)
(literature)
/

Chemical database
Metabolic database ,
(pathways and enzymes)
/'
/
Structure database Disease database
(3D structure of ,
macromolecules) ,

r
Enzyme database

Figure 3.1 Databases based on different data types

and Swiss-Prot). Structure database deals with macromolecular stru~ture, mainly

proteins. The examples of structure databases are Molecular Modellmg Database
(MMDB), Protein Data Bank (PDB), Gene3D, EMBL-Macromolecular Structure
Database (E-MSD), Topology of Protein Structure (TOPS), etc. All these databases
are explained in detail in the subsequent chapters.

Genome Databases
Genome databases are a repository of whole genome nucleotide sequences of various
organisms - prokaryotes, eukaryotes, and viruses. These databases also provide views
for a variety of genomes, sequence maps with contigs, and integrated genetic and
physical maps along with annotated genes information. For example, Entrez Genome of
NCBI has the genome sequence data for six major organism types: Archaea, Bacteria.
Eukaryotes, Viruses, Viroids, and Plasmids. Genome Information Broker (GIB) is
a~othe_r database of the complete genome sequence data (http://www.gib.genes.
mg.ac.Jp).

Bibliographic Databases
Bibliographic database. are scientific literature database consisting of numero~s
15
resear~h papers a~d. articles from various journals. PubMed, available at NCBI.
the widely used b1bho~raphic _database. PubMed is a special type of database that
helps to _stay cu~rent with the literature of various subjects. PubMed is maintained bY
the National Library of Medicine (NLM) and contains more than 12 .8 million
Databases in Bioinformatics - All Introduction 53

abstracMts from 4'400 biomed ica· I • . . . h

and b1ochem1cal Journals datmg to as far back as t e
1970
C s. d ~ recently, Pub Med has become more fuUy integrated with NCBI's Entrez
0

. ro~s- ata ase search system (Wheeler et al. 2005) so that the users can see more than
.
Just Journal abstrac ts and titles to their text quenes.
MEDL INE . th ' .
ooici-- . is e NLM s premier bibliographic database covering the fields of
m_ . ne, !mrsmg, dentistry, veterinary medicine, the health-care system, and the pre-
chmcal sciences · MEDL INE co n tams · b'bl' · · s and auth or a bstracts
1 10graph'1c c1tat1on
from n~ore than 4,800 biomedical journ;I ; publi;hed in the United States and 70 other
countries. The ~aoas e contains over 12 million citations dating back to the mid-
1960s. Coverage ts worldwide, but most records are from English-language sources or
have English abstrac ts.

Microarray Databases
These databas es contain data obtained from microarray-based experiments measuring
the abunda nce of mRNA , genomic DNA, and protein molecules, and also from
nonarra y-based technologies, such as SAGE and mass spectrometry peptide profiling.
These data are otherwise known as transcriptome data. The examples of such data-
bases are GEO (Gene Expression Omnibus), Gensat, ArrayExpress, Cancer Gene
Expression Databa se (CGED ), Human Gene Expression Index (HuGE Index), etc.

Metabolic Databases
Metabo lic databas es contain data on biochemical pathways and enzymes in different
organisms. KEGG and MetaCyc are the noteworthy metabolic databases. Organism-
specific databas es include organism-related data individually such as EcoCyc , Flybase,
and CCDB. All these databases are elaborated in the subsequent chapters.

Chemical Databases
These databas es store chemical information on various molecules. For example,
PubChe m of NCBI contain s substance descriptions on small molecules with fewer
than 1,000 atoms and 1,000 bonds.

Structure Databases
Structu re databas es include data on 3D structur e of nucleic acids and proteins. The data
types found in this databas e are crystallographic or NMR coordin ate data, structur e
factors for the X-ray structures or constra int files for the NMR structures, and
information about the experiments used to determine the structures, such as crystal-
lization informa tion, data collection, and refinement statistics. The examples of such
nucleic acid databas es are NOB (Nucleic Acid Databas e) and SCOR (Structu ral
Classification of RNA). PDB (Protein Data Bank) is the most popular repository of 3D
structure of proteins obtained either by NMR or X-ray crystallography.

Disease Databases . .
These are the exclusive sources for disease-related mformat10n. For example, OMIM
(Online Mendel ian Inheritance in Man) provides data about human genes and genetic
disorders. Genetic Association Databa se is another popular diseases databas e contain -
ing data on Human Genetic Association studies of complex diseases and disorders.
ete. 'l'filj dlta6u6 ~
involved. Tbe
Nomencla ure a
Dallbllll IIIN 11t-DIII IOUrOI
There are two aeneral
classes of biological datal,alel bad"~ t:hffl ld,\ll'Cllltii'ii
of biological data - (i) Archival or Primal')' database and (11) Curated or
database.
,,,,,,., Dl!lllb•• . ..
Primary or Archival databases accept or mclude onginal data from researc
relatiwly little cbockiDS or validation. They contain original submissions by
ers. Most of the archival databases are public and offer open access to the
community for annotation purposes. GenBank and EMBL-Bank are eJU11111V11
primary nucleic acid database, These are nucleotide sequence database of N
Centre for Biotechnology Information (NCBI) and European Molecular
Laboratory (EMBL), respectively (see Table 3.1). Primary protein sequence da
are UniProt, PIR, Swiss-Prot, and TrEMBL; primary structure databases include
and Nucleic Acid Database (NOB).
Table 3.1 Primary databases, their descriptions and web interfaces ..,,..

~ Web Interfaces
.,_.,,e, Descriptions
http: //www.ncbi.nih.gov/
GenBank Nucleotide sequence database of NCBI
Genbank/
Database of European Molecular Biology http://www.ebi.ac.uk/embl.
EMBL
Laboratory
http://www.ddbj.nig.ac.jp
DDBJ DNA Data Bank of Japan
Universal Protein Resource http://www.uniprot.org
UniProt http://pir.georgetown.edu/
PIR Protein Information Resource
Manually curated protein-only sequence http://www.expasy.ch
Swiss-
Prot database
TrEMBL Translated EMBL is a very large protein
database in Swiss- Prot format
Protein Data Bank - repository for 30 http://www.rcsb.org/pdb
PDB
structure of macromolecules
NDB Nucleic Acid Database http://www.ndbserver.ru
edu/

,..,. contain results _from the analysis of entries

e r the databases are bsted in Table 3.2). These da
Databases In Bioinformatics - AA Introduction 51
Table 3.2 Secondary databases, their descriptions and web interfaces

of protein domai ns,

f:Database
.. http://www.expasy.org/prosite
amt 1ies, and functional sites
Pfam Protein family database http://www.sanger.ac. uk/
Software/Pfam/
Blocks Conserved regions in proteins http://www.blocks.fbcrc.org
Prints Conserved motifs in characterizing http://www.bioinf.man.ac. uk/
family db browser/PRINTS/
SCOP 8tructural Classification of Protein http://www.scop.mrc-lmb.
cam.ac. uk/scop
CATH Hierarchical classification http://www.cathdb.info
database of protein domain
structures clustered at four levels of
Class(C), Architecture (A),
Topology (T), and Homologous
superfamily (H)
ProDom Database of protein domain http://www.prodom.prabi.fr/
families generated from Swiss- Prot prodom/current/h tml/home. php
and TrEMBL
e-Motif Database of highly specific and http://motif.stanford.edu/emotif/
sensitive protein sequence motifs,
representing conserved biochemical
properties and biological functions
OMIM Online Mendelian Inheritance in http://www.ncbi.nlm.nih.gov/
Man en trez/query .fcgi?db=OMIM
TransFac Database of eukaryotic http://www.gene-regulation.com/
transcription factors, their pub/databases.html#transfac
genomic binding sites, and
DNA-binding profiles
KEGG Metabolic and regulatory http://www.genome.jp/kegg/
pathways in complete genomes pathway.html
MetaCyc Metabolic pathways and enzymes http://www.metacyc.org
from various organisms

manually curated or automatically generated. They contain information such as the

conserved sequence, signature sequence, and active-site residues of the protein families
arrived at by multiple sequence alignment of a set of related proteins.
Curated databases are maintained by one or more curators who select, input, or invite
only the 'highest quality' data from the selected research centres and database communities.
The quality of the data is of utmost importance, whereas the quantity of data being
deposited is secondary. Swiss-Prot is an example of a curated sequence database.
58 Bioinformatics: Principles and Applications

Composite Database T bl 3
Composite database combines different primary database sources (see a e -3).
mak · . . ore efficient. Although these ....
es querymg and searchmg multiple resources m . . . db -\:
compiled from various primary databases, non-redundancy ts mamtame examplesY filtCri.ll&
. s The best-known
multiple data from different primary database source · (NRDB) d . . or
· R d dant Database , an 8 10S11ico
composite databases are OWL, Non- e un .1 ·table primary 8 •
OWL is a non-redundant composite of the four pubhc y-avat ourcea:
3
Swiss Prot, PIR, GenBank (translation), and NRL- D.
. • t· and web interfaces
Table 3.3 Composite databases, their descnp ions

Composite Web interfaces

databases Descriptions http://www.bioinf.
-----'----
Composite of Sw1ss- Prot, PIR,
OWL man.ac.uk, www.
GenBank (translat10n), and NRL- 3D dbbrowser OWL/
http://www.biosilico.
Integrated metabolic database consisting of
BioSilico
LIGAND, ENZYME, EcoCyc, and MetaCyc
kaist.ac.kr/
-
BIOLOGICAL DATABASE RETRIEVAL SYSTEMS
The amount of biological data is increasing rapidly day by day and so we should know
how to access and search this digitized information. There are ways to retrieve any
kind of biological data. These are termed as biological data retrieval systems. These
systems allow text searching of multiple molecular biology databases and provide
Jinks to relevant information for entries that match the search criteria. Here, we are
mainly concerned with two widely used and powerful data retrieval systems, viz.,
Sequence Retrieval System (SRS) and Entrez.
SRS (http://www.srs.ebi.ac.uk) is a homogeneous interface of data integration,
analysis, and display tool for bioinformatics, genomics, and related data containing
over 80 biological databases. This system was developed in 1993 by Thure Etzold. SRS
is based on pre-made indexes of the items (words, entries, data fields, text, ... ) found in
a set of documents (database files). It includes databases of sequences, metabolic
pathways, transcription factors, application results (such as BLAST, SSEARCH.
FASTA, etc.), protein 3D structures, genomes, mappings, mutations, and locus•
specific mutations. The SRS helps bioinformaticians and biotechnologists in various
ways, such as the following:
l. Fast a_c~ess to diverse life science data - genetic, protein, cellular, molecular.
and chmcal - for researchers and bioinformaticians
2. Int~gratio~ ?f public and proprietary data through ~ne interface.
3· Um~ue ability to perform cross-database queries.
4. Rapid string search of large volumes of data
5. Seamless integration of data and analysis to~ls.
Databases in Bioinformatics - An Introduction 57
Entrez (http://www.ncbi.nlm.nih.gov/Entrez) is a molecular biology database ~nd
retrieval system developed by the National Center for Biotechnology Information
(NCBI) (http://www.ncbi.nlm.nih.gov). It is an entry point for exploring distinct but
integrated databases. Between these two text-based database retrieval systems, Entrez
is easier to use, and also offers more limited information to search.
SUMMARY
Biological research has become an inter-
form within several publicly accessible data-
disciplinary area applying physics, statistics,
bases. Although these data describe a unique
information technology, and many more scien-
array of information on biological entities such
tific fields to cultivate the maximum out of it.
as DNA, proteins, and small molecules, a large
Most significantly, it has become an informa-
proportion of salient information still remains
tion-driven science for any kind of analysis.
hidden, which needs to be explored by acces-
The vast amounts of biological sequence or
sing these data from the databases. Hence,
structure data generated from various molecu- biological databases play a vital role in biolo-
lar biology methods have become the corner- gical research from the fundamental molecular
stone of bioinformatics. Unlike in many other biology to the applied areas of biology such as
disciplines, these results are not only published medicine, disease biology, agriculture, bioche-
in research papers, but also in a structured mical industries, etc.
REVIEW QUESTIONS
I. What is a database? Why does one need a biological database?
2. What is the contribution of Margaret Dayhoff in biological database development?
3. Data heterogeneity is very common in bio-databases. Justify it.
4. How do you classify bio-databases based on their source of data?
5. What is SRS?
6. Define a composite database with an example.
SUGGESTED READING
Admas et al., 2000, 'From sequence to chromo- and evolutionary applications', Encyclopaedia of
some: The tip of the X chromosome of D. Genomics, Proteomics and Bioinformatics, Dunn
melanogaster, Science, 287: 2220-2222. M., Jorde L., Little P., and Subramaniam A.,
Arabidopsis Genome Initiative, 2000, 'Analysis of eds., John Wiley & Sons, New York.
the genome sequence of the flowering plant Bairoch A. and Apweiler R., 1998, The Swiss- Prot
Arabidopsis thaliana', Nature, 408(6814): 796-815. protein sequence data bank and its supplement
The C.elegans Sequencing Consortium, 1998, 'Gen- TrEMBL', Nucleic Acids Res, 26: 38-42.
ome sequence of the nematode C. elega11s: A Bateman A., Coin L., Durbin R., Finn R.D.,
platform for investigating biology', Science, 282: Hollich V., Griffiths-Jones S., Khanna A .,
2012-2018. Marshall M., Moxon S., Sonnhammer E.L.L.,
Attwood T.K., Bradley P., Gaulton A., Maudling Studholme D.J., Yeats C., and Eddy S.R. , 2004,
N., Mitchell A.L., and Moulton G., 2004, 'The 'The Pfam protein families database', Nucleic
PRINTS protein fingerprint database: Functional Acids Res, 32: Dl38- 141.
'I

Bioinformatics
100% (2)
Bioinformatics
104 pages
ME440 Handheld Data Logger: Rogowski Technology (Shanghai) Co., LTD
No ratings yet
ME440 Handheld Data Logger: Rogowski Technology (Shanghai) Co., LTD
26 pages
Introduction To Databases
No ratings yet
Introduction To Databases
7 pages
Lec2 Databases
No ratings yet
Lec2 Databases
135 pages
Capture D'écran . 2023-03-14 À 00.15.22
No ratings yet
Capture D'écran . 2023-03-14 À 00.15.22
54 pages
Lecture 5- DataBase
No ratings yet
Lecture 5- DataBase
18 pages
Biological Databases PDF
No ratings yet
Biological Databases PDF
13 pages
Biological Databases
No ratings yet
Biological Databases
13 pages
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
No ratings yet
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
48 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Biological Database
No ratings yet
Biological Database
3 pages
Sec1 Introduction to Bioinformatics
No ratings yet
Sec1 Introduction to Bioinformatics
20 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
#1 L1 BioDatabases
No ratings yet
#1 L1 BioDatabases
89 pages
UNIT II
No ratings yet
UNIT II
23 pages
BCH 505 Bioinformatics 3(2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3(2 2) Databases
17 pages
Basics of Bioinformatics in Biological Research
No ratings yet
Basics of Bioinformatics in Biological Research
5 pages
Introduction To Bioinformatics (Databases)
No ratings yet
Introduction To Bioinformatics (Databases)
28 pages
Biological Databases: - Bio-Informatics
No ratings yet
Biological Databases: - Bio-Informatics
16 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
2024.HF_BioInformatics_Lec3p
No ratings yet
2024.HF_BioInformatics_Lec3p
11 pages
Basics of Bioinformatics in Biological Research
No ratings yet
Basics of Bioinformatics in Biological Research
5 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
No ratings yet
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
42 pages
Essential Info Notes-1
No ratings yet
Essential Info Notes-1
57 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Bioinformatics and Omics Topic: Database and Biological Database With Examples Assignment-3
No ratings yet
Bioinformatics and Omics Topic: Database and Biological Database With Examples Assignment-3
5 pages
BIOINFORMATICS - eNOTES
No ratings yet
BIOINFORMATICS - eNOTES
23 pages
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
Generating Structural Data Analysis
No ratings yet
Generating Structural Data Analysis
8 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
Biological Data Bases
No ratings yet
Biological Data Bases
36 pages
"MBG1002 Biological Databases Week II
No ratings yet
"MBG1002 Biological Databases Week II
37 pages
Rese Rach
No ratings yet
Rese Rach
37 pages
M Lec 01 & 02 Biological Database
No ratings yet
M Lec 01 & 02 Biological Database
50 pages
Databases - Final
No ratings yet
Databases - Final
50 pages
Databases in Bioinformatics
No ratings yet
Databases in Bioinformatics
33 pages
Database
No ratings yet
Database
16 pages
Biological Data and Database Biological Data
No ratings yet
Biological Data and Database Biological Data
10 pages
Presentation 11
No ratings yet
Presentation 11
20 pages
Bio PPT
No ratings yet
Bio PPT
35 pages
Biol BDs Singapore
No ratings yet
Biol BDs Singapore
24 pages
Bioinfo U2 KD 2
No ratings yet
Bioinfo U2 KD 2
3 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
Biological Database 1
No ratings yet
Biological Database 1
50 pages
Bioinformatics
No ratings yet
Bioinformatics
47 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
61 pages
INtroduction To Informatics
No ratings yet
INtroduction To Informatics
61 pages
Unit I
No ratings yet
Unit I
28 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Biological Databases
No ratings yet
Biological Databases
3 pages
Unit 5-Introduction To Biological Databases
No ratings yet
Unit 5-Introduction To Biological Databases
14 pages
4Bioinformaticsdatabases
No ratings yet
4Bioinformaticsdatabases
71 pages
ajol-file-journals_314_articles_242956_submission_proof_242956-3745-584187-1-10-20230306
No ratings yet
ajol-file-journals_314_articles_242956_submission_proof_242956-3745-584187-1-10-20230306
17 pages
Lecture 4 Biological Databases
No ratings yet
Lecture 4 Biological Databases
29 pages
المحاضرة 2
No ratings yet
المحاضرة 2
16 pages
161_vansh_sharma
No ratings yet
161_vansh_sharma
4 pages
Biological Databases: DR Z Chikwambi Biotechnology
No ratings yet
Biological Databases: DR Z Chikwambi Biotechnology
47 pages
Biological Databases For Human Research
No ratings yet
Biological Databases For Human Research
9 pages
Introduction to Bioinformatics, Sequence and Genome Analysis
From Everand
Introduction to Bioinformatics, Sequence and Genome Analysis
Jerry H. Swift
No ratings yet
The Hidden Genome
From Everand
The Hidden Genome
Winston Cellini
No ratings yet
Instant Access to The Influencer Industry: The Quest for Authenticity on Social Media 1st Edition Emily Hund ebook Full Chapters
100% (1)
Instant Access to The Influencer Industry: The Quest for Authenticity on Social Media 1st Edition Emily Hund ebook Full Chapters
40 pages
Tugas 4 - Bahasa Inggris - Putri Budiman - 2B
No ratings yet
Tugas 4 - Bahasa Inggris - Putri Budiman - 2B
2 pages
ELLE_DECOR_Winter_2025-2024@magazinesclubnew
100% (1)
ELLE_DECOR_Winter_2025-2024@magazinesclubnew
112 pages
Mechanical Measurement and Control - Question Bank
100% (1)
Mechanical Measurement and Control - Question Bank
3 pages
Industrial Training Presentation NBC
100% (2)
Industrial Training Presentation NBC
31 pages
en G.1 1 DCS
No ratings yet
en G.1 1 DCS
28 pages
Stepwells - Reading 3: Web: Anhngulinhnam - Edu.vn// Fanpage: Ieltsthaytu
No ratings yet
Stepwells - Reading 3: Web: Anhngulinhnam - Edu.vn// Fanpage: Ieltsthaytu
13 pages
Buildings - PPT Version 1
No ratings yet
Buildings - PPT Version 1
46 pages
Interactive Schematic: This Document Is Best Viewed at A Screen Resolution of 1024 X 768
100% (1)
Interactive Schematic: This Document Is Best Viewed at A Screen Resolution of 1024 X 768
39 pages
Chapter 4
No ratings yet
Chapter 4
46 pages
Citizen C700 Manual
No ratings yet
Citizen C700 Manual
25 pages
Channel Distribution of Lays Mk
No ratings yet
Channel Distribution of Lays Mk
50 pages
Dear Muhammad,: This Paper Argues That Lukacs Notion of Searching For Meaning Is Exemplified Inby Leo Tolstoy's Anna
No ratings yet
Dear Muhammad,: This Paper Argues That Lukacs Notion of Searching For Meaning Is Exemplified Inby Leo Tolstoy's Anna
5 pages
Seasonal Rainfall Trend Analysis Review
No ratings yet
Seasonal Rainfall Trend Analysis Review
9 pages
ES Unit - 7 Human Communities and Environment
No ratings yet
ES Unit - 7 Human Communities and Environment
4 pages
Customizing Mars Instructions v0.2
No ratings yet
Customizing Mars Instructions v0.2
22 pages
Review of Related Literature and Study
0% (2)
Review of Related Literature and Study
10 pages
Schacht - Nietzsche On Interpretation and Truth
No ratings yet
Schacht - Nietzsche On Interpretation and Truth
12 pages
Earth Movements. Monica
100% (1)
Earth Movements. Monica
4 pages
Search Agents Uninformed Search: Artificial Intelligence
No ratings yet
Search Agents Uninformed Search: Artificial Intelligence
48 pages
January - Hirschsprung's Disease in Africa 21 Century
No ratings yet
January - Hirschsprung's Disease in Africa 21 Century
27 pages
Eddy Current Testing
No ratings yet
Eddy Current Testing
39 pages
A Displacement-Based Adaptive Pushover For Assessment of Buildings and Bridges - Rui Pinho, Et Al, 2006
No ratings yet
A Displacement-Based Adaptive Pushover For Assessment of Buildings and Bridges - Rui Pinho, Et Al, 2006
16 pages
Foro
No ratings yet
Foro
12 pages
10 1109@blockchain 2019 00003
No ratings yet
10 1109@blockchain 2019 00003
5 pages
TorayTrak OneStage PTotal v3.1.5
No ratings yet
TorayTrak OneStage PTotal v3.1.5
77 pages
Aitken Et Al 2018 - A Role For Data Richness Mapping in Exploration Decision Making
No ratings yet
Aitken Et Al 2018 - A Role For Data Richness Mapping in Exploration Decision Making
13 pages
Operation & SCM Implementation of Supply Chain Management at Hindustan Unilever
No ratings yet
Operation & SCM Implementation of Supply Chain Management at Hindustan Unilever
28 pages
Tds CA1010 Mastinox PPG
No ratings yet
Tds CA1010 Mastinox PPG
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Databases in Bioinformatics - An Introduction

Uploaded by

Databases in Bioinformatics - An Introduction

Uploaded by

3

'Knowledge is the eye of desire and can become

History of Biological Databases d b

Features of Biological Databases

g. Mode/ data· Th b' .

CLASSIFICATION SCHEMA OF BIOLOGICAL DATABASES

✓Databases Based on Data Type

Figure 3.1 Databases based on different data types

and Swiss-Prot). Structure database deals with macromolecular stru~ture, mainly

abstracMts from 4'400 biomed ica· I • . . . h

,..,. contain results _from the analysis of entries

of protein domai ns,

manually curated or automatically generated. They contain information such as the

Composite Web interfaces

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.