0% found this document useful (0 votes)
61 views

Databases in Bioinformatics - An Introduction

Uploaded by

Isha Chopra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

Databases in Bioinformatics - An Introduction

Uploaded by

Isha Chopra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

3

Databases in Bioinformatics -
An Introduction

'Knowledge is the eye of desire and can become


the pilot of the soul.'
WILL D URANT

INTRODUCTION
The amount and rate of accumulation of biological information is increasing
expo nentially with the discovery of new and automated sequencing methods and
development of powerful new technologies for acquiring large-scale genomic and
proteomic datasets. The exponential growth in molecular sequence data started in the
early 1980s, when the methods for DNA sequencing became widely available. The data
generated from various sequencing projects needed to be stored and analysed to
annotate the genes and their products and measure their dynamic interactions. As a
result came the concept of 'biological database' to store biological data in an electronic
format . All the biological databases use standardized formats. This tremendous growth
in the biological data has turned biological sciences into a data-rich science. The
common examples of biological data are the nucleotide sequences (genes and genomes);
the protein sequences and motifs; the macromolecular structural data generated from X -
ray crystallography and macromolecular N MR; metabolic pathways; gene expression
data (microarrays); protein-protein interactions; and many other types of data related to
biological function a nd processes. This explosion of biological sequence data in the early
1980s paved the way for the development of three popular databases: NCBI (National
Center for Biotechnology Information), EMBL (European M olecular Biology Labora-
tory), and DDBJ (DNA Data Bank of Japan).

BIOLOGICAL DATABASES
We can define biological database as a collection of data that is structured, searchable,
updated periodically, and cross-referenced. The database administrator updates these
data from time to time by editing existing data and adding new data. Biological
llal11bfflllcl: Princlples and Applications

databases are developed to perform several functions. Some of the main purposes/
functions of biological databases are as follows:
• Databases aid in the systematization of results from biological experiments and
analysis. All the biological data obtained through experiments or analysis are
useful for future work. So databases help to organize and store all known data
which prevents recomputing and duplication of experiments.
nd
• Databases make biological data available to scientists at one place a help them
to obtain data for their research and cross-validation.
nd
• Biological data in databases are available in computer-readable form a this
forms the first fundamental step of biological data analysis.

History of Biological Databases d b


· · I · · b. · r · II ed by the necessity to create ata ases . of
.
I mtta interest m 1om1ormat1cs was prope
· · , , t d 1·mmediately after the msulm
b1olog1cal sequences. The first database was crea e .
· ·1 bl · J 956 I 11·n was the first protem to be
protem sequence was made avai a e 111 • nsu
sequenced and this consisted of only 51 amino acid residues. Proba~ly, the first
11
published work on the biological sequence databases was Atlas of Prote~ Sequences
and Structures, 1965, by Margaret Dayhoff et al. It contained the_ prot~m sequences
determined at that time, and new editions of the book were pubhshed m the 1970s.
The data published in this book became the foundation for the PIR (Protein
Information Resource) database. Then, the first nucleic acid sequence of Yeast tRNA,
with 77 bases, was also obtained. Dunng this period, three-dimensional (30)
structures of proteins were studied and the well-known PDB (Protein Data Bank)
was developed. This was the first protein structure database with only 10 entries in
1972. This has now grown into a large database with over 43,755 entries (as on May
2007). While the initial databases of protein sequences were maintained at the
individual laboratories, the development of a consolidated formal protein sequence
database known as Swiss Prot was initiated in 1986, which now has about 2,69,293
protein sequences from more than I 0,917 model organisms (as on May 2007). This
huge selection of divergent database resources is now made available for study and
research by both academic institutions and industries. These are made available as
public domain information in the larger interest of the research community through
Internet and CD-ROMs. These databases are constantly updated with additional
entries.
. The first genome of a free-living organism (viruses aside) was that of Jlaemoplzi/us
znjluenzae published in 1995. A genome is defined as the completely (or almost com-
pletely) dete~ined_ DNA sequence of the genetic material (chromosomes as well
as_ any plasmids, mitochondrial DNA, etc.) of an organism. The word is a bit of a
misno,mer-: a genome is not the same as 'all genes', it is rather a 'sequence of all
DNA
. wherem all genes can be 1round . N owad ays, scientists • . are showmg . more mterest.
m genome research
. and th ey consi·d er genome as a basic . requirement for working on a
.fi
spec1 1c orgamsm Wh . .
eno " · ~ are. comp1ete genomes so mterestmg? Knowing the complete
g me ior an orgamsm is the fi t t . h
constituents d irs s ep m t e complete mapping of the gene
an processes of the organis Th .
not sufficient) requireme t ~ d Ill_· e comp1ete genome 1s a necessary (but
n or un erstandmg an organism.
Databases in Bioinfonnatics - An Introduction 49
The analysis of a g~nome covers many different . . of the most common
aspects. A hst
·
i en t th at entirely · a complete genome
~
ones follows
. ' but it is ev'd novel ways of analysing
can be mvented. There is great potential for interesting discoveries in the complete
genomes; we probably have just scratched the surface with the few possibilities as
follows:
1. ~e~ne t~e location of genes (coding sequences, regulatory regions); gene pre-
diction (identification).
2 - Gene prediction ab initio using software based on the rules and patterns. Find
Open Reading Frames (ORFs), with some additional criteria. Fairly simple
for bacteria, very difficult for eukaryotes.
3. Gene identification through alignment with known proteins and EST
(Expressed Sequence Tag) sequences.
4. Gene prediction through similarity with proteins or ESTs in other organisms.
5. Gene prediction through comparison with other genomes; conserved regions
may be coding or regulatory regions.

Features of Biological Databases


The various features of biological databases are as follows:
1. Data heterogeneity: There is a great deal of diversity in the data types of a
biological database. Some of the biological data types are listed here.
a. Seque11ces: Sequence data type includes DNA, RNA, and amino-acid sequences
(proteins). These data have grown enormously due to the availability of automated
sequencers and large-scale sequencing projects such as human, mouse, dog, and
many more genomes.
b. Graphs: Biological data indicating relationships among themselves can be
captured as graphs or networks. Examples of these data are pathway data
(metabolic pathways, signaling pathways, and gene-regulatory networks), genetic
maps, and structured taxonomies.
c. High-dimensional data: High-dimensional data include the data generated
from micro-array experiments that involve thousands of genes, hundreds of
experimental conditions, clustering studies on genes, etc. These data are used for
comparing the behaviour of various biological units in different conditions.
d. Shapes: The data type 'Shapes' consists of 3D molecular structural data. For
example, docking is put under these data category as 'docking' behaviour
of molecules at a potential binding site, and it depends on the 3D configuration of
the molecule and the site.
e. Temporal data: Temporal data are useful for studying the dynamics of any
biological system, e.g., electrophysiology recordings, development biology,
protein structure dynamics, cellular structure dynamics, etc.
f Patterns: Within the genome, there are patterns that characterize biologically
interesting entities. For example, the genome contains patterns associated with
genes (i.e., sequences of particular genes) and with regulatory sequences (that
determine the extent of a particular gene's expression, such as promoter,
transcription factor, etc.).
so Biointonnatics• Pri .
. nap1es and Applications

g. Mode/ data· Th b' .


mathemat' · e I~l~gical phenomena are represented as computa ·
"c , tea1' and statistical models used for parameter estimation, testing
h· .,, a,ar and . '
. vector fields: Charge distribution across cell surface, calcnun
P. roEt em fluxes across cell surface, etc., are included under this
· category.
z. xtracted features data: Numerical data extracted from the combination of
~ne of the above data types are put under this category of data.
2· High-volume data: In addition to being highly heterogeneous, biological data are
\ 0 luminous to support comprehensive investigation in various fields and direc..
hons.
3. Uncertainty: Biological data have a great deal of uncertainty as they represent
biological phenomena that are observed and assumed (based on some evidence) to
be true. The uncertainty must be modelled and recorded as a part of the data.
4. Data curation: The biological data are collected from various sources across
different structural and functional boundaries and so there is always a chance of
many missing links and inconsistencies. Some of these inconsistencies are due to a
lack of knowledge in the desired field. To improve these data inconsistencies,
cross-correlation of data and its analysis are essential. Automatic data curation
is essential in understanding the missing links as the biological databases are
flourishing quickly.
5. Large-scale data integration: Data collected from laboratories worldwide after
years of research, across different structural and functional scales, are integrated
together through a database and made available for use.
6. Data sharing: Biological data generated from an experiment needs to be cross-
verified by other scientists around the world to confirm its reproducibility.
Therefore, data are shared via databases for the scientific community's examina-
tion and inspection.
7. Dynamic and subject to continual change: New data are generated in plenty every
day in various laboratories and sometimes these new data obtained contradict the
old data. So there is a necessity for developing new organizational database
schemes to incorporate any new data.

CLASSIFICATION SCHEMA OF BIOLOGICAL DATABASES


The biological databases can be classified into various categories based on different
criteria such as data type, maintainer status, data access, data source, database design,
and type of organism. These classifications are explained as follows:
• Based on Data Type
1. Genome databases
Human, Mouse, Yeast, C. elegans, Flybase, and Parasites
2. Sequence database
a. Nucleotide databases: Alternative Splicing, EMBL-Bank, Ensembl,
Genomes Server, Genome, MOT, EMBL-Align, Simple Queries.
dbSTS Queries, Parasites, Mutations, and JMGT ·
b. Protein databases: Swiss- Prot, TrEMBL, JnterPro, CluSTr, IPI, G()A,
GO, Proteome Analysis, HPI, IntEnz, TrEMBLnew SP ML, NEW'f,
and PANDIT ' -
Databases In Bioinformatics - An lntroductiOn 51
3. Structure databases
PJ?B, MSD, NDB, FSSP, and DALI
4. M1croarray database
ArrayExpress and MIAME
5. Chemical database
ChEBI
6. Pathway database
7· BRENDA, KEGG, and BioSilico
8. Enzyme database
EC Enzyme Database, Enzyme Nomenclature Database (ExPASy) and
REBASE '
9. Disease database
OMIM and OMIA
10. Literature databases
MEDLINE, Software Biocatalog, and Flybase Archives
• Based on Maintainer Status
NCBI, EMBL, and SIB
• Based on Data Access
1. Publicly available
2. Available with copyright
3. Browsing only, accessible but not downloadable
4. Academic, but not freely available
5. Proprietary, commercial
6. Restricted SQL queries against underlying DBMS
• Based on Data Source
1. Primary database (archival)
a. Nucleotide: GenBank EMBL, DDBJ
b. Protein: UniProt, TrEMBL
c. Structure: PDB
d. Literature: Medline (PubMed)
2. Secondary database (curated)
a. Genomic: RefSeq, TIGR gene indices of human
b. Proteomic: Prosite, Swiss-Prot
• Database Design
Relational and object-oriented
• Orga11ism
Bacteria, virus, human, etc.

✓Databases Based on Data Type


Biological databases can be broadly classified into nine categories based on the
composition of their data types. These are illustrated in Figure 3.1.

Sequence Databases
Sequence databases are applicable to both nucleic acid sequences (GenBank, EMBL-
Bank, and DDBJ) and protein sequences (Entrez protein, Integr8, proteome F ASTA,
52 Biolnfonnatics: Principles and Applications

r Databases l
Sequence database
' Genome database

Nuoleotide
(DNA)
~ Protein
(protein)
l
/
Microarray database
Bibliographic database (Transcriptome)
(literature)
/

Chemical database
Metabolic database ,
(pathways and enzymes)
/'
/
Structure database Disease database
(3D structure of ,
macromolecules) ,

r
Enzyme database

Figure 3.1 Databases based on different data types

and Swiss-Prot). Structure database deals with macromolecular stru~ture, mainly


proteins. The examples of structure databases are Molecular Modellmg Database
(MMDB), Protein Data Bank (PDB), Gene3D, EMBL-Macromolecular Structure
Database (E-MSD), Topology of Protein Structure (TOPS), etc. All these databases
are explained in detail in the subsequent chapters.

Genome Databases
Genome databases are a repository of whole genome nucleotide sequences of various
organisms - prokaryotes, eukaryotes, and viruses. These databases also provide views
for a variety of genomes, sequence maps with contigs, and integrated genetic and
physical maps along with annotated genes information. For example, Entrez Genome of
NCBI has the genome sequence data for six major organism types: Archaea, Bacteria.
Eukaryotes, Viruses, Viroids, and Plasmids. Genome Information Broker (GIB) is
a~othe_r database of the complete genome sequence data (http://www.gib.genes.
mg.ac.Jp).

Bibliographic Databases
Bibliographic database. are scientific literature database consisting of numero~s
15
resear~h papers a~d. articles from various journals. PubMed, available at NCBI.
the widely used b1bho~raphic _database. PubMed is a special type of database that
helps to _stay cu~rent with the literature of various subjects. PubMed is maintained bY
the National Library of Medicine (NLM) and contains more than 12 .8 million
Databases in Bioinformatics - All Introduction 53

abstracMts from 4'400 biomed ica· I • . . . h


and b1ochem1cal Journals datmg to as far back as t e
1970
C s. d ~ recently, Pub Med has become more fuUy integrated with NCBI's Entrez
0

. ro~s- ata ase search system (Wheeler et al. 2005) so that the users can see more than
.
Just Journal abstrac ts and titles to their text quenes.
MEDL INE . th ' .
ooici-- . is e NLM s premier bibliographic database covering the fields of
m_ . ne, !mrsmg, dentistry, veterinary medicine, the health-care system, and the pre-
chmcal sciences · MEDL INE co n tams · b'bl' · · s and auth or a bstracts
1 10graph'1c c1tat1on
from n~ore than 4,800 biomedical journ;I ; publi;hed in the United States and 70 other
countries. The ~aoas e contains over 12 million citations dating back to the mid-
1960s. Coverage ts worldwide, but most records are from English-language sources or
have English abstrac ts.

Microarray Databases
These databas es contain data obtained from microarray-based experiments measuring
the abunda nce of mRNA , genomic DNA, and protein molecules, and also from
nonarra y-based technologies, such as SAGE and mass spectrometry peptide profiling.
These data are otherwise known as transcriptome data. The examples of such data-
bases are GEO (Gene Expression Omnibus), Gensat, ArrayExpress, Cancer Gene
Expression Databa se (CGED ), Human Gene Expression Index (HuGE Index), etc.

Metabolic Databases
Metabo lic databas es contain data on biochemical pathways and enzymes in different
organisms. KEGG and MetaCyc are the noteworthy metabolic databases. Organism-
specific databas es include organism-related data individually such as EcoCyc , Flybase,
and CCDB. All these databases are elaborated in the subsequent chapters.

Chemical Databases
These databas es store chemical information on various molecules. For example,
PubChe m of NCBI contain s substance descriptions on small molecules with fewer
than 1,000 atoms and 1,000 bonds.

Structure Databases
Structu re databas es include data on 3D structur e of nucleic acids and proteins. The data
types found in this databas e are crystallographic or NMR coordin ate data, structur e
factors for the X-ray structures or constra int files for the NMR structures, and
information about the experiments used to determine the structures, such as crystal-
lization informa tion, data collection, and refinement statistics. The examples of such
nucleic acid databas es are NOB (Nucleic Acid Databas e) and SCOR (Structu ral
Classification of RNA). PDB (Protein Data Bank) is the most popular repository of 3D
structure of proteins obtained either by NMR or X-ray crystallography.

Disease Databases . .
These are the exclusive sources for disease-related mformat10n. For example, OMIM
(Online Mendel ian Inheritance in Man) provides data about human genes and genetic
disorders. Genetic Association Databa se is another popular diseases databas e contain -
ing data on Human Genetic Association studies of complex diseases and disorders.
ete. 'l'filj dlta6u6 ~
involved. Tbe
Nomencla ure a
Dallbllll IIIN 11t-DIII IOUrOI
There are two aeneral
classes of biological datal,alel bad"~ t:hffl ld,\ll'Cllltii'ii
of biological data - (i) Archival or Primal')' database and (11) Curated or
database.
,,,,,,., Dl!lllb•• . ..
Primary or Archival databases accept or mclude onginal data from researc
relatiwly little cbockiDS or validation. They contain original submissions by
ers. Most of the archival databases are public and offer open access to the
community for annotation purposes. GenBank and EMBL-Bank are eJU11111V11
primary nucleic acid database, These are nucleotide sequence database of N
Centre for Biotechnology Information (NCBI) and European Molecular
Laboratory (EMBL), respectively (see Table 3.1). Primary protein sequence da
are UniProt, PIR, Swiss-Prot, and TrEMBL; primary structure databases include
and Nucleic Acid Database (NOB).
Table 3.1 Primary databases, their descriptions and web interfaces ..,,..

~ Web Interfaces
.,_.,,e, Descriptions
http: //www.ncbi.nih.gov/
GenBank Nucleotide sequence database of NCBI
Genbank/
Database of European Molecular Biology http://www.ebi.ac.uk/embl.
EMBL
Laboratory
http://www.ddbj.nig.ac.jp
DDBJ DNA Data Bank of Japan
Universal Protein Resource http://www.uniprot.org
UniProt http://pir.georgetown.edu/
PIR Protein Information Resource
Manually curated protein-only sequence http://www.expasy.ch
Swiss-
Prot database
TrEMBL Translated EMBL is a very large protein
database in Swiss- Prot format
Protein Data Bank - repository for 30 http://www.rcsb.org/pdb
PDB
structure of macromolecules
NDB Nucleic Acid Database http://www.ndbserver.ru
edu/

,..,. contain results _from the analysis of entries


e r the databases are bsted in Table 3.2). These da
Databases In Bioinformatics - AA Introduction 51
Table 3.2 Secondary databases, their descriptions and web interfaces

of protein domai ns,


f:Database
.. http://www.expasy.org/prosite
amt 1ies, and functional sites
Pfam Protein family database http://www.sanger.ac. uk/
Software/Pfam/
Blocks Conserved regions in proteins http://www.blocks.fbcrc.org
Prints Conserved motifs in characterizing http://www.bioinf.man.ac. uk/
family db browser/PRINTS/
SCOP 8tructural Classification of Protein http://www.scop.mrc-lmb.
cam.ac. uk/scop
CATH Hierarchical classification http://www.cathdb.info
database of protein domain
structures clustered at four levels of
Class(C), Architecture (A),
Topology (T), and Homologous
superfamily (H)
ProDom Database of protein domain http://www.prodom.prabi.fr/
families generated from Swiss- Prot prodom/current/h tml/home. php
and TrEMBL
e-Motif Database of highly specific and http://motif.stanford.edu/emotif/
sensitive protein sequence motifs,
representing conserved biochemical
properties and biological functions
OMIM Online Mendelian Inheritance in http://www.ncbi.nlm.nih.gov/
Man en trez/query .fcgi?db=OMIM
TransFac Database of eukaryotic http://www.gene-regulation.com/
transcription factors, their pub/databases.html#transfac
genomic binding sites, and
DNA-binding profiles
KEGG Metabolic and regulatory http://www.genome.jp/kegg/
pathways in complete genomes pathway.html
MetaCyc Metabolic pathways and enzymes http://www.metacyc.org
from various organisms

manually curated or automatically generated. They contain information such as the


conserved sequence, signature sequence, and active-site residues of the protein families
arrived at by multiple sequence alignment of a set of related proteins.
Curated databases are maintained by one or more curators who select, input, or invite
only the 'highest quality' data from the selected research centres and database communities.
The quality of the data is of utmost importance, whereas the quantity of data being
deposited is secondary. Swiss-Prot is an example of a curated sequence database.
58 Bioinformatics: Principles and Applications

Composite Database T bl 3
Composite database combines different primary database sources (see a e -3).
mak · . . ore efficient. Although these ....
es querymg and searchmg multiple resources m . . . db -\:
compiled from various primary databases, non-redundancy ts mamtame examplesY filtCri.ll&
. s The best-known
multiple data from different primary database source · (NRDB) d . . or
· R d dant Database , an 8 10S11ico
composite databases are OWL, Non- e un .1 ·table primary 8 •
OWL is a non-redundant composite of the four pubhc y-avat ourcea:
3
Swiss Prot, PIR, GenBank (translation), and NRL- D.
. • t· and web interfaces
Table 3.3 Composite databases, their descnp ions

Composite Web interfaces


databases Descriptions http://www.bioinf.
-----'----
Composite of Sw1ss- Prot, PIR,
OWL man.ac.uk, www.
GenBank (translat10n), and NRL- 3D dbbrowser OWL/
http://www.biosilico.
Integrated metabolic database consisting of
BioSilico
LIGAND, ENZYME, EcoCyc, and MetaCyc
kaist.ac.kr/
-
BIOLOGICAL DATABASE RETRIEVAL SYSTEMS
The amount of biological data is increasing rapidly day by day and so we should know
how to access and search this digitized information. There are ways to retrieve any
kind of biological data. These are termed as biological data retrieval systems. These
systems allow text searching of multiple molecular biology databases and provide
Jinks to relevant information for entries that match the search criteria. Here, we are
mainly concerned with two widely used and powerful data retrieval systems, viz.,
Sequence Retrieval System (SRS) and Entrez.
SRS (http://www.srs.ebi.ac.uk) is a homogeneous interface of data integration,
analysis, and display tool for bioinformatics, genomics, and related data containing
over 80 biological databases. This system was developed in 1993 by Thure Etzold. SRS
is based on pre-made indexes of the items (words, entries, data fields, text, ... ) found in
a set of documents (database files). It includes databases of sequences, metabolic
pathways, transcription factors, application results (such as BLAST, SSEARCH.
FASTA, etc.), protein 3D structures, genomes, mappings, mutations, and locus•
specific mutations. The SRS helps bioinformaticians and biotechnologists in various
ways, such as the following:
l. Fast a_c~ess to diverse life science data - genetic, protein, cellular, molecular.
and chmcal - for researchers and bioinformaticians
2. Int~gratio~ ?f public and proprietary data through ~ne interface.
3· Um~ue ability to perform cross-database queries.
4. Rapid string search of large volumes of data
5. Seamless integration of data and analysis to~ls.
Databases in Bioinformatics - An Introduction 57
Entrez (http://www.ncbi.nlm.nih.gov/Entrez) is a molecular biology database ~nd
retrieval system developed by the National Center for Biotechnology Information
(NCBI) (http://www.ncbi.nlm.nih.gov). It is an entry point for exploring distinct but
integrated databases. Between these two text-based database retrieval systems, Entrez
is easier to use, and also offers more limited information to search.
SUMMARY
Biological research has become an inter-
form within several publicly accessible data-
disciplinary area applying physics, statistics,
bases. Although these data describe a unique
information technology, and many more scien-
array of information on biological entities such
tific fields to cultivate the maximum out of it.
as DNA, proteins, and small molecules, a large
Most significantly, it has become an informa-
proportion of salient information still remains
tion-driven science for any kind of analysis.
hidden, which needs to be explored by acces-
The vast amounts of biological sequence or
sing these data from the databases. Hence,
structure data generated from various molecu- biological databases play a vital role in biolo-
lar biology methods have become the corner- gical research from the fundamental molecular
stone of bioinformatics. Unlike in many other biology to the applied areas of biology such as
disciplines, these results are not only published medicine, disease biology, agriculture, bioche-
in research papers, but also in a structured mical industries, etc.
REVIEW QUESTIONS
I. What is a database? Why does one need a biological database?
2. What is the contribution of Margaret Dayhoff in biological database development?
3. Data heterogeneity is very common in bio-databases. Justify it.
4. How do you classify bio-databases based on their source of data?
5. What is SRS?
6. Define a composite database with an example.
SUGGESTED READING
Admas et al., 2000, 'From sequence to chromo- and evolutionary applications', Encyclopaedia of
some: The tip of the X chromosome of D. Genomics, Proteomics and Bioinformatics, Dunn
melanogaster, Science, 287: 2220-2222. M., Jorde L., Little P., and Subramaniam A.,
Arabidopsis Genome Initiative, 2000, 'Analysis of eds., John Wiley & Sons, New York.
the genome sequence of the flowering plant Bairoch A. and Apweiler R., 1998, The Swiss- Prot
Arabidopsis thaliana', Nature, 408(6814): 796-815. protein sequence data bank and its supplement
The C.elegans Sequencing Consortium, 1998, 'Gen- TrEMBL', Nucleic Acids Res, 26: 38-42.
ome sequence of the nematode C. elega11s: A Bateman A., Coin L., Durbin R., Finn R.D.,
platform for investigating biology', Science, 282: Hollich V., Griffiths-Jones S., Khanna A .,
2012-2018. Marshall M., Moxon S., Sonnhammer E.L.L.,
Attwood T.K., Bradley P., Gaulton A., Maudling Studholme D.J., Yeats C., and Eddy S.R. , 2004,
N., Mitchell A.L., and Moulton G., 2004, 'The 'The Pfam protein families database', Nucleic
PRINTS protein fingerprint database: Functional Acids Res, 32: Dl38- 141.
'I

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy