0% found this document useful (0 votes)
8 views

Chapter 7 Advanced Database Concepts

Chapter Seven covers advanced database concepts including database security, distributed database systems, PLSQL, and non-relational databases. It emphasizes the importance of database security management, user authentication, and authorization, detailing various access control methods such as discretionary, mandatory, and role-based access control. The chapter also discusses the responsibilities of database administrators and the mechanisms for managing user privileges and ensuring data integrity and confidentiality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Chapter 7 Advanced Database Concepts

Chapter Seven covers advanced database concepts including database security, distributed database systems, PLSQL, and non-relational databases. It emphasizes the importance of database security management, user authentication, and authorization, detailing various access control methods such as discretionary, mandatory, and role-based access control. The chapter also discusses the responsibilities of database administrators and the mechanisms for managing user privileges and ensuring data integrity and confidentiality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

CHAPTER SEVEN

Advanced Database Concepts

Advanced Database Concepts BY: Mekonnen K Page 0.1


Outline

7.1 Database Security

7.2 Distributed Database Systems

7.3 PLSQL (Trigger, Store Procedure and Function)

7.4 Non-Relational Database (NoSQL Database)

Advanced Database Concepts BY: Mekonnen K Page 0.2


1. Introduction to Database Security Issues

▪ In today's society, Some information is extremely important as to


have to be protected. For example, disclosure or modification of
military information could cause danger to national security. A good
database security management system has to handle the possible
database threats.

▪ Threat may be any situation or event, whether intentional or


accidental, that may adversely affect a system and consequently the
organization

Advanced Database Concepts BY: Mekonnen K Page 0.3


1. Introduction to Database Security Issues

▪ Threats to databases : It may results in degradation of some/all


security goals like;
✓ Loss of Integrity
▪ Only authorized users should be allowed to modify data.
▪ For example, students may be allowed to see their grades,
but not allowed to modify them.
✓ Loss of Availability-if DB is not available for those users/ to
which they have a legal right to uses the data
Authorized users should not be denied access.
For example, an instructor who wishes to change a grade should
be allowed to do so.
✓ Loss of Confidentiality
▪ Information should not be disclosed to unauthorized users.
▪ For example, a student should not be allowed to examine other students' grades.

Advanced Database Concepts BY: Mekonnen K Page 0.4


Authentication
All users of the database will have different access levels and
permission for different data objects, and authentication is the
process of checking whether the user is the one with the privilege
for the access level.
Thus the system will check whether the user with a specific
username and password is trying to use the resource.
Authorization/Privilege
Authorization refers to the process that determines the mode in
which a particular (previously authenticated) client is allowed to
access a specific resource controlled by a server.
Any database access request will have the following three major
components
1. Requested Operation: what kind of operation is requested by a specific query?
2. Requested Object: on which resource or data of the database is the operation
sought to be applied?
3. Requesting User: who is the user requesting the operation on the specified object?
Advanced Database Concepts BY: Mekonnen K Page 0.5
Forms of user authorization

There are different forms of user authorization on the


resource of the
database. These includes :
1. Read Authorization: the user with this privilege is allowed only to read the
content of the data object.
2. Insert Authorization: the user with this privilege is allowed only to insert
new records or items to the data object.
3. Update Authorization: users with this privilege are allowed to modify
content of attributes but are not authorized to delete the records.
4. Delete Authorization: users with this privilege are only allowed to delete a
record and not anything else.
❑ Note: Different users, depending on the power of the user, can have one
or the combination of the above forms of authorization on different data
objects.

Advanced Database Concepts BY: Mekonnen K Page 0.6


Database Security and the DBA

▪ The database administrator (DBA) is the central authority for


managing a database system.
▪ The DBA’s responsibilities include
▪ Account creation
▪ granting privileges to users who need to use the system
▪ Privilege revocation
▪ classifying users and data in accordance with the policy of the
organization
Access Protection, User Accounts, and Databases Audits
▪ Whenever a person or group of persons need to access a database
system, the individual or group must first apply for a user account.
▪ The DBA will then create a new account id and password for the
user if he/she believes there is a legitimate need to access the
database
Advanced Database Concepts BY: Mekonnen K Page 0.7
▪ The user must log in to the DBMS by entering account id and password
whenever database access is needed.
▪ The database system must also keep track of all operations on the
database that are applied by a certain user throughout each login session.
▪ If any tampering with the database is assumed, a database audit is
performed
▪ A database audit consists of reviewing the log to examine all accesses
and operations applied to the database during a certain time period.
▪ A database log that is used mainly for security purposes is sometimes
called an audit trail.

▪To protect databases against the possible threats two kinds of


countermeasures can be implemented: Access control and Encryption

Advanced Database Concepts BY: Mekonnen K Page 0.8


2 . Access Control (AC)
2. 1. Discretionary Access Control (DAC)
▪ The typical method of enforcing discretionary access control in a
database system is based on the granting and revoking privileges.
▪ The granting and revoking of privileges for discretionary
privileges known as the access matrix model where
– The rows of a matrix M represents subjects (users, accounts, programs)
– The columns represent objects (relations, records, columns, views,
operations).
– Each position M(i,j) in the matrix represents the types of privileges (read,
write, update) that subject i holds on object j.
▪ To control the granting and revoking of relation privileges, each
relation R in a database is assigned an owner account, which is
typically the account that was used when the relation was created
in the first place.
– The owner of a relation is given all privileges on that relation.
– The owner account holder can pass privileges on any of the owned relation to
other users by granting privileges to their accounts.
Advanced Database Concepts BY: Mekonnen K Page 0.9
▪ Privileges Using Views

The mechanism of views is an important discretionary


authorization mechanism in its own right. For example,
If the owner A of a relation R wants another account B to be
able to retrieve only some fields of R, then A can create a
view V of R that includes only those attributes and then
grant SELECT on V to B.
▪ Revoking Privileges
▪ In some cases it is desirable to grant a privilege to a user
temporarily.
▪ For example: The owner of a relation may want to grant the
SELECT privilege to a user for a specific task and then revoke
that privilege once the task is completed.
▪ Hence, a mechanism for revoking privileges is needed.
▪ In SQL, a REVOKE command is included for the purpose of
canceling privileges.
Advanced Database Concepts BY: Mekonnen K Page 0.10
▪ Propagation of Privileges using the GRANT OPTION

Whenever the owner A of a relation R grants a privilege on R to


another account B, privilege can be given to B with or without the
GRANT OPTION.

If the GRANT OPTION is given, this means that B can also grant
that privilege on R to other accounts.
▪ Suppose that B is given the GRANT OPTION by A and that B
then grants the privilege on R to a third account C, also with
GRANT OPTION. In this way, privileges on R can propagate
to other accounts without the knowledge of the owner of R.
▪ If the owner account A now revokes the privilege granted to
B, all the privileges that B propagated based on that privilege
should automatically be revoked by the system.

Advanced Database Concepts BY: Mekonnen K Page 0.11


Example 1
▪ Suppose that the DBA creates four accounts:A1, A2, A3, A4 and
wants only A1 to be able to create relations. Then the DBA must
issue the following GRANT command in SQL
GRANT CREATETAB TO A1;
Example 2
▪ Suppose that A1 creates the two base relations EMPLOYEE and
DEPARTMENT
▪ A1 is then owner of these two relations and hence A1 has all
the relation privileges on each of them.
▪ Suppose that A1 wants to grant A2 the privilege to insert and
delete rows in both of these relations, but A1 does not want A2
to be able to propagate these privileges to additional accounts:

GRANT INSERT, DELETE ON EMPLOYEE, DEPARTMENT TO A2;

Advanced Database Concepts BY: Mekonnen K Page 0.12


Example 3
▪ Suppose that A1 wants to allow A3 to retrieve information from either
of the table (Department or Employee) and also to be able to propagate
the SELECT privilege to other accounts.
▪ A1 can issue the command:
GRANT SELECT ON EMPLOYEE, DEPARTMENT
TO A3 WITH GRANT OPTION;
▪ A3 can grant the SELECT privilege on the EMPLOYEE relation to A4 by issuing:
GRANT SELECT ON EMPLOYEE TO A4;
▪ Notice that A4 can’t propagate the SELECT privilege because GRANT
OPTION was not given to A4
Example 4
▪ Suppose that A1 decides to revoke the SELECT privilege on the EMPLOYEE
relation from A3; A1 can issue:
REVOKE SELECT ON EMPLOYEE FROM A3;
▪ The DBMS must now automatically revoke the SELECT privilege on EMPLOYEE
from A4, too, because A3 granted that privilege to A4 and A3 does not have the
privilege any more.
Advanced Database Concepts BY: Mekonnen K Page 0.13
Example 5
▪ Suppose that A1 wants to give back to A3 a limited capability to SELECT
from the EMPLOYEE relation and wants to allow A3 to be able to propagate
the privilege.
▪ The limitation is to retrieve only the NAME, BDATE, and ADDRESS attributes
and only for the tuples with DNO=5.
▪ A1 then create the view:
CREATE VIEW A3EMPLOYEE AS
SELECT NAME, BDATE, ADDRESS FROM EMPLOYEE WHERE DNO = 5;
▪ After the view is created, A1 can grant SELECT on the view
A3EMPLOYEE to A3 as follows:
GRANT SELECT ON A3EMPLOYEE TO A3 WITH GRANT OPTION;
Example 6
– Finally, suppose that A1 wants to allow A4 to update only the SALARY
attribute of EMPLOYEE;
– A1 can issue:
GRANT UPDATE ON EMPLOYEE (SALARY) TO A4;
Advanced Database Concepts BY: Mekonnen K Page 0.14
2.2 Mandatory Access Control

▪ DAC techniques is an all-or-nothing method:


▪ A user either has or does not have a certain privilege.
▪ In many applications, additional security policy is needed that
classifies data and users based on security classes.
▪ Typical security classes are top secret (TS), secret (S), confidential
(C), and unclassified (U), where TS is the highest level and U the
lowest: TS ≥ S ≥ C ≥ U
▪ The commonly used model for multilevel security, known as the Bell-
LaPadula model, classifies each subject (user, account, program) and
object (relation, tuple, column, view, operation) into one of the
security classifications, T, S, C, or U:
▪ Clearance (classification) of a subject S as class(S) and to the
classification of an object O as class(O).

Advanced Database Concepts BY: Mekonnen K Page 0.15


▪ Two restrictions are enforced on data access based on the
subject/object classifications:
✓ A subject S is not allowed read access to an object O unless class(S) ≥ class(O).
✓ A subject S is not allowed to write an object O unless class(S) ≥ class(O).
▪ To incorporate multilevel security notions into the relational database
model, it is common to consider attribute values and rows as data
objects.
▪ Hence, each attribute A is associated with a classification attribute C in
the schema.
▪ In addition, in some models, a tuple classification attribute TC is added to
the relation attributes to provide a classification for each tuple as a whole.
▪ Hence, a multilevel relation schema R with n attributes would be
represented as
▪ R(A1,C1,A2,C2, …, An,Cn,TC)
▪ where each Ci represents the classification attribute associated with attribute
Ai.

Advanced Database Concepts BY: Mekonnen K Page 0.16


▪ The value of the TC attribute in each tuple t – which is the highest of
all attribute classification values within t – provides a general
classification for the tuple itself
▪ Whereas, each Ci provides a finer security classification for each
attribute value within the tuple.
▪ A multilevel relation will appear to contain different data to subjects
(users) with different clearance levels.
▪ In some cases, it is possible to store a single tuple in the relation at
a higher classification level and produce the corresponding tuples at
a lower-level classification through a process known as filtering.
▪ In other cases, it is necessary to store two or more tuples at
different classification levels with the same value for the apparent
key.
▪ This leads to the concept of polyinstantiation where several tuples can
have the same apparent key value but have different attribute values
for users at different classification levels.
▪ Example
Advanced Database Concepts BY: Mekonnen K Page 0.17
Consider query SELECT * FROM employee

(a) The original employee table,


(b) After filtering employee table for classification C users,
(c) After filtering employee table for classification U users
(d) Polyinstantation of the smith row for C users who want to modify some value
Advanced Database Concepts BY: Mekonnen K Page 0.18
▪ A user with a security clearance S would see the same relation
shown above (a) since all row classification are less than or equal
to S as shown in (a).
▪ However, a user with security clearance C would not allowed to
see values for salary of Brown and job performance of Smith,
since they have higher classification as shown in (b)
▪ For a user with security clearance U , filtering introduces null
values for attributes values whose security classification is higher
than the user’s security clearance as shown in (c)
▪ A user with security clearance C may request for update on the
values of job performance of smith to ‘Excellent’ and the view
will allow him to do so .However the user shouldn't be allowed to
overwrite the existing value at the higher classification level.
▪ Solution: to create ployinstantation for smith row at the lower
classification level C as shown in (d)
Advanced Database Concepts BY: Mekonnen K Page 0.19
❑ Comparing DAC and MAC
▪ DAC policies are characterized by a high degree of flexibility,
which makes them suitable for a large variety of application
domains.
▪ The main drawback of DAC models is their weakness to
malicious attacks, such as Trojan horses embedded in
application programs.
▪ By contrast, mandatory policies ensure a high degree of protection
in a way, they prevent any illegal flow of information.
▪ Mandatory policies have the drawback of being too rigid and
they are only applicable in limited environments.
▪ In many practical situations, discretionary policies are preferred
because they offer a better trade-off between security and
applicability.

Advanced Database Concepts BY: Mekonnen K Page 0.20


2.3 Role-Based Access Control

▪ Its basic notion is that permissions are associated with roles, and
users are assigned to appropriate roles.

▪ Roles can be created using the CREATE ROLE and DESTROY ROLE
commands.
▪ The GRANT and REVOKE commands discussed under DAC can
then be used to assign and revoke privileges from roles.
▪ RBAC appears to be a feasible alternative to discretionary and
mandatory access controls;
▪ It ensures that only authorized users are given access to certain
data or resources.
▪ Many DBMSs have allowed the concept of roles, where privileges
can be assigned to roles.
▪ Role hierarchy in RBAC is a natural way of organizing roles to
reflect the organization’s lines of authority and responsibility.
Advanced Database Concepts BY: Mekonnen K Page 0.21
2. 4. Introduction to Statistical Database Security

▪ Statistical databases are used mainly to produce statistics on


various populations.
▪ The database may contain confidential data on individuals, which
should be protected from user access.
▪ Users are permitted to retrieve statistical information on the
populations, such as averages, sums, counts, maximums,
minimums, and standard deviations.
▪ A population is a set of rows of a relation (table) that satisfy
some selection condition.
▪ Statistical queries involve applying statistical functions to a
population of rows.
▪ For example, we may want to retrieve the number of individuals
in a population or the average income in the population.
However, statistical users are not allowed to retrieve individual
data, such as the income of a specific person.

Advanced Database Concepts BY: Mekonnen K Page 0.22


▪ Statistical database security techniques must disallow the retrieval of individual data.
▪ This can be achieved by elimination of queries that retrieve attribute values and by
allowing only queries that involve statistical aggregate functions such as, SUM, MIN,
MAX,
▪ Such queries are sometimes called statistical queries.
▪ It is DBMS’s responsibility to ensure confidentiality of information about individuals,
while still providing useful statistical summaries of data about those individuals to
users. Provision of privacy protection of users in a statistical database is paramount.
▪ In some cases it is possible to infer the values of individual rows from a sequence
statistical queries.
▪ This is particularly true when the conditions result in a population consisting
of a small number of rows.
▪ Example:
▪ Solution:
▪ Not to allow query if the number of rows fall below a certain threshold
▪ To forbid sequences of queries that refer to repeatedly to the same population
of rows
Advanced Database Concepts BY: Mekonnen K Page 0.23
2.5 Encryption
✓ Authorization may not be sufficient to protect data in database
systems, especially when there is a situation where data should be
moved from one location to the other using network facilities.
✓ Encryption is used to protect information stored at a particular site
or transmitted between sites from being accessed by unauthorized
users.
✓ Encryption is the encoding of the data by a special algorithm that
renders the data unreadable by any program without the decryption
key.
✓ It is not possible for encrypted data to be read unless the reader
knows how to decipher/decrypt the encrypted data.
✓ If a database system holds particularly sensitive data, it may
be believed necessary to encode it as a insurance against
possible external threats or attempts to access it

Advanced Database Concepts BY: Mekonnen K Page 0.24


– The DBMS can access data after decoding it, although there is a
degradation in performance because of the time taken to decode it
– Encryption also protects data transmitted over communication lines
❑ To transmit data securely over insecure networks requires the use
of a Cryptosystem, which includes:
1. An encryption key to encrypt the data (plaintext)
2. An encryption algorithm that, with the encryption key,
transforms the plaintext into ciphertext
3. A decryption key to decrypt the ciphertext
4. A decryption algorithm that, with the decryption key,
transforms the ciphertext back into plaintext
Data encryption standard is an approach which does both a
substitution of characters and a rearrangement of their order
based on an encryption key.
Advanced Database Concepts BY: Mekonnen K Page 0.25
❑ Types of Cryptosystems

▪ Cryptosystems can be categorized into two:


1. Symmetric encryption – uses the same key for both
encryption and decryption and relies on safe
communication lines for exchanging the key.
2. Asymmetric encryption – uses different keys for
encryption and decryption

▪ Generally, symmetric algorithms are much faster to


execute on a computer than those that are asymmetric. In
the contrary, asymmetric algorithms are more secure than
symmetric algorithms.

Advanced Database Concepts BY: Mekonnen K Page 0.26


▪ Public Key Encryption algorithm:

▪ This algorithm operates with modular arithmetic – mod n ,


where n is the product of two large prime numbers.
▪ Two keys, d and e, are used for decryption and
encryption.
▪ n is chosen as a large integer that is a product of two large
distinct prime numbers, p and q.
▪ The encryption key e is a randomly chosen number between 1 and n
that is relatively prime to (p-1) x (q-1).
▪ The plaintext m is encrypted as C= me mod n.
▪ However, the decryption key d is carefully chosen so that
C d mod n = m.
▪ The decryption key d can be computed from the condition that d
x e -1 is divisible by (p-1)x(q-1).
▪ Thus, the legitimate receiver who knows d simply computes Cd
mod n = m and recovers m.
Advanced Database Concepts BY: Mekonnen K Page 0.27
Simple Example: Asymmetric encryption

1. Select primes p=11, q=3.


2. n = pq = 11*3 = 33
3. find phi which is given by, phi = (p-1)(q-1) = 10*2 = 20
4. Choose e=3 ( 1<e<phi)
5. Check for gcd(e, phi) = gcd(e, (p-1)(q-1)) = gcd(3, 20) = 1
6. Compute d (1<d<phi) such that d *e -1 is divisible by phi
Simple testing (d = 2, 3 ...) gives d = 7
7. Check: ed-1 = 3*7 - 1 = 20, which is divisible by phi (20).
Given
Public key = (n, e) = (33, 3)
Private key = (n, d) = (33, 7)
▪ Now say we want to encrypt the message m = 7
c = me mod n = 73 mod 33 = 343 mod 33 = 13
Hence the ciphertext c = 13
▪ To check decryption we compute
m = cd mod n = 137 mod 33 =62,748,517 mod 33 = 7
Advanced Database Concepts BY: Mekonnen K Page 0.28
Digital Signatures
▪ A digital signature is an example of using encryption techniques to
provide authentication services in e-commerce applications.
▪ A digital signature is a means of associating a mark unique to an
individual with a body of text.
The mark should be unforgettable, meaning that others should be
able to check that the signature does come from the originator.
Public key techniques are the means creating digital signatures.
By combining digital signature with public key, it is possible to
secure encryption with verification of digital signature
Example : Abebe is a sender and Kebede is the receiver
Abebe sign his message with his private key
Abebe encrypt the signed message with Kebed’s Public key and
send it to Kebede
Kebede decrypts the message with his private key
Kebede verifies with Abebe’s public key and recovers the
message.
Advanced Database Concepts BY: Mekonnen K Page 0.29
▪ In SQL the following types of privileges can be granted on each
individual relation R:
SELECT (retrieval or read) privilege on R:
This gives the account the privilege to use the SELECT statement
to retrieve tuples from R.
MODIFY privileges on R:
This privilege is further divided into UPDATE, DELETE, and
INSERT privileges to R.
In addition, both the INSERT and UPDATE privileges can specify
that only certain attributes can be updated by the account.
▪ Notice that to create a view, the account must have SELECT
privilege on all relations involved in the view definition.

Advanced Database Concepts BY: Mekonnen K Page 0.30


▪ A related requirement is the support for content-
based access-control.
▪ Another requirement is related to the heterogeneity of
subjects, which requires access control policies based
on user characteristics and qualifications.
A possible solution, to better take into account user
profiles in the formulation of access control policies,
is to support the notion of credentials.
A credential is a set of properties concerning a user
that are relevant for security purposes
For example, age, position within an organization
It is believed that the XML language can play a key
role in access control for e-commerce applications.

Advanced Database Concepts BY: Mekonnen K Page 0.31


5. Introduction to Flow Control

▪ Flow control regulates the distribution or flow of information


among accessible objects.
▪ A flow between object X and object Y occurs when a program
reads values from X and writes values into Y.
Flow controls check that information contained in some objects
does not flow explicitly or implicitly into less protected objects.
▪ A flow policy specifies the channels along which information is
allowed to move.
The simplest flow policy specifies just two classes of information:
confidential (C) and nonconfidential (N)
and allows all flows except those from class C to class N.
▪ Covert Channels
A covert channel allows a transfer of information that violates
the security or the policy.
A covert channel allows information to pass from a higher
classification level to a lower classification level through
improper means.
Advanced Database Concepts BY: Mekonnen K Page 0.32
2.5 .Encryption and Public Key Infrastructures
▪ Encryption is a means of maintaining secure data in an insecure
environment.
▪ Encryption consists of applying an encryption algorithm to data
using some prespecified encryption key.
▪ The resulting data has to be decrypted using a decryption key
to recover the original data.
▪ This can be done in two ways : DES and RSA
▪ Data Encryption Standard (DES)
▪ It is a system which can provide end-to-end encryption on
the channel between the sender A and receiver B.
▪ DES algorithm is a careful and complex combination of two of
the fundamental building blocks of encryption:
substitution and permutation (transposition).
The DES algorithm derives its strength from repeated application
of these two techniques for a total of 16 cycles.
Plaintext (the original form of the message) is encrypted as
blocks of 64 bits.

Advanced Database Concepts BY: Mekonnen K Page 0.33


▪ RSA Public Key Encryption
▪ Public key Encryption algorithms are based on
mathematical functions rather than operations on
bit patterns.
▪ It incorporates results from number theory, such
as the difficulty of determining the large prime
factors of a large number.
✓ They also involve the use of two separate keys In contrast to
conventional encryption, which uses only one key.
✓ The use of two keys can have profound consequences in the
areas of confidentiality, key distribution, and authentication.
▪ The two keys used for public key encryption are
referred to as the public key and the private key.

Advanced Database Concepts BY: Mekonnen K Page 0.34


▪ A public key encryption scheme, or infrastructure, has six
ingredients:
i. Plaintext: This is the data or readable message that is fed into
the algorithm as input.
ii. Encryption algorithm: The encryption algorithm performs
various transformations on the plaintext.
iii. Public and private keys: These are pair of keys that have been
selected so that if one is used for encryption, the other is used
for decryption.
▪ The exact transformations performed by the encryption algorithm
depend on the public or private key that is provided as input.
iv. Ciphertext:
▪ This is the scrambled message produced as output. It depends on
the plaintext and the key.
▪ For a given message, two different keys will produce two different
ciphertexts.
v. Decryption algorithm:
▪ This algorithm accepts the ciphertext and the matching key and
produces the original plaintext.
Advanced Database Concepts BY: Mekonnen K Page 0.35
▪ Public key is made for public and private key is known only
by owner.
▪ The essential steps are as follows:
i. Each user generates a pair of keys to be used for the
encryption and decryption of messages.
ii. Each user places one of the two keys in a public register
or other accessible file. This is the public key. The
accompanying key is kept private (private key).
iii. If a sender wishes to send a private message to a
receiver, the sender encrypts the message using the
receiver’s public key.
iv. When the receiver receives the message, he or she
decrypts it using the receiver’s private key.
No other recipient can decrypt the message because only the receiver
knows his or her private key.
Advanced Database Concepts BY: Mekonnen K Page 0.36
Distributed Database System

•A distributed database is a collection of databases which are


distributed over different computers of a computer network.

•A distributed database is basically a database that is not limited to


one system, it is spread over different sites, i.e, on multiple
computers or over a network of computers. A distributed database
system is located on various sites that don’t share physical
components.
•Each site has autonomous processing capability and can perform local
applications.

•Each site also participates in the execution of at least one global


application which requires accessing data at several sites.
Advanced Database Concepts BY: Mekonnen K Page 0.37
Distributed Database System (1)

•This may be required when a particular database needs to be


accessed by various users globally. It needs to be managed such that
for the users it looks like one single database.

Shared-Nothing System

Local
Global application
application

No local
applications

Advanced Database Concepts BY: Mekonnen K Page 0.38


Why Distributed Databases ?

1. Local Autonomy: permits setting and enforcing local policies regarding


the use of local data (suitable for organization that are inherently
decentralized).
2. Improved Performance: The regularly used data is proximate to the
users and given the parallelism inherent in distributed systems.
3. Improved Reliability/Availability:
❑ Data replication can be used to obtain higher reliability and
availability.
❑ The autonomous processing capability of the different sites ensures a
graceful degradation property.
4. Incremental Growth: supports a smooth incremental growth with a
minimum degree of impact on the already existing sites.
5. Shareability: allows preexisting sites to share data.
6. Reduced Communication Overhead: The fact that many applications are
local clearly reduces the communication overhead with respect to
centralized databases.

Advanced Database Concepts BY: Mekonnen K Page 0.39


Disadvantages of DDBSs

• Cost: replication of effort (manpower).


• Security: More difficult to control
• Complexity:
• The possible duplication is mainly due to reliability and
efficiency considerations. Data redundancy, however,
complicates update operations.
• If some sites fail while an update is being executed, the
system must make sure that the effects will be reflected on
the data residing at the failing sites as soon as the system
can recover from the failure.
• The synchronization of transactions on multiple sites is
considerably harder than for a centralized system.

Advanced Database Concepts BY: Mekonnen K Page 0.40


Distributed DBMS
Architecture

Advanced Database Concepts BY: Mekonnen K Page 0.41


ANSI/SPARC Architecture
CREAT
External External External External
Schema VIEW …
view view view

Conceptual Conceptual CREAT


Schema view TABLE …

Internal Internal B+-tree


Schema view Index …

Internal view: deals with the physical definition and organization of data.
Conceptual view: abstract definition of the database. It is the “real
world” view of the enterprise being modeled in the database.
External view: individual user’s view of the database.
Advanced Database Concepts BY: Mekonnen K Page 0.42
Distributed Data Systems (2)
Simplify
A distributed database software
can be defined as development

•a logically
Improve
integrated collection system
of shared data which performance

is Physically
• physically distributed
Logically
distributed across the integrated
nodes of a computer
network.

Advanced Database Concepts BY: Mekonnen K Page 0.43


A Taxonomy of Distributed Data Systems

Distributed data
systems

No local
user
Heterogeneous
Homogeneous (Multidatabase)

Loosely coupled Tightly coupled


(interoperable DB (/w global schema)
systems using
export schema)

Advanced Database Concepts BY: Mekonnen K Page 0.44


Homogeneous vs. Heterogeneous
Global users
Homogeneous DDBMS

o In a homogeneous database, all


different sites store database
Homogeneous identically. The operating system,
Database management
system database management system,
and the data structures used – all
are the same at all sites.
DBMS DBMS DBMS DBMS
o Hence, they’re easy to manage.
• No local users

Database 1 Database 2 Database 3 Database 4 • Most systems do not have local


schemas (i.e., every user uses the
same schema)

Advanced Database Concepts BY: Mekonnen K Page 0.45


Architecture of a Homogeneous DDBMS
Global user . . . Global user
A homogeneous
view 1 view n
DDBMS resembles a
Global Schema
centralized DB, but
Fragmentation instead of storing all
Schema
the data at one site,
Allocation
Schema
the data is distributed
across a number of
Local Local sites in a network.
conceptual conceptual
schema 1 schema n

Local Local
internal internal
schema 1 schema n

Local DB 1 Local DB n
Advanced Database Concepts BY: Mekonnen K Page 0.46
Homogeneous vs. Heterogeneous (1)
Heterogeneous DDBMS
Global
user o In a heterogeneous distributed
database, different sites can use
different schema and software
that can lead to problems in query
Multidatabase
Management
processing and transactions. Also,
Local
Local system
user
a particular site might be
user completely unaware of the other
sites.
o Different computers may use a
DBMS DBMS DBMS DBMS different operating system,
different database application.
They may even use different data
Database 1 Database 2 Database 3 Database 4
models for the database.

Advanced Database Concepts BY: Mekonnen K Page 0.47


Homogeneous vs. Heterogeneous (2)
Heterogeneous DDBMS
Global
user o Hence, translations are required
for different sites to
communicate.
Multidatabase o There are both local and global
Management
Local users
Local system
user
user o Multidatabase systems are split
into:
• Tightly Coupled Systems:
DBMS DBMS DBMS DBMS have a global schema
• Loosely Coupled Systems: do
Database 1 Database 2 Database 3 Database 4 not have a global schema.

Advanced Database Concepts BY: Mekonnen K Page 0.48


Schema Architecture of a Tightly-Coupled System

Global user
view 1
... Global user
view n
An individual node’s
participation in the MDB
is defined by means of a
participation schema.
Global Conceptual Schema

Auxiliary Local Local Auxiliary


Schema 1 Participation Participation Schema 1
Schema 1 Schema n

...
Local user
Local Local
view 1 Local user
Conceptual Conceptual
Schema 1 Schema n view 1

Local user Local Local


view 2 Internal Internal Local user
Schema 1 Schema n view 2

Local DB 1 Local DB n
Advanced Database Concepts BY: Mekonnen K Page 0.49
Replication & Fragmentation
o There are 2 ways in which data can be stored on
different sites. These are:
o Replication - In this approach, the entire relationship is stored
redundantly at 2 or more sites. If the entire database is available at
all sites, it is a fully redundant database. Hence, in replication,
systems maintain copies of data.
o This is advantageous as it increases the availability of data at
different sites. Also, now query requests can be processed in
parallel.
o However, it has certain disadvantages as well. Data needs to be
constantly updated. Any change made at one site needs to be
recorded at every site that relation is stored or else it may lead to
inconsistency. This is a lot of overhead.
o Also, concurrency control becomes way more complex as concurrent
access now needs to be checked over a number of sites.

Advanced Database Concepts BY: Mekonnen K Page 0.50


Replication & Fragmentation (1)
o There are 2 ways in which data can be stored on
different sites. These are: (Cont’d)
o Fragmentation - In this approach, the relations are fragmented
(i.e., they’re divided into smaller parts) and each of the fragments is
stored in different sites where they’re required.

o It must be made sure that the fragments are such that they can be
used to reconstruct the original relation (i.e, there isn’t any loss of
data).

o Fragmentation is advantageous as it doesn’t create copies of data,


consistency is not a problem.

Advanced Database Concepts BY: Mekonnen K Page 0.51


Replication & Fragmentation (2)

o Fragmentation of relations can be done in two ways:


• Horizontal fragmentation – Splitting by rows

• The relation is fragmented into groups of tuples so that each tuple is


assigned to at least one fragment.

• Vertical fragmentation – Splitting by columns

• The schema of the relation is divided into smaller schemas. Each


fragment must contain a common candidate key so as to ensure a
lossless join.

• In certain cases, an approach that is hybrid of fragmentation and


replication is used.

Advanced Database Concepts BY: Mekonnen K Page 0.52


Replication & Fragmentation (3)
The user is unaware of the
replication of fragments Applications
Queries are specified on the do not need
relations (rather than the to know this
fragments).
Data Site A
Copy 1 of R1
fragments
Copy 1 of R2
Relation R
Applications see only

Fragment R1
Site B
Fragment R2 Copy 2 of R1
the relations

Fragment R3

Fragment R4
Site C
Copy 2 of R2

Advanced Database Concepts BY: Mekonnen K Page 0.53


Distributed Database Systems (4)
o There are several different architectures for distributed database
systems, including:
o Client-server architecture: In this architecture, clients connect to
a central server, which manages the distributed database system.
The server is responsible for coordinating transactions, managing
data storage, and providing access control.
o Peer-to-peer architecture: In this architecture, each site in the
distributed database system is connected to all other sites. Each
site is responsible for managing its own data and coordinating
transactions with other sites.
o Federated architecture: In this architecture, each site in the
distributed database system maintains its own independent
database, but the databases are integrated through a middleware
layer that provides a common interface for accessing and querying
the data.
Advanced Database Concepts BY: Mekonnen K Page 0.54
Advantage and Disadvantages of DDB Systems

Advantages of Distributed Database System :


1) There is fast data processing as several sites participate in request
processing.
2) Reliability and availability of this system is high.
3) It possess reduced operating cost.
4) It is easier to expand the system by adding more sites.
5) It has improved sharing ability and local autonomy.
Disadvantages of Distributed Database System :
1) The system becomes complex to manage and control.
2) The security issues must be carefully managed.
3) The system require deadlock handling during the transaction processing
otherwise
4) The entire system may be in inconsistent state.
5) There is need of some standardization for processing of distributed
database
system.
Advanced Database Concepts BY: Mekonnen K Page 0.55
END

Advanced Database Concepts BY: Mekonnen K Page 0.56

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy