0% found this document useful (0 votes)
142 views

Database Management Systems Unit-Wise

Uploaded by

satya.G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views

Database Management Systems Unit-Wise

Uploaded by

satya.G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

DATABASE MANAGEMENT SYSTEMS (MCA2101)

UNIT I: Introduction to Databases: Introduction, An Example, Characteristics of the Database Approach, Actors on Scene, Workers
behind the scene, Advantages of Using the DBMS Approach, A Brief History of Database Applications, When Not to Use a DBMS
[Text book-3] Overview of Database Languages and Architectures: Data Models, Schemas and Instances, Three Schema
Architecture and Data Independence, Database Languages and Interfaces, The Database System Environment, Centralized and
Client/Server Architecture for DBMSs, Classification of Database Management Systems [Text book-3]

UNIT II: Introduction to Database Design: Database Design and ER Diagrams, Entities, Attributes and Entity Sets, Relationships
and Relationship Sets, Additional Features of the ER Model, Conceptual Design with the ER Model, Conceptual Design for Large
Enterprises. Relational Model: Introduction to the Relational Model, Integrity Constraints over Relations, Enforcing Integrity
Constraints, Querying Relational Data, Logical Database Design: ER to Relational, Introduction to Views, Destroying/Altering Tables
and Views

UNIT III: Relational Algebra: Selection and Projection, Set Operations, Renaming, Joins, Division, More Examples of Algebra
Queries. SQL: Queries, Constraints, Triggers: The Form of a Basic SQL Query, UNION, INTERSECT and EXCEPT, Nested
Queries, Aggregate Operators, Null Values, Complex Integrity Constraints in SQL, Triggers and Active Databases, Designing Active
Databases.

UNIT IV: Introduction to Normalization Using Functional and Multi valued Dependencies: Informal Design Guidelines for
Relation Schema, Functional Dependencies, Normal Forms Based on Primary Keys, General Definitions of Second and Third Normal
Forms, Boyce-Codd Normal Form, Multi valued Dependency and Fourth Normal Form, Join Dependencies and Fifth Normal Form.

UNIT V: Transaction Management and Concurrency Control: Transaction Concept, A Simple Transaction Model, Storage
Structure, ACID Properties, Serializability, Transaction Isolation Levels, Concurrency Control, Lock-Based Protocols, Validation-
Based Protocols [Text Book-2]

UNIT I
1.1 Introduction
Importance: Database systems have become an essential component of life in modern society, in that many frequently occurring
events trigger the accessing of at least one database: bibliographic library searches, bank transactions, hotel/airline reservations,
grocery store purchases, online (Web) purchases, etc., etc.
Also, database search techniques are applied by some WWW search engines.
Definitions
The term database is often used, rather loosely, to refer to just about any collection of related data. E&N say that, in addition to being
a collection of related data, a database must have the following properties:
 It represents some aspect of the real (or an imagined) world, called the miniworld or universe of discourse. Changes to the
mini world are reflected in the database. Imagine, for example, a UNIVERSITY mini world concerned with students,
courses, course sections, grades, and course prerequisites.
 It is a logically coherent collection of data, to which some meaning can be attached. (Logical coherency requires, in part, that
the database not be self-contradictory.)
 It has a purpose: there is an intended group of users and some preconceived applications that the users are interested in
employing.
To summarize: a database has some source (i.e., the miniworld) from which (logically consistent) data are derived, some degree of
interaction with events in the represented miniworld (at least insofar as the data is updated in response to changes in the state of the
miniworld), and an audience that is interested in using it.
An Aside: data vs. information vs. knowledge: Data is the representation of "facts" or "observations" whereas information refers to
the meaning thereof (according to some interpretation). Knowledge, on the other hand, refers to the ability to use information to
achieve intended ends.
Computerized vs. manual: Not surprisingly (this being a CS course), our concern will be with computerized database systems, as
opposed to manual ones, such as the card catalog-based systems that were used in libraries in ancient times (i.e., before the year 2000).
(Some authors wouldn't even recognize a non-computerized collection of data as a database, but E&N do.)
Size/Complexity: Databases run the range from being small/simple (e.g., one person's recipe database) to being huge/complex (e.g.,
Amazon's database that keeps track of all its products, customers, and suppliers).
Definition: A database management system (DBMS) is a collection of programs enabling users to create and maintain a database.
DB Functionalities: More specifically, a DBMS is a general purpose software system facilitating each of the following (with respect
to a database):
 definition: specifying data types (and other constraints to which the data must conform) and data organization
 construction: the process of storing the data on some medium (e.g., magnetic disk) that is controlled by the DBMS
 manipulation: querying, updating, report generation
 sharing: allowing multiple users and programs to access the database "simultaneously"
 system protection: preventing database from becoming corrupted when hardware or software failures occur
 security protection: preventing unauthorized or malicious access to database.
Given all its responsibilities, it is not surprising that a typical DBMS is a complex piece of software.
A database together with the DBMS software is referred to as a database system. (See Figure 1.1, page 7.)
1.2: An Example:
UNIVERSITY database in Figure 1.2. Notice that it is relational!
Among the main ideas illustrated in this example is that each file/relation/table has a set of named fields/attributes/columns, each of
which is specified to be of some data type. (In addition to a data type, we might put further restrictions upon a field, e.g.,
the Grade field in the GRADE_REPORT table must have a value from the set {'A', 'B', ..., 'F'}.)
The idea is that, of course, each table will be populated with data in the form of records/tuples/rows, each of which represents some
entity (in the mini world) or some relationship between entities.
For example, each record in the STUDENT table represents a —surprise!— student. Similarly for
the COURSE and SECTION tables.
On the other hand, each record in GRADE_REPORT represents a relationship between a student and a section of a course. And each
record in PREREQUISITE represents a relationship between two courses.
Database manipulation involves querying and updating.
Examples of (informal) queries:
 Retrieve the transcript(s) of student(s) named 'Smith'.
 List the names of students who were enrolled in a section of the 'Database' course in Spring 2006, as well as their grades in
that course section.
 List all prerequisites of the 'Database' course.
Examples of (informal) updates:
 Change the CLASS value of 'Smith' to sophomore (i.e., 2).
 Insert a record for a section of 'File Processing' for this semester.
 Remove from the prerequisites of course 'CMPS 340' the course 'CMPS 144'.
A query/update must be conveyed to the DBMS in a precise way (via the query language of the DBMS) in order to be processed.
As with software in general, developing a new database (or a new application for an existing database) proceeds in phases,
including requirements analysis and various levels of design (conceptual (e.g., Entity-Relationship Modeling), logical (e.g.,
relational), and physical (file structures)).
1.3: Characteristics of the Database Approach:
Database approach vs. File Processing approach: Consider an organization/enterprise that is organized as a collection of
departments/offices. Each department has certain data processing "needs", many of which are unique to it. In the file processing
approach, each department would control a collection of relevant data files and software applications to manipulate that data.
Data redundancy, which not only wastes storage space but also makes it more difficult to keep changing data items
consistent with one another, as a change to one copy of a data item must be made to all of them. Inconsistency results when one (or
more) copies of a datum are changed but not others. (E.g., If you change your address, informing the Registrar's Office should suffice
to ensure that your grades are sent to the right place, but does not guarantee that your next bill will be, as the copy of your address
maintained by the Bursar's Office might not have been changed.)
In the database approach, a single repository of data is maintained that is used by all the departments in the organization.
(Note that "single repository" is used in the logical sense. In physical terms, the data may be distributed among various sites, and
possibly mirrored.)

Main Characteristics of database approach:


1. Self-Description: A database system includes —in addition to the data stored that is of relevance to the organization— a
complete definition/description of the database's structure and constraints. This meta-data (i.e., data about data) is stored in
the so-called system catalog, which contains a description of the structure of each file, the type and storage format of each
field, and the various constraints on the data The system catalog is used not only by users (e.g., who need to know the names
of tables and attributes, and sometimes data type information and other things), but also by the DBMS software, which
certainly needs to "know" how the data is structured/organized in order to interpret it in a manner consistent with that
structure. Recall that a DBMS is general purpose, as opposed to being a specific database application. Hence, the structure of
the data cannot be "hard-coded" in its programs (such as is the case in typical file processing approaches), but rather must be
treated as a "parameter" in some sense.
2. Insulation between Programs and Data; Data Abstraction:
Program-Data Independence: In traditional file processing, the structure of the data files accessed by an application is
"hard-coded" in its source code. (E.g., Consider a file descriptor in a COBOL program: it gives a detailed description of the
layout of the records in a file by describing, for each field, how many bytes it occupies.)
If, for some reason, we decide to change the structure of the data (e.g., by adding the first two digits to the YEAR field, in
order to make the program Y2K compliant!), every application in which a description of that file's structure is hard-coded
must be changed!
In contrast, DBMS access programs, in most cases, do not require such changes, because the structure of the data is described
(in the system catalog) separately from the programs that access it and those programs consult the catalog in order to
ascertain the structure of the data (i.e., providing a means by which to determine boundaries between records and between
fields within records) so that they interpret that data properly.
3. Multiple Views of Data: Different users (e.g., in different departments of an organization) have different "views" or
perspectives on the database. For example, from the point of view of a Bursar's Office employee, student data does not
include anything about which courses were taken or which grades were earned. (This is an example of a subset view.)
As another example, a Registrar's Office employee might think that GPA is a field of data in each student's record. In reality,
the underlying database might calculate that value each time it is needed. This is called virtual (or derived) data.
A view designed for an academic advisor might give the appearance that the data is structured to point out the prerequisites
of each course.
4. Data Sharing and Multi-user Transaction Processing: As you learned about (or will) in the OS course, the simultaneous
access of computer resources by multiple users/processes is a major source of complexity. The same is true for multi-user
DBMS's.
Arising from this is the need for concurrency control, which is supposed to ensure that several users trying to update the
same data do so in a "controlled" manner so that the results of the updates are as though they were done in some sequential
order. This gives rise to the concept of a transaction, which is a process that makes one or more accesses to a database and
which must have the appearance of executing in isolation from all other transactions (even ones that access the same data at
the "same time") and of being atomic (in the sense that, if the system crashes in the middle of its execution, the database
contents must be as though it did not execute at all).
Applications such as airline reservation systems are known as online transaction processing applications.
1.4: Actors on the Scene
These apply to "large" databases, not "personal" databases that are defined, constructed, and used by a single person via, say,
Microsoft Access.
1. Database Administrator (DBA): This is the chief administrator, who oversees and manages the database system (including
the data and software). Duties include authorizing users to access the database, coordinating/monitoring its use, acquiring
hardware/software for upgrades, etc. In large organizations, the DBA might have a support staff.
2. Database Designers: They are responsible for identifying the data to be stored and for choosing an appropriate way to
organize it. They also define views for different categories of users. The final design must be able to support the requirements
of all the user sub-groups.
3. End Users: These are persons who access the database for querying, updating, and report generation. They are main
reason for database's existence!
o Casual end users: use database occasionally, needing different information each time; use query language to
specify their requests; typically middle- or high-level managers.
o Naive/Parametric end users: Typically the biggest group of users; frequently query/update the database using
standard canned transactions that have been carefully programmed and tested in advance. Examples:
 bank tellers check account balances, post withdrawals/deposits
 reservation clerks for airlines, hotels, etc., check availability of seats/rooms and make reservations.
 shipping clerks (e.g., at UPS) who use buttons, bar code scanners, etc., to update status of in-transit
packages.
o Sophisticated end users: engineers, scientists, business analysts who implement their own applications to meet
their complex needs.
o Stand-alone users: Use "personal" databases, possibly employing a special-purpose (e.g., financial) software
package.
4. System Analysts, Application Programmers, Software Engineers:
o System Analysts: determine needs of end users, especially naive and parametric users, and develop specifications
for canned transactions that meet these needs.
o Application Programmers: Implement, test, document, and maintain programs that satisfy the specifications
mentioned above.
1.5: Workers Behind the Scene
 DBMS system designers/implementors: provide the DBMS software that is at the foundation of all this!
 Tool developers: design and implement software tools facilitating database system design, performance monitoring, creation
of graphical user interfaces, prototyping, etc.
 Operators and maintenance personnel: responsible for the day-to-day operation of the system.
1.6: Capabilities/Advantages of DBMS's
1. Controlling Redundancy: Data redundancy (such as tends to occur in the "file processing" approach) leads to wasted
storage space, duplication of effort (when multiple copies of a datum need to be updated), and a higher liklihood of the
introduction of inconsistency.
On the other hand, redundancy can be used to improve performance of queries. Indexes, for example, are entirely redundant,
but help the DBMS in processing queries more quickly.
Another example of using redundancy to improve performance is to store an "extra" field in order to avoid the need to access
other tables (as when doing a JOIN, for example). See Figure 1.6 (page 18): the StudentName and CourseNumber fields need
not be there.
A DBMS should provide the capability to automatically enforce the rule that no inconsistencies are introduced when data is
updated. (Figure 1.6 again, in which Student_name does not match Student_number.)
2. Restricting Unauthorized Access: A DBMS should provide a security and authorization subsystem, which is used for
specifying restrictions on user accounts. Common kinds of restrictions are to allow read-only access (no updating), or access
only to a subset of the data (e.g., recall the Bursar's and Registrar's office examples from above).
3. Providing Persistent Storage for Program Objects: Object-oriented database systems make it easier for complex runtime
objects (e.g., lists, trees) to be saved in secondary storage so as to survive beyond program termination and to be retrievable
at a later time.
4. Providing Storage Structures for Efficient Query Processing: The DBMS maintains indexes (typically in the form of trees
and/or hash tables) that are utilized to improve the execution time of queries and updates. (The choice of which indexes to
create and maintain is part of physical database design and tuning (see Chapter 16) and is the responsibility of the DBA.
The query processing and optimization module is responsible for choosing an efficient query execution plan for each query
submitted to the system. (See Chapter 15.)
5. Providing Backup and Recovery: The subsystem having this responsibility ensures that recovery is possible in the case of a
system crash during execution of one or more transactions.
6. Providing Multiple User Interfaces: For example, query languages for casual users, programming language interfaces for
application programmers, forms and/or command codes for parametric users, menu-driven interfaces for stand-alone users.
7. Representing Complex Relationships Among Data: A DBMS should have the capability to represent such relationships
and to retrieve related data quickly.
8. Enforcing Integrity Constraints: Most database applications are such that the semantics (i.e., meaning) of the data require
that it satisfy certain restrictions in order to make sense. Perhaps the most fundamental constraint on a data item is its data
type, which specifies the universe of values from which its value may be drawn. (E.g., a Grade field could be defined to be of
type Grade_Type, which, say, we have defined as including precisely the values in the set { "A", "A-", "B+", ..., "F" }.
Another kind of constraint is referential integrity, which says that if the database includes an entity that refers to another one,
the latter entity must exist in the database. For example, if (R56547, CIL102) is a tuple in the Enrolled_In relation, indicating
that a student with ID R56547 is taking a course with ID CIL102, there must be tuples in the Student and Course relations,
respectively, that describe a student and a course with those ID's.
9. Permitting Inferencing and Actions Via Rules: In a deductive database system, one may specify declarative rules that
allow the database to infer new data! E.g., Figure out which students are on academic probation. Such capabilities would take
the place of application programs that would be used to ascertain such information otherwise.
Active database systems go one step further by allowing "active rules" that can be used to initiate actions automatically.

A Brief History of Database Applications

1. Early Database Applications Using Hierarchical and Network Systems: Many early database applications maintained
records in large organzations, such as corporations, universities, hospitals, and banks. In many of these applications, there
were large numbers of records of similar structure. One of the main problems with early database systems was the
intermixing of conceptual relationships with the physical storage and placement of records on disk. Another shortcoming of
early systems was that they provided only programming language interfaces. This made it time-consuming and expensive to
implement new queries and transactions, since new programs had to be written, tested, and debugged.
2. Providing Application Flexibility with Relational Databases : Relational databases were originally proposed to separate
the physical storage of data from its conceptual representation and to provide a mathematical foundation for databases. The
relational data model also introduced high-level query languages that provided an alternative to programming language
interfaces; hence, it was a lot quicker to write new queries. Eventually, relational databases became the dominant type of
database systems for traditional database applications. Relational databases now exist on almost all types of computers, from
small personal computers to large servers.
3. Object-Oriented Applications and the Need for More Complex Databases : The emergence of object-oriented
programming languages in the 1980s and the need to store and share complex-structured objects led to the development of
object-oriented databases. Initially, they were considered a competitor to relational databases, since they provided more
general data structures. They also incorporated many of the useful object oriented paradigms, such as abstract data types,
encapsulation of operations, inheritance, and object identity. However, the complexity of the model and the lack of an early
standard contributed to their limited usc. They are now mainly used in specialized applications, such as engineering design,
multimedia publishing, and manufacturing systems.
4. Interchanging Data on the Web for E-Commerce : The World Wide Web provided a large network of interconnected
computers. Users can create documents using a Web publishing language, such as HTML (HyperText Markup Language),
and store these documents on Web servers where other users (clients) can access them. Documents can be linked together
through hyper links, which are pointers to other documents. A variety of techniques were developed to allow the interchange
of data on the Web. Currently, XML (eXtended Markup Language) is considered to be the primary standard for
interchanging data among various types of databases and Web pages. XML combines concepts from the models used in
document systems with database modeling concepts.
5. Extending Database Capabilities for New Applications : The success of database systems in traditional applications
encouraged developers of other types of applications to attempt to use them. Such applications traditionally used their own
specialized file and data structures.
★ Scientific applications → store large amounts of data from scientific experiments
★ Storage & retrieval of images
★ Storage & retrieval of videos
★ Data mining applications → analyzing large amounts of data
★ Spatial applications → weather information
★ Time series applications → eg: daily sales information
When Not to Use a DBMS :
Overhead costs of using DBMS: High initial investment
→ Overhead for providing security, concurrency control, recovery
★ Database & applications → simple, well defined and no changes expected.
★ Multiple-user access → not required.

Overview of Database Languages and Architectures:


Data Models: A data model—a collection of concepts that can be used to describe the structure of a database—provides the necessary
means to achieve this abstraction
Categories of Data Models :Many data models have been proposed, and we can categorize them according to the types of concepts
they use to describe the database structure.
High-level or conceptual data models provide concepts that are close to the way many users perceive data. Conceptual data models
use concepts such as entities, attributes, and relationships. An entity represents a real-world object or concept, such as an employee or
a project, that is described in the database. An attribute represents some property of interest that further describes an entity, such as the
employee’s name or salary. A relationship among two or more entities represents an interaction among the entities
Low-level or physical data models provide concepts that describe the details of how data is stored in the computer.
describe how data is stored as files in the computer by representing information such as record formats, record orderings, and access
paths. An access path is a search structure that makes the search for particular database records efficient, such as indexing or hashing.
Representational (or implementation) data models, which provide concepts that may be understood by end users but that are not
too far removed from the way data is organized within the computer. Representational data models hide some details of data storage
but can be implemented on a computer system in a direct way. Representational data models represent data by using record structures
and hence are sometimes called record-based data models.

Schemas and Instance ,Database State: In a data model, it is important to distinguish between the description of the database and the
database itself. The description of a database is called the database schema, which is specified during database design and is not
expected to change frequently. Most data models have certain conventions for displaying schemas as diagrams.

The diagram displays the structure of each record type but not the actual instances of records. We call each object in the schema—
such as STUDENT or COURSE—a schema construct. The data in the database at a particular moment in time is called a database
state or snapshot. It is also called the current set of occurrences or instances in the database.

Three Schema Architecture and Data Independence:


The Three-Schema Architecture The goal of the three-schema architecture, is to separate the user applications from the physical
database. In this architecture, schemas can be defined at the following three levels:
1. The internal level has an internal schema, which describes the physical storage structure of the database. The internal schema uses a
physical data model and describes the complete details of data storage and access paths for the database
2.The conceptual level has a conceptual schema, which describes the structure of the whole database for a community of users. The
conceptual schema hides the details of physical storage structures and concentrates on describing entities, data types, relationships,
user operations, and constraints. Usually, a representational data model is used to describe the conceptual schema when a database
system is implemented. This implementation conceptual schema is often based on a conceptual schema design in a high-level data
model.

3. The external or view level includes a number of external schemas or user views. Each external schema describes the part of the
database that a particular user group is interested in and hides the rest of the database from that user group. As in the previous level,
each external schema is typically implemented using a representational data model, possibly based on an external schema design in a
high-level conceptual data model.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level of the database system without
altering the schema at the next higher level.
There are two types of data independence:
1. Logical Data Independence
o Logical data independence refers characteristic of being able to change the conceptual schema without having to change the
external schema.
o Logical data independence is used to separate the external level from the conceptual view.
o If we do any changes in the conceptual view of the data, then the user view of the data would not be affected.
o Logical data independence occurs at the user interface level.
2. Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal schema without having to change the
conceptual schema.
o If we do any changes in the storage size of the database system server, then the Conceptual structure of the database will not
be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.

Database Languages and Interfaces:


DA LANGUAGES:
In DBMSs where a clear separation is maintained between the conceptual and internal levels, the DDL is used to specify the
conceptual schema only. Another language, the storage definition language (SDL), is used to specify the internal schema. The
mappings between the two schemas may be specified in either one of these languages. For a true three-schema architecture, we would
need a third language, the view definition language (VDL), to specify user views and their mappings to the conceptual schema, but in
most DBMSs the DDL is used to define both conceptual and external schemas.
Once the database schemas are compiled and the database is populated with data, users must have some means to manipulate
the database. Typical manipulations include retrieval, insertion, deletion, and modification of the data. The DBMS provides a data
manipulation language (DML) for these purposes There are two main types of DMLs. A high-level or nonprocedural DML can be
used on its own to specify complex database operations in a concise manner. Many DBMSs allow high-level DML statements either
to be entered interactively from a terminal (or monitor) or to be embedded in a general-purpose programming language. In the latter
case, DML statements must be identified within the program so that they can be extracted by a pre-compiler and processed by the
DBMS.
A low-level or procedural DML must be embedded in a general-purpose programming language. This type of DML typically
retrieves individual records or objects from the database and processes each separately. Hence, it needs to use programming language
constructs, such as looping, to retrieve and process each record from a set of records. Low-level DMLs are also called record-at-a-time
DMLs because of this property.
High-level DMLs, such as SQL, can specify and retrieve many records in a single DML statement and are hence called set-
at-a-time or set-oriented DMLs. A query in a high-level DML often specifies which data to retrieve rather than how to retrieve it;
hence, such languages are also called declarative.

DB INTERFACES:
A database management system (DBMS) interface is a user interface that allows for the ability to input queries to a database without
using the query language itself. User-friendly interfaces provided by DBMS may include the following:
 Menu-Based Interfaces
 Forms-Based Interfaces
 Graphical User Interfaces
 Natural Language Interfaces
 Interfaces for Parametric Users
 Interfaces for the Database Administrator (DBA)
Menu-Based Interfaces
These interfaces present the user with lists of options (called menus) that lead the user through the formation of a request. The basic
advantage of using menus is that they remove the tension of remembering specific commands and syntax of any query language. The
query is basically composed step by step by collecting or picking options from a menu that is shown by the system. Pull-down menus
are a very popular technique in Web-based interfaces. They are also often used in browsing interfaces which allow a user to look
through the contents of a database in an exploratory and unstructured manner.
Forms-Based Interfaces
A forms-based interface displays a form to each user. Users can fill out all of the form entries to insert new data, or they can fill out
only certain entries, in which case the DBMS will redeem the same type of data for other remaining entries. These types of forms are
usually designed or created and programmed for users that have no expertise in operating systems. Many DBMS’s have form
specification languages which are special languages that help specify such forms.
Example: SQL Forms is a form-based language that specifies queries using a form designed in conjunction with the relational
database schema.
Graphical User Interface
A GUI typically displays a schema to the user in diagrammatic form. The user then can specify a query by manipulating the diagram.
In many cases, GUI utilise both menus and forms. Most GUI use a pointing device such as a mouse, to pick a certain part of the
displayed schema diagram.
Natural Language Interfaces
These interfaces accept requests written in English or some other language and attempt to understand them. A Natural language
interface has its own schema, which is similar to the database conceptual schema as well as a dictionary of important words.
The natural language interface refers to the words in its schema as well as to the set of standard words in a dictionary to interpret the
request. If the interpretation is successful, the interface generates a high-level query corresponding to the natural language and submits
it to the DBMS for processing, otherwise, a dialogue is started with the user to clarify any provided condition or request. The main
disadvantage of this is that the capabilities of this type of interface are not that advance.
Interface for Parametric Users
Interfaces for Parametric Users contain some commands that can be handled with a minimum of keystrokes. It is generally used in
bank transactions for transferring money. These operations are performed repeatedly.
Interfaces for Database Administrators (DBA)
Most database system contains privileged commands that can be used only by the DBA’s staff. These include commands for creating
accounts, setting system parameters, granting account authorization, changing a schema, and reorganizing the storage structures of
databases.
The Database System Environment:
A DBMS is a complex software system. In this section we discuss the types of software components that constitute a DBMS and the
types of computer system software with which the DBMS interacts.
DBMS Component Modules :the typical DBMS components divided into two parts. The top part of the figure refers to the various
users of the database environment and their interfaces. The lower part shows the internal modules of the DBMS responsible for
storage of data and processing of transactions.

The database and the DBMS catalog are usually stored on disk. Access to the disk is controlled primarily by the operating system
(OS), which schedules disk read/write. Many DBMSs have their own buffer management module to schedule disk read/write, because
management of buffer storage has a considerable effect on performance. Reducing disk read/write improves performance
considerably. A higher-level stored data manager module of the DBMS controls access to DBMS information that is stored on disk,
whether it is part of the database or the catalog.
first. It shows interfaces for the DBA staff, casual users who work with interactive interfaces to formulate queries,
application programmers who create programs using some host programming languages, and parametric users who do data entry
work by supplying parameters to predefined transactions. The DBA staff works on defining the database and tuning it by making
changes to its definition using the DDL and other privileged commands. The DDL compiler processes schema definitions, specified
in the DDL, and stores descriptions of the schemas (meta-data) in the DBMS catalog. The catalog includes information such as the
names and sizes of files, names and data types of data items, storage details of each file, mapping information among schemas, and
constraints. In addition, the catalog stores many other types of information that are needed by the DBMS modules, which can then
look up the catalog information as needed.
The rest of the program is sent to the host language compiler. The object codes for the DML commands and the rest of the
program are linked, forming a canned transaction whose executable code includes calls to the runtime database processor. It is also
becoming increasingly common to use scripting languages such as PHP and Python to write database programs. Canned transactions
are executed repeatedly by parametric users via PCs or mobile apps; these users simply supply the parameters to the transactions. Each
execution is considered to be a separate transaction.
Common utilities have the following types of functions:
■ Loading. A loading utility is used to load existing data files—such as text files or sequential files—into the database.
Usually, the current (source) format of the data file and the desired (target) database file structure are specified to the utility, which
then automatically reformats the data and stores it in the database. With the proliferation of DBMSs, transferring data from one DBMS
to another is becoming common in many organizations. Some vendors offer conversion tools that generate the appropriate loading
programs, given the existing source and target database storage descriptions (internal schemas).
■ Backup. A backup utility creates a backup copy of the database, usually by dumping the entire database onto tape or other
mass storage medium. The backup copy can be used to restore the database in case of catastrophic disk failure. Incremental backups
are also often used, where only changes since the previous backup are recorded. Incremental backup is more complex, but saves
storage space.
■ Database storage reorganization. This utility can be used to reorganize a set of database files into different file
organizations and create new access paths to improve performance.
■ Performance monitoring. Such a utility monitors database usage and provides statistics to the DBA. The DBA uses the
statistics in making decisions such as whether or not to reorganize files or whether to add or drop indexes to improve performance.
Other utilities may be available for sorting files, handling data compression, monitoring access by users, interfacing with the network,
and performing other functions.

Centralized and Client/Server Architecture for DBMSs:

Centralized DBMSs Architecture: Architectures for DBMSs have followed trends similar to those for general computer system
architectures. Older architectures used mainframe computers to provide the main processing for all system functions, including user
application programs and user interface programs, as well as all the DBMS functionality. The reason was that in older systems, most
users accessed the DBMS via computer terminals that did not have processing power and only provided display capabilities.
Therefore, all processing was performed remotely on the computer system housing the DBMS, and only display information
and controls were sent from the computer to the display terminals, which were connected to the central computer via various types of
communications networks. As prices of hardware declined, most users replaced their terminals with PCs and workstations, and more
recently with mobile devices.
At first, database systems used these computers similarly to how they had used display terminals, so that the DBMS itself
was still a centralized DBMS in which all the DBMS functionality, application program execution, and user interface processing were
carried out on one machine. Figure illustrates the physical components in a centralized architecture. Gradually, DBMS systems started
to exploit the available processing power at the user side, which led to client/server DBMS architectures.
Basic Client/Server Architectures :First, we discuss client/server architecture in general; then we discuss how it is applied to
DBMSs. The client/server architecture was developed to deal with computing environments in which a large number of PCs,
workstations, file servers, printers, database servers, Web servers, e-mail servers, and other software and equipment are connected via
a network. The idea is to define specialized servers with specific functionalities. For example, it is possible to connect a number of
PCs or small workstations as clients to a file server that maintains the files of the client machines. Another machine can be designated
as a printer server by being connected to various printers; all print requests by the clients are forwarded to this machine. Web servers
or e-mail servers also fall into the specialized server category.

The resources provided by specialized servers can be accessed by many client machines. The client machines provide the
user with the appropriate interfaces to utilize these servers, as well as with local processing power to run local applications. This
concept can be carried over to other software packages, with specialized programs—such as a CAD (computer-aided design) package
—being stored on specific server machines and being made accessible to multiple clients. Figure illustrates client/server architecture at
the logical level.

Classification of Database Management Systems:


Classification of Database Management Systems Several criteria can be used to classify DBMSs.

The first is the data model on which the DBMS is based.

Relational DBMS (RDBMS): Organizes data into tables of rows and columns, with relationships established between tables. SQL
(Structured Query Language) is typically used to manipulate and query data.
Hierarchical DBMS: Represents data in a tree-like structure, with each record having one parent record and multiple children records.
Commonly used in older systems and in environments like mainframes.
Network DBMS: Extends the hierarchical model by allowing each record to have multiple parent and child records, forming a
network-like structure. Often used in specialized applications.
Object-Oriented DBMS (OODBMS): Stores data as objects, encapsulating data and behavior. Objects can contain attributes and
methods, offering more flexibility and support for complex data structures.
The main data model used in many current commercial DBMSs is the relational data model, and the systems based on this model are
known as SQL systems. The object data model has been implemented in some commercial systems but has not had widespread use.
Recently, so-called big data systems, also known as key-value storage systems and NOSQL systems, use various data models:
document-based, graph-based, column-based, and key-value data models. Many legacy applications still run on database systems
based on the hierarchical and network data models. The relational DBMSs are evolving continuously, and, in particular, have been
incorporating many of the concepts that were developed in object databases. This has led to a new class of DBMSs called object-
relational DBMSs.

The second criterion used to classify DBMSs is the number of users supported by the system. Single-user systems support only one
user at a time and are mostly used with PCs. Multiuser systems, which include the majority of DBMSs, support concurrent multiple
users.

The third criterion is the number of sites over which the database is distributed. A DBMS is centralized if the data is stored at a
single computer site. A centralized DBMS can support multiple users, but the DBMS and the database reside totally at a single
computer site. A distributed DBMS (DDBMS) can have the actual database and DBMS software distributed over many sites
connected by a computer network. Big data systems are often massively distributed, with hundreds of sites. The data is often
replicated on multiple sites so that failure of a site will not make some data unavailable.
Homogeneous DDBMSs use the same DBMS software at all the sites, whereas heterogeneous DDBMSs can use different DBMS
software at each site. It is also possible to develop middleware software to access several autonomous preexisting databases stored
under heterogeneous DDBMSs. This leads to a federated DBMS (or multidatabase system), in which the participating DBMSs are
loosely coupled and have a degree of local autonomy. Many DDBMSs use client-server architecture also.

The fourth criterion is cost. It is difficult to propose a classification of DBMSs based on cost. Today we have open source (free)
DBMS products like MySQL and PostgreSQL that are supported by third-party vendors with additional services. The main RDBMS
products are available as free examination 30-day copy versions as well as personal versions, which may cost under $100 and allow a
fair amount of functionality. The giant systems are being sold in modular form with components to handle distribution, replication,
parallel processing, mobile capability, and so on, and with a large number of parameters that must be defined for the configuration.
Furthermore, they are sold in the form of licenses—site licenses allow unlimited use of the database system with any number of copies
running at the customer site. Another type of license limits the number of concurrent users or the number of user seats at a location.
Standalone single-user versions of some systems like Microsoft Access are sold per copy or included in the overall configuration of a
desktop or laptop. In addition, data warehousing and mining features, as well as support for additional data types, are made available
at extra cost. It is possible to pay millions of dollars for the installation and maintenance of large database systems annually. We can
also classify a DBMS on the basis of the types of access path options for storing files. One well-known family of DBMSs is based
on inverted file structures. Finally, a DBMS can be general purpose or special purpose. When performance is a primary
consideration, a special-purpose DBMS can be designed and built for a specific application; such a system cannot be used for other
applications without major changes. Many airline reservations and telephone directory systems developed in the past are special-
purpose DBMSs. These fall into the category of online transaction processing (OLTP) systems, which must support a large number of
concurrent transactions without imposing excessive delays.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy