Unit 1 and 2 It03
Unit 1 and 2 It03
2
3. Object-Oriented DBMS (OODBMS)
OODBMS integrates object-oriented programming concepts into the database environment,
allowing data to be stored as objects. This approach supports complex data types and
relationships, making it ideal for applications requiring advanced data modeling and real-world
simulations.
Examples: ObjectDB, db4o.
A file system and a DBMS are two kinds of data management systems that are used in different
capacities and possess different characteristics. A File System is a way of organizing files into
groups and folders and then storing them in a storage device. It provides the media that stores
data as well as enables users to perform procedures such as reading, writing, and even erasure.
On the other hand, DBMS is a more elaborate software application that is solely charged with the
responsibility of managing large amounts of structured data. It provides functionalities such as
query, index, transaction, as well as data integrity. Although the file system serves well for the
purpose of data storage for applications where data is to be stored simply and does not require
any great organization, DBMS is more appropriate for applications where data needs to be stored
and optimized for organizational and structural needs, security, etc.
File System
The file system is basically a way of arranging the files in a storage medium like a hard disk.
The file system organizes the files and helps in the retrieval of files when they are required. File
systems consist of different files which are grouped into directories. The directories further
contain other folders and files. The file system performs basic operations like management, file
naming, giving access rules, etc.
Example: NTFS(New Technology File System) , EXT(Extended File System).
File System
Example:
Oracle, MySQL, MS SQL server.
3
DBMS
Difference Between File System and DBMS
Basics File System DBMS
Only one user can access data at a Multiple users can access data at a
User Access time. time.
4
Basics File System DBMS
The users are not required to write The user has to write procedures for
Meaning procedures. managing databases
Key Concepts:
Data Models:
Define how data is organized and represented within the database, influencing how it's stored
and retrieved. Examples include relational, entity-relationship, and object-oriented models.
Schemas:
Describe the structure and constraints of a database, including tables, columns, and
relationships.
Database Management System (DBMS):
Software that manages the database, handling data access, storage, and manipulation
5
DBMS Architecture 1-level, 2-Level, 3-Level
A Database stores a lot of critical information to access data quickly and securely. Hence it is
important to select the correct architecture for efficient data management. Database Management
System (DBMS) architecture is crucial for efficient data management and system performance. It
helps users to get their requests done while connecting to the database. It focuses on how the
database is designed, built and maintained, shaping how users access and interact with it. This
article explains different DBMS architectures like client/server systems and database models.
Types of DBMS Architecture
There are several types of DBMS Architecture that we use according to the usage requirements.
Types of DBMS Architecture are discussed here.
1-Tier Architecture
2-Tier Architecture
3-Tier Architecture
1- Tier Architecture
In 1-Tier Architecture the database is directly available to the user, the user can directly sit on
the DBMS and use it that is, the client, server, and Database are all present on the same machine.
This setup is simple and is often used in personal or standalone applications where the user
interacts directly with the database.
For Example: A Microsoft Excel spreadsheet is a great example of one-tier architecture.
Everything—the user interface, application logic and data is handled on a single system.
The user directly interacts with the application, performs operations like calculations or data
entry and stores data locally on the same machine.
This architecture is simple and works well for personal, standalone applications where no
external server or network connection is needed.
DBMS 1-Tier Architecture
Advantages of 1-Tier Architecture
Below mentioned are the advantages of 1-Tier Architecture.
Simple Architecture: 1-Tier Architecture is the most simple architecture to set up, as only a
single machine is required to maintain it.
Cost-Effective: No additional hardware is required for implementing 1-Tier Architecture,
which makes it cost-effective.
Easy to Implement: 1-Tier Architecture can be easily deployed, and hence it is mostly used
in small projects.
2- Tier Architecture
The 2-tier architecture is similar to a basic client-server model . The application at the client end
directly communicates with the database on the server side. APIs like ODBC and JDBC are used
for this interaction. The server side is responsible for providing query processing and transaction
management functionalities. On the client side, the user interfaces and application programs are
run. The application on the client side establishes a connection with the server side to
communicate with the DBMS.
6
For Example: A Library Management System used in schools or small organizations is a
classic example of two-tier architecture.
1. Client Layer (Tier 1): This is the user interface that library staff or users interact with. For
example they might use a desktop application to search for books, issue them, or check due
dates.
2. Database Layer (Tier 2): The database server stores all the library records such as book
details, user information, and transaction logs.
The client layer sends a request (like searching for a book) to the database layer which processes
it and sends back the result. This separation allows the client to focus on the user interface, while
the server handles data storage and retrieval.
DBMS 2-Tier Architecture
7
Disadvantages of 3-Tier Architecture
More Complex: 3-Tier Architecture is more complex in comparison to 2-Tier Architecture.
Communication Points are also doubled in 3-Tier Architecture.
Difficult to Interact: It becomes difficult for this sort of interaction to take place due to the
presence of middle layers
A data model is a conceptual framework for organizing and defining the structure,
operations, and constraints of data in a database.
The Hierarchical Data Model is the data model that organizes data in a tree-like structure,
where each record (or node) has a single parent, but can have multiple children.
Example: A company database where each department is a parent and employees are
children of the department.
The Network Data Model is an extension of the hierarchical model that allows many-to-
many relationships.
In this model, data is represented using records (nodes) and sets (edges or links) to
connect them.
Each record can have multiple parents and multiple children, allowing for more complex
relationships between records.
Example: A database of employees and projects where employees can work on multiple
projects, and each project can involve multiple employees.
8
3.) Relational Data Model
The Relational Data Model organizes data in two-dimensional tables (relations), consisting
of rows (tuples) and columns (attributes). Each table represents a different entity, and
relationships between tables are maintained through the use of foreign keys.
The relational model provides a highly flexible way to handle data and is the foundation
of modern database systems like MySQL, PostgreSQL, and Oracle.
Example: A student database with tables for students, courses, and enrollments where
relationships are defined through keys.
In this model, data is stored in objects, which are instances of classes, and can contain
both data (attributes) and methods (operations).
This model allows for better representation of real-world entities by encapsulating both
the state and behavior of an entity.
Example: A product inventory system where each product is an object with properties like
name, price, and methods to update or calculate discounts.
The Entity-Relationship (ER) Model is a high-level conceptual model that defines the
structure of data by describing entities (objects), attributes (properties of entities), and
relationships between entities.
Example: A university database where students, courses, and instructors are entities, and
the relationships between them (e.g., a student enrolls in a course) are defined.
Schemas
9
A schema is the overall logical structure of a database that defines how the data is
organized and how relationships among data are maintained.
It can be viewed as a blueprint or architecture of the database that defines tables, fields,
data types, and relationships.
Types of Schemas:
The physical schema describes how the data is actually stored on the storage media. It
includes details about the physical storage of data, such as file structures, indexing methods,
and storage allocations.
The logical schema is an abstract representation of the database’s structure, capturing the
logical relationships between data elements without concern for the physical implementation
details.
The external schema defines how individual users or user groups interact with the database.
It provides a customized view of the database tailored to the needs of different users or
applications.
Instances
10
An instance refers to the actual data stored in a database at a particular moment in time. It
is the snapshot of the database content.
Explanation:
Instance vs. Schema: While a schema defines the structure of the database (tables,
columns, etc.), an instance represents the actual content within that structure at any given
point. The schema remains relatively static, while instances can change frequently as data is
inserted, updated, or deleted.
Data Model: Provides the abstract framework and rules for how data can be stored and
manipulated (e.g., relational model, ER model).
Schema: Implements the data model for a specific database, defining its structure
(tables, fields, relationships).
Instance: Represents the actual data that populates the schema at any given time.
Data independence is a property of a database management system by which we can change the
database schema at one level of the database system without changing the database schema at the
next higher level. In this article, we will learn in full detail about data independence and will also
see its types. If you read it completely, you will understand it easily.
What is Data Independence in DBMS?
In the context of a database management system, data independence is the feature that allows the
schema of one layer of the database system to be changed without any impact on the schema of
the next higher level of the database system. ” Through data independence, we can build an
environment in which data is independent of all programs, and through the three schema
architectures, data independence will be more understandable. Data via two card stencils along
with centralized DBMS data is a form of transparency that has value for someone.
It can be summed up as a sort of immunity of user applications that adjusts correctly and does
not change addresses, imparting the class of data and their order. I want the separate applications
not to be forced to deal with data representation and storage specifics because this decreases
quality and flexibility. DBMS permits you to see data with such a generalized sight. It actually
means that the ability to change the structure of the lower-level schema without presenting the
upper-level schema is called data independence.
11
Types of Data Independence
There are two types of data independence.
logical data independence
Physical data independence
It mainly concerns how the data is stored in It mainly concerns about changes to the
the system. structure or data definition.
It tells about the internal schema. It tells about the conceptual schema.
There may or may not be a need for changes Whenever the logical structure of
to be made at the internal level to improve the the database has to be changed, the changes
structure. made at the logical level are important.
12
Physical Data Independence Logical Data Independence
Database Languages
Database languages are used to read, store and update the data in the database. Specific
languages are used to perform various operations of the database.
Data Definition Language(DDL) is used for describing structures or patterns and its relationship
in a database. It is also used to define the database schema, tables, index, Constraints, etc. It can
also be used to store information like the number of tables, names, columns, indexes, etc. The
commands only affect the database structure and not the data.
The commands used in DDL are:
Create: It is used to create a database or table.
Alter: It is used to make a change in the structure of a database.
Drop: It is used to completely delete a table from the database
Rename: It is used to rename a table.
Truncate: It is used to delete the entities inside the table while holding the structure of the table.
Comment: It is used to comment on the data dictionary.
DML is used to manipulate the data present in the table or database. We can easily perform
operations such as store, modify, update, and delete on the database.
13
get the particular record.
14
Insert: It allows users to insert data into the database or tables.
Update: It is used to update or modify the existing data in database tables.
Delete: It is used to delete records from the database tables. Also, it can be used with a WHERE
clause to delete a particular row from the table.
Merge: It allows the insert and update(UPSERT) operations.
DCL works to deal with SQL commands that are used to permit a user to access, modify and
work on a database. it is used to access stored data. It gives access, revokes access, and changes
the permission to the owner of the database as per the requirement.
The commands used in DCL are:
Grant: It is used to give access to security privileges to a specific database user.
Revoke: It is used to revoke the access from the user that is being granted by the grant
command.
It can be grouped into a logical transaction and is used to run the changes made by the DML
command in the database.
Commit: Transaction on the database is saved using Commit.
Rollback: The database gets restored to the original since the last commit.
Interface
An interface is a program that allows users to input queries into a database without writing the
code in the query language. An interface can be used to manipulate the database for adding,
deleting, updating, or viewing the data.
Form?based Interface
A form is displayed to each user by the form?based interface. The user fills in the details and
submits the form to make a new entry into the database. It can also be done when the user only
fills in some details and the system will help by retrieving the rest of the details from the
database. The form?based interface is built for the naive user(inexperienced user) which deals
with a limited number of operations. Many DBMS have specification language which helps the
programmer define such forms.
15
Example
Student entering his roll. no, branch in the form to get the grade card.
In this interface, the user was provided with a list of options (called a menu) through which the
user forms a request. The user doesn't need to memorize the command and syntax and the query
is composed step by step by picking options from a menu. Pull?down menu interfaces are mostly
used in web?based user interfaces and are often used in browsing interfaces by which the
database content can be looked through.
Example
In a shopping website, categories are selected from the menu, brands are selected from the menu
of brands, and budget ranges are applied from the menu of budget range.
Users are provided a schema of diagrammatic form by which query can be specified through
manipulating the diagram. GUI utilizes both menu and form in several cases. Schema Diagram's
specific parts are selected using devices used by GUI.
Example
You liked a video on Instagram by tapping with your finger, and the color changes to red. The
visual graphic gets changed due to user action.
A natural language interface contains its unique schema more like the high?level conceptual
schema. It also has a directory of important words. It generates a query based on the
interpretation of important words in the input by the user and if the interpretation is successful,
then it displays the result to the user.
Example
A user googled the fastest car in India, and now the natural language interface will look for the
important words i. e. fastest, car, India, and show the result accordingly.
16
The users query the interface with speech and get the answer in speech. The input is detected
using predefined words and conversions are done into speech to provide the output. Nowadays, it
has become the most common type of interface.
Example
OK Google, Siri on Apple, and Alexa is used in the form of speech.
DBA staff are provided commands that can only be used by them only to create an account,
grant account authorization, and change a schema, and storage structure reorganization.
DDL stands for Data Definition Language. These are the commands that are used to change the
structure of a database and database objects. For example, DDL commands can be used to add,
remove, or modify tables within a database.
In this article, We will learn about the DDL Full Form by understanding various examples and so
on.
What is DDL?
DDL actually represents Data Definition Language, which is actually a set of commands used to
create a structure and maintain databases. Those would
include CREATE, ALTER, DROP, TRUNCATE, and RENAME statements for creating,
changing the structure of, and dropping structures in the database, such as tables. DDL basically
deals with the storage of the data and not the data itself.
DML
Elaboration:
Purpose:
DML statements are used to perform operations on the data stored in a database, such as
retrieving, adding, modifying, and deleting records.
Key Commands:
17
SELECT: Retrieves data from a database.
INSERT: Adds new data to a database.
UPDATE: Modifies existing data in a database.
DELETE: Deletes data from a database
A database structure is the way data is organized and stored in a database system. This structure
includes how different data elements relate to each other, the types of data stored, and the
relationships between those data elements. The overall design of a database is also known as
the database schema.
Here's a more detailed breakdown:
1. Database Schema:
Logical Structure:
The schema defines the logical organization of the data, including tables, fields, relationships,
and rules.
Relationships:
It outlines the relationships between entities, such as primary and foreign keys.
Data Organization:
The schema helps resolve issues with unstructured data by organizing it in a clear, structured
way.
2. Key Components of a Database Structure:
Tables: Data is organized into tables with rows and columns.
Fields (Attributes): Columns within a table that define the type of data stored.
Records (Rows): A collection of fields (columns) that represents a single entity.
Data Dictionary: A centralized repository that stores metadata, including data types,
relationships, and constraints.
Indexes: Used for faster retrieval of data.
Constraints: Rules that ensure data integrity, such as primary keys, foreign keys, and data type
constraints.
18
Internal Level:
Deals with the physical storage of data, including disk storage, data compression, and
indexing.
Conceptual Level:
Represents the logical organization of the data, including tables, attributes, and relationships,
independent of any specific DBMS.
External Level:
Defines how users interact with the database, providing customized views and interfaces.
4. Types of Database Structures:
The Entity-Relationship Model (ER Model) is a conceptual model for designing a databases. This
model represents the logical structure of a database, including entities, their attributes and
relationships between them.
Entity: An objects that is stored as data such as Student, Course or Company.
Attribute: Properties that describes an entity such as StudentID, CourseName,
or EmployeeEmail.
Relationship: A connection between entities such as "a Student enrolls in a Course".
Components of ER Diagram
19
ER diagrams represent the E-R model in a database, making them easy to convert into
relations (tables).
These diagrams serve the purpose of real-world modeling of objects which makes them
intently useful.
Unlike technical schemas, ER diagrams require no technical knowledge of the underlying
DBMS used.
They visually model data and its relationships, making complex systems easier to
understand.
Symbols Used in ER Model
ER Model is used to model the logical view of the system from a data perspective which
consists of these symbols:
Rectangles: Rectangles represent entities in the ER Model.
Ellipses: Ellipses represent attributes in the ER Model.
Diamond: Diamonds represent relationships among Entities.
Lines: Lines represent attributes to entities and entity sets with other relationship types.
Double Ellipse: Double ellipses represent multi-valued Attributes, such as a student's
multiple phone numbers
Double Rectangle: Represents weak entities, which depend on other entities for
identification.
What is an Entity?
An Entity represents a real-world object, concept or thing about which data is stored in a
database. It act as a building block of a database. Tables in relational database represent these
entities.
Example of entities:
Real-World Objects: Person, Car, Employee etc.
Concepts: Course, Event, Reservation etc.
Things: Product, Document, Device etc.
The entity type defines the structure of an entity, while individual instances of that type
represent specific entities.
What is an Entity Set?
An entity refers to an individual object of an entity type, and the collection of all entities of a
particular type is called an entity set. For example, E1 is an entity that belongs to the entity
type "Student," and the group of all students forms the entity set.
In the ER diagram below, the entity type is represented as:
Entity Set
We can represent the entity sets in an ER Diagram but we can't represent individual entities
because an entity is like a row in a table, and an ER diagram shows the structure and
relationships of data, not specific data entries (like rows and columns). An ER diagram is a
visual representation of the data model, not the actual data itself.
Types of Entity
20
There are two main types of entities:
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute that can uniquely identify each
instance of the entity. A Strong Entity does not depend on any other Entity in the Schema for
its identification. It has a primary key that ensures its uniqueness and is represented by a
rectangle in an ER diagram.
2. Weak Entity
A Weak Entity cannot be uniquely identified by its own attributes alone. It depends on a strong
entity to be identified. A weak entity is associated with an identifying entity (strong entity),
which helps in its identification. A weak entity are represented by a double rectangle. The
participation of weak entity types is always total. The relationship between the weak entity
type and its identifying strong entity type is called identifying relationship and it is represented
by a double diamond.
Example:
A company may store the information of dependents (Parents, Children, Spouse) of an
Employee. But the dependents can't exist without the employee. So dependent will be a Weak
Entity Type and Employee will be identifying entity type for dependent, which means it is
Strong Entity Type.
Strong Entity and Weak Entity
Attributes in ER Model
Attributes are the properties that define the entity type. For example, for a Student entity
Roll_No, Name, DOB, Age, Address, and Mobile_No are the attributes that define entity type
Student. In ER diagram, the attribute is represented by an oval.
Attribute
Types of Attributes
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key attribute.
For example, Roll_No will be unique for each student. In ER diagram, the key attribute is
represented by an oval with an underline.
Key Attribute
2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For example,
the Address attribute of the student Entity type consists of Street, City, State, and Country. In
ER diagram, the composite attribute is represented by an oval comprising of ovals.
Composite Attribute
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example, Phone_No (can
be more than one for a given student). In ER diagram, a multivalued attribute is represented by
a double oval.
Multivalued Attribute
4. Derived Attribute
21
An attribute that can be derived from other attributes of the entity type is known as a derived
attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived attribute is
represented by a dashed oval.
Derived Attribute
The Complete Entity Type Student with its Attributes can be represented as:
Entity and Attributes
A set of relationships of the same type is known as a relationship set. The following
relationship set depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as registered in C3.
Relationship Set
2. Binary Relationship: When there are TWO entities set participating in a relationship, the
relationship is called a binary relationship. For example, a Student is enrolled in a Course.
Binary Relationship
3. Ternary Relationship: When there are three entity sets participating in a relationship, the
relationship is called a ternary relationship.
4. N-ary Relationship: When there are n entities set participating in a relationship, the
relationship is called an n-ary relationship.
Cardinality in ER Model
The maximum number of times an entity of an entity set participates in a relationship set is
known as cardinality.
Cardinality can be of different types:
1. One-to-One
When each entity in each entity set can take part only once in the relationship, the cardinality is
one-to-one. Let us assume that a male can marry one female and a female can marry one male.
So the relationship will be one-to-one.
2. One-to-Many
22
In one-to-many mapping as well where each entity can be related to more than one entity. Let
us assume that one surgeon department can accommodate many doctors. So the Cardinality
will be 1 to M. It means one department has many Doctors.
one to many cardinality
3. Many-to-One
When entities in one entity set can take part only once in the relationship set and entities in
other entity sets can take part more than once in the relationship set, cardinality is many to one.
Let us assume that a student can take only one course but one course can be taken by many
students. So the cardinality will be n to 1. It means that for one course there can be n students
but for one student, there will be only one course.
many to one cardinality
.
4. Many-to-Many
When entities in all entity sets can take part more than once in the relationship cardinality is
many to many. Let us assume that a student can take more than one course and one course can
be taken by many students. So the relationship will be many to many.
many to many cardinality
.
Participation Constraint
Participation Constraint is applied to the entity participating in the relationship set.
1. Total Participation: Each entity in the entity set must participate in the relationship. If each
student must enroll in a course, the participation of students will be total. Total participation is
shown by a double line in the ER diagram.
2. Partial Participation: The entity in the entity set may or may NOT participate in the
relationship. If some courses are not enrolled by any of the students, the participation in the
course will be partial.
23
24
UNIT 2
Relational Model in DBMS
The Relational Model represents data and their relationships through a collection of tables. Each
table also known as a relation consists of rows and columns. Every column has a unique name
and corresponds to a specific attribute, while each row contains a set of related data values
representing a real-world entity or relationship. This model is part of the record-based models
which structure data in fixed-format records each belonging to a particular type with a defined
set of attributes.
E.F. Codd introduced the Relational Model to organize data as relations or tables. After creating
the conceptual design of a database using an ER diagram, this design must be transformed into a
relational model which can then be implemented using relational database systems like Oracle
SQL or MySQL.
The relational model represents how data is stored and managed in Relational Databases. Data
is organized into tables, each known as a relation, consisting of rows (tuples)
and columns (attributes). Each row represents an entity or record, and each column represents a
particular attribute of that entity. A relational database consists of a collection of tables each of
which is assigned a unique name.
Example:
Consider a relation STUDENT with attributes ROLL_NO, NAME, ADDRESS,
PHONE, and AGE shown in the table.
25
Example: The STUDENT relation defined above has cardinality 4.
7. Column: The column represents the set of values for a particular attribute.
Example: The column ROLL_NO is extracted from the relation STUDENT.
8. NULL Values: The value which is not known or unavailable is called a NULL value. It is
represented by NULL.
Example: PHONE of STUDENT having ROLL_NO 4 is NULL.
1. Primary Key:
A Primary Key uniquely identifies each tuple in a relation. It must contain unique values and cannot
have NULL values. Example: ROLL_NO in the STUDENT table is the primary key.
2. Candidate Key
A Candidate Key is a set of attributes that can uniquely identify a tuple in a relation. There can be
multiple candidate keys, and one of them is chosen as the primary key.
3. Super Key
A Super Key is a set of attributes that can uniquely identify a tuple. It may contain extra attributes
that are not necessary for uniqueness.
4. Foreign Key
A Foreign Key is an attribute in one relation that refers to the primary key of another relation. It
establishes relationships between tables. Example: BRANCH_CODE in the STUDENT table is
a foreign key that refers to the primary key BRANCH_CODE in the BRANCH table.
5. Composite Key
A Composite Key is formed by combining two or more attributes to uniquely identify a tuple.
Example: A combination of FIRST_NAME and LAST_NAME could be a composite key if no
one in the database shares the same full name.
26
Characteristics of the Relational Model
1. Data Representation: Data is organized in tables (relations), with rows (tuples) representing
records and columns (attributes) representing data fields.
2. Atomic Values: Each attribute in a table contains atomic values, meaning no multi-valued or
nested data is allowed in a single cell.
3. Unique Keys: Every table has a primary key to uniquely identify each record, ensuring no
duplicate rows.
4. Attribute Domain: Each attribute has a defined domain, specifying the valid data types and
constraints for the values it can hold.
5. Tuples as Rows: Rows in a table, called tuples, represent individual records or instances of
real-world entities or relationships.
6. Relation Schema: A table’s structure is defined by its schema, which specifies the table
name, attributes, and their domains.
7. Data Independence: The model ensures logical and physical data independence, allowing
changes in the database schema without affecting the application layer.
8. Integrity Constraints: The model enforces rules like:
9. Domain constraints: Attribute values must match the specified domain.
10. Entity integrity: No primary key can have NULL values.
11. Referential integrity: Foreign keys must match primary keys in the referenced table or be
NULL.
12. Relational Operations: Supports operations like selection, projection, join, union, and
intersection, enabling powerful data retrieval manipulation.
13. Data Consistency: Ensures data consistency through constraints, reducing redundancy and
anomalies.
14. Set-Based Representation: Tables in the relational model are treated as sets, and operations
follow mathematical set theory principles.
While designing the Relational Model, we define some conditions which must hold for data present
in the database are called Constraints. These constraints are checked before performing any
operation (insertion, deletion, and updation ) in the database. If there is a violation of any of the
constraints, the operation will fail.
1. Domain Constraints
Domain Constraints ensure that the value of each attribute A in a tuple must be an atomic
value derived from its specified domain, dom(A). Domains are defined by the data types
associated with the attributes. Common data types include:
Numeric types: Includes integers (short, regular, and long) for whole numbers and real
numbers (float, double-precision) for decimal values, allowing precise calculations.
Character types: Consists of fixed-length (CHAR) and variable-length
(VARCHAR, TEXT) strings for storing text data of various sizes.
Boolean values: Stores true or false values, often used for flags or conditional checks in
databases.
27
Specialized types: Includes types
for date (DATE), time (TIME), timestamp (TIMESTAMP), and money (MONEY), used for
precise handling of time-related and financial data.
2. Key Integrity
Every relation in the database should have at least one set of attributes that defines a tuple uniquely.
Those set of attributes is called keys. e.g.; ROLL_NO in STUDENT is key. No two students can
have the same roll number. So a key has two properties:
It should be unique for all tuples.
It can’t have NULL values.
3. Referential Integrity Constraints
When one attribute of a relation can only take values from another attribute of the same relation or
any other relation, it is called referential integrity. Let us suppose we have 2 relations
Table STUDENT
ROLL_NO NAME ADDRESS PHONE AGE BRANCH_CODE
4 SURESH DELHI 18 IT
Table BRANCH
BRANCH_CODE BRANCH_NAME
CS COMPUTER SCIENCE
IT INFORMATION TECHNOLOGY
28
BRANCH_CODE BRANCH_NAME
CV CIVIL ENGINEERING
Explanation: BRANCH_CODE of STUDENT can only take the values which are present in
BRANCH_CODE of BRANCH which is called referential integrity constraint. The relation
which is referencing another relation is called REFERENCING RELATION (STUDENT in this
case) and the relation to which other relations refer is called REFERENCED RELATION
(BRANCH in this case).
An anomaly is an irregularity or something which deviates from the expected or normal state.
When designing databases, we identify three types of anomalies: Insert, Update, and Delete.
3. On Delete Cascade
It will delete the tuples from REFERENCING RELATION if the value used by REFERENCING
ATTRIBUTE is deleted from REFERENCED RELATION. e.g.; if we delete a row from
BRANCH with BRANCH_CODE ‘CS’, the rows in STUDENT relation with BRANCH_CODE
CS (ROLL_NO 1 and 2 in this case) will be deleted.
4. On Update Cascade
It will update the REFERENCING ATTRIBUTE in REFERENCING RELATION if the
attribute value used by REFERENCING ATTRIBUTE is updated in REFERENCED
RELATION. e.g., if we update a row from BRANCH with BRANCH_CODE ‘CS’ to ‘CSE’, the
29
rows in STUDENT relation with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case) will be
updated with BRANCH_CODE ‘CSE’.
5. Super Keys
Any set of attributes that allows us to identify unique rows (tuples) in a given relationship is
known as super keys. Out of these super keys, we can always choose a proper subset among
these that can be used as a primary key. Such keys are known as Candidate keys. If there is a
combination of two or more attributes that are being used as the primary key then we call it a
Composite key.
1. Performance: The relational model can experience performance issues with very large
databases.
2. Complexity for Complex Data: The model struggles with hierarchical or complex data
relationships, which might be better handled with other models like the Graph or Document
model.
3. Normalization Overhead: Extensive use of normalization can result in complex queries and
slower performance.
Domain constraints,
Domain constraints, in the context of databases and data integrity, define the acceptable values
for an attribute (column). They specify the data type and any additional restrictions, ensuring
30
data accuracy and consistency. These constraints act as rules, preventing invalid data from being
entered into the database.
Purpose:
Domain constraints ensure that the values stored in a database column are valid and within a
specific range or domain.
Data Type:
They specify the data type of the attribute, such as integer, string, date, etc.
Restrictions:
They can include additional restrictions, such as allowed ranges, formats, or patterns.
Examples:
A "NOT NULL" constraint prevents a column from accepting null values.
A "UNIQUE" constraint ensures that all values in a column are different.
A "CHECK" constraint can enforce specific criteria or conditions on the values.
Importance:
Domain constraints are crucial for maintaining data integrity and preventing errors.
Application:
They are used in various contexts, including:
Database design
Data validation
Data quality
relational algebra
Introduction of Relational Algebra in DBMS
Relational Algebra is a formal language used to query and manipulate relational databases,
consisting of a set of operations like selection, projection, union, and join. It provides a
mathematical framework for querying databases, ensuring efficient data retrieval and
manipulation. Relational algebra serves as the mathematical foundation for query SQL
31
Relational algebra simplifies the process of querying databases and makes it easier to understand
and optimize query execution for better performance. It is essential for learning SQL because
SQL queries are based on relational algebra operations, enabling users to retrieve data
effectively.
Key Concepts in Relational Algebra
Before explaining relational algebra operations, let's define some fundamental concepts:
1. Relations: In relational algebra, a relation is a table that consists of rows and columns,
representing data in a structured format. Each relation has a unique name and is made up of
tuples.
2. Tuples: A tuple is a single row in a relation, which contains a set of values for each attribute.
It represents a single data entry or record in a relational table.
3. Attributes: Attributes are the columns in a relation, each representing a specific characteristic
or property of the data. For example, in a "Students" relation, attributes could be "Name", "Age",
and "Grade".
4. Domains: A domain is the set of possible values that an attribute can have. It defines the type
of data that can be stored in each column of a relation, such as integers, strings, or dates.
Basic Operators in Relational Algebra
Relational algebra consists of various basic operators that help us to fetch and manipulate data
from relational tables in the database to perform certain operations on relational data. Basic
operators are fundamental operations that include selection (σ), projection (π), union (U), set
difference (−), Cartesian product (×), and rename (ρ
, relational calculus,
Relational calculus is a declarative query language in database theory that allows users to specify
what data they want to retrieve, rather than how to retrieve it. It's a non-procedural language,
meaning it focuses on the "what" instead of the "how," unlike relational algebra which is
procedural. Relational calculus is based on predicate calculus, a part of symbolic logic. There are
two main types: Tuple Relational Calculus (TRC) and Domain Relational Calculus (DRC).
Key Concepts:
Declarative: Relational calculus describes the desired result without specifying the steps to
obtain it.
Non-procedural: It focuses on what needs to be retrieved, not how.
Based on Predicate Calculus: Relational calculus builds upon the concepts of predicate
calculus, a branch of logic.
Two Main Types: TRC and DRC offer different ways of expressing
queries. Tuple Relational Calculus (TRC):
TRC focuses on tuples (rows) of relations and uses predicates to define conditions.
It describes the desired tuples based on conditions (predicates).
32
For example, you can write a TRC query to retrieve all employees who earn more than a certain
salary.
TRC is used as a theoretical foundation for optimizing queries in relational databases, according
to Naukri.com.
Domain Relational Calculus (DRC):
DRC deals with individual values (domain values) of attributes rather than entire tuples.
It expresses queries in terms of the values of attributes.
While DRC can be powerful, it can also be more difficult to express complex queries compared
to TRC.
Relationship to Relational Algebra:
Both relational algebra and relational calculus are used in database management systems
(DBMS).
Relational algebra is procedural, specifying a sequence of operations to retrieve data.
Relational calculus is declarative, specifying what data to retrieve without specifying the steps.
Both are considered equivalent in expressive power, meaning they can express the same queries.
34
UNIT-3
What is SQL?
Data is at the core of every application, and SQL (Structured Query Language) manages and interacts
with this data. Whether we’re handling a small user database or analyzing terabytes of sales
records, SQL allows efficient querying, updating, and management of relational databases.
When data needs to be retrieved from a database, SQL is used to construct and send the request.
The Database Management System (DBMS) processes the SQL query, retrieves the requested
data, and returns it to the user or application. Instead of specifying step-by-step procedures, SQL
statements describe what data should be retrieved, organized, or modified, allowing the DBMS to
handle how the operations are executed efficiently.
Characteristics of SQL?
User-Friendly and Accessible: SQL is designed for a broad range of users, including those with
minimal programming experience, making it approachable for non-technical individuals.
Declarative Language: As a non-procedural language, SQL allows users to specify what data is needed
rather than how to retrieve it, focusing on the desired results rather than the retrieval process.
Efficient Database Management: SQL enables the creation, modification, and management of
databases efficiently, saving time and simplifying complex database operations.
Standardized Language: Based on ANSI (American National Standards Institute) and ISO
(International Organization for Standardization) standards, SQL ensures consistency and stability
across various database management systems (DBMS).
Command Structure: SQL does not require a continuation character for multi-line queries, allowing
flexibility in writing commands across one or multiple lines.
Execution Mechanism: Queries are executed using a termination character (e.g., a semicolon ;),
enabling immediate and accurate command processing.
Built-in Functionality: SQL includes a rich set of built-in functions for data manipulation,
aggregation, and formatting, empowering users to handle diverse data-processing needs
effectively.
35
Advantages of SQL
Faster Query Processing: Large amount of data is retrieved quickly and efficiently. Operations like
Insertion, deletion, manipulation of data is also done in almost no time.
No Coding Skills: For data retrieval, large number of lines of code is not required. All basic keywords
such as SELECT, INSERT INTO, UPDATE, etc are used and also the syntactical rules are not
complex in SQL, which makes it a user-friendly language.
Standardized Language: Due to documentation and long establishment over years, it provides a
uniform platform worldwide to all its users.
Portable: It can be used in programs in PCs, server, laptops independent of any platform (Operating
System, etc). Also, it can be embedded with other applications as per need/requirement/use.
Interactive Language : Easy to learn and understand, answers to complex queries can be received in
seconds.
Multiple data views : One of the advantages of SQL is its ability to provide multiple data views . This
means that SQL allows users to create different views or perspectives of the data stored in a
database, depending on their needs and permissions.
Scalability : SQL databases can handle large volumes of data and can be scaled up or down as per the
requirements of the application.
Security : SQL databases have built-in security features that help protect data from unauthorized
access, such as user authentication, encryption, and access control.
Data Integrity : SQL databases enforce data integrity by enforcing constraints such as unique keys,
primary keys, and foreign keys, which help prevent data duplication and maintain data accuracy.
Backup and Recovery : SQL databases have built-in backup and recovery tools that help recover data
in case of system failures, crashes, or other disasters.
Data Consistency: SQL databases ensure consistency of data across multiple tables through the use of
transactions, which ensure that changes made to one table are reflected in all related tables.
Disadvantages of SQL :
Although SQL has many advantages, still there are a few disadvantages.
Various Disadvantages of SQL are as follows:
Complex Interface : SQL has a difficult interface that makes few users uncomfortable while dealing
with the database.
Cost : Some versions are costly and hence, programmers cannot access it.
Partial Control : Due to hidden business rules, complete control is not given to the database.
Limited Flexibility: SQL databases are less flexible than NoSQL databases when it comes to handling
unstructured or semi-structured data, as they require data to be structured into tables and columns.
Lack of Real-Time Analytics: SQL databases are designed for batch processing and do not support
real-time analytics, which can be a disadvantage for applications that require real-time data
processing.
Limited Query Performance: SQL databases may have limited query performance when dealing with
large datasets, as queries may take longer to process than in-memory databases.
Complexity: SQL databases can be complex to set up and manage, requiring skilled database
administrators to ensure optimal performance and maintain data integrity.
36
What are SQL Data Types?
-
Large
9,223,372,036,854,77
integer
BIGINT 5,808 to
number
9,223,372,036,854,77
s
5,807
Standard
-2,147,483,648 to
INT integer
2,147,483,647
values
37
Descriptio
Data Type n Range
Small
SMALLINT -32,768 to 32,767
integers
Very small
TINYINT 0 to 255
integers
Exact
fixed-
point
number
DECIMAL s (e.g., -10^38 + 1 to 10^38 - 1
for
financia
l
values)
Similar to
DECIM
AL,
NUMERIC -10^38 + 1 to 10^38 - 1
used for
precisio
n data
-
For storing 922,337,203,685,477.
MONEY monetar 5808 to
y values 922,337,203,685,477.
5807
y values
Approximate Numeric Datatype
These types are used to store approximate values, such as scientific
measurements or large ranges of data that don't need exact precision.
Data
Type Description Range
39
Data Type Description
40
Storage
Data Type Description Size
Fixed-length binary
Binary 8000 bytes
data.
41
DataType Description
SQL Literals
Last Updated : 10 Sep, 2020
There are four kinds of literal values supported in SQL. They are : Character string, Bit string, Exact numeric,
and Approximate numeric. These are explained as following below.
1. Character string : Character strings are written as a sequence of characters enveloped in single quotes.
the only quote character is delineate at intervals a personality string by 2 single quotes. Some example
of character strings are :
'My String'
'I love GeeksForGeeks'
'16378'
2. Bit string : A bit string is written either as a sequence of 0s and 1s enveloped in single quotes and
preceded by the letter ‘B’ or as a sequence of positional representation system digits enveloped in single
quotes and preceded by the letter X’ some examples are given below :
B'10001011'
B'1'
B'0'
X'C 5'
X'0'
3. Exact numeric : These literals ar written as a signed or unsigned decimal variety probably with
mathematical notation. Samples of actual numeric literals are given below :
8
42
80
80.00
0.8
+88.88
-88.88
4. Approximate numeric : Approximate numeric literals are written as actual numeric literals followed by
the letter ‘E’, followed by a signed or unsigned number. Some example are :
6E6
66.6E6
+66E-6
0.66E
-6.66E-8
Create
database or its
objects (table,
CREATE TABLE table_name
index,
CREATE (column1 data_type, column2
function,
data_type, ...);
views, store
procedure, and
triggers)
Delete objects
DROP from the DROP TABLE table_name;
database
43
Command Description Syntax
all spaces
allocated for
the records are
removed
Add comments
COMMENT 'comment_text'
COMMENT to the data
ON TABLE table_name;
dictionary
Example:
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
hire_date DATE
);
In this example, a new table called employees is created with columns for employee ID, first
name, last name, and hire date.
2. DQL - Data Query Language
DQL statements are used for performing queries on the data within schema objects. The
purpose of the DQL Command is to get some schema relation based on the query passed to
it. This command allows getting the data out of the database to perform operations with it. When
a SELECT is fired against a table or tables the result is compiled into a further temporary table,
which is displayed or perhaps received by the program.
DQL Command
Command Description Syntax
SELECT column1,
It is used to retrieve
column2, ...FROM
SELECT data from the
table_name WHERE
database
condition;
Example:
SELECT first_name, last_name, hire_date
FROM employees
WHERE department = 'Sales'
ORDER BY hire_date DESC;
This query retrieves employees' first and last names, along with their hire dates, from the
employees table, specifically for those in the 'Sales' department, sorted by hire date.
44
3. DML - Data Manipulation Language
The SQL commands that deal with the manipulation of data present in the database belong
to DML or Data Manipulation Language and this includes most of the SQL statements. It is the
component of the SQL statement that controls access to data and to the database. Basically, DCL
statements are grouped with DML statements.
Common DML Commands
Command Description Syntax
Update
UPDATE table_name SET column1
existing
UPDATE = value1, column2 = value2
data within
WHERE condition;
a table
Delete
records
DELETE FROM table_name
DELETE from a
WHERE condition;
database
table
Table
LOCK TABLE table_name IN
LOCK control
lock_mode;
concurrency
Call a
PL/SQL or
CALL CALL procedure_name(arguments);
JAVA
subprogram
Describe
EXPLAIN EXPLAIN PLAN FOR SELECT *
the access
PLAN FROM table_name;
path to data
Example:
INSERT INTO employees (first_name, last_name, department)
VALUES ('Jane', 'Smith', 'HR');
This query inserts a new record into the employees table with the first name 'Jane', last name
'Smith', and department 'HR'.
4. DCL - Data Control Language
DCL (Data Control Language) includes commands such as GRANT and REVOKE which
mainly deal with the rights, permissions, and other controls of the database system. These
45
commands are used to control access to data in the database by granting or revoking
permissions.
Common DCL Commands
Command Description Syntax
Assigns new
privileges to a user GRANT privilege_type
account, allowing [(column_list)] ON
GRANT access to specific [object_type] object_name TO
database objects, user [WITH GRANT
actions, or OPTION];
functions.
Removes
previously granted
REVOKE [GRANT
privileges from a
OPTION FOR] privilege_type
user account,
REVOKE [(column_list)] ON
taking away their
[object_type] object_name
access to certain
FROM user [CASCADE];
database objects or
actions.
Example of DCL
GRANT SELECT, UPDATE ON employees TO user_name;
This command grants the user user_name the permissions to select and update records in the
employees table.
5. TCL - Transaction Control Language
Transactions group a set of tasks into a single execution unit. Each transaction begins with a
specific task and ends when all the tasks in the group are successfully completed. If any of
the tasks fail, the transaction fails. Therefore, a transaction has only two
results: success or failure. We can explore more about transactions here.
Common TCL Commands
Command Description Syntax
BEGIN
BEGIN Starts a new
TRANSACTION
TRANSACTION transaction
[transaction_name];
during the
transaction
Creates a SAVEPOINT
savepoint within savepoint_name;
SAVEPOINT
the current
transaction
Example:
BEGIN TRANSACTION;
UPDATE employees SET department = 'Marketing' WHERE department = 'Sales';
SAVEPOINT before_update;
UPDATE employees SET department = 'IT' WHERE department = 'HR';
ROLLBACK TO SAVEPOINT before_update;
COMMIT;
In this example, a transaction is started, changes are made, and a savepoint is set. If needed, the
transaction can be rolled back to the savepoint before being committed.
Most Important SQL Commands
There are also a few other SQL Commands we often rely on when writing powerful queries.
While they don’t fit neatly into the five main categories, they’re absolutely essential for working
with data effectively.
Command Description
TRUNCATE Removes all rows from a table but keeps its structure
47
Command Description
TABLE intact.
IN / BETWEEN /
Used for advanced filtering conditions.
LIKE
48
SQL operators are important in database management systems (DBMS) as they allow us to
manipulate and retrieve data efficiently. Operators in SQL perform arithmetic, logical,
comparison, bitwise, and other operations to work with database values. Understanding SQL
operators is crucial for performing complex data manipulations, calculations, and filtering
operations in queries.
This operator works with the 'ALL' keyword and it calculates division
/
operations.
= Equal to.
49
Operator Description
Operator Description
AND
Logical AND compares two Booleans as expressions and returns true when both expressions are
true.
OR
Logical OR compares two Booleans as expressions and returns true when one of the expressions
is true.
NOT
Not takes a single Boolean as an argument and change its value from false to true or from true to
false.
| Bitwise OR operator
50
Operator Description
51
Operators Description
returns no rows.
Simplify Complex Queries: Encapsulate complex joins and conditions into a single object.
Enhance Security: Restrict access to specific columns or rows.
Present Data Flexibly: Provide tailored data views for different users.
52
What Are Indexes in SQL?
An index in SQL is a schema object that improves the speed of data retrieval operations on a
table. It works by creating a separate data structure that provides pointers to the rows in a table,
making it faster to look up rows based on specific column values. Indexes act as a table of
contents for a database, allowing the server to locate data quickly and efficiently, reducing disk
I/O operations.
Benefits of Indexes:
Faster Queries: Speeds up SELECT and JOIN operations.
Lower Disk I/O: Reduces the load on your database by limiting the amount of data scanned.
Better Performance on Large Tables: Essential when working with millions of records.
TABLE IN SQL
relational database system contains one or more objects called tables. The data or information for
the database are stored in these tables. Tables are uniquely identified by their names and are
comprised of columns and rows. Columns contain the column name, data type, and any other
attributes for the column. Rows contain the records or data for the columns
SQL Aggregate Functions are used to perform calculations on a set of rows and return a single
value. These functions are particularly useful when we need to summarize, analyze, or group
large datasets in SQL databases. Whether you're working with sales data, employee records, or
product inventories, aggregate functions help us derive meaningful insights.
Commonly used aggregate functions include COUNT(), SUM(), AVG(), MIN(), and MAX().
53
Key Features of SQL Aggregate Functions:
Operate on groups of rows: They work on a set of rows and return a single value.
Ignore NULLs: Most aggregate functions ignore NULL values, except for COUNT(*).
Used with GROUP BY: To perform calculations on grouped data, you often use
aggregate functions with GROUP BY.
Can be combined with other SQL clauses: Aggregate functions can be used alongside
HAVING, ORDER BY, and other SQL clauses to filter or sort results.
Count()
The COUNT() function returns the number of rows that match a given condition or are present in
a column.
SUM()
The SUM() function calculates the total sum of a numeric column.
AVG()
The AVG() function calculates the average of a numeric column. It divides the sum of the
column by the number of non-NULL rows.
54
Joins
Joins are used to combine rows from two or more tables based on related columns between them.
The most common type is the INNER JOIN which returns only matching rows from both tables.
LEFT JOIN returns all rows from the left table with matching rows from the right (or NULL if
no match). RIGHT JOIN does the opposite, returning all rows from the right table. FULL JOIN
returns all rows from both tables, matching where possible. CROSS JOIN produces a Cartesian
product, combining every row from the first table with every row from the second table, which is
rarely used in practice but important to understand.
Unions
UNION combines the result sets of two or more SELECT statements, removing duplicates. The
number and order of columns must match in all queries, and data types must be compatible.
UNION ALL is similar but retains duplicates and is more efficient since it doesn't need to check
for duplicates. These operations are vertical combinations, stacking result sets on top of each
other rather than joining them side-by-side.
Intersection
INTERSECT returns only the rows that appear in both result sets of two SELECT statements.
Like UNION, the queries must have the same number of columns with compatible data types.
This operation is useful for finding common elements between two datasets. Not all database
systems support INTERSECT directly, sometimes requiring alternative approaches using
EXISTS or IN clauses.
Minus (EXCEPT)
MINUS (called EXCEPT in some databases) returns rows from the first query that aren't present
in the second query's results. It essentially performs set subtraction. The operation requires the
same number of columns with compatible types in both queries. This is particularly useful for
finding differences between datasets or excluding specific records from a result set.
Cursors
Cursors are database objects used to retrieve, manipulate, and navigate through result sets row by
row. They provide more control than standard result sets, allowing procedural processing of data.
There are implicit cursors (automatically created for SQL statements) and explicit cursors
(defined by programmers). Cursors are essential in PL/SQL for row-by-row processing, though
overuse can impact performance.
Triggers
Triggers are stored programs that automatically execute ("fire") in response to specific database
events (INSERT, UPDATE, DELETE) on particular tables or views. They can run before or
after the triggering event and are useful for enforcing business rules, maintaining audit trails, or
keeping derived data consistent. Triggers operate transparently to applications, executing
whenever the defined event occurs regardless of what caused the event.
Procedures
Procedures are named PL/SQL blocks that perform specific tasks, optionally accepting
parameters and returning values. They promote code reusability and modularity, encapsulating
complex operations that can be called from applications or other procedures. Unlike functions,
procedures don't have to return values and are called as standalone statements. They're stored in
the database and can include transaction control statements, exception handling, and all PL/SQL
55
features.
Normalization is an important process in database design that helps improve the database's
efficiency, consistency, and accuracy. It makes it easier to manage and maintain the data and
ensures that the database is adaptable to changing business needs.
Database normalization is the process of organizing the attributes of the database to reduce or
eliminate data redundancy (having the same data but at different places).
Normalization generally involves splitting a table into multiple ones which must be linked each
time a query is made requiring data from the split tables.
A functional dependency occurs when one attribute uniquely determines another attribute within
a relation. It is a constraint that describes how attributes in a table relate to each other. If attribute
A functionally determines attribute B we write this as the A→B.
Functional dependencies are used to mathematically express relations among database entities
56
and are very important to understanding advanced concepts in Relational Database Systems.
57
Non-trivial Functional Dependency
In Non-trivial functional dependency, the dependent is strictly not a subset of the determinant.
i.e. If X → Y and Y is not a subset of X, then it is called Non-trivial functional dependency.
58
Normal Forms in DBMS?
Normalization is a technique used in database design to reduce redundancy and improve data
integrity by organizing data into tables and ensuring proper relationships. Normal Forms are
different stages of normalization, and each stage imposes certain rules to improve the structure
and performance of a database. Let's break down the various normal forms step-by-step to
understand the conditions that need to be satisfied at each level:
Example: For a composite key (StudentID, CourseID), if the StudentName depends only on
StudentID and not on the entire key, it violates 2NF. To normalize, move StudentName into a
separate table where it depends only on StudentID.
2. Improved data consistency: Normalization ensures that data is stored in a consistent and
organized manner, reducing the risk of data inconsistencies and errors.
59
3. Simplified database design: Normalization provides guidelines for organizing tables and data
relationships, making it easier to design and maintain a database.
4. Improved query performance: Normalized tables are typically easier to search and retrieve
data from, resulting in faster query performance.
Complex Queries: Too many tables may result in multiple joins, making queries slow and
difficult to manage.
Performance Overhead: Additional processing required for joins in overly normalized databases
may hurt performance, especially in large-scale systems.
60
UNIT 5
This architecture allows for efficient data management and resource allocation by
centralizing critical functions on the server, which can handle complex processing
and large-scale data storage.
Clients manage user interactions and send specific requests to the server, which
processes these requests and sends back appropriate responses.
The client-server architecture is highly scalable, as it can accommodate more
clients by scaling the server's capabilities or adding additional servers.
This design is prevalent in various applications, including web services, database
management, and email systems, providing a robust framework for developing and
managing complex, distributed systems efficiently.
A centralized architecture for DBMS is one in which all data is stored on a single
server, and all clients connect to that server in order to access and manipulate the
data. This type of architecture is also known as a monolithic architecture. One of
the main advantages of a centralized architecture is its simplicity - there is only one
server to manage, and all clients use the same data.
Client-server
Application LayerinG
62
A distributed database is basically a database that is not limited to one
system, it is spread over different sites, i.e, on multiple computers or over a
network of computers. A distributed database system is located on various sites
that don't share physical components. This may be required when a particular
database needs to be accessed by various users globally. It needs to be managed
such that for the users it looks like one single database.
Objects. The basic building block and an instance of a class. The type is either
built-in or user-defined.
Classes. A schema or blueprint that defines object structure and behavior.
Methods. A blueprint that defines the behavior of a class.
Pointers. An entity that helps access elements of an object database. They also help
establish relationships between objects.
64
1. Object Structure:
The structure of an object refers to the properties that an object is made up of.
These properties of an object are referred to as an attribute. Thus, an object is a
real-world entity with certain attributes that makes up the object structure. Also, an
object encapsulates the data code into a single unit which in turn provides data
abstraction by hiding the implementation details from the user.
Messages -
A message provides an interface or acts as a communication medium between an
object and the outside world. A message can be of two types:
Read-only message: If the invoked method does not change the value of a variable,
then the invoking message is said to be a read-only message.
Update message: If the invoked method changes the value of a variable, then the
invoking message is said to be an update message.
Methods -
When a message is passed then the body of code that is executed is known as a
method. Whenever a method is executed, it returns a value as output. A method
can be of two types:
Read-only method: When the value of a variable is not affected by a method, then
it is known as the read-only method.
Update-method: When the value of a variable change by a method, then it is
known as an update method.
Variables -
It stores the data of an object. The data stored in the variables makes the object
distinguishable from one another.
2. Object Classes:
An object which is a real-world entity is an instance of a class. Hence first we need
to define a class and then the objects are made which differ in the values they store
but share the same class definition. The objects in turn correspond to various
messages and variables stored in them.
Example -
65
class CLERK
{ //variables
char name;
string address;
int id;
int salary;
//Messages
char get_name();
string get_address();
int annual_salary();
};
In the above example, we can see, CLERK is a class that holds the object variables
and messages.
Thus, OODBMS provides numerous facilities to its users, both built-in and user-
defined. It incorporates the properties of an object-oriented data model with a
database management system, and supports the concept of programming
paradigms like classes and objects along with the support for other concepts like
encapsulation, inheritance, and the user-defined ADT's (abstract data types).
66
An ODBMS stores and manages data as objects, and provides mechanisms for
querying, manipulating, and retrieving the data. In an ODBMS, the data is
typically stored in the form of classes and objects, which can be related to each
other using inheritance and association relationships.
ODBMS have several advantages over traditional relational databases. One of the
main advantages is that they provide a natural way to represent complex data
structures and relationships. Since the data is represented using objects, it can be
easier to model real-world entities in the database. Additionally, ODBMS can
provide better performance and scalability for applications that require a large
number of small, complex transactions.
However, there are also some disadvantages to using an ODBMS. One of the main
disadvantages is that they can be more complex and harder to use than traditional
relational databases. Additionally, ODBMS may not be as widely used and
supported as traditional relational databases, which can make it harder to find
expertise and support. Finally, some applications may not require the advanced
features and performance provided by an ODBMS, and may be better suited for a
simpler database solution
Features of ODBMS:
Object-oriented data model: ODBMS uses an object-oriented data model to store
and manage data. This allows developers to work with data in a more natural way,
as objects are similar to the objects in the programming language they are using.
Complex data types: ODBMS supports complex data types such as arrays, lists,
sets, and graphs, allowing developers to store and manage complex data structures
in the database.
Data integrity: ODBMS provides strong data integrity, as the relationships between
objects are maintained by the database. This ensures that data remains consistent
and correct, even in complex applications.
Scalability: ODBMS can scale horizontally by adding more servers to the database
cluster, allowing it to handle large volumes of data.
Advantages:
Supports Complex Data Structures: ODBMS is designed to handle complex data
structures, such as inheritance, polymorphism, and encapsulation. This makes it
easier to work with complex data models in an object-oriented programming
environment.
Supports Rich Data Types: ODBMS supports rich data types, such as audio, video,
images, and spatial data, which can be challenging to store and retrieve in
traditional relational databases.
Scalability: ODBMS can scale horizontally and vertically, which means it can
handle larger volumes of data and can support more users.
Disadvantages:
Limited Adoption: ODBMS is not as widely adopted as traditional relational
databases, which means it may be more challenging to find developers with
68
experience working with ODBMS.
Cost: ODBMS can be more expensive than traditional relational databases since it
requires specialized software and hardware.
70
Spatial data mining Temporal data mining
71
Characteristics of a Decision Support System
Interactive Interface: The graphical user interface is also user friendly hence users
can interact with DSS easily in inputting their data and get the desired output.
Data Integration: It gets information from various sources like DBMS, data marts,
Data warehouses and even data feeds to have complete data when processing data.
Support for Semi-structured and Unstructured Decisions: DSS on the other hand is
intended for usage in cases where the decision-making process is not highly
routinized as in the case of traditional management information systems.
Analytical Models and Tools: DSS also has tools for analysing data and making
recommendations; these tools range from statistical analysis, forecasting,
optimization, and simulation models.
Flexibility and Adaptability: The system can be applied to any type of decision-
making environment and is versatile in the sense that it can be modified in exact
conformity with the needs of the users or the organization.
What-if Analysis: It helps in what-if analysis where the assumptions or the values
of the input variable can be varied to determine the impact of change on the result.
Timely and Relevant Information: DSS thus supplies timely and relevant
information that can be used by the decision-makers to respond appropriately to
requisite and volatile environments.
Support for Group Decision Making: Most of the available DSS have group
support systems, where more than one person is involved in the decision-making
process.
Purpose of a Decision Support System
Improving Decision Quality: DSS assists in improving the quality of the decision
based on information and analysis that is accurate, comprehensive, pertinent, and
timely, therefore making better decisions possible.
Handling Complex Problems: This analytic resource is helpful when dealing with
assignments that are structured and unstructured since other approaches may not be
efficient in handling these issues by the application of analytical and modelling
instruments.
Facilitating Rapid Decision Making: DSS facilitates speedy implementation of the
decision by first automating the process of data collection and analysis.
Supporting Strategic Planning: Strategic support is well provided by DSS since all
organisations need to make long-run forecasts and planning and for this, DSS
offers tools in the form of scenarios, forecasts, and simulations.
Enhancing Efficiency: DSS reduces the time and efforts needed to amaze the
decision information, assemble data and analyze it, thus enhancing organizational
72
productivity at the operation stage.
Encouraging Collaboration: As has been seen many DSSs make it possible to make
collaborative decisions this makes it possible for many people to share information
and come up with agreed decisions.
process-of-Data-Analysis
73
Data Analysis Process
Define Objectives : Clearly define the goals of the analysis and the specific
questions you aim to answer. Establish a clear understanding of what insights or
decisions the analyzed data should inform.
Data Collection: Gather relevant data from various sources. Ensure data integrity,
quality, and completeness. Organize the data in a format suitable for analysis.
There are two types of data: qualititative and quantitative data.
Data Cleaning and Preprocessing: Address missing values, handle outliers, and
transform the data into a usable format. Cleaning and preprocessing steps are
crucial for ensuring the accuracy and reliability of the analysis.
Exploratory Data Analysis (EDA): Conduct exploratory analysis to understand the
characteristics of the data. Visualize distributions, identify patterns, and calculate
summary statistics. EDA helps in formulating hypotheses and refining the analysis
approach.
Statistical Analysis : Apply appropriate statistical methods or modeling techniques
to answer the defined questions. This step involves testing hypotheses, building
predictive models, or performing any analysis required to derive meaningful
insights from the data.
Visualization and Communication: Interpret the results in the context of the
original objectives. Communicate findings through reports, visualizations, or
presentations. Clearly articulate insights, conclusions, and recommendations based
on the analysis to support informed decision-making.
If you want to learn more about it . refers this: Data Analysis Process
type_of_data_analytics
Types of Data Analysis
1. Descriptive Analysis
Descriptive analysis helps us understand what happened in the past. It looks at
historical data and summarizes it in a way that makes sense. For example, a
company might use descriptive analysis to see how much they sold last year or to
find out which product was most popular.
2. Diagnostic Analysis
Diagnostic analysis works hand in hand with Descriptive Analysis. As descriptive
Analysis finds out what happened in the past, diagnostic Analysis, on the other
hand, finds out why did that happen or what measures were taken at that time, or
how frequently it has happened. It helps businesses figure out the reasons behind
74
certain outcomes.
3. Predictive Analysis
By forecasting future trends based on historical data, Predictive analysis predictive
analysis enables organizations to prepare for upcoming opportunities and
challenges. For example, a store might use predictive analysis to figure out what
products will be popular in the upcoming season. It helps businesses prepare for
future events and make plans.
4. Prescriptive Analysis
Prescriptive Analysis is an advanced method that takes Predictive Analysis insights
and gives suggestions on the best actions to take. For example, if predictive
analysis shows that a certain product will be popular, prescriptive analysis might
suggest how much stock to buy or what marketing strategies to use. It’s about
giving businesses clear advice on how to act.
75
A Mobile database is a database that can be connected to a mobile computing
device over a mobile network (or wireless network). Here the client and the server
have wireless connections. In today's world, mobile computing is growing very
rapidly, and it is huge potential in the field of the database. It will be applicable on
different-different devices like android based mobile databases, iOS based mobile
databases, etc. Common examples of databases are Couch base Lite, Object Box,
etc.
A cache is maintained to hold frequent and transactions so that they are not lost
due to connection failure.
As the use of laptops, mobile and PDAs is increasing to reside in the mobile
system.
Mobile databases are physically separate from the central database server.
Mobile databases resided on mobile devices.
Mobile databases are capable of communicating with a central database server or
other mobile clients from remote sites.
With the help of a mobile database, mobile users must be able to work without a
wireless connection due to poor or even non-existent connections (disconnected).
A mobile database is used to analyze and manipulate data on mobile devices.
Mobile Database typically involves three parties :
Fixed Hosts -
It performs the transactions and data management functions with the help of
database servers.
Mobiles Units -
These are portable computers that move around a geographical region that includes
the cellular network that these units use to communicate to base stations.
Base Stations -
These are two-way radios installation in fixed locations, that pass communication
with the mobile units to and from the fixed hosts.
Limitations :
76
Here, we will discuss the limitation of mobile databases as follows.
Data transfers are either done on a per object basis or on a per page (normally 4K)
basis.
XML Database
is used to store huge amount of information in the XML format. As the use of
XML is increasing in every field, it is required to have a secured place to store the
XML documents. The data stored in the database can be queried using XQuery,
serialized, and exported into a desired format.
XML Database Types
There are two major types of XML databases −
XML- enabled
Native XML (NXD)
XML - Enabled Database
XML enabled database is nothing but the extension provided for the conversion of
XML document. This is a relational database, where data is stored in tables
consisting of rows and columns. The tables contain set of records, which in turn
consist of fields.
Native XML Database
Native XML database is based on the container rather than table format. It can
store large amount of XML document and data. Native XML database is queried
by the XPath-expressions.
Native XML database has an advantage over the XML-enabled database. It is
highly capable to store, query and maintain the XML document than XML-enabled
database.
79
A Multimedia database is a collection of interrelated multimedia data that includes
text, graphics (sketches, drawings), images, animations, video, audio etc and have
vast amounts of multisource multimedia data. The framework that manages
different types of multimedia data which can be stored, delivered and utilized in
different ways is known as multimedia database management system. There are
three classes of the multimedia database which includes static media, dynamic
media and dimensional media.
Documents and record management : Industries and businesses that keep detailed
records and variety of documents. Example: Insurance claim record.
Knowledge dissemination : Multimedia database is a very effective tool for
knowledge dissemination in terms of providing several resources. Example:
Electronic books.
Education and training : Computer-aided learning materials can be designed using
multimedia sources which are nowadays very popular sources of learning.
Example: Digital libraries.
Marketing, advertising, retailing, entertainment and travel. Example: a virtual tour
of cities.
Web Database?
A web database is a system for storing and displaying information that is
accessible from the Internet. It is a type of web application designed to be managed
and accessed through the Internet.
Web databases are ideal for situations where the information should be shared or
81
when it must be accessed from various locations or different devices. They are
especially beneficial when the system is to be shared between locations or different
devices, such as tablets, computers, and cell phones. Web databases can be used
for a range of different purposes, including membership databases, client lists,
inventory databases, and more.
In a web database, each field in a table has to have a defined data type, such as
numbers, strings, and dates. Proper database design involves choosing the correct
data type for each field to reduce memory consumption and improve performance.
Web databases can be accessed from anywhere by authorised users, allowing for
sharing and collaboration. Examples of web database software include Microsoft
Office Access, OpenOffice Base, Webex WebOffice database, FormLogix Web
database, and MySQL, which is a relational database management system often
used with web hosting for managing either personal or business website databases.
Spatial data support in database is important for efficiently storing, indexing and
querying of data on the basis of spatial location. For example, suppose that we
want to store a set of polygons in a database and to query the database to find all
polygons that intersect a given polygon. We cannot use standard index structures,
such as B-trees or hash indices, to answer such a query efficiently. Efficient
processing of the above query would require special-purpose index structures, such
as R-trees for the task.
Geographic data such as road maps, land-usage maps, topographic elevation maps,
political maps showing boundaries, land-ownership maps, and so on. Geographical
information system are special purpose databases for storing geographical data.
Geographical data are differ from design data in certain ways. Maps and satellite
images are typical examples of geographic data. Maps may provide not only
83
location information associated with locations such as elevations. Soil type, land
type and annual rainfall.
Raster data
Vector data
1.Raster data: Raster data consist of pixels also known as grid cells in two or more
dimensions. For example, image of Satellites , digital pictures, and scanned maps.
2.Vector data: Vector data consist of triangles, lines, and various geometrical
objects in two dimensions and cylinders, cuboids, and other polyhedrons in three
dimensions. For example, building boundaries and roads.
Microsoft SQL server: Since the 2008 version of Microsoft SQL server supported
spatial databases.
CouchDB : This is document-based database in which spatial data is enabled by
plugin called GeoCouch.
Neo4j database.
84
Features of Active Database:
1. It possess all the concepts of a conventional database i.e. data modelling facilities, query
language etc.
2. It supports all the functions of a traditional database like data definition, data
manipulation, storage management etc.
3. It supports definition and management of ECA rules.
4. It detects event occurrence.
5. It must be able to evaluate conditions and to execute actions.
6. It means that it has to implement rule execution.
Examples of Active Databases:
1. Real-time Databases
2. In-Memory Databases
3. Transactional Databases
4. Time-series Databases
1.Real-time Databases:
Oracle TimesTen: A relational database that runs in memory and is intended for real-
time applications that need response times of less than one millisecond.
VoltDB: A lightning-fast in-memory database for instantaneous analytics and data
processing.
2.In-Memory Databases:
85
SAP HANA: A column-oriented, in-memory relational database management system for
processing large amounts of data and real-time analytics.
MemSQL: Uses in-memory processing for real-time data insights, combining analytics
and transactions on a single platform.
3.Transactional Databases:
MySQL Cluster: Offers automatic sharding and synchronous replication for high
availability and real-time data access.
Microsoft SQL Server with Always On: High availability and disaster recovery are
provided by Microsoft SQL Server with Always On, which enables real-time read access
to replicated databases.
4.Time-series Databases:
InfluxDB: For time-stamped data, InfluxDB is designed to withstand heavy write and
query loads. It is frequently utilized in IoT and monitoring applications.
Prometheus: A toolkit for alerting and monitoring that keeps track of time series data
and is used to analyze and monitor systems in real time.
These databases and platforms support a variety of real-time data handling
requirements, including high-throughput stream processing, low-latency
transaction processing, and event-driven architectures.
Advantages :
1. Enhances traditional database functionalities with powerful rule processing capabilities.
2. Enable a uniform and centralized description of the business rules relevant to the
information system.
3. Avoids redundancy of checking and repair operations.
4. Suitable platform for building large and efficient knowledge base and expert systems.
86
A Mobile database is a database that can be connected to a mobile computing
device over a mobile network (or wireless network). Here the client and the server
have wireless connections. In today's world, mobile computing is growing very
rapidly, and it is huge potential in the field of the database. It will be applicable on
different-different devices like android based mobile databases, iOS based mobile
databases, etc. Common examples of databases are Couch base Lite, Object Box,
etc.
A cache is maintained to hold frequent and transactions so that they are not lost
due to connection failure.
As the use of laptops, mobile and PDAs is increasing to reside in the mobile
system.
Mobile databases are physically separate from the central database server.
Mobile databases resided on mobile devices.
Mobile databases are capable of communicating with a central database server or
other mobile clients from remote sites.
With the help of a mobile database, mobile users must be able to work without a
wireless connection due to poor or even non-existent connections (disconnected).
A mobile database is used to analyze and manipulate data on mobile devices.
Mobile Database typically involves three parties :
Fixed Hosts -
It performs the transactions and data management functions with the help of
database servers.
Mobiles Units -
These are portable computers that move around a geographical region that includes
the cellular network that these units use to communicate to base stations.
Base Stations -
These are two-way radios installation in fixed locations, that pass communication
with the mobile units to and from the fixed hosts.
Limitations :
Here, we will discuss the limitation of mobile databases as follows.
88
UNIT- 4
A transaction refers to a sequence of one or more operations (such as read, write, update, or
delete) performed on the database as a single logical unit of work. A transaction ensures that
either all the operations are successfully executed (committed) or none of them take effect
(rolled back). Transactions are designed to maintain the integrity, consistency and reliability of
the database, even in the case of system failures or concurrent access.
All types of database access operation which are held between the beginning and end transaction
statements are considered as a single logical transaction. During the transaction the database is
inconsistent. Only once the database is committed the state is changed from one consistent state
to another.
Properties of Transaction
As transactions deal with accessing and modifying the contents of the database, they must have
some basic properties which help maintain the consistency and integrity of the database before
and after the transaction. Transactions follow 4 properties, namely, Atomicity, Consistency,
Isolation, and Durability.
Generally, these are referred to as ACID properties of transactions in DBMS. ACID is the
acronym used for transaction properties. A brief description of each property of the transaction is
as follows.
89
Atomicity
Atomicity is achieved through commit and rollback operations, i.e. changes are made to the
database only if all operations related to a transaction are completed, and if it gets interrupted,
any changes made are rolled back using rollback operation to bring the database to its last saved
state.
Consistency
This property of a transaction keeps the database consistent before and after a transaction is
completed.
Execution of any transaction must ensure that after its execution, the database is either in its prior
stable state or a new stable state.
In other words, the result of a transaction should be the transformation of a database from one
consistent state to another consistent state.
Consistency, here means, that the changes made in the database are a result of logical operations
only which the user desired to perform and there is not any ambiguity.
Isolation
This property states that two transactions must not interfere with each other, i.e. if some data is
used by a transaction for its execution, then any other transaction can not concurrently access
that data until the first transaction has completed. It ensures that the integrity of the database is
maintained and we don't get any ambiguous values. Thus, any two transactions are isolated from
each other.
Example:
Transaction 1: Withdraw $100 from account A.
Durability
This property ensures that the changes made to the database after a transaction is completely
executed, are durable.
It indicates that permanent changes are made by the successful execution of a transaction.
In the event of any system failures or crashes, the consistent state achieved after the completion
of a transaction remains intact. The recovery subsystem of DBMS is responsible for enforcing
this property.
90
91