DBMS Expt 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

DBMS Module 4

Syllabus
Introduction to database Systems, advantages of database system over traditional
file system, Basic concepts & Definitions, Database users, Database Language,
Database System Architecture, Schemas, Sub Schemas, & Instances, database
constraints, 3-level database architecture, Data Abstraction, Data Independence,
Mappings, Structure, Components & functions of DBMS, Data models.

Entity relationship model, Components of ER model, Mapping E-R model to


Relational schema, Network and Object Oriented Data models. Relational Algebra,
Tuple & Domain Relational Calculus, Relational Query Languages: SQL and QBE.
Query processing and optimization: Evaluation of Relational Algebra Expressions,
Query optimization, Query cost estimation.

Database Design:-Database development life cycle (DDLC), Automated design tools,


Functional dependency and Decomposition, Join strategies, Dependency
Preservation & lossless Design, Normalization, Normal forms:1NF, 2NF,3NF, and
BCNF, Multi-valued Dependencies, 4NF & 5NF.

Transaction processing and concurrency control: Transaction concepts, properties of


transaction, concurrency control, locking and Timestamp methods for concurrency
control schemes. Database Recovery System, Types of Data Base failure & Types of
Database Recovery, Recovery techniques. Fundamental concepts of advanced
databases.
Storage Strategies: Detailed Storage Architecture, RAID

Transaction Processing and Concurrency Control:


1. Transaction Concepts:

Definition: A transaction is a sequence of one or more operations that are


executed as a single unit of work. Transactions ensure data integrity and
consistency in a database.

DBMS Module 4 1
ACID Properties:

Atomicity: Transactions are treated as a single, indivisible unit. Either all


operations are executed, or none are.

Consistency: Transactions take the database from one consistent state to


another.

Isolation: Transactions are executed independently of each other.

Durability: Once a transaction is committed, its effects persist even in the


face of system failures.

2. Properties of Transactions:

Consistency: Ensures that the database remains in a consistent state before


and after the transaction.

Isolation: Transactions should be isolated from each other, preventing


interference between concurrently executing transactions.

Durability: Changes made by committed transactions should be permanent and


survive system failures.

3. Concurrency Control:

Definition: Concurrency control is the process of managing simultaneous


execution of transactions in a multi-user database environment to maintain
consistency and isolation.

Challenges:

Lost Update: Overwriting uncommitted data.

Inconsistent Retrievals: Reading uncommitted data.

Uncommitted Data: Accessing data modified by uncommitted transactions.

4. Locking for Concurrency Control:

Definition: Locking is a method to control access to shared resources by


acquiring locks on those resources.

Types of Locks:

Shared Lock: Allows multiple transactions to read a resource but prevents


any of them from writing to it.

Exclusive Lock: Permits only the transaction holding the lock to write to the
resource, and others are blocked.

DBMS Module 4 2
Drawbacks:

Deadlocks: A situation where two or more transactions are unable to


proceed because each is waiting for the other to release a lock.

5. Timestamp Methods for Concurrency Control:

Definition: Timestamp-based concurrency control assigns a unique timestamp


to each transaction and each data item. It uses these timestamps to determine
the order of transactions.

Two-Phase Locking (2PL):

Growing Phase: Acquiring locks.

Shrinking Phase: Releasing locks.

Timestamp Ordering:

Transactions are ordered based on their timestamps.

Older transactions have priority over newer transactions.

Concurrency Control Protocols:

Timestamp-Ordering Protocol: Ensures that transactions are executed in


timestamp order.

Thomas Write Rule: A write operation is not allowed if the timestamp of the
writing transaction is smaller than the timestamp of the last transaction that
read the data.

6. Optimistic Concurrency Control:

Definition: Assumes that conflicts between transactions are rare. Transactions


are executed without locks, and conflicts are checked at the end.

Validation Phase:

Transactions are executed without acquiring locks.

A validation phase checks if any conflicts occurred.

If conflicts are found, transactions are rolled back and re-executed.

7. Multi-Version Concurrency Control (MVCC):

Definition: Allows multiple versions of a data item to coexist in the database.

Read Transactions:

Can read any version of the data item based on their timestamp.

DBMS Module 4 3
Write Transactions:

Create a new version of the data item.

Existing transactions continue to read the old version.

Concurrency control methods, such as locking and timestamp-based approaches,


are crucial for managing concurrent transactions in a database system, ensuring
data consistency and preventing conflicts. Each method has its strengths and
weaknesses, and the choice depends on the specific requirements and
characteristics of the application.

Database Recovery System:

1. Need for Recovery:


Database Consistency: Ensures the database remains in a consistent and valid
state despite system failures.

Transaction Atomicity: Guarantees that all changes by a transaction are


applied or none in the case of errors.

Data Durability: Ensures committed changes persist even if the system


crashes.

2. Redo and Undo Operations:


Redo Operation:

Definition: Reapplies changes from the log to the database during recovery.

Purpose: Ensures committed changes, not yet written to the database, are
reinstated after a crash.

Undo Operation:

Definition: Reverts changes made by an incomplete or aborted transaction.

Purpose: Restores the database to a consistent state by undoing


uncommitted transactions.

3. Types of Database Failures:


Transaction Failures:

Abort (Logical Error): Rolling back a transaction due to a logical mistake.

DBMS Module 4 4
System Crash (Hard Failure): Entire system crashes.

Storage Failures:

Media Failure: Physical damage to storage media like a disk failure.

Concurrency Control Failures:

Deadlock: Transactions are stuck waiting for each other.

4. Types of Database Recovery:


Forward Recovery (Roll-Forward):

Definition: Restores the database by reapplying committed transactions from


a log.

Usage: After a crash, brings the database forward to a consistent state.

Backward Recovery (Rollback):

Definition: Reverts the database back to a consistent state by undoing


uncommitted transactions.

Usage: After an abort or logical error, maintains consistency.

5. Recovery Techniques:
Shadow Paging:

Definition: Maintains a shadow copy of the database.

Process: Changes are made in a shadow copy, becoming the main copy
after commit.

Write-Ahead Logging (WAL):

Definition: Logs changes before applying them to the database.

Process: Log records are written before changes are made, crucial for
recovery.

Checkpointing:

Definition: Periodically saves the database state.

Process: Reduces recovery work by providing a known consistent state.

Immediate Update:

Definition: Applies changes directly to the database.

DBMS Module 4 5
Process: Changes are immediately reflected. Simpler but less resilient to
crashes.

Deferred Update:

Definition: Delays applying changes; stores in a transaction log.

Process: Changes are stored, applied after a transaction commits.

ARIES (Algorithm for Recovery and Isolation Exploiting Semantics):

Definition: Combines write-ahead logging and phases for crash recovery.

Phases: Analysis, Redo, Undo.

6. Log-Based Recovery Steps (WAL):


Analysis Phase:

Identifies the most recent checkpoint.

Scans the log from the checkpoint to find the last committed transaction.

Redo Phase:

Reapplies changes made by committed transactions since the last


checkpoint.

Undo Phase:

Undoes changes made by incomplete transactions.

7. Optimistic Concurrency Control:


Definition: Assumes conflicts between transactions are rare; checks conflicts at
the end.

Validation Phase: Checks for conflicts after transaction execution. Rolls back and
re-executes if conflicts exist.

8. Multi-Version Concurrency Control (MVCC):


Definition: Allows multiple versions of a data item in the database.

Read Transactions: Can read any version based on their timestamp.

Write Transactions: Create a new version; existing transactions continue reading


the old one.

DBMS Module 4 6
Fundamental Concepts of Advanced Databases:

1. Distributed Databases:

Definition: A database distributed over multiple locations, allowing data to


be stored and processed on different computers.

Key Concepts:

Data Distribution

Replication

Transaction Management in a Distributed Environment

2. NoSQL Databases:

Definition: Non-relational databases that provide a flexible schema and are


designed for distributed data.

Key Concepts:

Document-oriented (e.g., MongoDB)

Key-Value Stores (e.g., Redis)

Column-Family Stores (e.g., Apache Cassandra)

Graph Databases (e.g., Neo4j)

3. Big Data:

Definition: Handling and analyzing large and complex datasets that exceed
the capabilities of traditional databases.

Key Concepts:

Volume, Velocity, Variety, Veracity, Value

Distributed Computing (e.g., Hadoop)

Data Warehousing

4. Data Warehousing:

Definition: Centralized repository for storing and managing large volumes of


data from various sources.

Key Concepts:

ETL (Extract, Transform, Load) Processes

Data Marts

DBMS Module 4 7
OLAP (Online Analytical Processing)

5. Data Mining:

Definition: Extracting patterns and knowledge from large datasets.

Key Concepts:

Clustering

Classification

Association Rule Mining

6. Blockchain Databases:

Definition: A decentralized and distributed ledger that records transactions


across multiple computers.

Key Concepts:

Cryptocurrency (e.g., Bitcoin)

Smart Contracts

Immutability

7. In-Memory Databases:

Definition: Data stored in the system's main memory (RAM) for faster
access and retrieval.

Key Concepts:

Reduced Latency

Increased Throughput

Real-Time Analytics

8. Temporal Databases:

Definition: Databases that support the storage and retrieval of historical


data, including data at different points in time.

Key Concepts:

Valid Time vs. Transaction Time

Temporal Queries

Temporal Constraints

DBMS Module 4 8
9. Spatial Databases:

Definition: Databases that manage spatial data and support spatial queries.

Key Concepts:

Geometric Data Types (e.g., Points, Lines, Polygons)

Spatial Indexing

Spatial Operations

10. Data Encryption and Security:

Definition: Implementing measures to secure data and protect it from


unauthorized access.

Key Concepts:

Encryption Algorithms

Access Control

Auditing and Logging

11. Advanced Query Optimization:

Definition: Techniques for optimizing the performance of database queries.

Key Concepts:

Query Rewriting

Cost-Based Optimization

Parallel Query Execution

12. Machine Learning in Databases:

Definition: Integrating machine learning algorithms and models within


databases for data analysis and decision-making.

Key Concepts:

Predictive Modeling

Anomaly Detection

Recommendation Systems

Storage Strategies: Detailed Storage Architecture and RAID

DBMS Module 4 9
1. Detailed Storage Architecture:
A. Storage Hierarchy:

Registers: Fastest and smallest storage directly accessible by the CPU.

Cache Memory: Provides high-speed access to frequently used data.

Primary Memory (RAM): Stores data and programs currently in use.

Secondary Storage (HDDs, SSDs): Non-volatile storage for long-term data


retention.

B. Disk Storage Architecture:

Platters: Circular disks coated with a magnetic material.

Read/Write Heads: Move over platters to read and write data.

Cylinders, Tracks, Sectors: Disk subdivision for data organization.

Controller: Manages data transfer between the disk and the computer.

C. File Systems:

Definition: A method for organizing and storing data on storage devices.

Key Concepts:

File Allocation Table (FAT)

Master File Table (MFT)

Inodes (used in UNIX-like systems)

D. Virtual Memory:

Definition: An extension of the computer's physical memory into secondary


storage.

Paging and Swapping: Techniques to manage virtual memory.

2. RAID (Redundant Array of Independent Disks):


A. RAID Levels:

RAID 0 (Striping):

Description: Data is divided into blocks and written across multiple disks.

Advantages: Improved performance due to parallel access.

Drawbacks: No data redundancy; if one disk fails, data is lost.

DBMS Module 4 10
RAID 1 (Mirroring):

Description: Data is mirrored on two or more disks.

Advantages: Redundancy; if one disk fails, data is still available.

Drawbacks: Costlier as it requires double the storage capacity.

RAID 5 (Striping with Parity):

Description: Data and parity information are striped across multiple disks.

Advantages: Balances performance and redundancy; allows for a single


disk failure.

Drawbacks: Lower write performance due to parity calculation.

RAID 6 (Striping with Dual Parity):

Description: Similar to RAID 5, but with dual parity for enhanced fault
tolerance.

Advantages: Can tolerate the failure of two disks.

Drawbacks: Requires more storage capacity and has higher write overhead.

RAID 10 (Combination of RAID 1 and RAID 0):

Description: Mirroring combined with striping.

Advantages: Excellent performance and redundancy.

Drawbacks: Requires more disks and storage space.

B. RAID Controllers:

Definition: Hardware or software that manages the RAID array.

Functions:

Disk Striping

Parity Calculation

Fault Detection and Correction

C. Hot Spare:

Definition: An extra disk kept in the RAID array that automatically replaces a
failed disk.

Advantages: Minimizes downtime and reduces the risk of data loss.

DBMS Module 4 11
D. RAID Considerations:

Performance: RAID 0 and RAID 10 offer high performance.

Redundancy: RAID 1, RAID 5, and RAID 6 provide varying levels of data


redundancy.

Capacity Utilization: RAID 5 and RAID 6 use parity for efficient capacity
utilization.

E. RAID for SSDs:

Trim Support: Important for maintaining SSD performance.

Wear Leveling: Distributes write and erase cycles evenly across SSD cells.

DBMS Module 4 12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy