0% found this document useful (0 votes)
7 views64 pages

Database Notes

The document outlines the history and evolution of database systems, highlighting key developments from personal computers to the internet and the importance of databases in technology. It discusses the limitations of file-based systems, the advantages of Database Management Systems (DBMS), and the characteristics of databases including their structure and the role of metadata. Additionally, it covers the components of a database system, security measures, and the relational model introduced by E.F. Codd.

Uploaded by

hananikawaii27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views64 pages

Database Notes

The document outlines the history and evolution of database systems, highlighting key developments from personal computers to the internet and the importance of databases in technology. It discusses the limitations of file-based systems, the advantages of Database Management Systems (DBMS), and the characteristics of databases including their structure and the role of metadata. Additionally, it covers the components of a database system, security measures, and the relational model introduced by E.F. Codd.

Uploaded by

hananikawaii27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

CHAPTER 1

History:​

How did we get here in the internet world?


Personal Computers Local Area Networks
-​ 1977: Apple II -​ Ethernet Networking
-​ 1981: IBM PC Technology
-​ Early 1970s: Xerox Palo
Alto Research Center
-​ 1983: U.S National
Standard

The Importance of Database in the Technology world?

Internet World Wide Web WWW


-​ 1969: Advanced Research -​ 1963: First Web Browser
Projects Agency Network (Netscape) available
(ARPANET) -​ Mid 1990s: Online Retail
Sites
-​ 1995: Amazon, Best Buy

Early 2000s: Web 2.0 Mid 1970s: Mobile Phone


-​ Tablets -​ 2007: Apple Iphone
-​ Apps -​ 2008: Google Android
-​ Internet of Things (IoT) Operating System

-​ Databases are important because are used daily on


Instagram (posts,likes), Twitter (tweets), online shopping
(amazon)
History of Database

1.​ File-Based Systems

• Collection of application programs that perform services for


the end users (e.g. reports).
• Each program defines and manages its own data
Limitations of File-Based Approach

1. Separation and isolation of data

• Each program maintains its own set of data.


• Users of one program may be unaware of potentially useful
data held by other programs.

2. Duplication of data

• Same data is held by different programs.


• Wasted space and potentially different values and/or different
formats for the same item.

3. Data dependence

• File structure is defined in the program code.

4. Incompatible file formats

• Programs are written in different languages, and so cannot


easily access each other’s files.

5. Fixed Queries/Proliferation of application programs

• Programs are written to satisfy particular functions.


• Any new requirement needs a new program

Why did database approach?

1.​ Descriptions of data (metadata) was embedded in


application programs, rather than being stored separately
& independently
2.​ No control over access & manipulation of data beyond
that imposed by application programs
The Result: Database Management System (DBMS)
Why Database?

★​Allow the storage of data in one place and eliminate


duplications.
★​Allow the sharing of data.
★​Data is stored in tables, which have rows and columns like
a spreadsheet. A database may have multiple tables,
where each table stores data about a different thing.
★​Each row in a table stores data about an occurrence or
instance of the thing of interest.
★​A database stores data and relationships.

What is a Database?

●​ A database is a self-describing collection of integrated


tables.
●​ The tables are called integrated because they store data
about the relationships between the rows of data.
●​ A database is called self-describing because it stores a
description of itself.
●​ The self-describing data is called metadata (also called
data dictionary or system catalog), which is data about
data, that provides description of data

The Characteristics of Databases

●​ The purpose of a database is to help people track things


of interest to them.
●​ Data is stored in tables, which have rows and columns like
a spreadsheet. A database may have multiple tables,
where each table stores data about a different thing.
●​ Each row in a table stores data about an occurrence or
instance of the thing of interest.
●​ A database stores data and relationships.

Typical Metadata Tables

Databases Create Information

Data Information
Recorded facts & figures Knowledge derived from data

-​ Databases record data, but they do so in such a way that


we can produce information from the data
-​ The data on STUDENTs, CLASSes, GRADEs could produce
information about each student’s GPA
The Range of Databases Systems

The Components of a Database System

….With SQL

Applications, DBMS, SQL

●​ Applications are the computer programs that users work


with, which interacts with database by issuing an
appropriate request (SQL statement) to the DBMS.
Example?
●​ The Database Management System (DBMS) a software
system that enable users to creates, processes, and
administers databases.
●​ Structured Query Language (SQL) is an internationally
recognized standard database language that is used by
all commercial D B M Ss.

Prominent DBMS products


DBMS Power vs. Ease of Use

DBMS Disadvantages
Complexity
• Failure to understand DBMS functionalities can lead to bad
database design decisions

Size
• DBMS can occupy megabytes of disk space and requires
substantial amount of memory

Cost of DBMS
• Cost varies from hundreds to millions

Performance
• DBMS cater for many applications, thus some applications
may not run as fast as they used to

Higher impact of a failure


• Due to centralization of resources, failure to certain DBMS
components can delay business operation

DBMS Advantages

Improved data integrity Improved Security


• the database is not • Allow access to only
permitted to violate (e.g.salary authorized users
cannot be greater than 40,000)
Increased concurrency Improved backup & recovery
services
• Allow two or more users to • Provide facilities to protect
access the same file data from failures to the
simultaneously without computer system by
interfering with each other performing backup and
recovery
History of Database Systems

1st generation 2nd generation 3rd generation


-​ Hierarchical & -​ Relational -​ Object-Relational
Network -​ Object-Oriented

Types of Databases

Database Design

-​ (as a process) is the creation of the proper structure of


database tables, the proper relationships between tables,
appropriate data constraints, other structural
components of the database

Types of Database Design


CHAPTER 2: DATABASE
ENVIRONMENT
Database system refers to an organization of components that
define & regulate the collection, storage, management, use of
data within a database environment

1.​ Hardware

-​ Can range from a PC to a network of computers


-​ All system's physical devices including computers (PCs,
tablets, workstations, servers, and supercomputers),
storage devices, printers, network devices (hubs, switches,
routers, fiber optics), and other devices (automated teller
machines, ID readers, and so on)

2.​ Software

-​ Collection of programs used within the database systems


-​ Includes the operating system, DBMS software, application
programs & utilities
-​ Operating System manages all hardware components and
makes it possible for all other software to run on the
computers. e.g: UNIX, Windows.
-​ DBMS software manages the database within the
database system. e.g: Oracle Corporation's ORACLE, IBM's
DB2, Sun's MYSQL, Microsoft's MS Access and SQL Server.
-​ Application programs and utilities are used to access and
manipulate the data in the database and to manage the
operating environment of the database.

1.​ Data

-​ Acts as a bridge between the machine components and


the human components
-​ Covers the collection of facts stored in the database
-​ The database contains both the operational data & the
metadata - the “data about data”

2.​ Procedure

-​ Instructions & rules that govern the design & the use of
database
-​ These may consist of instructions on how to:
●​ Log on to the DBMS.
●​ Use a particular DBMS facility or application
program.
●​ Start and stop the DBMS.
●​ Make backup copies of the database.
●​ Handle hardware or software failures

3.​ People

-​ All users associated with the database system.


-​ Five types of users in a database system:

●​ System Administrators - oversees the database


system's general operations.
●​ Database Administrators - physically implements the
database according to the logical design and
performs the maintenance of a database system.
●​ Database Designer/Database Architect - prepare the
conceptual design from the requirements.
●​ System Analysts and Programmer/Application
Developer - design and implements the application
programs by creating the input screens, reports, and
procedures through which end users access and
manipulate the database.
●​ End Users - the people who use the application. e.g:
(banking system) the employees, customer using ATM
or online banking facility are end users.

3-level ANSI-SPARC Architecture

-​ is a three-levels of abstraction at which data item can be


described.
-​ This design standard for a Database Management System
(DBMS) was first proposed in 1975.
-​ The objective of the three-level architecture is to separate
each user’s view of the database from the way the
database is physically represented.
-​ The levels form a three-level architecture comprising an
external, a conceptual, and an internal level.
Database Administration

-​ All large & small databases need database administration


-​ Data administration : refers to a function concerning all of
an organization’s data assets
-​ Database administrator (DBA) : refers to a person or office
specific to a single database and its application [ provide
backup & recovery || maintain documentation ]
DBA Responsibilities

Database Security
-​ Ensures that only authorized users can perform
authorized activities at authorized times

Developing database security:


-​ Determine users’ processing rights and responsibilities.
-​ Enforce security requirements using security features from
both DBMS and application programs.

-​ Protect [C.I.A] confidentiality, integrity, availability of data .


-​ Integrity ; protect the og data
-​ availability : always accessible

DBMS Security

-​ DBMS products provide security facilities.


-​ They limit certain actions on certain objects to certain
users or groups (also called roles).
-​ Almost all DBMS products use some form of username and
password security [ authorizes credentials ]
DBMS Security Guidelines

1.​ Run DBMS behind a firewall, but plan as though the


firewall has been breached
2.​ Apply the latest operating system and DBMS service
packs and fixes
3.​ Use the least functionality possible
• Support the fewest network protocols possible
• Delete unnecessary or unused system stored procedures
• Disable default logins and guest users, if possible
• Unless required, never allow all users to log on to the
DBMS interactively
4. Protect the computer that runs the DBMS
• No user allowed to work at the computer that runs the
DBMS
• DBMS computer physically secured behind locked doors
• Access to the room containing the DBMS computer
should be recorded in a log
5. Manage accounts and passwords
• Use a low privilege user account for the DBMS service
• Protect database accounts with strong passwords
• Monitor failed login attempts
• Frequently check group and role memberships
• Audit accounts with null passwords
• Assign accounts the lowest privileges possible
• Limit DBA account privileges

6. Develop a security plan


• plan for preventing and detecting security problems
• Create procedures for security emergencies and practice
them

Application Security

• If DBMS security features are inadequate, additional security


code could be written in application program.
• Application security in Internet applications is often provided
on the Web server computer.
• However, you should use the DBMS security features first.
• The closer the security enforcement is to the data, the less
chance there is for infiltration.
• DBMS security features are faster, cheaper, and probably
result in higher quality results than developing your own
CHAPTER 3: THE RELATIONAL
MODEL

-​ Introduced in 1970
-​ Created by E.F Codd
➢​An IBM engineer
➢​The model used mathematics known as ‘relational algebra’
➢​Now the standard model for commercial DBMS products

Entity

Is an identifiable thing that users want to track:

-​ Customers
-​ Products
-​ Sales
-​ Transactions
Relation

-​ Relational DBMS products store data about entities in


relations, which are a special type of table.
-​ A relation is a two-dimensional table that has the following
characteristics:
• Rows contain data about an entity.
• Columns contain data about attributes of the entity.
• All entries in a column are of the same kind.
• Each column has a unique name.
• Cells of the table hold a single value.
• The order of the columns is unimportant.
• The order of the rows is unimportant.
• No two rows may be identical

The Domain Integrity Constraint

• The requirement that all of the values in a column are of the


same kind is know as the domain integrity constraint.
• The term domain means a grouping of data that meets a
specific type definition.
– FirstName could have a domain of names such as Albert,
Bruce, Cathy, David, Edith, and so forth.
– All values of FirstName must come from the names in that
domain.
• Columns in different relations may have the same name.

Employee
To Key Or Not to KEY?

In a relation as defined by Codd:


-​ The rows of a relation must be unique.
-​ These is no requirement for a designated primary key.
-​ The requirement for unique rows implies that a primary
key can be designated.
-​ In the “real world,” every relation has a primary key.
-​ When do we designate a primary key?
What makes Determinant Values unique?

• A determinant is unique in a relation if and only if, it


determines every other column in the relation.
• You cannot find the determinants of all functional
dependencies simply by looking for unique values in one
column:
• Data set limitations
• Must be logically a determinant

Keys

A key is a combination of one or more columns that is used to


identify rows in a relation.
Composite Candidate
key that consists of two or a key that determines all of
more columns the other columns in a
relation.

Primary Surrogate

-​ is a candidate key -​ is an artificial column


selected as the primary added to a relation to
means of identifying rows serve as a primary key.
• DBMS supplied
in a relation.
• Short, numeric, and
-​ There is only one primary never changes—an ideal
key per relation. primary key
-​ may be a composite key. • Has artificial values
-​ The ideal primary key is that are meaningless to
short, numeric, and users
never changes. • Normally hidden in
forms and reports

Foreign
-​ A foreign key is the primary key
of one relation that is placed in
another relation to form a link
between the relations.
-​ A foreign key can be a single
column or a composite key.
-​ The term refers to the fact that
key values are foreign to the
relation in which they appear as
foreign key values.
CHAPTER 4: ERD

The Data Model

-​ Is a plan or a blueprint for a database design


-​ A more generalized & abstract than a database design
-​ It is easier to change a data model then it is to change a
database design, so it is the appropriate place to work
through conceptual database problems.

Data Model = Conceptual Design

1.​ Conceptual Design (conceptual schema)


2.​ Logical Design (logical schema)
3.​ Physical Design (physical schema)

1.​ Conceptual Model

identifies the highest-level relationships between the different


entities.
Features:
• Includes the important entities and the relationships among
them.
• No attribute is specified.
• No primary key is specified.
2.​ Logical Model

A logical data model describes the data in as much detail as


possible, without regard to how they will be physical
implemented in the database.

Features:
• Includes all entities and relationships among them.
• All attributes for each entity are specified.
• The primary key for each entity is specified.
• Foreign keys (keys identifying the relationship between
different entities) are specified.
• Normalization occurs at this level.

The steps for designing the logical data model are as follows:
• Specify primary keys for all entities.
• Find the relationships between different entities.
• Find all attributes for each entity.
• Resolve many-to-many relationships.
• Normalization.

3.​ Physical Model

represents how the model will be built in the database.

• A physical database model shows all table structures,


including column name, column data type, column constraints,
primary key, foreign key, and relationships between tables.

Features:
• Specification all tables and columns.
• Foreign keys are used to identify relationships between tables.
• De-normalization may occur based on user requirements.
• Physical considerations may cause the physical data model to
be quite different from the logical data model.

The steps for physical data model design are as follows:


• Convert entities into tables.
• Convert relationships into foreign keys.
• Convert attributes into columns.
• Modify the physical data model based on physical constraints
/ requirements.

The Entity-Relationship (E-R) Model

is a set of concepts and graphical symbols that can be used to


create conceptual schemas.

Versions:
– Original E-R model—by Peter Chen (1976)
– Extended E-R model—later extensions to the Chen model
included subtypes
– Now referred to as the extended E-R model, which is used in
this book when using the term E-R mode

ENTITIES
Something that can be readily
identified and that users want to
track:
– Entity class—a collection of
entities of a given type
– Entity instance—the occurrence of
a particular entity
• There are usually many instances
of an entity in an entity class.
Attributes
describe an entity’s characteristics.
• All entity instances of a given
entity class have the same
attributes, but vary in the values of
those attributes.
• They were originally shown in data
models as ellipses.
• Data modeling products today
commonly show attributes in
rectangular form.
Identifiers

are attributes that name, or identify, entity instances.


• The identifier of an entity instance consists of one or more of
the entity’s attributes.
• Composite identifiers are identifiers that consist of two or
more attributes.
• Identifiers in data models become keys in database designs.
– Entities have identifiers.
– Tables (or relations) have keys.

How to Draw an Entity

Entity Attributes Display


Relationships

Entities can be associated with one another in relationships:


Relationship Classes Relationship Instances
Associations among entity Associations among entity
classes instances

• In the original E-R model, relationships could have attributes,


but today this is no longer done.
• A relationship class can involve two or more entity classes

Degree of the Relationships

is the number of entity classes in the relationship:


– Two entities have a binary relationship of degree two
– Three entities have a ternary relationship of degree three

Entities & Tables

• The principle difference between an entity and a table


(relation) is that you can express a relationship between entities
without using foreign keys.
• This makes it easier to work with entities in the early design
process where the very existence of entities and the
relationships between them is uncertain
Cardinality

means “count,” and is expressed as a number.

Maximum Cardinality Minimum Cardinality


is the maximum number of is the minimum number of
relationship instances in relationship instances in
which an entity can which an entity must
participate. participate.
-​ One-to-One [1:1]
-​ One-to-Many [1:N]
-​ Many-to-Many [N:M]

Parent-Child Entities

• In a one-to-many relationship:
– The entity on the one side of the relationship is called the
parent entity or just the parent.
– The entity on the many side of the relationship is called the
child entity or just the child.
• The relationships we have been discussing are known as
HAS-A relationships:
– Each entity instance has a relationship with another entity
instance.
• An EMPLOYEE has one or more COMPUTERs.
• A COMPUTER has one assigned EMPLOYEE.

Minimum Cardinality

is the minimum number of entity instances that must


participate in a relationship.

• Minimums are generally stated as either zero or one:


– IF zero [0] THEN participation in the relationship by the entity
is optional, and no entity instance must participate in the
relationship.
– IF one [1] THEN participation in the relationship by the entity
is mandatory, and at least one entity instance must participate
in the relationship.

Indicating it:

➔​Minimum cardinality of zero [0] indicating optional


participation is indicated by placing a circle next to the
optional entity.
➔​Minimum cardinality of one [1] indicating mandatory
(required) participation is indicated by placing a vertical
hash mark next to the required entity.

Reading it:

➔​IF you see a circle THEN that entity is optional (minimum


cardinality of zero [0]).
➔​IF you see a vertical hash mark THEN that entity is
mandatory (required) (minimum cardinality of one [1]).

4 Types of
Minimum
Cardinality
Data Modeling Notation: IE Crow’s Foot 1:N

Data Modeling Notation: IE Crow’s Foot N:M


Example 1

Example 2
Strong vs. Weak Entities

STRONG ENTITY WEAK ENTITY


an entity that represents an entity whose existence
something that can exist on depends on the presence of
its own. another entity.
• Examples (PERSON, • Example (APARTMENT –
AUTOMOBILE, BUILDING) depends on BUILDING)

Logical Data Model

Build & Validate Logical Data Model


• A logical data model represents the data requirements of one
or more, but not all user views
• To translate the conceptual data model = logical data model
• To validate this model to ensure that it is structurally correct
using logical data model and supports the required
transactions.

Step 1. Derive Relations For Logical Data Model

To create relations for the logical data model to represent the


entities, relationships, and attributes that have been identified.
• Describe the composition of each relation using database
design language (DBDL) for relational database.
• Identify the primary key and any alternative and/or foreign
key(s) of the relation.

1.​ Strong entity types

• For each strong entity in the data model, create a relation


that includes all the simple attributes of that entity. For
composite attributes, include only the constituent simple
attributes.

2.​ Weak entity types


• For each weak entity in the data model, create a relation that
includes all the simple attributes of that entity.
• The primary key of a weak entity is partially or fully derived
from each owner entity and so the identification of the primary
key of a weak entity cannot be made until after all the
relationships with the owner entities have been mapped.

3.​ One-to-many (1:*) binary relationship types

• For each 1:* binary relationship, the entity on the ‘one side’ of
the relationship is designated as the parent entity and the
entity on the ‘many side’ is designated as the child entity.
• To represent this relationship, post a copy of the primary key
attribute(s) of parent entity into the relation representing the
child entity,to act as a foreign key.

4.​One-to-one (1:1) binary relationship types

• Creating relations to represent a 1:1 relationship is more


complex as the cardinality cannot be used to identify the
parent and child entities.
• Instead, the participation constraints are used to decide
whether it is best to represent the relationship by combining
the entities involved into one relation or by creating two
relations and posting a copy of the primary key the other
relation.
• Consider the following
a) mandatory participation on both sides of 1:1 relationship; b)
mandatory participation on one side of 1:1 relationship;
c) optional participation on both sides of 1:1 relationship.

(a) Mandatory participation on both sides of 1:1 relationship


-​ Combine entities involved into one relation and choose
one of the primary keys of original entities to be primary
key of the new relation, while the other (if one exists) is
used as an alternate key.

(b) Mandatory participation on one side of a 1:1 relationship


-​ Identify parent and child entities using participation
constraints. Optional participation designated as parent
entity, Mandatory participation designated as child.
-​ A copy of primary key of the parent entity is placed in the
relation representing the child entity.

(c) Optional participation on both sides of a 1:1 relationship


-​ The designation of the parent and child entities is
arbitrary unless we can find out more about the
relationship that can help a decision

5.​One-to-one (1:1) recursive relationships

• Mandatory participation on both sides, represent the


recursive relationship as a single relation with two copies of
the primary key.
• Mandatory participation on only one side, create a single
new relation with two copies of the primary key.
• Optional participation on both sides, again create a new
relation as described in mandatory participation.

6.​ Many-to-many (*:*) binary relationship types

• Create a relation to represent the relationship and include


any attributes that are part of the relationship.
• Post a copy of the primary key attribute(s) of the entities that
participate in the relationship into the new relation, to act as
foreign keys.
• These foreign keys will also form the primary key of the new
relation, possibly in combination with some of the attributes of
the relationship
Step 2. Validate Relations Using Normalization

Step 3. Validate Relations Against User Transactions


-​ To ensure that the relations in the logical data model
support the required transactions.

Step 4. Check Integrity Constraints

This includes identifying:


• Required data
• Attribute domain constraints
• Multiplicity
• Entity integrity
• Referential integrity
• General constraints
CHAPTER 5: SQL, DATA
MANIPULATION
●​ Data Definition Language (DDL) statements create the
database structure and control access
●​ Data Manipulation Language (DML) statements add and
query data.

Writing SQL commands

-​ An SQL statement uses reserved words (fixed words


with specific meanings that must be spelled exactly)
and user-defined words (names created by users for
database objects like tables and columns).
-​ SQL statements follow syntax rules, and many versions
require a semicolon (;) to end each statement.
-​ Most parts of SQL are not case-sensitive, except for
literal text data, which must match exactly (e.g.,
"SMITH" is different from "Smith").
-​ Although SQL can be written freely, using indentation
and line breaks makes statements easier to read.
-​ For example, start each clause on a new line and
indent parts of clauses to show their structure.
We use a special notation called Backus Naur Form (BNF)
to define SQL syntax:
-​ Uppercase letters are reserved words,
-​ Lowercase letters are user-defined words,
-​ A vertical bar (|) means "or,"
-​ Curly braces ({}) mean required elements,
-​ Square brackets ([]) mean optional elements,
-​ Ellipsis (...) means something can repeat zero or more
times.
Simple Queries

-​ The SELECT statement is used to retrieve and display data


from one or more tables in a database.
-​ It is the most commonly used SQL command and can
perform several powerful operations-like filtering,
choosing specific columns, and combining tables-all in a
single statement.

The sequence of processing in a SELECT statement is:

Example: Retrieve all columns, all rows

-​ List full details of all staff.


-​ Because there are no restrictions specified in this query,
the WHERE clause is unnecessary and all columns are
required.
-​ We write this query as:
SELECT staffNo, fName, IName, position, sex, DOB, salary,
branchNo FROM Staff;
-​ Because many SQL retrievals require all columns of a
table, there is a quick way of expressing “all columns” in
SQL, using an asterisk (*) in place of the column names.
-​ The following statement is an equivalent and shorter way
of expressing this query:
SELECT * FROM Staff;
-​ Produce a list of salaries for all staff, showing only the staff
number, the first and last names, and the salary details.
SELECT staffNo, fName, IName, salary FROM Staff;
Example: Use of Distinct

-​ List the property numbers of all properties that have been


viewed.
SELECT propertyNo FROM Viewing;
-​ Notice that there are several duplicates, SELECT does not
eliminate duplicates when it projects over one or more
columns.
-​ To eliminate the duplicates, use the DISTINCT keyword.
-​ Rewriting the query as:
SELECT DISTINCT propertyNo FROM Viewing;

Example: Calculated Fields

-​ Produce a list of monthly salaries for all staff, showing the


staff number, the first and last names, and the salary
details.
SELECT staffNo, fName, IName, salary/12 FROM Staff;
-​ Using an AS clause:
SELECT staffNo, fName, IName, salary/12 AS monthlySalary
FROM Staff;

ROW SELECTION (WHERE CLAUSE)

-​ The previous examples show the use of the SELECT


statement to retrieve all rows from a table. However, need
to restrict the rows that are retrieved.
-​ This can be achieved with the WHERE clause, which
consists of the keyword WHERE followed by a search
condition that specifies the rows to be retrieved.
-​ The five basic search conditions (or predicates, using the
ISO terminology) are as follows:
❑Comparison Compare the value of one expression to the
value of another expression.
❑Range Test whether the value of an expression falls
within a specified range of values.
❑Set membership Test whether the value of an expression
equals one of a set of values.
❑Pattern match Test whether a string matches a specified
pattern.
❑Null Test whether a column has a null (unknown) value.

Example Comparison Search Condition

Comparison Operators
❑= equals
❑<> is not equal to (ISO standard) ! =
is not equal to (allowed in some
dialects)
❑< is less than < = is less than or
equal to
❑> is greater than > = is greater than
or equal to

Example Compound Comparison Search Condition

the logical operator OR is used in the WHERE clause to find the


branches in London (city = ‘London’) or in Glasgow (city = ‘Glasgow’).

Example Range Search Condition


Example Pattern Match Search Condition (LIKE/NOT LIKE)

Find all owners with the string ‘Glasgow’ in their address.

-​ For this query, we must search for the string ‘Glasgow’


appearing somewhere within the address column of the
PrivateOwner table.
-​ SQL has two special pattern-matching symbols:
❑The % percent character represents any sequence of zero or
more characters (wildcard).
❑The _ underscore character represents any single character.
-​ All other characters in the pattern represent themselves.
For example:
❑address LIKE ‘H%’ means the first character must be H, but
the rest of the string can be anything.
❑address LIKE ‘H_ _ _’ means that there must be exactly four
characters in the string, the first of which must be an H.
❑address LIKE ‘%e’ means any sequence of characters, of
length at least 1, with the last character an e.
❑address LIKE ‘%Glasgow%’ means a sequence of characters of
any length containing Glasgow.
❑address NOT LIKE ‘H%’ means the first character cannot be
an H
Example NULL search condition (IS NULL/NOT NULL)

Sorting Results (ORDER BY CLAUSE)

-​ By default, the rows returned from an SQL query are not


sorted in any specific order.
-​ To sort the results, you use the ORDER BY clause at the
end of a SELECT statement, listing one or more columns to
sort by, either by name or position number.
-​ You can choose ascending (ASC) or descending (DESC)
order, and sort by multiple columns if needed.
-​ Most parts of SQL are not case-sensitive, and the ORDER
BY clause must always be the last part of the SELECT
statement

-​ Can use more than one column in the ORDER BY clause to


sort results.
-​ The first column is the major sort key, which sets the main
order.
-​ If some rows have the same value in the major key, a
second column, called the minor sort key, can be used to
sort those rows further.
-​ Example Multiple column ordering: Produce an
abbreviated list of
properties arranged
in order of property
type.

Using the SQL Aggregate Functions

Besides getting rows and columns from the database, we often


want to calculate totals or summaries of the data, like the totals
you see at the end of a report.

The ISO standard defines five aggregate functions:


• COUNT – returns the number of values in a specified column
• SUM – returns the sum of the values in a specified column
• AVG – returns the average of the values in a specified column
• MIN – returns the smallest value in a specified column
• MAX – returns the largest value in a specified column

These functions work on one column and return a single value.


COUNT, MIN, and MAX can be used on any type of data, while
SUM and AVG only work with numbers.
-​ Except for COUNT(), these functions ignore null values.
-​ COUNT() counts all rows, including duplicates and nulls.

To remove duplicates before applying the function, use


DISTINCT before the column name.
-​ By default, duplicates are included unless you specify
DISTINCT.
-​ DISTINCT doesn’t affect MIN or MAX but can change the
results of SUM or AVG.
Aggregate functions can only be used in the SELECT list or
HAVING clause.
-​ An aggregate function without a GROUP BY clause, cannot
include columns in the SELECT list unless they are inside
an aggregate function.

Example Use of MIN, MAX, AVG

GROUPING RESULTS (GROUP BY CLAUSE)

The GROUP BY clause in a SELECT statement groups data and


creates a summary row for each group. The columns listed in
GROUP BY are called grouping columns. When using GROUP BY,
every column in the SELECT list must have one value for each
group.
In addition, the SELECT clause may contain only:
• column names;
• aggregate functions;
• constants;
• an expression involving combinations of these elements.

-​ All column names in the SELECT list must appear in the


GROUP BY clause unless the name is used only in an
aggregate function.
-​ The contrary is not true: there may be column names in
the GROUP BY clause that do not appear in the SELECT
list.
-​ When the WHERE clause is used with GROUP BY, the
WHERE clause is applied first, then groups are formed
from the remaining rows that satisfy the search condition

Example Use of GROUP BY

Subqueries

A subquery is a SELECT statement inside another SELECT


statement, used to help determine the final result. Subqueries
can appear in WHERE and HAVING clauses (as well as in
INSERT, UPDATE, and DELETE) and there are three main types:
❑Scalar subquery: returns a single value (one row, one column),
used where a single value is needed.
❑Row subquery: returns a single row with multiple columns,
used when you need to compare a full row.
❑Table subquery: returns multiple rows and columns, used as a
temporary table within
a query.
Using an aggregate function:​

The following rules apply to subqueries:


❑The ORDER BY clause may not be used in a subquery
(although it may be used in the outermost SELECT statement).
❑The subquery SELECT list must consist of a single column
name or expression, except for subqueries that use the keyword
EXISTS.
❑By default, column names in a subquery refer to the table
name in the FROM clause of the subquery.
❑It is possible to refer to a table in a FROM clause of an outer
query by qualifying the column name.
❑When a subquery is one of the two operands involved in a
comparison, the subquery must appear on the right-hand side
of the comparison

DATABASE UPDATE

❑SQL is a complete data manipulation language that can be


used for modifying the data in the database as well as querying
the database.
❑The commands for modifying the database are not as
complex as the SELECT statement
❑Three SQL statements that are available to modify the
contents of the tables in the database:
❖​INSERT – adds new rows of data to a table
❖​UPDATE – modifies existing data in a table
❖​DELETE – removes rows of data from a table

The dataValueList must match the columnList as follows:

• The number of items in each list must be the same.


• There must be a direct correspondence in the position of
items in the two lists, so that the first item in dataValueList
applies to the first item in columnList, the second item in
dataValueList applies to the second item in columnList, and so
on.
• The data type of each item in dataValueList must be
compatible with the data type of the corresponding column.
Examples

UPDATING DATA

UPDATE ALL ROWS


UPDATE SPECIFIC ROWS

UPDATE MULTIPLE COLUMNS

DELETE DATA FROM DATABASE

DELETE SPECIFIC ROWS

DELETE ALL ROWS


CHAPTER 6: NORMALIZATION

The process of analyzing a given relation schema based on


their functional dependencies and Primary Key to achieve a
minimum redundancy and insertion, deletion and modification
anomalies.
-​ Proposed by Codd, which requires relation schemas to go
through ‘series of normal form test

Normal Form

FIRST NORMAL FORM (1NF)

The relation is in
1NF if every tuple
in that relation
has only atomic
values for each
attribute
How to improve this situation?

• Remove non-atomic value attribute that violates 1NF and


place it in a separate relation.

Second Normal Form (2NF)

The relation is in 2NF if every non-key (non prime) attribute is


fully functionally dependent on the PK
Functional dependencies :
Emp_No, Proj_No→ Hours
Emp_No → Emp_Name
Proj_No → Proj_Name, Proj_Loc

How to improve this situation ?


• Decompose relation so that each non-key attribute will only
fully functionally dependent on the PK

Third Normal Form (3NF)

The relation is in 3NF if there is no transitive dependencies


• Transitive dependency is a dependency between two or
more non key attributes

How to improve this situation?


• Decompose relation into relations so that there will be no
transitive dependencies
CHAPTER 7: INTRODUCTION TO
NO-SQL DATABASE

Big Data

• current term for the enormous datasets generated by Web


applications such as.
i. search tools :
• Google and Bing ii. Web 2.0 social networks
• Facebook, LinkedIn, and Twitter

Big Data & NoSQL

-​ The NoSQL movement, better described as the Not only


SQL movement, is a movement to using non-relational
databases.
-​ These databases are often distributed, replicated
databases.
-​ Used in many widely recognized Web applications
– Used for Facebook
– Used forTwitter

Type/Categories of NoSQL Databases

– key-value—examples are DynamoDB, MemcacheDB, and Redis


– document—examples are Couchbase, Azure Cosmos DB, and
MongoDB
– column family—examples are Apache Cassandra,Vertica, and
Hbase
– graph—examples are Neo4j, AllegroGraph, and Titan
A Column Family DB
Software: HADOOP

• Hadoop Distributed File System (HDFS)


– provides standard file services to clustered servers so their
file systems can function as one distributed file system.
• The Hadoop family includes a full set of applications
including:
– Hbase
– A nonrelational data store.
– Pig
– A query language.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy