AMD - Lecture 01
AMD - Lecture 01
AMD - Lecture 01
Management of Data
Contact
• Email: fsei@cs.tu-chemnitz.de
Contact
• Daniel Richter
• Email: daniel.richter@informatik.tu-chemnitz.de
2
Documents
• You can find the slides of the lecture and practice lessons on the
website of the chair Datenverwaltungssysteme
https://www.tu-chemnitz.de/informatik/DVS/lehre/vorlesungen.php
3
Exam & Study Regulations
Exam
• written, 90 minutes
Study Regulations
• for all other students the module „Datenbanken und Objektorientierung“ will be recognized
for „Advanced Management of Data“
• you have to apply individually after the exam by filling in the form „Antrag auf
Fachsemestereinstufung/Anrechnung von Prüfungsleistungen“ (Student Service Point):
https://www.tu-chemnitz.de/studentenservice/stusek/formulare.php
4
Requirements
Actually, you should have a background in fundamental database concepts and technology,
including
• we offer special slides for self-studying, which cover the most important aspects
• Normalisation
• Query Processing
5
Importance of Databases
• database industry is valued >€50 billion annually
• hardware capability
• e-Commerce
6
Lecture
Objective
• most commercially relevant database systems are based on the relational paradigm, which
does often not adequately address many of these new challenges
7
Contents
• Repetition of fundamental Database Concepts and Terminology
• Extensions of SQL
8
Literature
• Ramez Elmasri & Shamkant B. Navathe: „Database Systems“. Seventh Edition, Pearson.
9
Advanced
Management of Data
Foundations of Databases
Conceptual Database Design
Why do we use databases?
Limitations of a file-based approach
• duplication of data
• data dependence
• Inflexible queries
Reason
• The definition of the data is embedded in the application programs, rather than being stored
separately and independently.
• There is no control over the access and manipulation of data beyond that imposed by the
application programs.
11
Database
Database
• a shared collection of logically related data, designed to meet the information needs of an
organization
• the database holds also a description of this data, which is called either
• system catalog
• data dictionary
• metadata
12
DBMS
DBMS
The Database Management System (DBMS) is a software system that enables users to
define, create, maintain, and control access to the database.
Facilities
• insertion, modification and deletion of data from the database through a Data Manipulation
Language (DML)
Attention
Usually, the notion „database“ means the whole of database and DBMS.
13
Application Programs
An Application Program (in the context of a database) is a computer program that interacts with
the database by issuing an appropriate request to the DBMS.
14
Views
• allow each user to have an individual view of the database and include only relevant
information
• provide a level of security by excluding data that some users should not see
• can present a consistent, unchanging picture of the database, even if the underlying
structure is changed
15
Advantages of DBMSs (1)
• Data consistency
• Sharing of data
• improved security
• enforcement of standards
16
Advantages of DBMSs (2)
• economy of scale
• increased productivity
• increased concurrency
17
Disadvantages of DBMSs
• Complexity
• Size
• Cost
• Cost of conversions
• Performance
• Impact of failures
18
Three-Level Architecture
External Level
Conceptual Level
• describes what data is stored in the database and the relationships among the data
Internal Level
19
Database Schema
Database Schema
20
Example
Students Central Examination Room
Staff Department
external level Office Office planning
(external schemas)
Student Student examine Student Staff Room
Mapping
Mapping
21
Data Independence
A major objective for the three-level architecture is to provide data independence, which means
that upper levels are unaffected by changes to lower levels.
• Logical Data Independence: The immunity of the external schemas to changes in the
conceptual schema
22
Data Independence
Languages that do not include constructs for all computing needs, such as conditional or
iterative statements, which are in contrast provided by high-level programming languages:
• Data Manipulation Language is used to both read and update the database
Host Language
Many DBMSs have a facility for embedding the sublanguage in a high-level programming
language such as C, C#, or Java, which is sometimes referred to as the host language.
24
Data Model
Definition
A Data Model is an integrated collection of concepts for describing and manipulating data,
relationships between data, and constraints on the data in an organization. Usually, it consists
of three components:
• a manipulative part, defining the types of operation that are allowed on the data
25
Phases of Database Design
Database design is made up of three main phases:
Conceptual Design
The process of constructing a model of the data used in an enterprise, independent of all
physical considerations.
Logical Design
The process of constructing a model of the data used in an enterprise based on a specific
data model, but independent of a particular DBMS and other physical considerations.
Physical Design
26
Conceptual Data Modeling
To get a precise understanding of the nature of the data and how it is used by the enterprise, we
need a model for communication that is nontechnical and free of ambiguities.
• although there is general agreement about what each ER-concept means, there are a
number of different notations that can be used to represent these concepts diagrammatically
• we use the Unified Modeling Language (UML) to visualize the concepts of ER. UML is
currently recognized as the de facto industry standard modeling language for object-oriented
software engineering projects
• we use the UML notation for drawing ER models, but continue to describe the concepts of ER
models using traditional database terminology
27
The ER Model (1)
Entity Type
Entity Occurence
Each entity type is shown as a rectangle, labeled with the name of the entity, which is
normally a singular noun. In UML, the first letter of each word in the entity name is uppercase.
28
The ER Model (2)
Relationship Type
A uniquely identifiable association that includes one occurrence from each participating
entity type.
• Each relationship type is shown as a line connecting the associated entity types and labeled
with the name of the relationship.
• A relationship is named using a verb or a short phrase including a verb. The first letter of
each word in the relationship name is shown in uppercase. Whenever possible, a relationship
name should be unique for a given ER model.
• A relationship is only labeled in one direction by placing an arrow symbol beside the name
indicating the correct direction for a reader to interpret the relationship name.
29
The ER Model (3)
Degree of a relationship type
A relationship with degree higher than two is called complex and represented by a diamond.
The name of the relationship is displayed inside the diamond, and in this case, the directional
arrow normally associated with the name is omitted.
Example:
30
Example - Quarternary
Relationship Arranges
31
The ER Model (4)
Recursive Relationship
A relationship type in which the same entity type participates more than once in different
roles.
Role name
Relationships may be given role names to indicate the purpose that each participating entity
type plays in a relationship.
32
Role Names
Role names may also be used when two entities are associated through more than one
relationship.
33
The ER Model (5)
Attribute
Attribute Domain
Classification of attributes
• simple or composite
• single-valued or multi-valued
• derived
34
Classification of Attributes
Simple Attribute
Composite Attribute
Single-Valued-Attribute
Multi-Valued Attribute
Derived Attribute
• represents a value that is derivable from the value of a related attribute or set of attributes, not
necessarily in the same entity type
35
The ER-Model (6)
Candidate Key
• The minimal set of attributes that uniquely identifies each occurrence of an entity type.
Primary Key
• The candidate key that is selected to uniquely identify each occurrence of an entity type.
Composite Key
36
The ER Model (7)
Diagrammatic representation of attributes
If an entity type is to be displayed with its attributes, we divide the rectangle representing the
entity in two. The upper part of the rectangle displays the name of the entity and the lower
part lists the names of the attributes.
The first attribute(s) to be listed is the primary key for the entity type (if known).
The name(s) of the primary key attribute(s) can be labeled with the tag {PK}.
In UML, the name of an attribute is displayed with the first letter in lowercase. If the name has
more than one word, the first letter of each subsequent word has to be in uppercase.
Additional tags that can be used include partial primary key {PPK} when an attribute forms
part of a composite primary key, and alternate key {AK}.
37
Example - Attributes
38
The ER Model (8)
39
The ER Model (9)
Diagrammatic representation of attributes on relationships
Attributes associated with a relationship type are represented using the same symbol as an
entity type.
40
The ER Model (10)
Multiplicity
The number (or range) of possible occurrences of an entity type that may relate to a single
occurrence of an associated entity type through a particular relationship.
The most common degree for relationships is binary. Binary relationships are generally
referred to as being
Attention
41
1:1 Relationship
42
The 1:* Relationship
43
*:* Relationship
44
The ER Model (11)
45
The ER Model (12)
46
The ER Model (13)
Cardinality
Participation
47