AMD - Lecture 01

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Advanced

Management of Data
Contact

• Dr. Frank Seifert

• Email: fsei@cs.tu-chemnitz.de

• Office 1/336a (Consultation only by appointment)


Practice Lessons

Contact

• Daniel Richter

• Email: daniel.richter@informatik.tu-chemnitz.de

• Office 1/336d (Consultation only by appointment)

2
Documents

• You can find the slides of the lecture and practice lessons on the
website of the chair Datenverwaltungssysteme

https://www.tu-chemnitz.de/informatik/DVS/lehre/vorlesungen.php

• slides may be updated during semester

3
Exam & Study Regulations
Exam

• written, 90 minutes

Study Regulations

• at the moment „Advanced Management of Data“ is only contained in study regulations of


course „Automotive Software Engineering 2016“

• for all other students the module „Datenbanken und Objektorientierung“ will be recognized
for „Advanced Management of Data“

• you have to apply individually after the exam by filling in the form „Antrag auf
Fachsemestereinstufung/Anrechnung von Prüfungsleistungen“ (Student Service Point):
https://www.tu-chemnitz.de/studentenservice/stusek/formulare.php

4
Requirements
Actually, you should have a background in fundamental database concepts and technology,
including

• Semantic data modelling

• the relational data model and languages (relational Algebra, SQL)

• we offer special slides for self-studying, which cover the most important aspects

• Normalisation

• Concurrency and Transaction Management

• Query Processing

• Physical data modeling and indexing

5
Importance of Databases
• database industry is valued >€50 billion annually

• databases are basis of information systems

• databases changed the way of operating many businesses fundamentally

• growing importance caused by significant developments in

• hardware capability

• the emergence of the Internet and mobile communications

• e-Commerce

• Business Intelligence / Big Data

6
Lecture
Objective

• most commercially relevant database systems are based on the relational paradigm, which
does often not adequately address many of these new challenges

• we will name the problems of relational systems

• we will get to know

• special applications and extensions of relational database systems

• learn different and new paradigms of data management

7
Contents
• Repetition of fundamental Database Concepts and Terminology

• Semantic Data Modeling using the Entity-Relationship Model and UML

• Logical Data Modeling using the Relational Model (by self-studying)

• Extensions of SQL

• Object-Relational Database Concepts

• Distributed Database Concepts

• NoSQL and Big Data Storage Systems

8
Literature

• Thomas Connolly, Carolyn Begg: „Database Systems - A Practical Approach to Design,


Implementation, and Management“. Sixth Edition, Pearson.

• Ramez Elmasri & Shamkant B. Navathe: „Database Systems“. Seventh Edition, Pearson.

• Further Literature will be added later

9
Advanced
Management of Data
Foundations of Databases
Conceptual Database Design
Why do we use databases?
Limitations of a file-based approach

• separation and isolation of data

• duplication of data

• data dependence

• incompatible file formats

• Inflexible queries

Reason

• The definition of the data is embedded in the application programs, rather than being stored
separately and independently.

• There is no control over the access and manipulation of data beyond that imposed by the
application programs.

11
Database

Database

• a shared collection of logically related data, designed to meet the information needs of an
organization

• the database holds also a description of this data, which is called either

• system catalog

• data dictionary

• metadata

12
DBMS
DBMS

The Database Management System (DBMS) is a software system that enables users to
define, create, maintain, and control access to the database.

Facilities

• definition of the database through a Data Definition Language (DDL)

• insertion, modification and deletion of data from the database through a Data Manipulation
Language (DML)

• retrieval of the data through a Query Language (QL)

• providing controlled access to the database

Attention

Usually, the notion „database“ means the whole of database and DBMS.

13
Application Programs
An Application Program (in the context of a database) is a computer program that interacts with
the database by issuing an appropriate request to the DBMS.

Database processing example

[Connolly & Begg]

14
Views

• are subsets of the database

• allow each user to have an individual view of the database and include only relevant
information

• provide a mechanism to customize the appearance of the database

• provide a level of security by excluding data that some users should not see

• can present a consistent, unchanging picture of the database, even if the underlying
structure is changed

15
Advantages of DBMSs (1)

• Control of data redundancy

• Data consistency

• More information from the same amount of data

• Sharing of data

• improved data integrity

• improved security

• enforcement of standards

16
Advantages of DBMSs (2)

• economy of scale

• balance of conflicting requirements

• improved data accessibility and responsiveness

• increased productivity

• improved maintenance through data independence

• increased concurrency

• improved backup and recovery services

17
Disadvantages of DBMSs

• Complexity

• Size

• Cost

• additional hardware costs

• Cost of conversions

• Performance

• Impact of failures

18
Three-Level Architecture
External Level

• The users’ view of the database

• describes that part of the database that is relevant to each user

Conceptual Level

• The logical view of the database

• describes what data is stored in the database and the relationships among the data

Internal Level

• The physical representation of the database on the computer

• describes how the data is stored in the database

19
Database Schema
Database Schema

• overall description of the database

• according to the levels of abstraction there


are three schemas

• multiple external schemas

• one conceptual schema

• one internal schema

• the DBMS is responsible for mapping


between these three types of schema and
checks for consistency
[Connolly & Begg]

20
Example
Students Central Examination Room
Staff Department
external level Office Office planning
(external schemas)
Student Student examine Student Staff Room

Mapping

Corporate Model of TU Chemnitz

Student Lecture Lecturer


conceptual level
(conceptual schema)

listen Room exam

Mapping

Physical MatrNo: int (4 Byte) on disc 1


internal level
Management
(internal schema)
of Data Course: char (20 Bytes) on disc 2

21
Data Independence

A major objective for the three-level architecture is to provide data independence, which means
that upper levels are unaffected by changes to lower levels.

There are two kinds of data independence:

• Logical Data Independence: The immunity of the external schemas to changes in the
conceptual schema

• Physical Data Independence: The immunity of the conceptual schema to changes in


the internal schema

22
Data Independence

[Connolly & Begg]


Database Languages
Data Sublanguages

Languages that do not include constructs for all computing needs, such as conditional or
iterative statements, which are in contrast provided by high-level programming languages:

• Data Definition Language (DDL) is used to specify the database schema

• Data Manipulation Language is used to both read and update the database

Host Language

Many DBMSs have a facility for embedding the sublanguage in a high-level programming
language such as C, C#, or Java, which is sometimes referred to as the host language.

24
Data Model
Definition

A Data Model is an integrated collection of concepts for describing and manipulating data,
relationships between data, and constraints on the data in an organization. Usually, it consists
of three components:

• a structural part, consisting of a set of rules according to which databases can be


constructed

• a manipulative part, defining the types of operation that are allowed on the data

• a set of integrity constraints, which ensures that the data is accurate

Categories of data models

• object-based (e.g. Entity-Relationship, Semantic, Object-Oriented)

• record-based (e.g. relational, network, hierarchical)

• physical (e.g. record structures and orderings)

25
Phases of Database Design
Database design is made up of three main phases:

Conceptual Design

The process of constructing a model of the data used in an enterprise, independent of all
physical considerations.

Logical Design

The process of constructing a model of the data used in an enterprise based on a specific
data model, but independent of a particular DBMS and other physical considerations.

Physical Design

The process of producing a description of the implementation of the database on secondary


storage; it describes the base relations, file organizations, and indexes used to achieve
efficient access to the data, and any associated integrity constraints and security measures.

26
Conceptual Data Modeling
To get a precise understanding of the nature of the data and how it is used by the enterprise, we
need a model for communication that is nontechnical and free of ambiguities.

Entity-Relationship (ER) Model

• is the most widely used in the conceptual database design process

• although there is general agreement about what each ER-concept means, there are a
number of different notations that can be used to represent these concepts diagrammatically

• we use the Unified Modeling Language (UML) to visualize the concepts of ER. UML is
currently recognized as the de facto industry standard modeling language for object-oriented
software engineering projects

• we use the UML notation for drawing ER models, but continue to describe the concepts of ER
models using traditional database terminology

27
The ER Model (1)
Entity Type

A group of objects with the same properties, which are


identified by the enterprise as having an independent
existence (physically or conceptually).

Entity Occurence

A uniquely identifiable object of an


entity type. [Connolly & Begg]

Diagrammatic representation of Entity Types

Each entity type is shown as a rectangle, labeled with the name of the entity, which is
normally a singular noun. In UML, the first letter of each word in the entity name is uppercase.

28
The ER Model (2)
Relationship Type

A set of meaningful associations among entity types.

Relationship Occurrence [Connolly & Begg]

A uniquely identifiable association that includes one occurrence from each participating
entity type.

Diagrammatic representation of relationship types

• Each relationship type is shown as a line connecting the associated entity types and labeled
with the name of the relationship.

• A relationship is named using a verb or a short phrase including a verb. The first letter of
each word in the relationship name is shown in uppercase. Whenever possible, a relationship
name should be unique for a given ER model.

• A relationship is only labeled in one direction by placing an arrow symbol beside the name
indicating the correct direction for a reader to interpret the relationship name.

29
The ER Model (3)
Degree of a relationship type

Describes the number of participating entity types in a relationship. A relationship of degree


two is called binary.

Complex Relationships and diagrammatic representation

A relationship with degree higher than two is called complex and represented by a diamond.

The name of the relationship is displayed inside the diamond, and in this case, the directional
arrow normally associated with the name is omitted.

Example:

Ternary Relationship Registers

30
Example - Quarternary
Relationship Arranges

[Connolly & Begg]

31
The ER Model (4)
Recursive Relationship

A relationship type in which the same entity type participates more than once in different
roles.

Role name

Relationships may be given role names to indicate the purpose that each participating entity
type plays in a relationship.

[Connolly & Begg]

32
Role Names
Role names may also be used when two entities are associated through more than one
relationship.

[Connolly & Begg]

33
The ER Model (5)
Attribute

A property of an entity or a relationship type.

Attribute Domain

The set of allowable values for one or more attributes.

Classification of attributes

• simple or composite

• single-valued or multi-valued

• derived

34
Classification of Attributes
Simple Attribute

• composed of a single component with an independent existence

Composite Attribute

• composed of multiple components, each with an independent existence

Single-Valued-Attribute

• holds a single value for each occurrence of an entity type

Multi-Valued Attribute

• holds multiple values for each occurrence of an entity type

Derived Attribute

• represents a value that is derivable from the value of a related attribute or set of attributes, not
necessarily in the same entity type

35
The ER-Model (6)
Candidate Key

• The minimal set of attributes that uniquely identifies each occurrence of an entity type.

Primary Key

• The candidate key that is selected to uniquely identify each occurrence of an entity type.

• set of allowable values for one or more attributes

Composite Key

• a candidate key that consists of two or more attributes

36
The ER Model (7)
Diagrammatic representation of attributes

If an entity type is to be displayed with its attributes, we divide the rectangle representing the
entity in two. The upper part of the rectangle displays the name of the entity and the lower
part lists the names of the attributes.

The first attribute(s) to be listed is the primary key for the entity type (if known).

The name(s) of the primary key attribute(s) can be labeled with the tag {PK}.

In UML, the name of an attribute is displayed with the first letter in lowercase. If the name has
more than one word, the first letter of each subsequent word has to be in uppercase.

Additional tags that can be used include partial primary key {PPK} when an attribute forms
part of a composite primary key, and alternate key {AK}.

37
Example - Attributes

[Connolly & Begg]

38
The ER Model (8)

Strong Entity Type

An entity type that is not


existence-dependent on
some other entity type.

Weak Entity Type

An entity type that is


existence-dependent on
some other entity type.

[Connolly & Begg]

39
The ER Model (9)
Diagrammatic representation of attributes on relationships

Attributes associated with a relationship type are represented using the same symbol as an
entity type.

To distinguish between a relationship with an attribute and an entity, the rectangle


representing the attribute(s) is associated with the relationship using a dashed line.

[Connolly & Begg]

40
The ER Model (10)
Multiplicity

The number (or range) of possible occurrences of an entity type that may relate to a single
occurrence of an associated entity type through a particular relationship.

The most common degree for relationships is binary. Binary relationships are generally
referred to as being

• one-to-one (1:1), e.g. a member of staff manages a branch

• one-to-many (1:*), e.g. a member of staff oversees properties for rent

• many-to-many (*:*), e.g. a newspapers advertise properties for rent

Attention

Not all integrity constraints can be represented in an ER model.

41
1:1 Relationship

[Connolly & Begg]

42
The 1:* Relationship

[Connolly & Begg]

43
*:* Relationship

[Connolly & Begg]

44
The ER Model (11)

Multiplicity (of complex relationships)

The number (or range) of possible


occurrences of an entity type in an
n-ary relationship when the other
(n–1) values are fixed.

[Connolly & Begg]

45
The ER Model (12)

[Connolly & Begg]

46
The ER Model (13)
Cardinality

Describes the maximum


number of possible relationship
occurrences for an entity
participating in a given
relationship type.

Participation

Determines whether all or only


some entity occurrences
participate in a relationship.

[Connolly & Begg]

47

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy