Chapter 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

OLTP and OLAP

D ATA B A S E D E S I G N

Lis Sulmont
Curriculum Manager
Our motivating question:

How should we organize and manage data?


Schemas: How should my data be logically organized?

Normalization: Should my data have minimal dependency and redundancy?

Views: What joins will be done most o en?

Access control: Should all users of the data have the same level of access

DBMS: How do I pick between all the SQL and noSQL options?

and more!

DATABASE DESIGN
Our motivating question:

How should we organize and manage data?


Schemas: How should my data be logically organized?

Normalization: Should my data have minimal dependency and redundancy?

Views: What joins will be done most o en?

Access control: Should all users of the data have the same level of access

DBMS: How do I pick between all the SQL and noSQL options?

and more!

It depends on the intended use of the data.

DATABASE DESIGN
Approaches to processing data
OLTP OLAP
Online Transaction Processing Online Analytical Processing

DATABASE DESIGN
Some concrete examples
OLTP tasks OLAP tasks
Find the price of a book Calculate books with best pro t margin

Update latest customer transaction Find most loyal customers

Keep track of employee hours Decide employee of the month

DATABASE DESIGN
OLAP vs. OLTP
OLTP OLAP
Purpose support daily transactions report and analyze data
Design application-oriented subject-oriented
Data up-to-date, operational consolidated, historical
Size snapshot, gigabytes archive, terabytes
simple transactions & frequent complex, aggregate queries & limited
Queries
updates updates
Users thousands hundreds

DATABASE DESIGN
Working together

DATABASE DESIGN
Takeaways
Step back and gure out business requirements

Di erence between OLAP and OLTP

OLAP? OLTP? Or something else?

DATABASE DESIGN
Let's practice!
D ATA B A S E D E S I G N
Storing data
D ATA B A S E D E S I G N

Lis Sulmont
Curriculum Manager
Structuring data
1. Structured data 2. Unstructured data

Follows a schema Schemaless

De ned data types & relationships Makes up most of data in the world

_e.g., SQL, tables in a relational database _ e.g., photos, chat logs, MP3

3. Semi-structured data # Example of a JSON file


"user": {
Does not follow larger schema "profile_use_background_image": true,
Self-describing structure "statuses_count": 31,
"profile_background_color": "C0DEED",
e.g., NoSQL, XML, JSON "followers_count": 3066,
...

DATABASE DESIGN
Structuring data

1Flower by Sam Oth and Database Diagram by Nick Jenkins via Wikimedia Commons
h ps://commons.wikimedia.org/wiki/File:Languages_xml.png

DATABASE DESIGN
Storing data beyond traditional databases
Traditional databases
For storing real-time relational structured data ? OLTP

Data warehouses
For analyzing archived structured data ? OLAP

Data lakes
For storing data of all structures = exibility and scalability

For analyzing big data

DATABASE DESIGN
Data warehouses
Optimized for analytics - OLAP
Organized for reading/aggregating data

Usually read-only

Contains data from multiple sources

Massively Parallel Processing (MPP)

Typically uses a denormalized schema and


dimensional modeling

Data marts

Subset of data warehouses

Dedicated to a speci c topic

DATABASE DESIGN
Data lakes
Store all types of data at a lower cost:
e.g., raw, operational databases, IoT device logs, real-time, relational and non-relational

Retains all data and can take up petabytes

Schema-on-read as opposed to schema-on-write

Need to catalog data otherwise becomes a data swamp

Run big data analytics using services such as Apache Spark and Hadoop
Useful for deep learning and data discovery because activities require so much data

DATABASE DESIGN
ETL

ELT

DATABASE DESIGN
Let's practice!
D ATA B A S E D E S I G N
Database design
D ATA B A S E D E S I G N

Lis Sulmont
Curriculum Manager
What is database design?
Determines how data is logically stored
How is data going to be read and updated?

Uses database models: high-level speci cations for database structure


Most popular: relational model

Some other options: NoSQL models, object-oriented model, network model

Uses schemas: blueprint of the database


De nes tables, elds, relationships, indexes, and views

When inserting data in relational databases, schemas must be respected

DATABASE DESIGN
Data modeling
Process of creating a data model for the data to be stored

1. Conceptual data model: describes entities, relationships, and a ributes

Tools: data structure diagrams, e.g., entity-relational diagrams and UML diagrams

2. Logical data model: de nes tables, columns, relationships

Tools: database models and schemas, e.g., relational model and star schema

3. Physical data model: describes physical storage

Tools: partitions, CPUs, indexes, backup systems and tablespaces

1 h ps://en.wikipedia.org/wiki/Data_model

DATABASE DESIGN
Conceptual - ER diagram Logical - schema

Fastest conversion: entities become the


Entities, relationships, and a ributes tables

DATABASE DESIGN
Other database design options

Determining tables

DATABASE DESIGN
Beyond the relational model
Dimensional modeling
Adaptation of the relational model for data warehouse design

Optimized for OLAP queries: aggregate data, not updating (OLTP)

Built using the star schema

Easy to interpret and extend schema

DATABASE DESIGN
Elements of dimensional modeling
Fact tables

Decided by business use-case

Holds records of a metric

Changes regularly

Connects to dimensions via foreign keys

Dimension tables
Organize by:
Holds descriptions of a ributes
What is being analyzed?
Does not change as o en
How o en do entities change?

DATABASE DESIGN
Let's practice!
D ATA B A S E D E S I G N

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy