Dimension Modeling

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 37

Dimensional Modeling

What is a Data Model?

A Data model is a conceptual representation of data


structures(tables) required for a database and is very powerful
in expressing and communicating the business requirements.
Data model helps functional and technical team in designing
the database.
Data Modeling Tools : Erwin, Oracle Designer, Power Designer.
Two phases of data modeling are as follows:
1) Logical modeling
2)Physical modeling

Logical Modeling
Includes entities (tables), attributes (columns/fields) and
relationships (keys).
Uses business names for entities & attributes
Is independent of technology (platform, DBMS)
Is normalized to fourth normal form(4NF)

Physical Modeling
Includes tables, columns, keys, data types, validation rules,
database triggers, stored procedures, domains, and
access constraints
Uses more defined and less generic specific names for
tables and columns, such as abbreviated column names,
limited by the database management system (DBMS) and
any company defined standards
Includes primary keys and indices for fast data access.

Logical Vs Physical

Logical v/s Physical


logical

physical

Represents business information and


defines business rules

Represents the physical implementation


of the model in a database.

Entity.

Table.

Attribute

Column

Primary Key

Primary Key Constraint

Alternate Key

UserUnique Constraint or Unique Index

Rule

Check Constraint, Default Value

Relationship

Foreign Key

Definition

Comment

What is ER Modeling?
Entity Relational Data Modeling is used in OLTP systems
which are transaction oriented.
Focus of OLTP Design
Individual data elements
Data relationships

Design goals
Accurately model business
Remove redundancy(Normalized)

ER Modeling Shortcomings:

Complex
Unfamiliar to business people
Incomplete history
Slow query performance

Dimensional Modeling
Definition
Logical data model used to represent the measures and
dimensions that pertain to one or more business subject
areas
Dimensional Model = Star Schema
Can easily translate into multi-dimensional database
design if required
Overcomes ER design shortcomings

D M Advantages:
Understandable
Systematically represents history
Reliable join paths
High performance query
Enterprise scalability

ER v/s DM
ER

DM

Tables are units of storage

Cubes are units of storage

Data is normalized and used for


OLTP.

Data is denormalized and used in


datawarehouse and data mart.

Several tables and chains of


relationships among them

Few tables and fact tables are connected


to dimensional tables

Detailed level of transactional data

Summary of bulky transactional


data(Aggregates and Measures) used in
business decisions

Normal Reports

User friendly, interactive, drag and drop


multidimensional OLAP Reports

Dimension tables
Dimension table is one that describe the business entities
of an enterprise, represented as hierarchical, categorical
information such as time, departments , locations, and
products. Dimension tables are sometimes called lookup or
reference tables.
Textual content (Character data)

Dimension tables
Characteristics
Hold the dimensional attributes
Usually have a large number of attributes (wide)
Add flags and indicators that make it easy to perform
specific types of reports
Have small number of rows in comparison to fact tables
(most of the time)

Surrogate Key
A unique (primary key) generated by the RDBMS that is
not derived from any data in the database and whose only
significance is to act as the primary key. A surrogate key is
frequently a sequential number.
Each table assigned a unique primary key, specifically generated
for the data warehouse

Dimension table contd


Example of EMP dimension:

Dimension table contd


Example of dimension tables:

Time

Model

time_key

model_key

year
quarter
month
date

brand
category
line
model

Dealer
dealer_key
region
state
city
dealer

Slowly Changing Dimensions


Dimension source data may change over time
Relative to fact tables, dimension records change slowly
Allows dimensions to have multiple 'profiles' over time to
maintain history
Each profile is a separate record in a dimension table

Slowly Changing Dimension


Example

Example: A woman gets married


Possible changes to customer dimension
1) Last Name
2)Marriage Status
3)Address
4)Household Income

Existing facts need to remain associated with her


single profile
New facts need to be associated with her
married profile

Slowly Changing Dimension


Types

Three types of slowly changing dimensions


Type 1
Updates existing record with modifications
Does not maintain history
Type 2
Adds new record
Does maintain history
Maintains old record
Type 3:
Keep old and new values in the existing row
Requires a design change

Degenerated Dimensions
A degenerate dimension is a dimension which is derived
from the fact table and doesn't have its own dimension
table.
Stored in the fact table
Common examples include invoice numbers or order
numbers
Use - Degenerate dimensions is often based on the desire
to provide a direct reference back to a transactional system
without the overhead of maintaining a separate dimension
table.

Conformed Dimensions
A dimension that has exactly the same meaning and
content when being referred from different fact tables.
Example: Cube-1 contains F1 D1 D2 D3 and Cube-2
contains F2 D1 D2 D4 are the Facts and Dimensions
here D1 D2 are the Conformed Dimensions.
Eg: Time Dimension

Fact table
A fact table consists of the measurements, metrics or facts
of a business process.
Fact tables are often defined by their grain.
Grain
The level of detail represented by a row in the fact table
Must be identified early

Example of Fact table


Sales Facts
model_key
dealer_key
time_key
revenue
quantity

Facts
Fully additive
Can be summed across any and all dimensions
Stored in fact table
Examples: revenue, quantity , Sales_amount

Facts
Semi-additive
Semi-additive facts are facts that can be
summed up for some of the dimensions in the
fact table, but not the others.

Facts
Non-additive
Non-additive facts are facts that cannot be
summed up for any of the dimensions present in
the fact table.
All ratios are non-additive
Examples: Age, weather

Schemas in Data Warehouses


A schema is a collection of database objects, including
tables, views, indexes, and synonyms.
There is a variety of ways of arranging schema objects
in the schema models designed for data warehousing.
-STAR Schema
-Snowflake Schema

STAR Schema
The star schema (also called star-join schema or multi-dimensional
schema) is the simplest style of data warehouse schema. The star
schema consists of one or more fact tables referencing any number
of dimension tables.
The main advantages of star schemas are that they:
- Provide highly optimized performance for typical star queries.
- Widely supported by a large number of business intelligence tools.

STAR Schema

Snowflake Schema
The snowflake schema is similar to the star schema. However, in the
snowflake schema, dimensions are normalized into multiple related
tables, whereas the star schema's dimensions are denormalized with
each dimension represented by a single table.
Advantages of Using the Snowflake Schema :
- easier to maintain.
- increases flexibility
Disadvantages of Using the Snowflake Schema
- increases the number of tables an end-user must work with.
- makes the queries much more difficult to create because more tables
need to be joined.

Snowflake Schema

Designing a Star Schema

32

Five initial design steps


Based on Kimball's six steps
Start designing in order
Re-visit and adjust over project life

Step One

1.

Identify fact table


Start by naming the fact table with the
name of the business subject area

33

Step Two

2.

Identify fact table grain


Describe what a row in the fact table
represents - in business terms

34

Step Three

3.
35

Identify dimensions

Step Four

4.
36

Select facts

Step Five

5.
37

Identify dimensional
attributes

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy