What Are The Dimensions in Data Warehouse
What Are The Dimensions in Data Warehouse
What Are The Dimensions in Data Warehouse
A dimension table consists of the attributes about the facts. Dimensions store the textual descriptions of the
business attribute. Without the dimensions, we cannot measure the facts and facts are just disordered
Numbers. In Business, Customer, Products, Buyers information can be different dimensions.
Conformed Dimensions
Junk Dimensions
Role-playing Dimensions
Slowly Changing Dimensions
Degenerated Dimensions
Conformed Dimensions
A Dimension that is used in multiple locations is called conformed dimensions. A conformed dimension may
be used with multiple fact tables in single database, or across multiple data marts or Data warehouses.
I.e. Above shown Customer and Product Dimensions are Conformed Dimensions as they are connected to
Shipment Fact table, Sales Order Fact table, and Service Request Fact table.
Junk Dimensions
A junk dimension is a collection of random transaction codes flags and/or text attributes that are unrelated
to any particular dimension. The junk dimension is simply a structure that provides a convenient place to
Store the junk attributes.
I.e.: Assume that we have a gender dimension and marital status dimension. In the fact table we need to
maintain two keys referring to these dimensions. Instead of that create a junk dimension which has all the
combinations of gender and marital status (cross join gender and marital status table and create a junk
table). Now we can maintain only one key in the fact table.
Role-playing Dimensions
Role Playing Dimensions are the Dimensions which often used for multiple purposes within same
database.Here same dimension key is associated with more than one foreign key in the fact table in the
database for the different purposes.
I.e.: In Date dimensions, [FullDateAlternateKey] is associated with [Orderdate key], [Duedate key], and
[Shipdate] key in the fact table to solve different purpose in Data warehouse.
This is widely used Dimensions type. It is the dimensions where attribute values changes with time. There
are various types of Slowly Changing Dimensions (SCD) based on how business manages this dimensions.
Types of SCD
TYPE 0: It is the dimensions where we do not change attribute values at all. They are rarely
used. I.e. Employee birth date
TYPE 1: In this type, Old value of attribute is overwritten by new values of attribute and no history kept
I.e Customer City where company decided to show only current one.
In this case previous city name London is replaced by new city name Edinburgh.
TYPE 2: In this type we tracks historical data by creating multiple records for a given Natural key (business
key) in the dimensional tables with separate surrogate key and/or different version numbers. Unlimited
history is preserved for each insert.
I.e. Customer City where company decided to have historical data then we will have to add an extra row with
column to identify the Current/Historical attributes value by start and end date columns.
TYPE 3: In this type, we tracks changes using separate columns and preserves limited history.it is limited to
how many columns we want to add in dimension table.
I.e. Customer City where New columns previous City and Current City being added.
TYPE 4: In this type, we keep all or some historical data in separate table and current data stays in main
Dimension table. Both historical and current dimension table joined to fact table with same surrogate key,
this will enhance the query performance. This type used very rarely.
I.e. we create new table to store previous Customer City and Current Customer City in Historical table with
Created date And Current Customer city in Current dimension table.
Degenerated Dimensions
A degenerate dimension is a dimension which is derived from the fact table and doesnt have its own
dimension table.
In Data warehouse this Dimension often used to show drill through capability where in the report you can
see how aggregated no came up.
I.e. Invoice no can be stored in the fact table and then used as separate dimensions for the drill through
purpose to find out what invoices are part of total buying cost in report.
So Dimensions are one of the pillar of the data warehouse.Choosing right one can define future of the data
warehouse.It always good to use right type
Hope this post gives some insight and information around data warehouse design.
Additive:
Additive facts are facts that can be summed up through all of the dimensions in the fact table. A sales fact
is a good example for additive fact.
Semi-Additive:
Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not
the others.
Eg: Daily balances fact can be summed up through the customers dimension but not through the time
dimension.
Non-Additive:
Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
Eg: Facts which have percentages, ratios calculated.
In the real world, it is possible to have a fact table that contains no measures or facts. These tables are
called "Factless Fact tables".
Eg: A fact table which has only product key and date key is a factless fact. There are no measures in this
table. But still you can get the number products sold over a period of time.
A fact tables that contain aggregated facts are often called summary tables.
Star Schema:
A star schema is the one in which a central fact table is sour rounded by denormalized dimensional tables.
A star schema can be simple or complex. A simple star schema consists of one fact table where as a
complex star schema have more than one fact table.
Snow Flake Schema:
A snow flake schema is an enhancement of star schema by adding additional dimensions. Snow flake
schema are useful when there are low cardinality attributes in the dimensions.
Galaxy Schema:
Galaxy schema contains many fact tables with some common dimensions (conformed dimensions). This
schema is a combination of many data marts.
The dimensions in this schema are segregated into independent dimensions based on the levels of hierarchy.
For example, if geography has five levels of hierarchy like teritary, region, country, state and city;
constellation schema would have five dimensions instead of one.