DWM Exp1 C49
DWM Exp1 C49
PART A
(PART A: TO BE REFFERED BY STUDENTS)
Experiment No.01
A.1 Aim:
Write Detail Statement of Problem and create dimensional model (creation
star and snowflake schema) for the same
A.2 Prerequisite:
DBMS concept and ER diagram.
A.3 Outcome:
After successful completion of this experiment students will be able
to
Describe need and design of data warehouse.
A.4 Theory:
Example:(Sales)
Data Warehouse & Mining Lab
A.5 Algorithm:
Student need to select a problem statement and create a Star schema
and snowflake schema for the selected problem statement
Star schema
Data Warehouse & Mining Lab
Snowflake schema:
PART B
(PART B: TO BE COMPLETED BY STUDENTS)
Input:
SQL commands/script which satisfies Two different outcomes that are mentioned in
the Problem statement.
Output:
1. Dimensional Tables created after firing above SQL commands.
Data Warehouse & Mining Lab
2. The output which satisfies 2 different outcomes that are mentioned in Problem
statements.
B.4 Conclusion:
(Students must write the conclusion as per the attainment of individual
outcome listed above and learning/observation noted in section B.3)
Using the Data Warehouse, we can manage data from the Sales Management
Information System for sales of computers and find useful information.
Thus, implemented the star schema for sales management and learned the data
warehouse concept.
B.5 Question of Curiosity
(To be answered by student based on the practical performed and
learning/observations)
1. What is Dimension Modeling?
Ans:
Dimension Modeling:
Dimensional Modeling (DM) is a data structure technique optimized for data
storage in a Data warehouse. The purpose of dimensional modelling is to
optimize the database for faster retrieval of data. The concept of Dimensional
Modelling was developed by Ralph Kimball and consists of “fact” and
“dimension” tables.
A dimensional model in the data warehouse is designed to read, summarize,
analyze numeric information like values, balances, counts, weights, etc. in a
data warehouse.
Dimensional modelling is a database design technique that supports business
users to query data in the data warehouse system. Dimensional modelling is
developed to be oriented to improve query performance and ease of use.
It is important to note that dimensional modelling does not necessarily depend on
relational databases. The dimensional modelling approach, at the logical level,
can be applied to any physical forms such as relational and multidimensional
databases.
Data Warehouse & Mining Lab
In dimensional modelling, there are two important concepts: facts and
dimensions.
1. Facts are business measurements: Facts are normally but not
always numeric values that could be aggregated. e.g:- A number of
products sold per quarter.
2. Dimensions are called contexts: Dimensions are business
descriptors that specify the facts, for example, product name, brand,
quarter, etc.
1. Explain Star Schema and Snowflake schema with an example.
Ans:
Star Schema:
Star schema is the type of multidimensional model which is used for the data
warehouse. In a star schema, The fact tables and the dimension tables are
contained. In this schema, a fewer foreign-key join is used. This schema forms a star
with a fact table and dimension tables.
Note: Each dimension has only one dimension table and each table holds a set of
attributes. For example, the location dimension table contains the attribute set
{location_key, street, city, province_or_state,country}. This constraint may cause data
redundancy. For example, "Vancouver" and "Victoria" cities are both in the Canadian
province of British Columbia. The entries for such cities may cause data redundancy
along the attributes province_or_state and country.
Snowflake Schema:
Snowflake Schema is also the type of multidimensional model which is used for a
data warehouse. In snowflake schema, The fact tables, dimension tables as well as
sub-dimension tables are contained. This schema forms a snowflake with fact tables,
dimension tables as well as sub-dimension tables.
Now the item dimension table contains the attributes item_key, item_name, type,
brand, and supplier-key.
The supplier key is linked to the supplier dimension table. The supplier dimension
table contains the attributes of supplier_key and supplier_type.
Note: Due to normalization in the Snowflake schema, the redundancy is reduced
and therefore, it becomes easy to maintain and save storage space.