Atharva College of Engineering Atharva College of Engineering Atharva College of Engineering
Atharva College of Engineering Atharva College of Engineering Atharva College of Engineering
Atharva College of Engineering Atharva College of Engineering Atharva College of Engineering
Experiment – 1
Title: Build Data Warehouse/Data Mart for a given problem statement (auto sales analysis)
i) Identifying the source tables and populating sample data
ii) Design dimensional data model i.e. Star schema, Snowflake schema and Fact Constellation
schema (if applicable)
Notes:
Design a data cube which contain one fact table and design item, time, supplier,
supplier location,
customer dimension table , also identify measures for sales. Insert minimum 4 items like
bikes, small cars, mid segment cars, car consumables items etc. Also enter minimum 10,
12 records Region/location, enter minimum 2 cities from each state also
also enter minimum 2
states. Keep track of sales quarter wise.
Perform and implement above fact& dimension tables in oracle10g which are same as
relational table of database, perform analyze above with the help of SQL tool.
You have to use concepts of OLAP operation like slice, dice, roll-up, drill--down etc.
Objective:
• To learn fundamental of data warehousing
• To learn concepts of dimensional modeling
• To learn star, snowflake & Galaxy schema
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by the Government of Maharashtra
& Affiliated to University of Mumbai )
Department of Computer Engineering
Academic Year 2020-21
Reference:
• SQL-PL/SQL
PL/SQL by Ivan Bayrose
• Data Mining Concept and Technique
Tec By Han & Kamber
• Data Warehousing Fundamentals By Paulraj
• Data warehousing & Mining By Reema Thereja
Pre-requisite:
• Fundamental Knowledge of Database Management
• Fundamental Knowledge of SQL
Theory:
Fact table
The fact table is not a typical relational database table as it is de
de-normalized
normalized on purpose,
to Enhance query response times. The fact table typically contains records that are ready
to explore, usually with adhoc queries. Records in the fact table are often refer
referred to as
events, due to the time-variant
variant nature of a data warehouse environment. The primary key
for the fact table is a composite of all the columns except numeric values/scores (like
QUANTITY, TURN OVER, exact invoice date and time).Typical fact tables in a global
enterprise at a ware house are (usually there may be additional company or business
specific fact tables):
Sales fact table-contains
contains all details regarding sales
Orders fact table-inin some cases the table can be split into open orders and historical
histori orders.
Sometimes the values for historical orders are stored in a sales fact table.
Budget fact table-usually
usually grouped by month and loaded once at the end of a year.
Forecast fact table : usually grouped by month and loaded daily, weekly or monthly.
Inventory
ventory fact table : report stocks, usually refreshed daily.
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by the Government of Maharashtra
& Affiliated to University of Mumbai )
Department of Computer Engineering
Academic Year 2020-21
Dimension table
Nearly all of the information in a typical fact table is also present in one or more
dimension tables. The main purpose of maintaining Dimension Tables is to allow
browsing the categories
ategories quickly and easily.
The primary keys of each of the dimension tables are linked together to form the
composite primary key of the fact table. In a star schema design, there is only one de-
de
normalized table for a given dimension.
Typical dimension tables in a data warehouse are:
Time dimension table
Customers dimension
table Products
dimension table
Key account managers (KAM) dimension
table Sales office dimension table
Star schema architecture
Star schema architecture is the simplest data warehouse
warehouse design. The main feature of a star
schema is a table at the center, called the fact table and the dimension tables which allow
browsing of specific categories, summarizing, drill-downs
drill downs and specifying criteria.
Typically, most of the fact tables in a star schema are in database third normal form,
while dimensional
Tables are de-normalized
normalized (second normal form).Despite the fact that the star schema is
the simplest data warehouse architecture; it is most commonly used in the data warehouse
implementations across
oss the world today (about 90-95%
90 cases).
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by the Government of Maharashtra
& Affiliated to University of Mumbai )
Department of Computer Engineering
Academic Year 2020-21
Conclusion:
A schema is a logical description of database where fact and dimension tables are
joined in a logical manner. Data Warehouse is maintained in the form of Star, Snow
flakes, and
nd Fact Constellation schema.