0% found this document useful (0 votes)
6 views

Unit 4

Dimensional modeling is a data representation technique used in OLAP that organizes data into fact and dimension tables, allowing for efficient querying and analysis. It has advantages such as ease of understanding for end-users and efficient query performance, but also faces challenges like data integrity maintenance and adaptability to business changes. The document also discusses various schemas including star and snowflake schemas, which structure data differently to optimize performance and reduce redundancy.

Uploaded by

priya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Unit 4

Dimensional modeling is a data representation technique used in OLAP that organizes data into fact and dimension tables, allowing for efficient querying and analysis. It has advantages such as ease of understanding for end-users and efficient query performance, but also faces challenges like data integrity maintenance and adaptability to business changes. The document also discusses various schemas including star and snowflake schemas, which structure data differently to optimize performance and reduce redundancy.

Uploaded by

priya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

What is Dimensional Modeling?

• Dimensional modeling represents data with a cube operation, making more


suitable logical data representation with OLAP data management. The
perception of Dimensional Modeling was developed by Ralph Kimball and
is consist of "fact" and "dimension" tables.
• In dimensional modeling, the transaction record is divided into
either "facts," which are frequently numerical transaction data,
or "dimensions," which are the reference information that gives context to
the facts. For example, a sale transaction can be damage into facts such as
the number of products ordered and the price paid for the products, and
into dimensions such as order date, user name, product number, order
ship-to, and bill-to locations, and salesman responsible for receiving the
order.
Objectives of Dimensional Modeling

The purposes of dimensional modeling are:


• To produce database architecture that is easy for end-clients to
understand and write queries.
• To maximize the efficiency of queries. It achieves these goals by
minimizing the number of tables and relationships between them.
Advantages of Dimensional
Modeling
Disadvantages of Dimensional
Modeling
• To maintain the integrity of fact and dimensions, loading the data
warehouses with a record from various operational systems is
complicated.
• It is severe to modify the data warehouse operation if the
organization adopting the dimensional technique changes the method
in which it does business.
Elements of Dimensional Modeling

Fact
• It is a collection of associated data items, consisting of measures and
context data. It typically represents business items or business transactions.
Dimensions
• It is a collection of data which describe one business dimension.
Dimensions decide the contextual background for the facts, and they are
the framework over which OLAP is performed.
Measure
• It is a numeric attribute of a fact, representing the performance or behavior
of the business relative to the dimensions.
Fact Table

• Fact tables are used to data facts or measures in the business. Facts a
Characteristics of the Fact table
• The fact table includes numerical values of what we measure. For
example, a fact value of 20 might means that 20 widgets have been
sold.
• Each fact table includes the keys to associated dimension tables. These
are known as foreign keys in the fact table.
• Fact tables typically include a small number of columns.
• When it is compared to dimension tables, fact tables have a large
number of rows.
Dimension Table

• Dimension tables establish the context of the facts. Dimensional tables store fields that
describe the facts.
Characteristics of the Dimension table
• Dimension tables contain the details about the facts. That, as an example, enables the
business analysts to understand the data and their reports better.
• The dimension tables include descriptive data about the numerical values in the fact table.
That is, they contain the attributes of the facts. For example, the dimension tables for a
marketing analysis function might include attributes such as time, marketing region, and
product type.
• Since the record in a dimension table is denormalized, it usually has a large number of
columns. The dimension tables include significantly fewer rows of information than the fact
table.
• The attributes in a dimension table are used as row and column headings in a document or
query results display.
• Example: A city and state can view a store summary in a fact table.
Item summary can be viewed by brand, color, etc. Customer
information can
• be viewed by name and address.
Hierarchy
• A hierarchy is a directed tree whose nodes are dimensional attributes
and whose arcs model many to one association between dimensional
attributes team. It contains a dimension, positioned at the tree's root,
and all of the dimensional attributes that define it.
What is Multi-Dimensional Data
Model?
• A multidimensional model views data in the form of a data-cube. A data cube enables data
to be modeled and viewed in multiple dimensions. It is defined by dimensions and facts.
• The dimensions are the perspectives or entities concerning which an organization keeps
records. For example, a shop may create a sales data warehouse to keep records of the
store's sales for the dimension time, item, and location. These dimensions allow the save to
keep track of things, for example, monthly sales of items and the locations at which the
items were sold. Each dimension has a table related to it, called a dimensional table, which
describes the dimension further. For example, a dimensional table for an item may contain
the attributes item_name, brand, and type.
• A multidimensional data model is organized around a central theme, for example, sales. This
theme is represented by a fact table. Facts are numerical measures. The fact table contains
the names of the facts or measures of the related dimensional tables.
• Consider the data of a shop for items sold per quarter in the city of
Delhi. The data is shown in the table. In this 2D representation, the
sales for Delhi are shown for the time dimension (organized in
quarters) and the item dimension (classified according to the types of
an item sold). The fact or measure displayed in rupee_sold (in
thousands).
What is Data Cube?

• When data is grouped or combined in multidimensional matrices called Data


Cubes. The data cube method has a few alternative names or a few variants,
such as "Multidimensional databases," "materialized views," and "OLAP (On-
Line Analytical Processing)."
• The general idea of this approach is to materialize certain expensive
computations that are frequently inquired.
• For example, a relation with the schema sales (part, supplier, customer, and
sale-price) can be materialized into a set of eight views as shown in fig,
where psc indicates a view consisting of aggregate function value (such as
total-sales) computed by grouping three attributes part, supplier, and
customer, p indicates a view composed of the corresponding aggregate
function values calculated by grouping part alone, etc.
• A data cube is created from a subset of attributes in the database. Specific attributes
are chosen to be measure attributes, i.e., the attributes whose values are of interest.
Another attributes are selected as dimensions or functional attributes. The measure
attributes are aggregated according to the dimensions.
For example, XYZ may create a sales data warehouse to keep records of the store's
sales for the dimensions time, item, branch, and location. These dimensions enable
the store to keep track of things like monthly sales of items, and the branches and
locations at which the items were sold. Each dimension may have a table identify
with it, known as a dimensional table, which describes the dimensions. For example,
a dimension table for items may contain the attributes item_name, brand, and type.
• Data cube method is an interesting technique with many applications. Data cubes
could be sparse in many cases because not every cell in each dimension may have
corresponding data in the database.
• Techniques should be developed to handle sparse cubes efficiently.
• If a query contains constants at even lower levels than those provided in a
data cube, it is not clear how to make the best use of the precomputed
results stored in the data cube.
• The model view data in the form of a data cube. OLAP tools are based on the
multidimensional data model. Data cubes usually model n-dimensional data.
• A data cube enables data to be modeled and viewed in multiple dimensions.
A multidimensional data model is organized around a central theme, like
sales and transactions. A fact table represents this theme. Facts are
numerical measures. Thus, the fact table contains measure (such as Rs_sold)
and keys to each of the related dimensional tables.
• Dimensions are a fact that defines a data cube. Facts are generally
quantities, which are used for analyzing the relationship between
dimensions.
3-Dimensional Cuboids

• Let suppose we would like to view the sales data with a third
dimension. For example, suppose we would like to view the data
according to time, item as well as the location for the cities Chicago,
New York, Toronto, and Vancouver. The measured display in dollars
sold (in thousands). These 3-D data are shown in the table. The 3-D
data of the table are represented as a series of 2-D tables.
• In data warehousing, the data cubes are n-dimensional. The cuboid
which holds the lowest level of summarization is called a base cuboid.
• For example, the 4-D cuboid in the figure is the base cuboid for the
given time, item, location, and supplier dimensions.
• A 4-D data cube representation of sales data, according to the
dimensions time, item, location, and supplier. The measure displayed
is dollars sold (in thousands).
• The topmost 0-D cuboid, which holds the highest level of
summarization, is known as the apex cuboid. In this example, this is
the total sales, or dollars sold, summarized over all four dimensions.
• The lattice of cuboid forms a data cube. The figure shows the lattice of
cuboids creating 4-D data cubes for the dimension time, item,
location, and supplier. Each cuboid represents a different degree of
summarization.
What is Star Schema?

• A star schema is the elementary form of a dimensional model, in which


data are organized into facts and dimensions. A fact is an event that is
counted or measured, such as a sale or log in. A dimension includes
reference data about the fact, such as date, item, or customer.
• A star schema is a relational schema where a relational schema whose
design represents a multidimensional data model. The star schema is the
explicit data warehouse schema. It is known as star schema because the
entity-relationship diagram of this schemas simulates a star, with points,
diverge from a central table. The center of the schema consists of a large
fact table, and the points of the star are the dimension tables.
Fact Tables

• A table in a star schema which contains facts and connected to


dimensions. A fact table has two types of columns: those that include
fact and those that are foreign keys to the dimension table. The
primary key of the fact tables is generally a composite key that is
made up of all of its foreign keys.
• A fact table might involve either detail level fact or fact that have
been aggregated (fact tables that include aggregated fact are often
instead called summary tables). A fact table generally contains facts
with the same level of aggregation.
What is Snowflake Schema?

• A snowflake schema is equivalent to the star schema. "A schema is


known as a snowflake if one or more dimension tables do not connect
directly to the fact table but must join through other dimension tables."
• The snowflake schema is an expansion of the star schema where each
point of the star explodes into more points. It is called snowflake schema
because the diagram of snowflake schema resembles a
snowflake. Snowflaking is a method of normalizing the dimension
tables in a STAR schemas. When we normalize all the dimension tables
entirely, the resultant structure resembles a snowflake with the fact
table in the middle.
• Snowflaking is used to develop the performance of specific queries. The
schema is diagramed with each fact surrounded by its associated
dimensions, and those dimensions are related to other dimensions,
branching out into a snowflake pattern.
• In snowflake, schema tables are normalized to delete redundancy. In
snowflake dimension tables are damaged into multiple dimension
tables.
• Figure shows a simple STAR schema for sales in a manufacturing
company. The sales fact table include quantity, price, and other
relevant metrics. SALESREP, CUSTOMER, PRODUCT, and TIME are the
dimension tables.
Difference between Star and Snowflake Schemas

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy