Unit 4
Unit 4
Fact
• It is a collection of associated data items, consisting of measures and
context data. It typically represents business items or business transactions.
Dimensions
• It is a collection of data which describe one business dimension.
Dimensions decide the contextual background for the facts, and they are
the framework over which OLAP is performed.
Measure
• It is a numeric attribute of a fact, representing the performance or behavior
of the business relative to the dimensions.
Fact Table
• Fact tables are used to data facts or measures in the business. Facts a
Characteristics of the Fact table
• The fact table includes numerical values of what we measure. For
example, a fact value of 20 might means that 20 widgets have been
sold.
• Each fact table includes the keys to associated dimension tables. These
are known as foreign keys in the fact table.
• Fact tables typically include a small number of columns.
• When it is compared to dimension tables, fact tables have a large
number of rows.
Dimension Table
• Dimension tables establish the context of the facts. Dimensional tables store fields that
describe the facts.
Characteristics of the Dimension table
• Dimension tables contain the details about the facts. That, as an example, enables the
business analysts to understand the data and their reports better.
• The dimension tables include descriptive data about the numerical values in the fact table.
That is, they contain the attributes of the facts. For example, the dimension tables for a
marketing analysis function might include attributes such as time, marketing region, and
product type.
• Since the record in a dimension table is denormalized, it usually has a large number of
columns. The dimension tables include significantly fewer rows of information than the fact
table.
• The attributes in a dimension table are used as row and column headings in a document or
query results display.
• Example: A city and state can view a store summary in a fact table.
Item summary can be viewed by brand, color, etc. Customer
information can
• be viewed by name and address.
Hierarchy
• A hierarchy is a directed tree whose nodes are dimensional attributes
and whose arcs model many to one association between dimensional
attributes team. It contains a dimension, positioned at the tree's root,
and all of the dimensional attributes that define it.
What is Multi-Dimensional Data
Model?
• A multidimensional model views data in the form of a data-cube. A data cube enables data
to be modeled and viewed in multiple dimensions. It is defined by dimensions and facts.
• The dimensions are the perspectives or entities concerning which an organization keeps
records. For example, a shop may create a sales data warehouse to keep records of the
store's sales for the dimension time, item, and location. These dimensions allow the save to
keep track of things, for example, monthly sales of items and the locations at which the
items were sold. Each dimension has a table related to it, called a dimensional table, which
describes the dimension further. For example, a dimensional table for an item may contain
the attributes item_name, brand, and type.
• A multidimensional data model is organized around a central theme, for example, sales. This
theme is represented by a fact table. Facts are numerical measures. The fact table contains
the names of the facts or measures of the related dimensional tables.
• Consider the data of a shop for items sold per quarter in the city of
Delhi. The data is shown in the table. In this 2D representation, the
sales for Delhi are shown for the time dimension (organized in
quarters) and the item dimension (classified according to the types of
an item sold). The fact or measure displayed in rupee_sold (in
thousands).
What is Data Cube?
• Let suppose we would like to view the sales data with a third
dimension. For example, suppose we would like to view the data
according to time, item as well as the location for the cities Chicago,
New York, Toronto, and Vancouver. The measured display in dollars
sold (in thousands). These 3-D data are shown in the table. The 3-D
data of the table are represented as a series of 2-D tables.
• In data warehousing, the data cubes are n-dimensional. The cuboid
which holds the lowest level of summarization is called a base cuboid.
• For example, the 4-D cuboid in the figure is the base cuboid for the
given time, item, location, and supplier dimensions.
• A 4-D data cube representation of sales data, according to the
dimensions time, item, location, and supplier. The measure displayed
is dollars sold (in thousands).
• The topmost 0-D cuboid, which holds the highest level of
summarization, is known as the apex cuboid. In this example, this is
the total sales, or dollars sold, summarized over all four dimensions.
• The lattice of cuboid forms a data cube. The figure shows the lattice of
cuboids creating 4-D data cubes for the dimension time, item,
location, and supplier. Each cuboid represents a different degree of
summarization.
What is Star Schema?