Fact Tables

Fact table
From Wikipedia, the free encyclopedia

Jump to: navigation, search
In data warehousing, a fact table consists of the measurements, metrics or facts of a business
process. It is often located at the centre of a star schema or a snowflake schema, surrounded by
dimension tables.
Fact tables provide the (usually) additive values that act as independent variables by which
dimensional attributes are analyzed. Fact tables are often defined by their grain. The grain of a
fact table represents the most atomic level by which the facts may be defined. The grain of a
SALES fact table might be stated as "Sales volume by Day by Product by Store". Each record in
this fact table is therefore uniquely defined by a day, product and store. Other dimensions might
be members of this fact table (such as location/region) but these add nothing to the uniqueness of
the fact records. These "affiliate dimensions" allow for additional slices of the independent facts
but generally provide insights at a higher level of aggregation (a region contains many stores).
Contents
[hide]
* 1 Example
* 2 Measure types
* 3 Types of fact tables
* 4 Steps in designing fact table
* 5 References
[edit] Example
If the business process is SALES, then the corresponding fact table will typically contain columns
representing both raw facts and aggregations in rows such as:
* $12,000, being "sales for New York store for 15-Jan-2005"

* $34,000, being "sales for Los Angeles store for 15-Jan-2005"
* $22,000, being "sales for New York store for 16-Jan-2005"
* $50,000, being "sales for Los Angeles store for 16-Jan-2005"
* $21,000, being "average daily sales for Los Angeles Store for Jan-2005"
* $65,000, being "average daily sales for Los Angeles Store for Feb-2005"
* $33,000, being "average daily sales for Los Angeles Store for year 2005"
"average monthly sales" is a measurement which is stored in the fact table. The fact table also
contains foreign keys from the dimension tables, where time series (e.g. dates) and other
dimensions (e.g. store location, salesperson, product) are stored.
All foreign keys between fact and dimension tables should be surrogate keys, not reused keys
from operational data.
The centralized table in a star schema is called a fact table. A fact table typically has two types of
columns: those that contain facts and those that are foreign keys to dimension tables. The
primary key of a fact table is usually a composite key that is made up of all of its foreign keys.
Fact tables contain the content of the data warehouse and store different types of measures like
additive, non additive, and semi additive measures.
[edit] Measure types
* Additive - Measures that can be added across all dimensions.

* Non Additive - Measures that cannot be added across all dimensions.
* Semi Additive - Measures that can be added across few dimensions and not with others.
A fact table might contain either detail level facts or facts that have been aggregated (fact tables
that contain aggregated facts are often instead called summary tables).
Special care must be taken when handling ratios and percentage. One good design rule[1] is to
never store percentages or ratios in fact tables but only calculate these in the data access tool.
Thus only store the numerator and denominator in the fact table, which then can be aggregated
and the aggregated stored values can then be used for calculating the ratio or percentage in the
data access tool.
In the real world, it is possible to have a fact table that contains no measures or facts. These
tables are called "factless fact tables", or "junction tables".
The "Factless fact tables" can for example be used for modeling many-to-many relationships or
capture events[1]
[edit] Types of fact tables
There are basically three fundamental measurement events, which characterizes all fact tables.
[2]
* Transactional
A transactional table is the most basic and fundamental. The grain associated with a
transactional fact table is usually specified as "one row per line in a transaction", e.g., every line
on a receipt. Typically a transactional fact table holds data of the most detailed level, causing it to
have a great number of dimensions associated with it.
* Periodic snapshots
The periodic snapshot, as the name implies, takes a "picture of the moment", where the
moment could be any defined period of time, e.g. a performance summary of a salesman over the
previous month. A periodic snapshot table is dependent on the transactional table, as it needs the
detailed data held in the transactional fact table in order to deliver the chosen performance
output.
* Accumulating snapshots
This type of fact table is used to show the activity of a process that has a well-defined
beginning and end, e.g., the processing of an order. An order moves through specific steps until it
is fully processed. As steps towards fulfilling the order are completed, the associated row in the
fact table is updated. An accumulating snapshot table often has multiple date columns, each
representing a milestone in the process. Therefore, it's important to have an entry in the
associated date dimension that represents an unknown date, as many of the milestone dates are
unknown at the time of the creation of the row.
[edit] Steps in designing fact table
* Identify a business process for analysis (like sales).

* Identify measures or facts (sales dollar), by asking questions like what ‘number of’ XX are
relevant for the business process (Replace the XX, and test if the question makes sense
business wise).
* Identify dimensions for facts (product dimension, location dimension, time dimension,
organization dimension), by asking questions which makes sense business wise, like 'Analyse by'
XX, where XX are replaced with the subject to test.
* List the columns that describe each dimension (region name, branch name, business unit
name).
* Determine the lowest level (granularity) of summary in a fact table (e.g. sales dollar).
Or utilize the four step design process described in Kimball[1]
[edit] References
1. ^ a b c Kimball & Ross - The Data Warehouse Toolkit, 2nd Ed [Wiley 2002]
2. ^ Kimball, Ralph (2008). The Data Warehours Lifecycle Toolkit, 2. edition. Wiley. ISBN 978-0-
470-14977-5.
[hide]v · d · eData warehouse

Creating the data warehouse
Concepts
Database · Dimension · Dimensional modeling · Fact · OLAP · Star schema · Aggregate

Variants
Anchor Modeling · Column-oriented DBMS · Data Vault Modeling · HOLAP · MOLAP · ROLAP ·
Operational data store
Elements
Data dictionary/Metadata · Data mart · Sixth normal form · Surrogate key

Fact
Fact table · Early-arriving fact · Measure

Dimension
Dimension table · Degenerate · Slowly changing

Filling
Extract-Transform-Load (ETL) · Extract · Transform · Load

Using the data warehouse
Concepts
Business intelligence · Dashboard · Data mining · Decision support system (DSS) · OLAP cube
Languages
Data Mining Extensions (DMX) · MultiDimensional eXpressions (MDX) · XML for Analysis (XMLA)
Tools
Business intelligence tools · Reporting software · Spreadsheet

Related
People
Bill Inmon · Ralph Kimball

Products
Comparison of OLAP Servers · Data warehousing products and their producers

Fact Tables

Uploaded by

Copyright:

Available Formats

Fact Tables

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fact Tables

Uploaded by

Copyright:

Available Formats

Fact table

From Wikipedia, the free encyclopedia

* $12,000, being "sales for New York store for 15-Jan-2005"

* Additive - Measures that can be added across all dimensions.

[edit] Steps in designing fact table

* Identify a business process for analysis (like sales).

[hide]v · d · eData warehouse

Database · Dimension · Dimensional modeling · Fact · OLAP · Star schema · Aggregate

Data dictionary/Metadata · Data mart · Sixth normal form · Surrogate key

Fact table · Early-arriving fact · Measure

Dimension table · Degenerate · Slowly changing

Extract-Transform-Load (ETL) · Extract · Transform · Load

Business intelligence tools · Reporting software · Spreadsheet

Bill Inmon · Ralph Kimball

Comparison of OLAP Servers · Data warehousing products and their producers

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.