DWM UNIT-I NOTES
DWM UNIT-I NOTES
● A data warehouse is a powerful database model that significantly enhances the user‟s
ability to quickly analyze large, multidimensional data sets.
● It cleanses and organizes data to allow users to make business decisions based on
facts. Hence, the data in the data warehouse must have strong analytical
characteristics.
● Creating data to be analytical requires that it be subject-oriented, integrated, time-
referenced, and non-volatile.
● It is a multi dimensional model.
Characteristics of Data Warehouse:
1.Subject-Oriented Data
2.Integrated Data
3.Time-Referenced Data
4.Non-Volatile Data
5.Granularity
Subject-Oriented Data:
● In a data warehouse environment, information used for analysis is organized around
subjects: employees, accounts, sales, products, and so on.
● This subject specific design helps in reducing the query response time by searching
through very few records to get an answer to the user’s question.
Integrated Data:
● Integrated data refers to de-duplicating information and merging it from many sources
into one consistent location.
● When short listing your top 20 customers, you must know that “HAL” and “Hindustan
Aeronautics Limited” are one and the same.
● Much of the transformation and loading work that goes into the data warehouse is
centred on integrating data and standardizing it.
Time-Referenced Data:
Non-Volatile Data:
● The data warehouse is a physically separate data storage, which is transformed from
the source operational RDBMS. The operational updates of data do not occur in the
data warehouse, i.e., update, insert, and delete operations are not performed.
● The non-volatility of data, characteristic of data warehouse, enables users to dig deep
into history and arrive at specific business decisions based on facts.
● If there is a single key to survival in the 1990s and beyond, it is being able to analyze,
plan, and react to changing business conditions in a much more rapid fashion.
● In order to do this, top managers, analysts, and knowledge workers in our enterprises,
need more and better information.
● Every day, organizations large and small, create billions of bytes of data about all
aspects of their business; millions of individual facts about their customers, products,
operations and people.
● But for the most part, this is locked up in a maze of computer systems and is
exceedingly difficult to get at. This phenomenon has been described as “data in jail”.
Data Warehousing:
● The idea of data warehousing came to the late 1980's when IBM researchers Barry
Devlin and Paul Murphy established the "Business Data Warehouse."
● Data warehousing is a field that has grown from the integration of a number of
different technologies and experiences over the past two decades.
● These experiences have allowed the IT industry to identify the key problems that
need to be solved.
● It is the process of developing,managing and securing the electronic storage of data by a
business or organization in a digital data warehouse .
● Operational systems, as their name implies, are the systems that help the every day
operation of the enterprise.
● These are the backbone systems of any enterprise, and include order entry, inventory,
manufacturing, payroll and accounting.
● Due to their importance to the organization, operational systems were almost always
the first parts of the enterprise to be computerized.
● Informational systems deal with analyzing data and making decisions, often major,
about how the enterprise will operate now, and in the future.
● Not only do informational systems have a different focus from operational ones, they
often have a different scope.
● Where operational data needs are normally focused upon a single area, informational
data needs often span a number of different areas and need large amounts of related
operational data.
DATA WAREHOUSE ADVANTAGES & DISADVANTAGES:
1. DW make access to a wide variety of data easier for end users.2.provide key i/f for
business decision making
3. Improves the quality of decisions made
4. Especially useful for the medium & large term
5. It provides a great power of information processing
6. Facilities decision making in business
7. Companies get an increase in productivity
8. It allows you to plan more effectively
9. Reduce response times &Operating costs
10. improve relationships with suppliers & customers
There are four separate and distinct components to consider in the DW/BI
environment: operational source systems, ETL system, data presentation area, and
business intelligence application.
● Independent Data Mart is created directly from external sources instead of data
warehouse. First data mart is created by extracting data from external sources and
then datawarehouse is created from the data present in data mart. Independent data
mart is designed in bottom-up approach of datawarehouse architecture. This model
of data mart is used by small organizations and is cost effective comparatively.
● Independent data marts are not difficult to design and develop. They are beneficial to
achieve short-term goals but may become cumbersome to manage—each with its
own ETL tool and logic—as business needs expand and become more complex.
● An advantage to this model is that individual business units can run the data
mart that suits them best.
This type of Data Mart is created by extracting data from operational source or
from data warehouse. 1Path reflects accessing data directly from external sources
and 2Path reflects dependent data model of data mart.