Faculty of Egineering Data Mining & Warehouseing Lecture-01 Mr. Dhirendra
Faculty of Egineering Data Mining & Warehouseing Lecture-01 Mr. Dhirendra
Faculty of Egineering Data Mining & Warehouseing Lecture-01 Mr. Dhirendra
MR. DHIRENDRA
ASSISTANT PROFESSOR
RAMA UNIVERSITY
OUTLINE
Term "Data Warehouse" was first coined by Bill Inmon in 1990Concept of Hard Computing
a data warehouse is a subject oriented, integrated, time-variant, and non-volatile collection of data
Data mining
• data warehouse has now become an important platform for data analysis and online analytical processing
UNDERSTANDING A DATA WAREHOUSE
DATA WAREHOUSE
• helps executives to organize, understand, and use their data to take strategic decisions.
• possesses consolidated historical data, which helps the organization to analyze its business.
• An operational database is constructed for well-known tasks and workloads such as searching particular
records, indexing, etc. In contract, data warehouse queries are often complex and they present a general form of
data.
• Operational databases support concurrent processing of multiple transactions. Concurrency control and
recovery mechanisms are required for operational databases to ensure robustness and consistency of the
database.
• An operational database query allows to read and modify operations, while an OLAP query needs only read only
access of stored data.
• An operational database maintains current data. On the other hand, a data warehouse maintains historical data.
DATA WAREHOUSE FEATURES
•Subject Oriented −
•provides information around a subject rather than the organization's ongoing operations.
•Example: product, customers, suppliers, sales, revenue, etc.
•does not focus on the ongoing operations
•focuses on modelling and analysis of data for decision making.
•Integrated −
• integrating data from heterogeneous sources
• such as relational databases, flat files, etc.
• integration enhances the effective analysis of data.
•Time Variant −
•data warehouse is identified with a particular time period.
•provides information from the historical point of view.
•Non-volatile −
• previous data is not erased when new data is added to it.
• kept separate from the operational database
• frequent changes in operational database is not reflected in the data warehouse.
DATA WAREHOUSE APPLICATIONS
• helps business executives to organize, analyze, and use their data for decision making.
• serves as a sole part of a plan-execute-assess "closed-loop" feedback system for the enterprise management.
• widely used in the following fields −
• Financial services
• Banking services
• Consumer goods
• Retail sectors
• Controlled manufacturing
DATA WAREHOUSE TYPES
• Information Processing −
• A data warehouse allows to process the data stored in it.
• querying, basic statistical analysis, reporting using crosstabs, tables, charts, or graphs.
• Analytical Processing −
• supports analytical processing of the information stored in it.
• analyzed by means of basic OLAP operations
• including slice-and-dice, drill down, drill up, and pivoting.
• Data Mining −
• supports knowledge discovery by finding hidden patterns and associations,
• constructing analytical models, performing classification and prediction.
• presented using the visualization tools.
DIFFERENCE BETWEEN DATA WAREHOUSE AND OPERATIONAL DATABASE
2 OLAP systems are used by knowledge workers OLTP systems are used by clerks, DBAs, or database
such as executives, managers, and analysts. professionals.
1. A goal of data mining includes which of the 4.The data Warehouse is__________.
following? a) read only
a) To explain some observed event or b) write only.
condition c) read write only.
b) To confirm that data exists d) none.
c) To analyze data for expected relationships
d) To create a new data warehouse 5. Expansion for DSS in DW is__________.
a) Decision Support system.
2.. A data warehouse is which of the following? b) Decision Single System.
a) Can be updated by end users. c) Data Storable System
b) Contains numerous naming conventions d) Data Support System
and formats.
c) Organized around important subject
areas.
d) Contains only current data.
3. __________ is a subject-oriented,
integrated, time-variant, nonvolatile collection
of data in support of management decisions.
a) Data Mining
b) Data Warehousing
c) Web Mining
d) Text Mining.
REFERENCES
https://www.tutorialspoint.com/dwh/dwh_overview.htm