module -3 BI

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 50

Business Intelligence and

Analytics: Systems for Decision


Support
(10th Edition)

Chapter 3:
Data Warehousing
Learning Objectives
 Understand the basic definitions and
concepts of data warehouses
 Learn different types of data
warehousing architectures; their
comparative advantages and
disadvantages
 Describe the processes used in
developing and managing data
warehouses
(Continued…)
2
 Explain data warehousing operations
Copyright © 2014 Pearson Education, Inc.
Learning Objectives
 Explain the role of data warehouses
in decision support
 Explain data integration and the
extraction, transformation, and load
(ETL) processes
 Describe real-time (a.k.a. right-time
and/or active) data warehousing
 Understand data warehouse
administration and security issues
3 Copyright © 2014 Pearson Education, Inc.
Opening Vignette…
“Isle of Capri Casinos Is Winning
with Enterprise Data Warehouse”
 Company background
 Problem description
 Proposed solution
 Results
 Answer & discuss the case
questions.
4 Copyright © 2014 Pearson Education, Inc.
Questions for the
Opening Vignette
1. Why is it important for Isle to have an EDW?
2. What were the business challenges or
opportunities that Isle was facing?
3. What was the process Isle followed to realize
EDW? Comment on the potential challenges Isle
might have had going through the process of
EDW development.
4. What were the benefits of implementing an EDW
at Isle? Can you think of other potential benefits
that were not listed in the case?
5. Why do you think large enterprises like Isle in
the gaming industry can succeed without having
5
a capable data warehouse/business intelligence
Copyright © 2014 Pearson Education, Inc.
Main Data Warehousing
Topics
 DW definition
 Characteristics of DW
 Data Marts
 ODS, EDW, Metadata
 DW Framework
 DW Architecture & ETL Process
 DW Development
 DW Issues

6 Copyright © 2014 Pearson Education, Inc.


What is a Data Warehouse?
 A physical repository where
relational data are specially
organized to provide enterprise-wide,
cleansed data in a standardized
format
 “The data warehouse is a collection
of integrated, subject-oriented
databases designed to support DSS
functions, where each unit of data is
7 non-volatile and relevant to some
Copyright © 2014 Pearson Education, Inc.
A Historical Perspective to
Data Warehousing
ü Mainframe computers ü Centralized data storage ü Big Data analytics
ü Simple data entry ü Data warehousing was born ü Social media analytics
ü Routine reporting ü Inmon, Building the Data Warehouse ü Text and Web Analytics
ü Primitive database structures ü Kimball, The Data Warehouse Toolkit ü Hadoop, MapReduce, NoSQL
ü Teradata incorporated ü EDW architecture design ü In-memory, in-database

1970s 1980s 1990s 2000s 2010s

ü Mini/personal computers (PCs) ü Exponentially growing data Web data


ü Business applications for PCs ü Consolidation of DW/BI industry
ü Distributer DBMS ü Data warehouse appliances emerged
ü Relational DBMS ü Business intelligence popularized
ü Teradata ships commercial DBs ü Data mining and predictive modeling
ü Business Data Warehouse coined ü Open source software
ü SaaS, PaaS, Cloud Computing

8 Copyright © 2014 Pearson Education, Inc.


Characteristics of DWs
 Subject oriented
 Integrated
 Time-variant (time series)
 Nonvolatile
 Summarized
 Not normalized
 Metadata
 Web based, relational/multi-
dimensional
9
 Client/server,
Copyright © 2014 Pearson Education, Inc.
Data Mart
A departmental small-scale “DW”
that stores only limited/relevant data
 Dependent data mart
A subset that is created directly from a
data warehouse
 Independent data mart
A small data warehouse designed for a
strategic business unit or a department

10 Copyright © 2014 Pearson Education, Inc.


Other DW Components
 Operational data stores (ODS)
A type of database often used as an
interim area for a data warehouse
 Oper marts - an operational data mart.
 Enterprise data warehouse (EDW)
A data warehouse for the enterprise.
 Metadata: Data about data.
In a data warehouse, metadata describe
the contents of a data warehouse and the
manner of its acquisition and use
11 Copyright © 2014 Pearson Education, Inc.
Application Case 3.1
A Better Data Plan: Well-Established
TELCOs Leverage Data Warehousing and
Analytics to Stay on Top in a Competitive
Industry
Questions for Discussion
1. What are the main challenges for TELCOs?
2. How can data warehousing and data
analytics help TELCOs in overcoming their
challenges?
3. Why do you think TELCOs are well suited to
12
take full advantage of data analytics?
Copyright © 2014 Pearson Education, Inc.
A Generic DW Framework
No data marts option
Data Applications
Sources (Visualization)
Access
Routine
ERP Business
ETL
Reporting
Process Data mart
(Marketing)
Select
Legacy Metadata Data/text

/ Middleware
Extract mining
Data mart
(Engineering)
Transform Enterprise
POS Data warehouse
OLAP,
Integrate
Data mart Dashboard,

API
(Finance) Web
Other Load
OLTP/wEB
Replication Data mart
(...) Custom built
External
applications
data

13 Copyright © 2014 Pearson Education, Inc.


Application Case 3.2
Data Warehousing Helps MultiCare
Save More Lives
Questions for Discussion
1. What do you think is the role of data

warehousing in healthcare systems?


2. How did MultiCare use data

warehousing to improve health


outcomes?
14 Copyright © 2014 Pearson Education, Inc.
DW Architecture
 Three-tier architecture
1. Data acquisition software (back-end)
2. The data warehouse that contains the data &
software
3. Client (front-end) software that allows users
to access and analyze data from the
warehouse
 Two-tier architecture
First two tiers in three-tier architecture is
combined into one
… sometimes there is only one tier?
15 Copyright © 2014 Pearson Education, Inc.
DW Architectures

3-tier
architectur
e
Tier 1: Tier 2: Tier 3:
Client workstation Application server Database server

2-tier 1-tier
Architectur
architectur
e
e ?
Tier 1: Tier 2:
Client workstation Application & database server

16 Copyright © 2014 Pearson Education, Inc.


Data Warehousing
Architectures
 Issues to consider when deciding
which architecture to use:
 Which database management system
(DBMS) should be used?
 Will parallel processing and/or partitioning
be used?
 Will data migration tools be used to load
the data warehouse?
 What tools will be used to support data
retrieval and analysis?

17 Copyright © 2014 Pearson Education, Inc.


A Web-Based DW
Architecture

Web pages
Application
Server

Client Web
(Web browser) Internet/ Server
Intranet/
Extranet
Data
warehouse

18 Copyright © 2014 Pearson Education, Inc.


Alternative DW
Architectures
(a) Independent Data Marts Architecture

ETL
End user
Source Staging Independent data marts
access and
Systems Area (atomic/summarized data)
applications

(b) Data Mart Bus Architecture with Linked Dimensional Datamarts

ETL
Dimensionalized data marts End user
Source Staging
linked by conformed dimensions access and
Systems Area
(atomic/summarized data) applications

(c) Hub and Spoke Architecture (Corporate Information Factory)

ETL
End user
Source Staging Normalized relational
access and
Systems Area warehouse (atomic data)
applications

Dependent data marts


(summarized/some atomic data)
Alternative DW
Architectures
(d) Centralized Data Warehouse Architecture

ETL
Normalized relational End user
Source Staging
warehouse (atomic/some access and
Systems Area
summarized data) applications

(e) Federated Architecture

Data mapping / metadata


End user
Logical/physical integration of access and
Existing data warehouses
common data elements applications
Data marts and legacy systems

 Each architecture has advantages


and disadvantages!
 Which architecture is the best?
Ten factors that potentially affect
the architecture selection
decision

1. Information 6. Strategic view of the


interdependence data warehouse prior
between organizational to implementation
units 7. Compatibility with
2. Upper management’s existing systems
information needs 8. Perceived ability of the
3. Urgency of need for a in-house IT staff
data warehouse 9. Technical issues
4. Nature of end-user 10.Social/political factors
tasks
5. Constraints on
resources
21 Copyright © 2014 Pearson Education, Inc.
Teradata Corp. DW
Architecture

22 Copyright © 2014 Pearson Education, Inc.


Data Integration and the Extraction,
Transformation, and Load Process
 ETL = Extract Transform Load
 Data integration
Integration that comprises three major processes:
data access, data federation, and change
capture.
 Enterprise application integration (EAI)
A technology that provides a vehicle for pushing
data from source systems into a data warehouse
 Enterprise information integration (EII)
An evolving tool space that promises real-time
data integration from a variety of sources, such
as relational or multidimensional databases, Web
23 services, etc. Copyright © 2014 Pearson Education, Inc.
Data Integration and the Extraction,
Transformation, and Load Process

Packaged Transient
application data source

Data
warehouse

Legacy
Extract Transform Cleanse Load
system

Data mart
Other internal
applications

24 Copyright © 2014 Pearson Education, Inc.


ETL (Extract, Transform,
Load)
 Issues affecting the purchase of an ETL
tool
 Data transformation tools are expensive
 Data transformation tools may have a long
learning curve
 Important criteria in selecting an ETL tool
 Ability to read from and write to an unlimited
number of data sources/architectures
 Automatic capturing and delivery of metadata
 A history of conforming to open standards
 An easy-to-use interface for the developer and
25
the functional user
Copyright © 2014 Pearson Education, Inc.
Data Warehouse
Development
Data warehouse development approaches
 Inmon Model: EDW approach (top-down)
 Kimball Model: Data mart approach
(bottom-up)
 Which model is best?
 Table 3.3 provides a comparative analysis
between EDW and Data Mart approach
 One alternative is the hosted warehouse

26 Copyright © 2014 Pearson Education, Inc.


Application Case 3.5
Starwood Hotels & Resorts Manages
Hotel Profitability with Data
Warehousing
Questions for Discussion
1. How big and complex are the business
operations of Starwood Hotels & Resorts?
2. How did Starwood Hotels & Resorts use
data warehousing for better profitability?
3. What were the challenges, the proposed
27
solution, and the obtained results?
Copyright © 2014 Pearson Education, Inc.
Additional DW Considerations
Hosted Data Warehouses
 Benefits:
 Requires minimal investment in infrastructure
 Frees up capacity on in-house systems
 Frees up cash flow
 Makes powerful solutions affordable
 Enables solutions that provide for growth
 Offers better quality equipment and software
 Provides faster connections
 … more in the book

28 Copyright © 2014 Pearson Education, Inc.


Representation of Data in DW
 Dimensional Modeling
 A retrieval-based system that supports high-
volume query access
 Star schema
 The most commonly used and the simplest
style of dimensional modeling
 Contain a fact table surrounded by and
connected to several dimension tables
 Snowflakes schema
 An extension of star schema where the
diagram resembles a snowflake in shape
29 Copyright © 2014 Pearson Education, Inc.
Multidimensionality
The ability to organize, present, and analyze
data by several dimensions, such as sales by
region, by product, by salesperson, and by
time (four dimensions)
 Multidimensional presentation

 Dimensions: products, salespeople, market


segments, business units, geographical
locations, distribution channels, country, or
industry
 Measures: money, sales volume, head count,
inventory profit, actual versus forecast
30
 Time: daily,Copyright
weekly, monthly, quarterly, or
© 2014 Pearson Education, Inc.
Star versus Snowflake
Schema
Star Schema Snowflake Schema
Dimension Dimension Dimension Dimension
TIME PRODUCT MONTH BRAND
Quarter Brand M_Name Brand
... ... ... Dimension Dimension ...
DATE PRODUCT
Date LineItem
Fact Table
SALES Dimension ... ... Dimension
QUARTER CATEGORY
UnitsSold
Q_Name Category
... Fact Table
... SALES ...
UnitsSold
Dimension Dimension
PEOPLE GEOGRAPHY ...
Division Country
... ... Dimension Dimension
PEOPLE STORE
Division LocID
... ... Dimension
LOCATION
State
...

31 Copyright © 2014 Pearson Education, Inc.


Analysis of Data in DW
 OLTP vs. OLAP…
 OLTP (online transaction processing)

Capturing and storing data from ERP, CRM,
POS, …

The main focus is on efficiency of routine tasks
 OLAP (Online analytical processing)

Converting data into information for decision
support

Data cubes, drill-down / rollup, slice & dice, …
 Requesting ad hoc reports
 Conducting statistical and other analyses
 Developing multimedia-based applications
32  …more in the book
Copyright © 2014 Pearson Education, Inc.
OLAP vs. OLTP

33 Copyright © 2014 Pearson Education, Inc.


OLAP Operations
 Slice - a subset of a multidimensional
array
 Dice - a slice on more than two
dimensions
 Drill Down/Up - navigating among levels of
data ranging from the most summarized
(up) to the most detailed (down)
 Roll Up - computing all of the data
relationships for one or more dimensions
 Pivot - used to change the dimensional
34
orientation of a report or an ad hoc query-
Copyright © 2014 Pearson Education, Inc.
A 3-dimensional
OLAP cube with Sales volumes of

OLAP slicing
operations
a specific Product
on variable Time
and Region

Slicing
e
Operations on Ti
m

a Simple Tree- Product

Geography
Dimensional Cells are filled
Sales volumes of
with numbers
Data Cube representing
sales volumes
a specific Region
on variable Time
and Products

Sales volumes of
a specific Time on
variable Region
and Products

35 Copyright © 2014 Pearson Education, Inc.


Variations of OLAP
 Multidimensional OLAP (MOLAP)
OLAP implemented via a specialized
multidimensional database (or data
store) that summarizes transactions into
multidimensional views ahead of time
 Relational OLAP (ROLAP)
The implementation of an OLAP database
on top of an existing relational database
 Database OLAP and Web OLAP (DOLAP
and WOLAP); Desktop OLAP,…
36 Copyright © 2014 Pearson Education, Inc.
Technology Insights 3.2
Hands-On DW with MicroStrategy
 A wealth of teaching and learning
resources can be found at TUN portal

www.teradatauniversitynetwork.com

 The available resource includes


scripted demonstrations,
assignments, white papers, etc…

37 Copyright © 2014 Pearson Education, Inc.


DW Implementation Issues
 Identification of data sources and
governance
 Data quality planning, data model design
 ETL tool selection
 Establishment of service-level agreements
 Data transport, data conversion
 Reconciliation process
 End-user support
 Political issues
 … more in the book
38 Copyright © 2014 Pearson Education, Inc.
Successful DW
Implementation
Things to Avoid
 Starting with the wrong sponsorship chain
 Setting expectations that you cannot meet
 Engaging in politically naive behavior
 Loading the data warehouse with
information just because it is available
 Believing that data warehousing database
design is the same as transactional
database design
 Choosing a data warehouse manager who
is technology oriented rather than user
39 oriented Copyright © 2014 Pearson Education, Inc.
Failure Factors in DW Projects
 Lack of executive sponsorship
 Unclear business objectives
 Cultural issues being ignored

Change management
 Unrealistic expectations
 Inappropriate architecture
 Low data quality / missing information
 Loading data just because it is
available
40 Copyright © 2014 Pearson Education, Inc.
Massive DW and Scalability
 Scalability
 The main issues pertaining to scalability:

The amount of data in the warehouse

How quickly the warehouse is expected to
grow

The number of concurrent users

The complexity of user queries
 Good scalability means that queries and
other data-access functions will grow
linearly with the size of the warehouse
41 Copyright © 2014 Pearson Education, Inc.
Real-Time/Active DW/BI
 Enabling real-time data updates for
real-time analysis and real-time
decision making is growing rapidly
 Push vs. Pull (of data)
 Concerns about real-time BI
 Not all data should be updated continuously
 Mismatch of reports generated minutes apart
 May be cost prohibitive
 May also be infeasible

42 Copyright © 2014 Pearson Education, Inc.


Evolution and Data
Warehousing

43 Copyright © 2014 Pearson Education, Inc.


Real-Time/Active DW at
Teradata

44 Copyright © 2014 Pearson Education, Inc.


Traditional versus Active DW

45 Copyright © 2014 Pearson Education, Inc.


DW Administration and
Security
 Data warehouse administrator (DWA)
 DWA should…

have the knowledge of high-performance software,
hardware and networking technologies

possess solid business knowledge and insight

be familiar with the decision-making processes so as
to suitably design/maintain the data warehouse
structure

possess excellent communications skills
 Security and privacy is a pressing issue in
DW
 Safeguarding the most valuable assets
46
 Government regulations (HIPAA, etc.)
Copyright © 2014 Pearson Education, Inc.

The Future of DW
 Sourcing…
 Web, social media, and Big Data
 Open source software
 SaaS (software as a service)
 Cloud computing
 Infrastructure…
 Columnar
 Real-time DW
 Data warehouse appliances
 Data management practices/technologies
 In-database & In-memory processing New DBMS
 Advanced analytics
 …
47 Copyright © 2014 Pearson Education, Inc.
Free of Charge DW Portal
for Teaching & Learning
 www.TeradataStudentNetwork.com
 Password to signup: <check with your
instructor>

48 Copyright © 2014 Pearson Education, Inc.


End of the Chapter

 Questions, comments

49 Copyright © 2014 Pearson Education, Inc.


All rights reserved. No part of this publication may be
reproduced, stored in a retrieval system, or transmitted, in
any form or by any means, electronic, mechanical,
photocopying, recording, or otherwise, without the prior
written permission of the publisher. Printed in the United
States of America.

50 Copyright © 2014 Pearson Education, Inc.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy