0% found this document useful (0 votes)
13 views61 pages

BI Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views61 pages

BI Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

BUSINESS INTELLIGENT AND

DATA WAREHOUSE
Goals
After completing this chapter, you will be able to:
 Describe the role of business intelligence in providing
comprehensive business decision support
 Define BI architecture and its components
 Summarize main applications and business value for BI
 Differentiate between operational data and decision
support data
 Identify the purpose, characteristics, and components of a
data warehouse

2
THE BUSINESS DEMAND FOR DATA,
INFORMATION, AND ANALYTICS
 Data → information → knowledge → data…
 knowledge that helps Organizations make informed
decisions → to understand their operations, customers,
competitors, suppliers, partners, employees, and
stockholders...:
 Insight into the past: what is happening in the business?
 Understand the future: What might happen?
 Advice on possible outcomes: what action to take?

3
Need of IS use at high level

I shall book a train before other passengers realise


the implications.

Wisdom
→ action my experience says this will cause severe flight
delays.

knowledge
Heathrow weather station;
visibility 15 km, sky completely cloudy; wind
direction north west, speed 85 kts; temperature
Information 15.7 degrees C

Data 03772 41565 83385 10157

4
Data
 Data is raw, random, and unorganized
 Data: a collection of ingredients sitting on the counter. They
include carrots, onions, leeks, garlic, and potatoes from the
farmer’s market, and a package of chicken, a box of rice, and
some cans of broth from the grocery store

➔ In the data warehousing (DW)/BI world, this is like source


data from different operational systems.

5
Information
 Information is data that has been organized,
structured, and processed. Information is what
you use to gain knowledge.
 then you get everything ready by washing,
peeling, and cutting up the vegetables, cutting up
the chicken, and opening the cans of broth. You
put it all in the pot and turn on the heat where it
cooks and becomes soup.
➔ In the DW/BI world, the data has been moved
into the ETL (extract, transform, and load) system
and is transformed into information
6
Knowledge
 Knowledge: Now the soup is ready to be put into bowls
and eaten.
 → In the DW/BI world business people consume the
information in reports to gain knowledge that helps them
make informed business decisions.

7
THE BUSINESS DEMAND FOR DATA,
INFORMATION, AND ANALYTICS
 Organizations tend to grow and prosper as they gain a better
understanding of their environment
 Evaluate through tracking daily transactions and analyzing
company data
 Organizations are always looking for a competitive advantage
 Product development, market positioning, sales promotions,
and customer service
 Companies and software vendors addressed these multilevel
decision support needs by creating autonomous applications
for particular groups of users
 This more comprehensive and integrated decision support
framework within organizations became known as business
intelligence
8
THE BUSINESS DEMAND FOR DATA,
INFORMATION, AND ANALYTICS
 While enterprises still need leaders and decision-makers
with intuition, they depend on data to validate their
intuitions
 Data becomes a strategic guide that helps executives see
patterns they might not otherwise notice.
 → TOO MUCH DATA, TOO LITTLE INFORMATION
 → the importance of analytics

9
THE BUSINESS DEMAND FOR DATA,
INFORMATION, AND ANALYTICS
 BI Analytics again ranks number one in Gartner CIO
Survey
 https://www.concorn.com/bi-analytics-gartner-cio-survey/
 data and analytics is the most important technology
initiative for 2014, with 72% of CIOs surveyed stating that it
is a critical or high priority
 now many more people in an organization need the
information that comes from all this data.

10
Problem: Heterogeneous Information Sources

“Heterogeneities are
everywhere” Personal
Databases

World
Scientific Databases
Wide
Web
Digital Libraries
Different interfaces
Different data representations
Duplicate and inconsistent information
Slide credit: J. Hamme
IS 257 – Fall 2005
Problem: Data Management in Large Enterprises

 Vertical fragmentation of informational systems (vertical


stove pipes)
 Result of application (user)-driven development of
operational systems

Sales Planning Suppliers Num. Control


Stock Mngmt Debt Mngmt Inventory
... ... ...

Sales Administration Finance Manufacturing ...


IS 257 – Slide credit: J. Hamme
Fall 2005
Goal: Unified Access to Data

Integration System

World
Wide
Personal
Web
Digital Libraries Scientific Databases Databases

• Collects and combines information


• Provides integrated view, uniform user interface
• Supports sharing
Slide credit: J. Hamme
IS 257 – Fall 2005
The Traditional Research Approach
• Query-driven (lazy, on-demand)
Clients

Integration System Metadata

...

Wrapper Wrapper Wrapper

...
Source Source Source
Slide credit: J. Hamme
IS 257 – Fall 2005
Disadvantages of Query-Driven Approach

 Delay in query processing


 Slow or unavailable information sources
 Complex filtering and integration
 Inefficient and potentially expensive for frequent queries
 Competes with local processing at sources
 Hasn’t caught on in industry

IS 257 – Slide credit: J. Hamme


Fall 2005
The Warehousing Approach
• Information Clients
integrated in
advance
Data
• Stored in WH for Warehouse
direct querying
and analysis Integration System Metadata

...

Extractor/ Extractor/ Extractor/


Monitor Monitor Monitor

...
Source Source Source
Slide credit: J. Hamme
IS 257 – Fall 2005
BI solution

• Multidimensional aggregation
and allocation
• Realtime reporting with
analytical alert
• Key performance indicators
optimization
• ….

International Journal of Mechatronics, Electrical and Computer Technology (IJMEC


19
A Framework for Business Intelligence

 DS S → E IS → B I

20
A Framework for Business Intelligence

 A High-Level Architecture of BI:

Source: Eckerson, W. Smart Companies in the 21st Century: The Secrets of Creating Successful Business
Intelligent Solutions. The Data Warehousing Institute, Seattle, WA, 2003, p. 32
21
A Framework for Business Intelligence

22
A Framework for Business Intelligence
A Framework for Business Intelligence
A Framework for Business Intelligence

f06-03
Operational data store
Inmon’s CIF architecture
Kimball’s enterprise data bus
architecture
Source Systems
 Many possible sources – (ERP, CRM, legacy system,
unstructured data, etc.)
 Many platforms – IBM, Oracle, Microsoft, Sybase, SAS
 Many formats – Relational, Hierarchical, Columnar, Multi-
dimensional, Big data MapReduce Databases, Unstructured
text data

29
BI Services components
 Integration Services (ETL, Operational Data Feeds,
Enterprise Application Integration, Enterprise Information
Integration)
 Data Management Services (data warehouse, data marts,
federated data marts, OLAP cubes, etc.)
 Reporting and Analytical Services (Analytical Reporting,
ad-hoc query and batch reporting, dashboards/scorecards,
predictive and prescriptive modeling, data & text
mining/forecasting)
 Information Delivery and Consumption Services (Web
portals, subscription, direct user access, internal portals

30
Types of BI users
 IT developers
 Analysts
 Information workers
 Managers and executives
 Front line workers
 Suppliers, customers, and regulators

Source Watson, H. J., "Tutorial: Business Intelligence –Past, Present, and Future," Communications of the Association for Information Systems: Vol. 25, Article 39, 2009.

31
BI Business Value
 According to Williams (2004), BI can add value to:
 Management Processes:
 Planning budgeting, performance monitoring/assessment, process
improvement, cost analysis, optimization, etc.
 Revenue Generating Processes:
 Customer segmentation, campaign management, channel
management, sales management, etc.
 Resource Consumption Processes:
 Product/service development, order management,
manufacturing/operations, supply chain, purchasing, etc.

Adopted From Williams (2004) Assessing BI Readiness: A key to


BI ROI. Business Intelligence Journal, Vol. 9, pp. 15-23, summer
2004

32
Business intelligence
 Business intelligence (BI) is a broad category of
applications, technologies, and processes for gathering,
storing, accessing, and analyzing data to help business
users make better decisions.
Watson, Hugh J. (2009) " T utorial: Business Intelligence – Past, Present, and Future,"
Communications of the Association for Information Systems: V ol. 25, Article 39.

 The term Business Intelligence (BI) refers to


technologies, applications and practices for the collection,
integration, analysis, and presentation of business
information. The purpose of Business Intelligence is to
support better business decision making
https://olap.com/learn-bi-olap/olap-bi-definitions/business-intelligence/

33
Business Intelligence

 Comprehensive, cohesive, integrated set of tools and


processes
 Captures, collects, integrates, stores, and analyzes data
 Generates and presents information to support business
decision making
 Allows transformation
 Data into information
 Information into knowledge
 Knowledge into wisdom

34
Business Intelligence

 Concepts, practices, tools and techniques to help business


 Understand core capabilities
 Provide snapshots of the company situation
 Identify key opportunities to create a competitive advantage
 Provides a framework
 Collecting and storing operational data and aggregating it into
decision support data
 Analyzing decision support data and presenting generated
information to end users to support business decisions
 Making business decisions which generate more data
 Monitoring results to evaluate outcomes and predicting
future outcomes with a high degree of accuracy
35
Business Intelligence Benefits

 Improved decision making is the main goal of BI, but BI


provides other benefits
 Integrating architecture
 Common user interface for data reporting and analysis
 Common data repository fosters single version of company
data
 Improved organizational performance
 Achieving all these benefits takes a lot of human, financial,
technological resources, and time
 BI benefits are not achieved overnight; are the result of a
focused company-wide effort that could take a long time

36
Operational Data versus Decision
Support Data

 Operational data and decision support data serve different


purposes
 Operational data is useful for capturing daily business
transactions
 Decision support data gives tactical and strategic business
meaning to the operational data
 Decision support data differs from operational data in three
main areas
 Time span
 Granularity (level of aggregation)
 Dimensionality

37
Operational Data versus Decision
Support Data

(database systems – design, implementation and management, 13th edition)


38
Operational Data versus Decision
Support Data
Table 13.5
Contrasting Operational and Decision Support Data Characteristics

Characteristic Operational Data Decision Support Data

Data currency Current operations Historic data


Real-time data Snapshot of company data
Time component (week/month/year)
Granularity Atomic-detailed data Summarized data
Summarization level Low; some aggregate yields High; many aggregation levels
Data model Highly normalized Non-normalized
Mostly relational DBMSs Complex structures
Some relational, but mostly multidimensional
DBMSs
Transaction type Mostly updates Mostly query
Transaction volumes High-update volumes Periodic loads and summary calculations
Transaction speed Updates are critical Retrievals are critical
Query activity Low to medium High
Query scope Narrow range Broad range
Query complexity Simple to medium Very complex
39
Data volumes Hundreds of gigabytes Terabytes to petabytes
Source: (database systems – design, implementation and management, 13th edition)
Data warehouse

“a data warehouse is a system that extracts, cleans, conforms, and


delivers source data into a dimensional data store and then supports and
implements querying and analysis for the purpose of decision making”

(Ralph Kimball)
Building the Data Warehouse, Fourth Edition( John Wiley, 2005)

“a DW is a subject -oriented , integrated , time-variant, and


nonvolatile collection of data in support of management’s decision
making process”

(W.H. Inmon)
The Data Warehouse ETL Toolkit(John Wiley,2004)
40
Data warehouse - Schema

Dimensional approach:
 Star schema
 Snowflake schema
Normalized approach:
 3NF model

42
Star Schemas

 Data-modeling technique
 Maps multidimensional decision support data into a
relational database
 Creates the near equivalent of multidimensional
database schema from existing relational database
 Yields an easily implemented model for
multidimensional data analysis

43
Star Schemas

 Basic star schema components


 Facts: numeric values that represent a specific business
aspect
 Dimensions: qualifying characteristics that provide
additional perspectives to a given fact
 Attributes: used to search, filter, and classify facts
 Slice and dice: ability to focus on slices of the data
cube for more detailed analysis
 Attribute hierarchies: provide a top-down data
organization
 Aggregation and drill-down/roll-up data analysis
44
Star Schemas

45
BI - Online Analytical Processing (OLAP)

 Online analytical processing is the activity of interactively


analyzing business transaction data stored in the
dimensional data warehouse to make tactical and strategic
business decisions.
 EX:
 analyzing the effectiveness of a marketing campaign by
measuring sales growth over a certain period
 analyzing the impact of a price increase on the product sales
in different regions and product groups during the same
period of time

46
Example: Olap Usage of an Automobile Marketer

The Story
An automobile marketer wants to improve business
activity. Therefore he wants to view sales figures
from different perspectives.

➢Sales by model
The Data Needs
➢Sales by dealership
➢Sales by color
➢Sales over time
➢etc.

A Question
What is the trend in sales volumes over a period of
time for a specific model and color across a
specific group of dealerships ?

Adopted from Teradata University Network


presentation on OLAP.
Example: The Multidimensional View of the Data

Sales
Volumes

M Van
O
D
Coupe
E
L Smith
Sedan Clyde
Miller
Blue Red White DEALERSHIP

COLOR
Adopted from Teradata University Network presentation
on OLAP.
OLAP Features: “Slicing and Dicing“ the Data

Choosing a range out of each dimension: • Color: Blue and White


• Model: Coupe only
Sales Volumes
• Dealership: Clyde only

M Van
O Coupe Clyde
D
Coupe
E
Blue White
L Smith
Sedan Clyde “Sliced and
Diced“ Data
Miller
Blue Red White DEALERSHIP

COLOR

Adopted from Teradata University Network presentation


on OLAP.
OLAP Features: Rotating the Data
Different users will require different views of the multidimensional cube –
OLAP allows easy rotation of data

View of the Product View of the Account


Manager Manager

Sales Volumes Rotate the data Sales Volumes


cube by 90°
M Van M Van
O O
D D
Coupe Coupe
E E
L L
Sedan Sedan

Blue Red White Miller Smith Clyde

COLOR DEALERSHIP
Adopted from Teradata University Network presentation
on OLAP.
OLAP Features: Drill-Down and Roll-Up
Data can be disaggregated and aggregated along a dimension
according to their natural hierarchy

Roll-Up Sales Volumes by Organization Dimension


- three level hierarchy -

Georgia
State

Region Atlanta Athens

Dealership Miller Smith Clyde Lucas Gleason

Drill-Down
Adopted from Teradata University Network presentation
on OLAP.
OLAP SERVER

 There are three main types of OLAP servers:

52
Relational OLAP

 Relational online analytical processing (ROLAP)


 Provides OLAP functionality using relational databases and
familiar relational tools to store and analyze
multidimensional data
 Extensions added to traditional RDBMS technology
 Multidimensional data schema support within the RDBMS
 Data access language and query performance optimized for
multidimensional data
 Support for very large databases (VLDBs)

53
Multidimensional OLAP

 Multidimensional online analytical processing (MOLAP)


 Extends OLAP functionality to multidimensional database
management systems (MDBMSs)
 MDBMS uses proprietary techniques store data in matrix-like n-
dimensional arrays
 End users visualize stored data as a three dimensional data cube
▪ Grow to n number dimensions, thus becoming hypercubes
▪ Held in memory in a cube cache to speed access
 Sparsity: measures density of data held in the data cube

54
BI - mining
1. Define what we want to
achieve:
 Is there a correlation
between the sales of music,
film, and audio book product
types and the customer
interest or occupation?”
2. Prepare the data
3. Build the mining models
4. Deploy and maintain the
models in production

55
56

BI - mining

• Three Types of Analytics


Bài toán minh hoạ

 “Market basket” data


 Purchase(salesID, item)
 ...
 (3, bread) • Cần tìm luật: {L1,L2,...,Ln} -> R
 (3, milk) • Diễn giải: “Nếu một khách hàng mua tất cả
 (3, eggs) các món hàng trong tập {L1, L2…Ln},
 (3, beer) khách hàng đó sẽ muốn có món hàng R
 (4, beer) • Ví dụ:
 (4, chips) {bread, milk} -> eggs
 .... {diapers} -> beer

 Mục tiêu: nhanh chóng tìm ra luật kết hợp trên 1 dữ liệu khá lớn
(vd: thông tin bán hàng của wall mart qua nhiều năm)
57
Bài toán minh hoạ
 Cây phân lớp ( cây quyết định)
 Buyers(<attributes>, purchase)
 Cần dự đoán việc mua hàng dựa vào <attributes>
 Clustering
 Buyers(<attributes>)
 Tự động nhóm các Buyers vào N nhóm tương tự
 Top-N items
 Purchase(salesID, item)
 N item được mua nhiều nhất? (theo salesID)

58
BI - dashboard
 Dashboards are a category of business intelligence
applications that give a quick high-level summaryof
business performance in graphical gadgets, typically
gauges, charts, indicators, and color-coded maps

https://public.datapine.com/?_ga=2.10453493.1940943450.1585127327-933313889.1565257605#board/DnjvEBVsJRVZteO3gGbSWA/null
61
Actual vs Forecast Financial Dashboard

https://public.datapine.com/#board/COcn8yeYXCcmXfzllGV5Ac

62
BI reporting
Print-perfect Operational
Reports
• Via Web and Print
• Easy Navigation Through
Hundreds of Report Pages
• Parameter Prompting Lets
Users Specify Report Content
• Pixel-perfect Business Reports
• …..
(Business Intelligence Concepts, Tools, and Applications )

63
Alerting and Proactive Notification
Delivers Information via E-Mail, Print, or File

• Allows delivering the right information to the


right person at the right time

64
Preferences
 Business intelligent guide book - Rick Sherman
 https://www.logility.com/blog/descriptive-predictive-
and-prescriptive-analytics-explained/
 Database Systems Design, Implementation, Management
(13th Edition)
 lecture notes of the course: Business Intelligence Concepts,
Tools, and Applications - University of Colorado Denver

66
67

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy