0% found this document useful (0 votes)
5 views

Lecture 1 - Introduction

Uploaded by

edreethsultan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lecture 1 - Introduction

Uploaded by

edreethsultan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

CSC4304: Data Management II

INTRODUCTION
Abdullahi Ahamad Shehu (M.Sc. DS, M.Sc. CS)
aashehu.cs@buk.edu.ng
“Lecture Time”: Wednesdays (2-4, TH A)
“Office” : Faculty of Computing Extension Wing A

1
Course Overview
• Lectures/Tutorials:
• Lecture 1: Data Warehousing (DW) Concepts
• Lecture 2: Excel Business Intelligence Workflow & DAX
• Lecture 3: DW Design
• Lecture 4: ETL (Extraction, Transformation, Loading)
• Lecture 5: DW Analysis (OLAP: Online Analytical Processing)
• Lecture 6: DW Analysis with MDX (Multi-Dimensional eXpressions)
• Lecture 7: DW Analysis: MDX vs SQL, KPIs and Reporting

2
C.A Overview
Assessment Due Date
Coursework (30%) Wednesday 5th February 2025, by 5pm

3
Resources

• Main Textbook
• A. Vaisman, E. Zimanyi, Data Warehouse Systems: Design and Implementation
(Springer, 2014)

• Other online Lab-related resources will be indicated in the corresponding Lab notes.
• Video walkthrough of the labs will also be made available

4
Introduction to
Data Warehousing (DW)
&
Business Intelligence (BI)

5
Motivation
• Businesses and organisations are accumulating growing amounts of
data in transactional (operational) databases.
• They want to turn this data into useful information that can be used
to support strategic decision making.
• However, operational databases were never designed to support such
business activities.

6
Business Intelligence (BI)
• BI refers to computer-based techniques used to analyse business
data and aims to support business decision making.
• BI technologies provide historical, current, and predictive views of
business operations.
• BI includes:
• Data warehousing
• Data mining
• Online Analytical Processing (OLAP), and
• End-user querying and reporting tools.

7
BI Components (Architecture)

8
Motivation - Scenario 1
 A supermarket with branches in Lagos

Lagos, Abuja, and Kano.


 Each branch has a separate
operational system. Abuja
Sales report
per product
per branch
 The Sales Manager wants per quarter
Sales
quarterly sales report. Difficult Manager

task to accomplish because of


the differences in data definition Kano

and content.

9
BI can be the answer
Fuse all data into a single Data Warehouse
Lagos

Abuja
Data Extraction,
Query &
Transformation
Reporting
Tools
Data
warehouse

Kano Sales
Manager

10
Motivation - Scenario 2
 The Lagos branch has a huge
operational database. Data Entry
 Whenever Sales Manager wants Operator
Report
some reports, the database Wait
Lagos

system becomes slow and data Sales


entry operators have to wait Manager

because the reports involve


complex queries that access Data Entry
Operator
large volumes of data.

11
BI can be the answer
Separate the day-to-day Operational system from the Decision Support system

Lagos

Query &
Reporting
Tools
Data
warehouse
Sales
Manager

Operational Decision Support


System System
12
Motivation - Scenario 3
 The Managing Director wants to know if Expansion?
the supermarket should open a new
branch, and if yes, where?
 Needs information to make the correct
decision.
 Operational databases contain detailed
data, do not include historical data that
can be used to gain insight into the
business.

13
BI can be the answer
Use Data Mining and Data Analysis Tools to help in decision support
(Data Warehouse provides a vast amount of data that can be mined)
Data
mining

Query &
Reporting
Data Tools
warehouse

Sales
Manager

14
DW - Definition
• In a nutshell, a DW is a large store of data accumulated from a wide range
of sources within a company and used to guide management decisions.
• Definition (Bill Inmon): DW is a
• subject-oriented: data is organised around the business areas (subjects) (e.g.,
customers) rather than application areas (e.g., invoicing)
• Integrated (data comes from multiple heterogeneous sources)
• time-variant, (data is stored over a long period of time)
• nonvolatile (data is typically read-only
collection of data in support of management's decision-making process.

15
OLAP (OnLine Analytical Processing)
• Definition: The dynamic synthesis, analysis, and consolidation of large
volumes of multi-dimensional data.
• OLAP allow users to interactively query and aggregate the data and
analyse it at various levels of detail (drill-down, roll-up, slice, pivot... ).
• OLAP is often used to contrast with the traditional database systems or
OLTPs (OnLine Transaction Processing systems)

16
Data Mining
• The process of extracting hidden patterns and relationships in large
sets of data and using them to make crucial business decisions.
• Techniques include: decision trees, clustering, association rules…

17
Examples of BI Applications
• Retail / Marketing
• Identifying buying patterns of customers (shopping basket analysis).
• Predicting response to mailing campaigns.
• Insurance
• Claims analysis.
• Banking
• Detecting patterns of fraudulent credit card use.

18
Differences between OLTP and DW
Aspect Operational Databases Data Warehouse
User type Operators, office employees Managers, executives
Usage Predictable, repetitive Ad-hoc, random
Activity focus Day to day transactions Decision support
Database design Application oriented Subject oriented
Database size 100s MB – a few GBs 100s GBs - TBs
Data content Current, detailed data Historical, summarised data
Data structures Optimised for small transactions Optimised for complex queries
Access type Read, Insert, Update, Delete Read, Append
Number of records per access Few Many

Response time Short Can be long


Concurrency level High Low
Lock utilisation Needed Not needed
Data redundancy Low (normalised tables) High (denormalised tables)
Data modelling UML, ER model Multidimensional model 19
Summary & Reading
• DW is a large store of data accumulated from a wide range of sources
and used to guide management decisions.
• Further Reading:
• Textbook: Chapter 3 (Data Warehouse Concepts)

20
Lab
• Today’s lab will introduce you to the Microsoft Excel’s Business
Intelligence tools. More about this next week.
• Follow along the instructions

20 December 20 21
24

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy