DW 2022 Supplementary Exam Annswer Booklet

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Computing and Information Systems

BSC (HONS) BIDA

BI 205 –DATA WAREHOUSING Year 2 Semester2

SUPPLEMENTARY EXAM ANSWER BOOKLET


Date: 13 July 2022 Time: 14:00
Total Marks: 100 Duration: 4hrs

CANDIDATE NUMBER:

CANDIDATE NAMES:

CONTACT NUMBER:

MODULE LEADER:

EXAMINATION DATE:
Question 1

BAC has hired you as a data warehouse specialist to help make them decision on whether
to develop a data warehouse or not.

a) Explain why it is a challenge to analyze data from BAC’s operational systems in your [8
own words. Given four reasons. marks]

Answer here:

There is a lot of data available about running a business, which can be overwhelming for
employees. The abundance of data from different sources makes it difficult for employees
to figure out the most important insights. And they're going to analyze out-of-the-box
data, not data that really adds value to the business(BAC).

b)identify and give examples of data quality issues that may arise when extracting data from
operational system within BAC
Answer here: [10
marks]
Duplicate data
When the same information is entered more than once but in somewhat different ways, it
is considered duplicate data. When data is extracted from many siloed systems and
combined in a data warehouse, duplicate data frequently results, producing "copies" of the
same record. When duplication goes unnoticed, it might result in distorted or inaccurate
observations.

Inaccurate data
Running big data analytics or getting in touch with clients based on inaccurate data is
pointless. Data might be inaccurate very soon. Your data is incomplete if you don't collect
all the concealed information, which prevents you from making decisions based on
comprehensive and correct data sets. The most obvious source of faulty data is data in
systems where there are plenty of human errors, such as when customers type incorrect
information or input data into the wrong field.

Poorly defined data


Data is frequently ill-defined, which makes it difficult to determine the best management
style. For instance, incorrectly sectioned data, such as a corporation account recorded as a
single person's contact, will seriously muck up your database and make it more
challenging to grasp and sift through.

Poor organization
If you’re not able to easily search through your data, you’ll find that it becomes
significantly more difficult to make use of. Through different organizational
methods and procedures, there are dozens of ways that data can be represented.
c) Justify why data warehouse processing environment is better than an extract processing [7
environment. Give seven reasons. marks]

Answer here

Ensure consistency
It is simpler for corporate decision-makers to study and share data insights with their
colleagues throughout the world since data warehouses are programmed to apply a
common format to all acquired data. The probability of a misunderstanding error is also
decreased and overall accuracy is increased by standardizing data from various sources.

increase their revenue


Business leaders can easily access their organization's historical actions through data
warehouse platforms and assess earlier projects to see if they were successful or failed.
This enables executives to identify areas where their plan needs to be adjusted in order to
reduce expenses, maximize efficiency, and boost sales in order to improve their bottom
line.

It saves a lot of time because it eliminates the need to retrieve data from multiple data
sources and to transform it according to the requirement

The need for a Data Warehouse is to deliver faster query processing. A suitable
architecture can be created for a Data Warehouse to optimize performance instead of
using the structure used for transactional databases.

Data Warehouse is often used to create metadata that helps users understand data.

Total [25
marks]

Question 2

a) Expand the acronym OLAP and suggest to BAC 3 examples of multidimensional queries [5
as English statement and not SQL statement. marks]
Answer here

b) Draw a cube that could be used by BAC in the data warehouse. The measure should be [4
course payment, the other dimensions are up to your choice marks]
Answer here:

c) Maintaining meta data is very important in a data warehouse is very important. Analyze [4
this schema for a relation and give example meta data for each attribute BOOK(booId, marks]
Title, Publisher, Year)
Answer here

d) What could be the possible five business rules that can be applied in a library system that [5marks
would still have to be maintained in the data warehouse for BAC. ]
Answer here

e) Give five (5) reasons to justify why BAC may use a star schema to represent their data in
the data warehouse.

Answer here [4
marks]

f) Why are hierarchies important in a data warehouse? [3


marks]
Answer here:

Total [25
marks]

Question 3
Study the ERD below and answer the questions that follow:

a) Write SQL statements to create the 4 tables in the diagram above. [12
marks]

Answer here

Dim_Date

Create table dim_table(


Id int,
Date date,
Day_of_work varchar(10),
Month varchar(10),
Month

b) Write a query using SQL that how many product_category of your choice e.g. fridge have [8
been sold, for each brand and country, in 2021 marks]
Answer here

c) Identify five (5) characteristics of a fact table. [5


marks]
Answer here
Total [25
marks]

Question 4
a) The BAC management would like you to inform them to make a decision of whether the [6
top down incremental development approach is suitable or not. Identify to them the three marks]
benefits and three drawbacks of this approach.
Answer here

b) Identity four (4) main properties of a data warehouse. [4


marks]
Answer here:

c) Management would like to know why security is important in the data warehouse. State [4
four (4) reasons why security is important. marks]

Answer here:

d) Write one SQL statement to give a user the ability to insert and update data in a table [4
called Products and another to prevent a user from selecting and deleting in the same marks]
table.

Answer here:

e) Justify to management why index of data may be need in the data warehouse. Give two [4
reasons. marks]
Answer here

e) List any three data warehouse design phases. [3


marks]
Answer here:

Total [25
marks]
Question 5
a) Compare the difference between fact and dimensional tables. [8
marks]
Answer here

b) Describe any four (4) approaches that one can follow to handle missing values in the data [8
warehouse. marks]
Answer here

c) The diagram below shows the star schema for a hotel company. Note every room belongs [8
to a room type, such as, standard, medium, and suite etc. Identify and correct any four marks]
errors in the star diagram.

Day dimension

Occupancy Fact table Hotel dimension table


#day_id
Date_description
#hotelid
Dayofmonth Noofroomsoccupied Hotelname
Weekofmonth
city

Room dimension

Roomtype dimension
#room_id table
Room_size

#typeid
type

Answer here

d) What does ETL stand for in data warehousing? [1 mark]


Answer here
Total [25
marks]

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy