DW 2022 Supplementary Exam Annswer Booklet
DW 2022 Supplementary Exam Annswer Booklet
DW 2022 Supplementary Exam Annswer Booklet
CANDIDATE NUMBER:
CANDIDATE NAMES:
CONTACT NUMBER:
MODULE LEADER:
EXAMINATION DATE:
Question 1
BAC has hired you as a data warehouse specialist to help make them decision on whether
to develop a data warehouse or not.
a) Explain why it is a challenge to analyze data from BAC’s operational systems in your [8
own words. Given four reasons. marks]
Answer here:
There is a lot of data available about running a business, which can be overwhelming for
employees. The abundance of data from different sources makes it difficult for employees
to figure out the most important insights. And they're going to analyze out-of-the-box
data, not data that really adds value to the business(BAC).
b)identify and give examples of data quality issues that may arise when extracting data from
operational system within BAC
Answer here: [10
marks]
Duplicate data
When the same information is entered more than once but in somewhat different ways, it
is considered duplicate data. When data is extracted from many siloed systems and
combined in a data warehouse, duplicate data frequently results, producing "copies" of the
same record. When duplication goes unnoticed, it might result in distorted or inaccurate
observations.
Inaccurate data
Running big data analytics or getting in touch with clients based on inaccurate data is
pointless. Data might be inaccurate very soon. Your data is incomplete if you don't collect
all the concealed information, which prevents you from making decisions based on
comprehensive and correct data sets. The most obvious source of faulty data is data in
systems where there are plenty of human errors, such as when customers type incorrect
information or input data into the wrong field.
Poor organization
If you’re not able to easily search through your data, you’ll find that it becomes
significantly more difficult to make use of. Through different organizational
methods and procedures, there are dozens of ways that data can be represented.
c) Justify why data warehouse processing environment is better than an extract processing [7
environment. Give seven reasons. marks]
Answer here
Ensure consistency
It is simpler for corporate decision-makers to study and share data insights with their
colleagues throughout the world since data warehouses are programmed to apply a
common format to all acquired data. The probability of a misunderstanding error is also
decreased and overall accuracy is increased by standardizing data from various sources.
It saves a lot of time because it eliminates the need to retrieve data from multiple data
sources and to transform it according to the requirement
The need for a Data Warehouse is to deliver faster query processing. A suitable
architecture can be created for a Data Warehouse to optimize performance instead of
using the structure used for transactional databases.
Data Warehouse is often used to create metadata that helps users understand data.
Total [25
marks]
Question 2
a) Expand the acronym OLAP and suggest to BAC 3 examples of multidimensional queries [5
as English statement and not SQL statement. marks]
Answer here
b) Draw a cube that could be used by BAC in the data warehouse. The measure should be [4
course payment, the other dimensions are up to your choice marks]
Answer here:
c) Maintaining meta data is very important in a data warehouse is very important. Analyze [4
this schema for a relation and give example meta data for each attribute BOOK(booId, marks]
Title, Publisher, Year)
Answer here
d) What could be the possible five business rules that can be applied in a library system that [5marks
would still have to be maintained in the data warehouse for BAC. ]
Answer here
e) Give five (5) reasons to justify why BAC may use a star schema to represent their data in
the data warehouse.
Answer here [4
marks]
Total [25
marks]
Question 3
Study the ERD below and answer the questions that follow:
a) Write SQL statements to create the 4 tables in the diagram above. [12
marks]
Answer here
Dim_Date
b) Write a query using SQL that how many product_category of your choice e.g. fridge have [8
been sold, for each brand and country, in 2021 marks]
Answer here
Question 4
a) The BAC management would like you to inform them to make a decision of whether the [6
top down incremental development approach is suitable or not. Identify to them the three marks]
benefits and three drawbacks of this approach.
Answer here
c) Management would like to know why security is important in the data warehouse. State [4
four (4) reasons why security is important. marks]
Answer here:
d) Write one SQL statement to give a user the ability to insert and update data in a table [4
called Products and another to prevent a user from selecting and deleting in the same marks]
table.
Answer here:
e) Justify to management why index of data may be need in the data warehouse. Give two [4
reasons. marks]
Answer here
Total [25
marks]
Question 5
a) Compare the difference between fact and dimensional tables. [8
marks]
Answer here
b) Describe any four (4) approaches that one can follow to handle missing values in the data [8
warehouse. marks]
Answer here
c) The diagram below shows the star schema for a hotel company. Note every room belongs [8
to a room type, such as, standard, medium, and suite etc. Identify and correct any four marks]
errors in the star diagram.
Day dimension
Room dimension
Roomtype dimension
#room_id table
Room_size
#typeid
type
Answer here