0% found this document useful (0 votes)
66 views

Jss Mahavidyapeetha: AY 2019-20 (Even Semester)

This document contains a CIA-I test from a Data Warehousing and Data Mining course. It includes multiple choice and long answer questions testing student understanding of key concepts like the ETL process, data modeling, data preprocessing techniques, and data mining algorithms. It also provides the course outcomes which students should be able to demonstrate including describing data warehouse architecture and components, applying data modeling and preprocessing, using supervised and unsupervised algorithms, and applying techniques to multidimensional and web data.

Uploaded by

vik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views

Jss Mahavidyapeetha: AY 2019-20 (Even Semester)

This document contains a CIA-I test from a Data Warehousing and Data Mining course. It includes multiple choice and long answer questions testing student understanding of key concepts like the ETL process, data modeling, data preprocessing techniques, and data mining algorithms. It also provides the course outcomes which students should be able to demonstrate including describing data warehouse architecture and components, applying data modeling and preprocessing, using supervised and unsupervised algorithms, and applying techniques to multidimensional and web data.

Uploaded by

vik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 2

Roll No.

JSS MAHAVIDYAPEETHA
JSS ACADEMY OF TECHNICAL EDUCATION, NOIDA
DEPARTMENT OF INFORMATION TECHNOLOGY

CIA-I
AY 2019-20 (Even Semester)
Course : B.Tech Date : 26-02-2020
Semester : VI Subject Code : RIT-062
Subject : Datawarehousing & Data Mining Max. Marks : 20
Time : 9:30 – 10:30 a.m.

COURSE OUTCOMES
C316.1 Describe processes, architecture and components of data warehouse.
C316.2 Apply Multi-Dimensional data modeling scheme to design data warehouse.
C316.3 Apply pre-processing techniques on raw data.
C316.4 Apply supervised and unsupervised data mining algorithms to discover pattern and estimate accuracy of
the algorithms.
C316.5 Apply mining techniques on multi-dimensional and World Wide Web data.

Q.
Questions CO BL
No.
PART- A: Attempt All Questions (5x1 = 5 Marks)
1. Define ETL process. 1 1
2. Give a precise definition of the term “concept hierarchy”. 2 1
3. Write the formula to find out the outliers using inter-quartile range (IQR) of the given 3 2
data set.
4. Differentiate between ordinal and nominal attribute. 3 1
5. Compute the Manhattan distance between the two objects represented by the tuples 3 3
(22, 1, 42, 10) and (20, 0, 36, 8).

PART-B: Attempt ANY THREE Questions (3x3 = 9 Marks)


6. Explain 3 tier architecture of data warehouse with the help of suitable diagram. 1 1
7. Briefly discuss why OLTP is not applicable to data warehouse. 2 2
8. The data (in increasing order) for the attribute price in dollars : 3 3
4, 8, 9, 15, 21, 21, 24, 25, 26, 28, 29, 34
Use smoothing by bin means and bin boundary to smooth given data, using a bin
depth of 4. Illustrate your steps.

9. Find out the distance between each pair of objects represented by student names on the 3 3
basis of grade obtained and represent the distance in dissimilarity matrix.
Student Name Grades
A B-
B B+
C B-
D A
E A-
F B+

PART-C: Attempt ANY ONE Question (1x6 = 6 Marks)


10. a) With the suitable example, discuss the need of pre-processing the data. 1 2
b) Suppose that the data for analysis includes the attribute age. The age values for the 3 3
data tuples are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25,
25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
i. What is the mean of the data? What is the median?
JSS MAHAVIDYAPEETHA
JSS ACADEMY OF TECHNICAL EDUCATION, NOIDA
DEPARTMENT OF INFORMATION TECHNOLOGY
ii. What is the mode of the data? Comment on the data’s modality (i.e., bimodal,
trimodal, etc.).
iii. What is the midrange of the data?
iv. Show a boxplot of the data.
11. a) Differentiate between star and snowflake schema. 2 1
b) Suppose that a data warehouse for Big University consists of the four dimensions 2 4
student, course, semester, and instructor, and two measures count and avg grade. At
the lowest conceptual level (e.g., for a given student, course, semester, and instructor
combination), the avg grade measure stores the actual course grade of the student. At
higher conceptual levels, avg grade stores the average grade for the given
combination.
i. Draw star schema diagram for the data warehouse.
ii. Starting with the base cuboid [student, course, semester, instructor], what
specific
OLAP operations (e.g., roll-up from semester to year) should you perform in
order to list the average grade of IT courses for each Big University student.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy