0% found this document useful (0 votes)

167 views6 pages

Data Warehousing and Data Mining

The document discusses topics related to data warehousing and data mining including introduction to data warehousing, classification and prediction of data warehousing, mining time series data, mining data streams, web mining, and recent trends in distributed warehousing. Key concepts covered include data marts, data mining advantages over traditional approaches, and importance of association rules in data mining.

Uploaded by

sachin singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

167 views6 pages

Data Warehousing and Data Mining

Uploaded by

sachin singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

DATA WAREHOUSING AND DATA MINING

Introduction to Data Warehouse 02

Classification and Prediction of Data Warehousing 20

Mining Time Series Data 35

Mining Data Streams 48

Web Mining 66

Recent Trends in Distributed Warehousing 72

NOTE:

MAKAUT course structure and syllabus of 6th semester has been changed from 2021. Previously
DATA WAREHOUSING AND DATA MINING was in 7th semester. This subject has been redesigned
and shifted in 6th semester as per present curriculum. Subject organization has been changed
slightly. Taking special care of this matter we are providing the relevant MAKAUT university
solutions and some model questions & answers for newly introduced topics, so that students can
get an idea about university questions patterns.
POPULAR PUBLICATIONS

INTRODUCTION TO DATA WAREHOUSING

Multiple Choice Type Questions

1. A data warehouse is an integrated collection of data because [WBUT 2009, 2015]

a) It is a collection of data of different types

b) It is a collection of data derived from multiple sources

c) It is a relational database

d) It contains summarized data

Answer: (b)

2. A data warehouse is said to contain a 'subject oriented' collection of data because [WBUT 2009,
2013]

a) Its contents have a common theme

b) It is built for a specific application c) It cannot support multiple subjects

d) It is a generalization of 'object-oriented'

Answer: (a)

3. A Data warehouse is said to be contain in time-varying collection of data because [WBUT 2010,
2013, 2015]

a) Its content vary automatically with time

b) Its life-span is very limited

c) Every key structure of data warehouse contains either implicitly or explicitly an element of time

d) Its content has explicit time-stamp

Answer: (c)

4. Data Warehousing is used for [WBUT 2010, 2012]

a) Decision Support System

c) Database applications

b) OLTP applications

d) Data Manipulation applications

Answer: (a)

5. Which of the following is TRUE? [WBUT 2010, 2012]

a) Data warehouse can be used for analytical processing only

b) Data warehouse can be used for information processing (query, report) and analytical
processing

c) Data warehouse can be used for data mining only

d) Data warehouse can be used for information processing (query, report), analytical processing
and data mining

Answer: (d)

Short Answer Type Questions

1. Define Data Marts. [WBUT 2009, 2010, 2011, 2015, 2018

Define the types of Data Marts. ] [WBUT 2009, 2010, 2011, 2018]

Answer:

1 Part:

A data mart is a group of subjects that are organized in a way that allows them to assist
departments in making specific decisions. For example, the advertising department will have its
own data mart, while the finance department will have a data mart that is separate from it. In
addition to this, each department will have full ownership of the software, hardware, and other
components that make up their data mart.

2nd Part:

There are two types of Data Marts:

Independent data marts sources from data captured form OLTP system, external providers or from
data generated locally within a particular department or geographic area.

Dependent data mart - sources directly form enterprise data warehouses.

2. Define data mining. What is the advantages data mining over traditional approaches? [WBUT
2009]

Answer:

1 Part:

Data mining, which is also known as knowledge discovery, is one of the most popular topics in
information technology. It concerns the process of automatically extracting useful information and
has the promise of discovering hidden relationships that exist in large databases. These
relationships represent valuable knowledge that is crucial for many applications. Data mining is
not confined to the analysis of data stored in data warehouses. It may analyze data existing at
more detailed granularities than the Sumanarized data provided in a data warehouse. It may also
analyze transactional, textual, spatial, and multimedia data which are difficult to model with
current multidimensional databotechnology.

2nd Part:

With the help of data mining, organizations are in a better position to predict the future regarding
the business trend, the possible amount of revenue that could be generated, the orders that could
be expected and the type of customers that could be approached. The traditional approaches will
not be able to generate such accurate results as they use simpler algorithms. One major advantage
of data mining over a traditional statistical approach is its ability to deal directly with
heterogeneous data fields.

The advaritages of data mining helps the businesses grow help the customers be happy, and help
in a lot of other areas like data management.

3. What is the importance of Association Rules in Data mining? [WBUT 2009]

OR,

Explain support, confidence, frequent item set and give a formal definition of association rule.
[WBUT 2013]

OR,

What is an Association Rule? Define Support, Confidence, Item set and Frequent item set in
Association Rule Mining? [WBUT 2017]

Answer:

To illustrate the concepts, we use a small example from the supermarket domain. The set of items
is 1 = {milk, bread, butter, beer) and a small database containing the items is shown in Table
below.

Transaction Items

1 Milk, bread

2 Bread, butter

3 Beer

4 Milk, bread, butter

5 Bread, butter

An example supermarket database with five transactions.

An example rule for the supermarket could be (milk, bread} → {butter) meaning that if milk and
bread is bought, customers also buy butter. To select interesting rules from the set of all possible
rules, constraints on various measures of significance and interest can be used. The best-known
constraints are minimum thresholds on support and confidence. The support supp(X) of an itemset
X is defined as the proportion of transactions in the data set which contain the itemset. In the
example database in Table 1, the itemset {milk, bread) has a support of 2/5 = 0.4 since it occurs in
40% of all transactions (2 out of 5 transactions).

The confidence of a rule is defined conf(X) Y) = supp(X [Y)/supp(X). For example, the rule (milk,
bread)) {butter) has a confidence of 0.2/0.4 = 0.5 in the database in the Table, which means that
for 50% of the transactions containing milk and bread the rule is correct. Confidence can be
interpreted as an estimate of the probability P(Y (X), the probability of finding the RHS of the rule
in transactions under the condition that these transactions also contain the LHS.

In many (but not all) situations, we only care about association rules or causalities involving sets of
items that appear frequently in baskets. For example, we cannot run a good marketing strategy
involving items that no one buys anyway. Thus, much data mining starts with the assumption that
we only care about sets of items with high support; i.e., they appear together in many baskets. We
then find association rules or causalities only involving a high-support set of items must appear in
at least a certain percent of the baskets, called the support threshold. We use the term frequent
itemset for "a set S that appears in at least fraction s of the baskets," where s is some chosen
constant, typically 0.01 or 1%.

Association rules are statements of the form (X1,X2, ...,X, Y, meaning that if we find all of
X1,X2,...X, in the market basket, then we have a good chance of nding Y. The probability of finding
Y for us to accept this rule is the condence of the rule. We normally would search only for rules
that had confidence above a certain threshold.

BCA-404: Data Mining and Data Ware Housing
No ratings yet
BCA-404: Data Mining and Data Ware Housing
19 pages
Data Mining Techniques
No ratings yet
Data Mining Techniques
108 pages
1association Analysis-Apriori
No ratings yet
1association Analysis-Apriori
67 pages
Data Warehousing & Mining Important Questions & Answers
No ratings yet
Data Warehousing & Mining Important Questions & Answers
64 pages
Data Warehouse and Data Mining Organizer
No ratings yet
Data Warehouse and Data Mining Organizer
81 pages
DW & DM Organizer 2023 Compressed
No ratings yet
DW & DM Organizer 2023 Compressed
96 pages
Unit 4 - Part 1
No ratings yet
Unit 4 - Part 1
152 pages
Model Question Paper and Solution - DWDM
No ratings yet
Model Question Paper and Solution - DWDM
57 pages
Data Warehousing & Data Mining Slides
No ratings yet
Data Warehousing & Data Mining Slides
23 pages
Data Warehousing & Data Mining
No ratings yet
Data Warehousing & Data Mining
97 pages
TMK - DWDM - Unit 4. From Government Engineering College
No ratings yet
TMK - DWDM - Unit 4. From Government Engineering College
176 pages
DWDM 5 Unit Notes
No ratings yet
DWDM 5 Unit Notes
86 pages
358 44 Datamining and Warehousing 4.4
No ratings yet
358 44 Datamining and Warehousing 4.4
155 pages
Market Basket Analysis
No ratings yet
Market Basket Analysis
15 pages
DMW M1 Ktunotes - in
No ratings yet
DMW M1 Ktunotes - in
75 pages
Unit 4 - DA - Frequent Itemsets and Clustering-1 (Unit-5)
No ratings yet
Unit 4 - DA - Frequent Itemsets and Clustering-1 (Unit-5)
86 pages
Association Rule Mining
No ratings yet
Association Rule Mining
61 pages
Lec 02
No ratings yet
Lec 02
33 pages
Bca DM Unit I
No ratings yet
Bca DM Unit I
20 pages
Association RuleMining
No ratings yet
Association RuleMining
52 pages
Unit-II Association Rules
No ratings yet
Unit-II Association Rules
16 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
10 pages
DMDW Imp Ques
No ratings yet
DMDW Imp Ques
17 pages
DA Unit 4
No ratings yet
DA Unit 4
125 pages
Unit 4 DWM by DR KSR Association - Analysis
No ratings yet
Unit 4 DWM by DR KSR Association - Analysis
68 pages
Data Mining 1 2 and 3
No ratings yet
Data Mining 1 2 and 3
20 pages
AprioriTID Algorithm Improved From Apriori Algorithm
No ratings yet
AprioriTID Algorithm Improved From Apriori Algorithm
5 pages
DMDW Qa-3.2
No ratings yet
DMDW Qa-3.2
11 pages
Pptcs 1661
No ratings yet
Pptcs 1661
38 pages
Unit-5 Finalized
No ratings yet
Unit-5 Finalized
15 pages
Gokaraju Rangaraju Institute of Engineering and Technology
No ratings yet
Gokaraju Rangaraju Institute of Engineering and Technology
49 pages
Data Warehousing & Data Mining Organizer (For B.Tech MAKAUT)
No ratings yet
Data Warehousing & Data Mining Organizer (For B.Tech MAKAUT)
97 pages
Big Data
No ratings yet
Big Data
8 pages
MBAMarket Basket Analysis Using Frequent Pattern Mining Techniques
No ratings yet
MBAMarket Basket Analysis Using Frequent Pattern Mining Techniques
8 pages
Review Questions For CS410 Data Mining and Data Warehousing
No ratings yet
Review Questions For CS410 Data Mining and Data Warehousing
6 pages
Unit - I: Overview of Business Intelligence
No ratings yet
Unit - I: Overview of Business Intelligence
21 pages
DM Unit - 2
No ratings yet
DM Unit - 2
14 pages
DM Question Bank
No ratings yet
DM Question Bank
50 pages
DM Unit-II
No ratings yet
DM Unit-II
80 pages
Data Mining
No ratings yet
Data Mining
4 pages
DWM Unit-4 Sem Ans
No ratings yet
DWM Unit-4 Sem Ans
9 pages
MS (Data Science) Fall 2020 Semester
No ratings yet
MS (Data Science) Fall 2020 Semester
36 pages
DWDM Mod-1
No ratings yet
DWDM Mod-1
13 pages
Dmbi Ia2 Ans
No ratings yet
Dmbi Ia2 Ans
17 pages
Data Mining Unit 4 (1) PDF PDF
No ratings yet
Data Mining Unit 4 (1) PDF PDF
11 pages
Unit Iii (DWDM)
No ratings yet
Unit Iii (DWDM)
11 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
17 pages
Unit 3 1
No ratings yet
Unit 3 1
34 pages
DW Model Questions
No ratings yet
DW Model Questions
8 pages
Datamining and Datawarehousean In-Depth Review
No ratings yet
Datamining and Datawarehousean In-Depth Review
14 pages
Integrative Paper
No ratings yet
Integrative Paper
2 pages
Data Mining and Data Warehousing: Gayathri Vidya Parishad College of Engineering Visakhapatnam
No ratings yet
Data Mining and Data Warehousing: Gayathri Vidya Parishad College of Engineering Visakhapatnam
11 pages
Data Mining and Warehousing Concepts: Hapter
No ratings yet
Data Mining and Warehousing Concepts: Hapter
7 pages
Data Mining UNIT 3 LECTURE NOTES
No ratings yet
Data Mining UNIT 3 LECTURE NOTES
13 pages
Bi - Unit 3
No ratings yet
Bi - Unit 3
18 pages
Unit 4 - Association Analysis
No ratings yet
Unit 4 - Association Analysis
12 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
3 pages
How Evolution of Database Led To Data Mining
No ratings yet
How Evolution of Database Led To Data Mining
10 pages
Effects of Social Media Platforms To Students' Academic Performance in The New Normal Education
No ratings yet
Effects of Social Media Platforms To Students' Academic Performance in The New Normal Education
27 pages
Data Mining and Data Warehouse
No ratings yet
Data Mining and Data Warehouse
11 pages
Introduction To Data Mining Techniques: Dr. Rajni Jain
No ratings yet
Introduction To Data Mining Techniques: Dr. Rajni Jain
11 pages
Level I
No ratings yet
Level I
79 pages
November 2023 - PSAD 1
No ratings yet
November 2023 - PSAD 1
2 pages
3d Printing Lesson Plan Final Complete
No ratings yet
3d Printing Lesson Plan Final Complete
9 pages
Trignometric Ratios
No ratings yet
Trignometric Ratios
6 pages
Statistics For Educational Research: Topic 4: Hypothesis Testing
No ratings yet
Statistics For Educational Research: Topic 4: Hypothesis Testing
17 pages
Briggs - Infant Observation
100% (1)
Briggs - Infant Observation
11 pages
CS 333 Introduction To Operating Systems Class 2 - OS-Related Hardware & Software The Process Concept
No ratings yet
CS 333 Introduction To Operating Systems Class 2 - OS-Related Hardware & Software The Process Concept
47 pages
Course Work File Automation in Manufacturing
No ratings yet
Course Work File Automation in Manufacturing
28 pages
Quadratic Inequality in One Variable: Learner's Module in Mathematics 9
No ratings yet
Quadratic Inequality in One Variable: Learner's Module in Mathematics 9
25 pages
Selfast Pro 2021
No ratings yet
Selfast Pro 2021
14 pages
Summer Java 1 Exams
No ratings yet
Summer Java 1 Exams
5 pages
RA CHEMENG CEBU Nov2018 PDF
No ratings yet
RA CHEMENG CEBU Nov2018 PDF
11 pages
Music Performance Anxiety and Copying Strategies
100% (4)
Music Performance Anxiety and Copying Strategies
6 pages
Faculty of Science and Technology: (Ph.D. Programmes)
No ratings yet
Faculty of Science and Technology: (Ph.D. Programmes)
9 pages
FaheemAhmad 1
No ratings yet
FaheemAhmad 1
4 pages
Backend Developer
No ratings yet
Backend Developer
2 pages
Present Simple & Present Continuous: I Always Have Breakfast at 7 Am
No ratings yet
Present Simple & Present Continuous: I Always Have Breakfast at 7 Am
5 pages
Accadd 04-12-13 Minutes
No ratings yet
Accadd 04-12-13 Minutes
9 pages
Jathro UU100
No ratings yet
Jathro UU100
1 page
Bloomfield Hall Sahiwal: Monday
No ratings yet
Bloomfield Hall Sahiwal: Monday
1 page
C PDF
No ratings yet
C PDF
1 page
Ap Macro Spring 2022 Block Semester Schedule
No ratings yet
Ap Macro Spring 2022 Block Semester Schedule
2 pages
Reflection Mitosis (Lab)
No ratings yet
Reflection Mitosis (Lab)
2 pages
Early Surgical Closure of Atrial Septal Defect Improves Clinical Status of Symptomatic Young Children With Underlying Pulmonary Abnormalities
No ratings yet
Early Surgical Closure of Atrial Septal Defect Improves Clinical Status of Symptomatic Young Children With Underlying Pulmonary Abnormalities
10 pages
Intonation in Linguistics
No ratings yet
Intonation in Linguistics
4 pages
Danielle Jackson Resume 2
No ratings yet
Danielle Jackson Resume 2
2 pages
Pidato Bahasa Inggris Hari Guru
No ratings yet
Pidato Bahasa Inggris Hari Guru
1 page
Standard 2 6 Unit Plan Itec 7480 Revised
No ratings yet
Standard 2 6 Unit Plan Itec 7480 Revised
10 pages
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data Warehousing and Data Mining

Uploaded by

Data Warehousing and Data Mining

Uploaded by

DATA WAREHOUSING AND DATA MINING

Introduction to Data Warehouse 02

Classification and Prediction of Data Warehousing 20

Mining Time Series Data 35

Mining Data Streams 48

Recent Trends in Distributed Warehousing 72

INTRODUCTION TO DATA WAREHOUSING

1. A data warehouse is an integrated collection of data because [WBUT 2009, 2015]

a) It is a collection of data of different types

b) It is a collection of data derived from multiple sources

d) It contains summarized data

a) Its contents have a common theme

b) It is built for a specific application c) It cannot support multiple subjects

a) Its content vary automatically with time

b) Its life-span is very limited

d) Its content has explicit time-stamp

4. Data Warehousing is used for [WBUT 2010, 2012]

a) Decision Support System

d) Data Manipulation applications

5. Which of the following is TRUE? [WBUT 2010, 2012]

a) Data warehouse can be used for analytical processing only

c) Data warehouse can be used for data mining only

Short Answer Type Questions

There are two types of Data Marts:

Dependent data mart - sources directly form enterprise data warehouses.

3. What is the importance of Association Rules in Data mining? [WBUT 2009]

4 Milk, bread, butter

An example supermarket database with five transactions.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.