0% found this document useful (0 votes)
33 views

Data Mining New

This document contains multiple choice questions about data mining concepts and techniques. It covers topics like data preprocessing, frequent pattern mining, classification algorithms, and more. Some key points addressed are the definition of data mining as the discovery phase of knowledge discovery, uses of principal component analysis for data reduction, association rule mining and its measures of support and confidence, decision trees representing class labels at the leaf nodes, and classification mapping data into predefined groups.

Uploaded by

Dheeraj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Data Mining New

This document contains multiple choice questions about data mining concepts and techniques. It covers topics like data preprocessing, frequent pattern mining, classification algorithms, and more. Some key points addressed are the definition of data mining as the discovery phase of knowledge discovery, uses of principal component analysis for data reduction, association rule mining and its measures of support and confidence, decision trees representing class labels at the leaf nodes, and classification mapping data into predefined groups.

Uploaded by

Dheeraj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 3

1.

Data Mining is [ ]
A) A subject-oriented integrated time variant non-volatile collection of data in support of
management
B) The stage of selecting the right data for a KDD process
C) The actual discovery phase of a knowledge discovery process D) None of these
2. Principle Component Analysis can be used for [ ]
A) Data Integration B) Data Cleaning C) Data Discretization D) DataReduction
3. Smoothing Techniques are [ ]
A) Binning B) Aggregation C) Attribute Creation D) All of the above
4. Extracting knowledge from large amount of data is called_________ [ ]
A) Warehousing B) Data Mining C)Data Base D)Cluster
5. _______is a summarization of general characteristics or features of a target class of
data. [ ]
A) Concept hierarchies B) Classification C) Characterization D) Associatio analysis
6. The leaf nodes in Decision Tree represent [ ]
A) Attributes B) Class Labels C) Both A and B D) None of the above
7. Association rules are discarded as uninteresting if they do not satisfy [ ]
A) Minimum Support threshold B) Minimum Confidence threshold
C) Both A and B D) None of the above
8. Treating incorrect or missing data is called as ___________ [ ]
A) Selection B) Cleaning C) Transformation D) Interpretation
9. ______analyzes data objects with out consulting a known class label. [ ]
A) Classification B) Clustering C) Association analysis D) Characterization
10. Normalization Techniques are [ ]
A) Min-Max B) Z-Score C) Decimal Scaling D) All of the above
11. ____________ maps data into predefined groups. [ ]
A) Regression B) Time series analysis C) Prediction D) Classification
12. OLAP stands for [ ]
A)Online Academic Planning B) Online Analytical Processing
C) Offline Analytical Processing D) Offline Agricultural Planning
13. Removing noise is called as ___________ [ ]
A) Selection B) Cleaning C) Transformation D) Interpretation
14. The leaf nodes of decision tree represents ____________ [ ]
A)Attributes B)Noisy Values C)Attribute Values D)Class Labels
15. The problem of finding hidden structure in unlabeled data is called [
]
A) Unsupervised learning B) Supervised learning C) Reinforcement learning D)None

16. _____can be used to reduce the data by collecting and replacing low-level concepts with
higher level concepts. [
]
A) Concept hierarchies B) Classification C) Characterization D) Associatio analysis
17. Measures for pattern interestingness are [ ]
A) Confidence B) Support C) both A,B D) None of the above
18. Support(A=>B) =_______________________ [ ]
A) P(AUB) B)P(A) C)P(B) D)P(B/A)
19. Market Basket Analysis is an example for [ ]
A) Classification B) Clustering C) Outlier Analysis D) Frequent pattern Analysis
20. Which of the following is a Classification algorithm. [ ]
A) Apriori B) Decision Tree C) FP-Growth D) All of the above

21.A _____________is a repository of information collected from multiple sources,Stored


under a unified schema.
22. Data objects that do not comply with the general behavior or model of the data are called
as__________

23. ________________ technique can be used for Dimensionality Reduction.


24. Lift(A,B)=______________
25. The two steps of a priori algorithm are ____________ and ___________
26. A ___________________is a flowchart-like tree structure, where each internal node
denotes a test on an attribute, each branch represents an outcome of the test, and each leaf
node holds a class label.
27. Confidence(A=>B) = P(B/A)=_____________
28. The two steps of Apriori algorithm are ____________ and ___________
29. The number of elements in a sequence is called as ___________of the sequence
30. The___________ of a classifier on a given test set is the percentage of test set tuples that
are correctly classified by the classifier.
31. ____________________ involves scaling all values for a given attribute so that they fall
within a small specified range, such as -1:0 to 1:0, or 0:0 to 1:0.
32. An itemset X is known as __________item set in a dataset S if there exists no proper
super itemset Y such that Y has same support count as X in S.
33. _______________database stores sequences of ordered events
34.________________ algorithm does not involve candidate generation to find frequent
items.
35._________is a random error or variance in a measured variable.
36.An itemset X is a _________ item set in a data set S if there exists no proper super-itemset
Y such that Y has the same support count as X in S.
37. ___________________ involves finding the “best” line to fit two attributes (or variables),
so that one attribute can be used to predict the other.
38.____________ and _______________ are the two steps involved in classification.
39.__________________ technique predicts a continuous valued function.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy