0% found this document useful (0 votes)
12 views

CE0716-Data Warehouse and Mining_Compulsory

The document outlines the course structure for Data Warehouse & Mining (CE0716) at Indus Institute of Technology & Engineering, detailing objectives, content, and outcomes. It covers topics such as data mining techniques, data pre-processing, and clustering methods, along with practical experiments using tools like Weka and RStudio. The course aims to equip students with skills in data analysis, problem-solving, and application of mining techniques in real-world scenarios.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

CE0716-Data Warehouse and Mining_Compulsory

The document outlines the course structure for Data Warehouse & Mining (CE0716) at Indus Institute of Technology & Engineering, detailing objectives, content, and outcomes. It covers topics such as data mining techniques, data pre-processing, and clustering methods, along with practical experiments using tools like Weka and RStudio. The course aims to equip students with skills in data analysis, problem-solving, and application of mining techniques in real-world scenarios.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

INDUS INSTITUTE OF TECHNOLOGY& ENGINEERING

Constituent Institute of Indus University

Subject: CE0716
Subject Code: Data Warehouse &
Program: B.Tech CE Semester: VII
Mining
Teaching Scheme (Hours per week) Examination Evaluation Scheme (Marks)
Continuous Continuous
University University Internal Internal
Lecture Tutorial Practical Credits Theory Practical Evaluation Evaluation Total
Examination Examination (CIE)- (CIE)-
Theory Practical
3 0 2 4 40 40 60 60 200

Course Objectives:
1. To learn how to gather and analyze large sets of data to gain useful business understanding
and how to produce a quantitative analysis report/memo with the necessary information to
make decisions.
2. To develop and apply critical thinking, problem-solving, and decision-making skills. Define
knowledge discovery and data mining for skill development.
3. To recognize the key areas and issues in data mining.
4. To apply the techniques of clustering, classification, association finding, feature selection and
visualization to real world data for employability.
5. To determine whether a real-world problem has a data mining solution.
6. To apply evaluation metrics to select data mining techniques.

CONTENTS

UNIT-I

Introduction to Data Mining [10 hours]

Importance of Data Mining, Data Mining functionalities, Classification of Data mining systems,
Data mining architecture, Major Issues in Data Mining, Applications of Data Mining, Social
Impacts of data mining.
Introduction to Data Warehouse and OLAP Technology for Data Mining

Data Warehouse, From Data Warehousing to Data Mining, OLAP versus OLTP, Data
Warehouse Architecture, Data Warehouse Development Approach, Multidimensional data
Model, Data Warehouse Design Schema
UNIT-II

Data Pre-processing [10 hours]

Data cleaning: Filling Out Missing Values, Noisy Data Removal, Outlier Analysis, Data
Cleaning as a Process; Data Integration: Correlation Techniques, Entity Identification Problem,
Tuple Duplication Problem; Data Reduction: Principal Component Analysis, Sampling, Attribute
Subset Selection, Histograms; Data Transformation: Normalization, Concept Hierarchy
Generation, Aggregation and Discretization

UNIT-III

Mining Frequent Patterns, Associations, Correlations [12 hours]

Market Basket Analysis, Association Rule Mining, Association Rue Mining Algorithms: Apriori
Algorithm, FP Growth Algorithm; Mining of: Single dimensional Association Rules, Multilevel
Association Rules, Multidimensional Association Rules and Constraint based Association Rules

Classification and Prediction

Classification as a Process, Bayesian Classification, Classification by Decision Tree Induction,


Associative Classification, Classification by Backpropagation, Prediction: Fundamentals of
Prediction, Linear Regression and Non-Linear Regression; Issues in Classification and
Prediction

UNIT-IV

Cluster Analysis [12 hours]

Clustering as a Process, Clustering using Partitioning Methods, Hierarchical methods, Density


based Methods, Grid based Methods and Model based Methods

Mining complex Types of Data

Introduction to Spatial Data Mining, Multimedia Data Mining, Temporal Data Mining, Text and
Web Mining
Course Outcomes:

At the end of this subject, students should be able to:

1. Understand various Data Mining Applications in Real World Scenario


2. Identify the Analytical Characteristics of Mining Techniques like Clustering,
Classification, Outlier Analysis etc.
3. Employ algorithm to model Engineering Problems
4. Apply Mining concepts into Business Intelligence for giving solutions, organizational
changes, products, technologies and methods to organize key data to improve
performance and profit.
5. To learn about various clustering techniques.
6. To learn about pattern search and association rules.

Text Books:
1. Data Mining concepts and Techniques by Jiawei Han, Micheline Kamber –Elsevier.

Reference Books:

1. Data Mining by Arun K. Pujari – University Press.


2. Mordern Data Warehousing, Data Mining and Visualization by George M. Marakas –
Pearson.
3. Data Mining by Vikram Puri and P.Radha Krishana –Oxfrod Press.
4. Data Warehousing by Reema Theraja –Oxford Press

Web Resources

1. NPTEL Lecture: https://nptel.ac.in/courses/110106064/


2. NPTEL Lecture: https://nptel.ac.in/courses/106101007/

LIST OF EXPERIMENTS

No. Title Learning Outcomes

1 Study Practical: Introduction To learn how to gather and analyze large sets of
to Weka data to gain useful business understanding and
how to produce a quantitative analysis
report/memo with the necessary information to
make Decisions.
2 Study Practical: Introduction To learn how to gather and analyze large sets of
to RStudio data to gain useful business understanding and
how to produce a quantitative analysis
report/memo with the necessary information to
make Decisions.

3 To perform Classification using To apply the techniques of clustering,


Naïve Bayes Classifier over Iris classification, association finding, feature
Dataset in Weka selection and visualization to real world data

4 To perform Clustering using K- To apply the techniques of clustering,


Means Algorithm over Iris classification, association finding, feature
Dataset in Weka selection and visualization to real world data

5 To Perform Frequent Pattern To apply the techniques of clustering,


Mining using Apriori classification, association finding, feature
Algorithm over Weather. selection and visualization to real world data
Nominal Dataset in Weka

6 To Perform Outlier Analysis To apply the techniques of clustering,


using Boxplot in RStudio classification, association finding, feature
selection and visualization to real world data

7 To Perform Data Cleaning To apply the techniques of clustering,


(Filling out missing values) classification, association finding, feature
using Measures of Central selection and visualization to real world data
Tendency in RStudio

8 To Perform Association Rule To apply the techniques of clustering,


Mining using Apriori classification, association finding, feature
Algorithm over Iris Dataset in selection and visualization to real world data
RStudio

9 To perform Classification To apply the techniques of clustering,


using Naïve Bayes Classifier classification, association finding, feature
over Iris Dataset in RStudio selection and visualization to real world data

10 To perform Linear Regression To apply the techniques of clustering,


in RStudio classification, association finding, feature
selection and visualization to real world data
11 To perform Clustering using To apply the techniques of clustering,
K-Means Algorithm over Iris classification, association finding, feature
Dataset in RStudio selection and visualization to real world data

12 Project To learn how to gather and analyze large sets of


data to gain useful business understanding and
how to produce a quantitative analysis
report/memo with the necessary information to
make Decisions, To develop and apply critical
thinking, problem-solving, and decision-making
skills. Define knowledge discovery and data
mining, To recognize the key areas and issues in
data mining, To apply the techniques of
clustering, classification, association finding,
feature selection and visualization to real world
data, To determine whether a real world problem
has a data mining solution, To apply evaluation
metrics to select data mining techniques

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy