0% found this document useful (0 votes)

17 views

Chapter 02

sentiment analysis part2

Uploaded by

RameshPrasadBhatta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Chapter 02

sentiment analysis part2

Uploaded by

RameshPrasadBhatta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

LITERATURE SURVEY

Chapter 2
Literature Survey

2.1 Historical Trends in Data Mining

t is a combination of many disciplines like

database management systems (DBMS), Statistics, Artificial Intelligence (AI), and Machine
Learning (ML) [33]. Data Mining produces useful patterns when algorithmic methods are
applied on observational data.

2.1.1 Data Mining Trends

Data mining algorithms show best results for numerical data but with the emergence of statistics
and machine learning techniques, algorithms have been developed to mine non numerical data
and relational databases [34].

Earlier most of the DM algorithms employed only statistical techniques [35], but now a days, the
computing techniques like artificial intelligence, machine learning and pattern reorganization are
also an integral part of it [29], [34] ,where huge heterogeneous data stored in data warehouses
can be easily mined [36],[37].

DM applications are successfully implemented in various fields like health care, finance, retail,
telecommunication, fraud detection, risk analysis, education etc [38], [39], [40], [41]. Due to
increasing complexities in various fields and evolving technologies, there are new challenges to
DM which include different data formats, distributed databases, networking resources etc.

2.2 Knowledge Discovery in Databases

Data mining and knowledge discovery in databases are related to each other and to other related
fields such as machine learning, statistics, and databases. Knowledge discovery in databases is
the process of finding useful knowledge from large dataset. Data preparation, pattern search,
knowledge evaluation and refinement are the steps of KDD [42]. Data Mining is one of the steps
in the overall process of KDD and consists of collection and pre-processing of data, data mining,
interpretation, evaluation of discovered knowledge and finally post processing [43]. The basic

21 | P a g e
LITERATURE SURVEY

objective of KDD is to make data meaningful by developing methods and techniques for
effective mining but major problem faced by the KDD process is to map huge and heterogeneous
data into understandable, more abstract and useful form [44], [45].

The phrase knowledge discovery in databases emphasizes on the fact, that knowledge is the end
product of a data-driven discovery [12], [44], [46], [47], [48]. The data mining step of KDD
relies heavily on known techniques from machine learning, pattern recognition, and statistics to
find patterns from data.

Data warehousing is one of the fields of databases [44], [47], [49], [50], which helps in business
analytics and decision support. Data warehousing helps set the stage for KDD in two ways: (a)
data cleaning and (b) data access. The approach followed for analysis of data warehouses is
called online analytical processing (OLAP) [14], [51], [52], [53], [54].

2.2.1 The Data Mining Step of the KDD Process

Data mining step of KDD Process involves iterations for particular data mining methods in
application. There are two types of goals: (a) verification in which system is limited to verifying
b) discovery, in which system autonomously finds new patterns.

DM helps in determining patterns from observed data. Knowledge inference is produced from
fitted models. Two primary mathematical formalisms are used in model fitting are: (a) statistical
and (b) logical [44].

2.2.2 Data Mining Methods

Primary goals of data mining in practice are prediction and description. In prediction some
variables and fields in the database are used to predict unknown values of other variables of
interest, and description helps in finding human-understandable patterns describing the data
[13],[15].

Classification is learning a function that maps (classifies) a data item into one of several
predefined classes [6]. The classification methods of data mining are used as part of knowledge
discovery applications which includes (a) classifying trends in financial markets, (b) education
and (c) identifying objects of interest from large dataset of images [7]. Regression is a predictive

22 | P a g e
LITERATURE SURVEY

technique that maps data item to a prediction variable. Clustering is a descriptive task which
helps in identifying a finite set of categories or clusters to describe the data e.g. Identifying those
students who are short of attendance and who have shown poor performance in sessionals [8],
[9], [10]. The examples of clustering applications in a knowledge discovery context include
discovering similar groups [11]. Summarization involves methods like calculating mean and
standard deviations. There are some methods which involve deriving of abstract rules,
visualization techniques, and the discovery of functional relationships between variables [44],
[45]. Summarization techniques are often applied to interactive exploratory data analysis and
automated report generation.

2.2.2.1 Decision Trees and Rules

Decision Trees are useful for multiple variable analyses. They split a data set into branch-like
segments [56], [57].

2.2.2.2 Classification Methods

These methods consist of techniques for prediction. Examples includes Feed Forward Neural
Networks, Adaptive Spline Methods, Projection Pursuit Regression, Multi-Layer Perceptrons,
Generalized Linear Models, Bayesian networks, Decision Trees, and Support Vector Machines
[58], [59].

2.2.2.3 Example-Based Methods

In this, predictive analyses on new examples are derived from those examples in the model for
which predictions are known. The techniques include Nearest Neighbor Classification and
Regression Algorithms and Case-Based Reasoning Systems.

2.3 The Components of Data Mining Algorithms

One can identify three primary components [35], [36], [44] in any DM algorithm:

1. Model representation: A model representation is used to describe or extract patterns

2. Model evaluation: Model-evaluation criteria are statements which help in meeting the goals
of knowledge discovery process using particular pattern or model. Predictive models are

23 | P a g e
LITERATURE SURVEY

judged by the prediction accuracy on some dataset and descriptive models are evaluated
along the dimensions of predictive accuracy, novelty, utility, and understandability of the
model.

3. Search: A search method consists of two components: (a) Parameter search and (b) Model
search. Once the model representation and the model-evaluation criteria are fixed, then data
mining problem left with optimization of task on observational dataset.

2.4 Research and Application Challenges

Larger Databases: There are databases with hundreds of fields, tables, millions of records
and to derive some useful information from these is itself a challenge. Agrawal et al.
suggested methods for dealing with large data volumes using efficient algorithmic
approaches because with increasing dataset, there are chances of finding those patterns which
are invalid [60]. Solution to this problem is the use of prior knowledge to identify irrelevant
variables.

Pattern updation: There are some issues related to prompt change, deletion of data that can
make previously discovered patterns invalid [55], [61], [62]. The possible solutions are to
discover methods for updating the patterns.

Problem of missing and noisy data: This problem is related to business databases [16] and
mostly happens when KDD methods and tools easily incorporate prior knowledge
about a problem.

2.5 Steps in Data Mining

Link analysis identifies useful associations among datasets [1].

Deviation detection detects and explains why certain records cannot be put into specific
segments [1].

According to IBM report, three main steps in DM are preparing the data, reducing the data
and finally, looking for useful information [4].

24 | P a g e
LITERATURE SURVEY

Fayyad et al. proposed following steps of data mining [44]:

Retrieving the data from a large databases

Selecting the relevant subset to work with

Deciding appropriate sampling system, transformations, cleaning the data and to deal
with missing fields and records

Fitting models to the pre-processed data

Predictive modeling uses inductive reasoning techniques and algorithms like neural networks
[63].

Database segmentation use statistical clustering techniques to partition data into clusters [64].

2.6 DM Techniques

There are different data mining techniques which are used to extract information from a data set
and transform it into an understandable format for further use. Table 2.1 shows different data
mining techniques and their roles.

2.6.1 Statistics

Statistics is a vital component of data selection, sampling, data mining, and knowledge
evaluation. In data cleaning process, statistics offer the techniques to detect outliers to simplify
data when necessary, and to estimate noise, it deals with missing data using estimation
techniques [65], [66].

2.6.2 Classification and Prediction

One of the most useful data mining techniques for e-learning is classification. Classification
maps data into the predefined group of classes. Classification is a supervised learning approach

performance with high accuracy is more beneficial for identifying the low academic performance
of the students at the beginning.

25 | P a g e
LITERATURE SURVEY

Table 2.1 Data Mining Techniques and their Roles

Techniques Roles
Classification Pre-Defined Examples
Clustering Identification of similar classes of objects.
Prediction Regression Technique.
Association Rules Find frequent item set findings among large data sets.
Derive meaning from complex or imprecise data and can be used
Neural Networks
to extract patterns and detect trends that are complex.
Represent set of decisions using CART (Classification and
Decision Trees Regression Trees) and CHAID (Chi Square Automatic Interaction
and Detection), C4.5, ID3.
Classify each record in a dataset Based on a combination of the
Nearest Neighbor method classes of the K-records which are most similar in historical
dataset.

Classification [67] is the processing of finding a set of models which describe and distinguish
data classes or concepts. The derived results may be represented in various forms, such as
classification (IF-THEN) rules, decision trees, or neural networks. Models then can be used for
predicting the class label of data objects. In many applications, there is a need to predict some
missing data values rather than class labels. E.g. case when the predicted values are numerical
data and is often specifically referred to as prediction.

2.7 Clustering

Clustering groups the data, which is not predefined and it can identify dense and sparse regions
in object space. Unlike classification and prediction, which analyze class labeled data objects,
clustering analyses data objects without consulting a known class label. The class labels are not
present in the training data and clustering can be used to generate such labels. Clusters of objects
are formed so that objects within a cluster have high similarity in comparison to one another, but
are very dissimilar to objects in other clusters. Each cluster formed can be viewed as a class of
objects, from which rules can be derived [33]. Application of clustering in education can help in

26 | P a g e
LITERATURE SURVEY

2.8 Association

Association rule mining is to find the set of binary variables that occur in the transaction
database repeatedly. Apriori measures are the association rule mining algorithm [66], [68].
Association analysis is the discovery of association rules showing attribute-value conditions that
occur frequently together in a given set of data. The association rule A=>B shows those database
tuples that satisfy the conditions in A as well as in B.

2.9 Techniques for Mining Transactional/Relational Database

2.9.1 Artificial Intelligence (AI) Techniques

AI techniques consist of pattern recognition, machine learning, and neural networks. Other
techniques in AI such as knowledge acquisition, knowledge representation, and search are
relevant to the various processes in DM.

2.9.2 Decision Tree Approach

Decision trees are non-linear data structures which start from the root node and end with a leaf
node. Decision trees represent sets of decisions. This approach can generate rules for the
classification of a data set. Specific decision tree methods include Classification and Regression
Trees (CART) and Chi Square Automatic Interaction Detection (CHAID) [69]. These techniques
are used for classification of a data set. They provide a set of rules that are applied to an
unclassified dataset to predict results. CART typically requires less data preparation than
CHAID.

2.9.3 Visualization

Visual DM techniques are helpful in exploratory data analysis, and mining the large database.
This approach requires integration of human in the DM process. There are examples of
visualization techniques that work on large data sets and produce interactive displays [70].

There are various techniques for visualizing multidimensional data like scatter plot matrices,
coplots, matrices, parallel coordinates, projection matrices, and other geometric projection
techniques such as icon-based techniques, hierarchical techniques, web-based techniques, graph-
based techniques, and dynamic techniques.

27 | P a g e
LITERATURE SURVEY

2.10 Various Data Mining Areas

2.10.1 Web Mining

Web mining is the application of data mining to discover the patterns from the Web in the form
of data collected from online information databases, hyperlinks, and digital data. Data mining
technique used in web mining are Classification (supervised learning), Clustering (unsupervised
learning) [71], [72].

2.10.2 Ubiquitous Data Mining

Increasing computational capacity and the emergence of the latest electronic devices lead to
ubiquitous or pervasive computing paradigm [73]. The Ubiquitous computing environments give
rise to Ubiquitous Data Mining (UDM).

2.11 Data Mining using Multimedia

The multimedia data includes images, video, audio, and animation. Data mining techniques
followed in multimedia data are rule-based decision tree classification algorithms like Artificial
Neural Networks, Instance-based learning algorithms, Support Vector Machines, Association rule
mining, clustering methods [74].

2.12 Spatial Data Mining

The spatial data includes astronomical and data related to space technology. It includes the use of
spatial warehouses, spatial data cubes, spatial OLAP, and clustering methods [75].

2.13 Emergence of Data Mining in Other Fields

Other data mining areas include visualization, medical, pattern, wireless networks, association
rule based mining.

2.14 Performance Improvement in Education Sector

2.14.1 Data Mining Techniques for Education Sector

Applying data mining techniques to educational data for knowledge discovery is significant to
educational organizations as well as students. Knowledge-driven data supports educational
decision support system. Educational data mining enhance our understanding of learning by

28 | P a g e
LITERATURE SURVEY

finding educational trends which include improving student performance, course selection, in-
house training, and faculty development. Using linear regression analysis [29], some factors are
correlated to
income. Data min
improvement ratio, and increase the outcome. Thus, data mining techniques
are used to operate on large volumes of data to discover hidden patterns and relationship which
help in effective decision making [65].

According to Han and Kamber, data mining software should be developed in such a manner that
it allows the users to analyze data from different dimensions, enable to categorize it and
summarize the derived results [36]. Data mining can be applied to traditional as well as distance
education. There are many general data mining tools that provide mining algorithms, filtering,
and visualization techniques. Some examples of data mining tools are DBMiner, Clementine,
Intelligent Miner, RapidMiner and Weka etc [29]. DM combines machine learning, statistics, and
visualization techniques to discover and extract knowledge. Questionnaires and feedback forms
are often used to collect data related to approach towards educational patterns or trends,
interest towards technologies, teaching methodologies followed and data collected is to be
analyzed using techniques like a decision tree, neural networks etc.

There are different mining models like Decision Trees, Naive Bayes, Support Vector Machines,
Linear Regression, Minimum Description Length, K-Nearest Neighbors and K-Means. By using
these models, one can get student behavior patterns, course behavior patterns, predict student
retention, predict course suitability, and personalized intervention strategy [32].

2.14.2 Statistics and Visualization

Information visualization techniques can be used to graphically represent student data like his
maximum interest towards which technologies or interest which he has shown in solving
questionnaires etc are collected by web-based educational systems [76]. According to Tsantis and
Castellani, s the evaluation of an e-learning
system [77]. Visualization techniques involve conversations among online groups, social
networking websites etc. These techniques are also helpful for instructors which can manipulate
the graphical representations generated and get the understanding and interest of their learners.

29 | P a g e
LITERATURE SURVEY

2.15 Web Mining

Srivastava et al. have proposed that, Web mining is used to extract knowledge from web data
[78]. In web mining useful information is extracted from the contents of web documents and web
usage mining is another technique to discover meaningful patterns from data generated by client-
server transactions on one or more web localities.

2.15.1 Clustering, Classification and Outlier Detection

Clustering and classification are both classification methods. Clustering is unsupervised and
classification is supervised. Classification and prediction are also related techniques.
Classification predicts class labels, whereas prediction predicts continuous-valued functions and
outlier is an observation that is unusually large or small relative to the other values in a dataset.

According to Liu, decision tree i.e. C5.0 algorithm and data cube technology are used for
managing classroom processes [79]. Induction analysis helps in identifying potential student
groups having similar characteristics. Talavera et al. proposes mining student data using
clustering to discover patterns reflecting user behaviors [80].

2.15.2 Adaptive and Intelligent Web-Based Educational Systems

Tang et al. have given the concept of data clustering for web-based learning which helps in
solving learner based problems [81]. They find clusters of students with similar learning
characteristics based on the sequence and the contents of the pages they visited.

2.16 Association Rule Mining

Association rule mining is popular mining method used between a set of items in large databases.
Here one or more attributes of a dataset are associated with each other using IF-THEN
statements.

2.16.1 Particular Web-Based Courses

Ha et al. [82] performs web page navigational structure analysis from web-based virtual
classrooms, e-learning portals and web pages navigated by learners.

30 | P a g e
LITERATURE SURVEY

The association fuzzy rules are implemented in a personalized e-learning material recommender
system. Fuzzy matching rules are used
and a list of learning materials [82]. Romero et al. [83] propose to use grammar-based genetic
programming with optimization techniques for providing a feedback to authors who designed

2.17 Text Mining

In text mining, mining is done on text data and is related to web content mining. It is an
interdisciplinary area involving machine learning and data mining, statistics, information
retrieval and natural language processing[76], [84]. Text mining can work with unstructured or
semi-structured datasets such as full-text documents, HTML files, emails, etc.

2.18 Web-Based Educational Systems

Data mining and text mining technologies are used in Web-based educational systems for shared
learning. Text mining is used for a discussion board for expanded correspondence analysis.
Learners select the relevant category which represents his/her comment and the system provides

2.18.1 Well-Known Learning Content Management Systems

Dringus et al. [85] and Abdous et al. [86] have proposed to use text mining as a strategy for
assessing conversations among irregular discussion forums. Text mining techniques also help in
evaluating the progress of a thread or user group discussions. Data can be retrieved from pdf
interactive multimedia productions for helping the evaluation of multimedia presentations for
statistics purpose and for extracting relevant data [82], [83]. Web-based educational systems
collect large amount of student data from weblog history which can be further analyzed for
deriving meaningful patterns [75].

2.18.2 Adaptive and Intelligent Web-Based Educational Systems

Tang et al. have proposed to construct a personalized web-based application by which mining
can be done on both the framework and structure of the courseware. Keyword-driven text mining
algorithms are used to select articles for distance learning students [81].

31 | P a g e
LITERATURE SURVEY

2.19 Conclusion

In this chapter, the literature survey has been conducted on knowledge discovery perspective and
the role of data mining in an educational environment. Educational Data Mining is an upcoming
field related to several well-established areas of research including e-learning, web mining, text
mining etc. Data mining techniques have been used to analyze educational data and extract
useful information from a large amount of data.

The KDD field is related to the development of methods and techniques which make the data
relevant. In the educational sector, software and visualization techniques can be developed using
data mining t
helps us to cluster those students who need special attention in their studies. Knowledge
discovery in databases results in better decision-making related to the latest technologies used in
classroom teaching as well as faculty enhancement programs and in-house training etc. Using
data mining techniques, one can achieve refined data from distributed databases. Data Mining is
an efficient tool for improving institutional effectiveness and student learning. Knowledge
acquired by educational data mining not only help teachers to manage their classes, improves
their teaching skills, students learning processes but also provide feedback to institutions to
improve their infrastructures and quality.

Using techniques like decision trees, the class result of students are predicted based on the
attributes taken. Decision tree classifiers have been used on student's data to predict the student's
performance in the class result. These techniques help in identifying a) those students who are
short of attendance, b) shown poor performance in sessionals.

The main finding of using these techniques is the gathering

academic performance. Other helpful techniques are clustering like K-Means, K-Nearest
Neighbors, Neural networks through which students are clustered based on some attributes like
a) class performance, b) sessional marks, c) attendance in class. The centroid values are
calculated from the educational dataset taking K-clusters. It enhances the decision-making
approach to monitor the performance of students. On increasing the value of K clusters, the
accuracy becomes better with a huge dataset and it can find the better grouping of the data. It
also helps us to clusters those students who need special attention. This review of data mining is
helpful to find useful patterns related to educational data sets.

32 | P a g e

Sample Research Methods For Social Workers 8th 8E
50% (4)
Sample Research Methods For Social Workers 8th 8E
24 pages
Business Analytics - The Science of Data Driven Decision Making PDF
22% (9)
Business Analytics - The Science of Data Driven Decision Making PDF
3 pages
BRM Imp Questions With Answers
93% (30)
BRM Imp Questions With Answers
50 pages
Statistika Minggu 3
100% (1)
Statistika Minggu 3
9 pages
A Brief Overview On Data Mining Survey PDF
No ratings yet
A Brief Overview On Data Mining Survey PDF
8 pages
Data Mining Versus Knowledge Discovery I
No ratings yet
Data Mining Versus Knowledge Discovery I
3 pages
Data Mining and Data Analysis UNIT-1 Notes For Print
No ratings yet
Data Mining and Data Analysis UNIT-1 Notes For Print
22 pages
BI_Unit 5
No ratings yet
BI_Unit 5
9 pages
Unit I
No ratings yet
Unit I
19 pages
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
No ratings yet
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
4 pages
Unit-2 Introduction To Data Mining
100% (1)
Unit-2 Introduction To Data Mining
11 pages
wao
No ratings yet
wao
9 pages
DM passing package
No ratings yet
DM passing package
38 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
AIML-HC Mod 02
No ratings yet
AIML-HC Mod 02
65 pages
Unit 1
No ratings yet
Unit 1
43 pages
DM-Model Question Paper Solutions
No ratings yet
DM-Model Question Paper Solutions
27 pages
Unit I DM
No ratings yet
Unit I DM
27 pages
Knowledge Discovery and Data Mining: Concepts and Fundamental Aspects
No ratings yet
Knowledge Discovery and Data Mining: Concepts and Fundamental Aspects
34 pages
1.data Mining Functionalities
No ratings yet
1.data Mining Functionalities
14 pages
Dwdm Unit-II Notes
No ratings yet
Dwdm Unit-II Notes
29 pages
dwm NOTES
No ratings yet
dwm NOTES
118 pages
Paper Dinesh Clustering Techniques
No ratings yet
Paper Dinesh Clustering Techniques
5 pages
Data Mining
No ratings yet
Data Mining
25 pages
p144 Data Mining
100% (3)
p144 Data Mining
11 pages
Data Mining Notes UNIT I
No ratings yet
Data Mining Notes UNIT I
21 pages
Notes Module 2
No ratings yet
Notes Module 2
28 pages
Chapter 1___Data Mining and Data Warehouse
No ratings yet
Chapter 1___Data Mining and Data Warehouse
44 pages
Fujipress - JACIII 21 1 5
No ratings yet
Fujipress - JACIII 21 1 5
18 pages
Module1 DataMining Ktustudents - in
No ratings yet
Module1 DataMining Ktustudents - in
24 pages
Whats App
No ratings yet
Whats App
23 pages
Unit-2 Finalized
No ratings yet
Unit-2 Finalized
12 pages
Fundamentals of Data Science Unit 1
No ratings yet
Fundamentals of Data Science Unit 1
29 pages
1.1 Data and Information Mining
No ratings yet
1.1 Data and Information Mining
24 pages
DWDM-UNIT-2
No ratings yet
DWDM-UNIT-2
50 pages
Chapter 3
No ratings yet
Chapter 3
9 pages
Chapter 1 - What is Data Mining
No ratings yet
Chapter 1 - What is Data Mining
8 pages
Sheenaz Project
No ratings yet
Sheenaz Project
22 pages
An Introduction To Data Mining Technique: August 2014
No ratings yet
An Introduction To Data Mining Technique: August 2014
6 pages
DM Notes (6th Nov)
No ratings yet
DM Notes (6th Nov)
6 pages
datamining&warehousing
No ratings yet
datamining&warehousing
65 pages
Cap481 - Business Communication Unit 4
No ratings yet
Cap481 - Business Communication Unit 4
90 pages
Bi Lesson 6
No ratings yet
Bi Lesson 6
36 pages
Data Mining: Priyanka Nemalikanti
No ratings yet
Data Mining: Priyanka Nemalikanti
5 pages
5104 - 07.S. L. Nalawade1
No ratings yet
5104 - 07.S. L. Nalawade1
5 pages
4 Data Mining Techniquesin Association Rule
No ratings yet
4 Data Mining Techniquesin Association Rule
4 pages
10.1.1.449.1341
No ratings yet
10.1.1.449.1341
3 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
30 pages
DWDMunit 2
No ratings yet
DWDMunit 2
27 pages
DM Unit1 Intro
No ratings yet
DM Unit1 Intro
12 pages
p196 - Knowledge Discovery in Databases
No ratings yet
p196 - Knowledge Discovery in Databases
8 pages
Unit 1 Data Mining task
No ratings yet
Unit 1 Data Mining task
7 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
Data Mining
No ratings yet
Data Mining
43 pages
DMWH M1
No ratings yet
DMWH M1
25 pages
Data Mining
No ratings yet
Data Mining
22 pages
BDA Class1
No ratings yet
BDA Class1
33 pages
Improvement of K-Means Clustering Algorithm: Prof P M Chawan Saurabh R Bhonde Shirish Patil
No ratings yet
Improvement of K-Means Clustering Algorithm: Prof P M Chawan Saurabh R Bhonde Shirish Patil
5 pages
Data Mining AND Warehousing: Abstract
No ratings yet
Data Mining AND Warehousing: Abstract
12 pages
Data Mining & Data Warehousing
No ratings yet
Data Mining & Data Warehousing
62 pages
Data Warehousing&Dat Mining
No ratings yet
Data Warehousing&Dat Mining
12 pages
Datawarehouse&Data mining_ALL
No ratings yet
Datawarehouse&Data mining_ALL
46 pages
Mastering Data Mining Techniques
From Everand
Mastering Data Mining Techniques
Dhaanyalakshmi Ahuja
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
cbse
No ratings yet
cbse
64 pages
Programming Preliminaries (Chapter 3)
No ratings yet
Programming Preliminaries (Chapter 3)
7 pages
Historical Development (Chapter 1)
No ratings yet
Historical Development (Chapter 1)
7 pages
Control Statements
No ratings yet
Control Statements
44 pages
Pointers
No ratings yet
Pointers
7 pages
Chapter 6
No ratings yet
Chapter 6
13 pages
Unit 5
No ratings yet
Unit 5
11 pages
Query Optimization
No ratings yet
Query Optimization
7 pages
Gov Decisions
No ratings yet
Gov Decisions
6 pages
C Chap06
No ratings yet
C Chap06
30 pages
A Package Is A Collection of Similar Types of Classes
No ratings yet
A Package Is A Collection of Similar Types of Classes
6 pages
Strings
No ratings yet
Strings
5 pages
Arrays
No ratings yet
Arrays
3 pages
Worksheet Research Design
No ratings yet
Worksheet Research Design
2 pages
Case Study #1 - Good Fitness
No ratings yet
Case Study #1 - Good Fitness
2 pages
Ria Mae Tamba Practical Research Chapt.1-5NEW (AutoRecovered)
No ratings yet
Ria Mae Tamba Practical Research Chapt.1-5NEW (AutoRecovered)
32 pages
Final Draft UCSP
No ratings yet
Final Draft UCSP
47 pages
Fernandoantequerasanz CV 2019
No ratings yet
Fernandoantequerasanz CV 2019
2 pages
F Distribution
No ratings yet
F Distribution
16 pages
Visualization Using Python
No ratings yet
Visualization Using Python
42 pages
Modelo de CV
No ratings yet
Modelo de CV
2 pages
Hierarchical Clustering: Ke Chen
No ratings yet
Hierarchical Clustering: Ke Chen
21 pages
Data Mining and BI
No ratings yet
Data Mining and BI
4 pages
Imputation
No ratings yet
Imputation
2 pages
Faculty of Busniess and Management Bachelor in Office System Manangment (BA232) MGT555 Assignment 1 Prepared by
100% (1)
Faculty of Busniess and Management Bachelor in Office System Manangment (BA232) MGT555 Assignment 1 Prepared by
13 pages
Mcom Applied Syllabus of LU
No ratings yet
Mcom Applied Syllabus of LU
23 pages
RESEARCH - Social Inequality and Academic Performance
No ratings yet
RESEARCH - Social Inequality and Academic Performance
3 pages
ExpertFit Student Version Overview
No ratings yet
ExpertFit Student Version Overview
23 pages
Analyzing Qualitative Data
No ratings yet
Analyzing Qualitative Data
8 pages
THREE - 4666-Research Methods-II
No ratings yet
THREE - 4666-Research Methods-II
4 pages
Chapter 14 - Analyzing Quantitative Data
No ratings yet
Chapter 14 - Analyzing Quantitative Data
8 pages
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
10 pages
Time Series Forecasting
No ratings yet
Time Series Forecasting
11 pages
Download full Research Methods for Social Workers 8th Edition Bonnie L. Yegidis ebook all chapters
100% (20)
Download full Research Methods for Social Workers 8th Edition Bonnie L. Yegidis ebook all chapters
60 pages
Factor Analysis (SPSS Based)
No ratings yet
Factor Analysis (SPSS Based)
61 pages
2020 Pornography Scale
50% (2)
2020 Pornography Scale
26 pages
GEC3 Assignment 5 PDF
No ratings yet
GEC3 Assignment 5 PDF
5 pages
Fund Flow Statement - Kotak Mahindra
No ratings yet
Fund Flow Statement - Kotak Mahindra
79 pages
Cases Conjoint Analysis
No ratings yet
Cases Conjoint Analysis
5 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 02

Uploaded by

Chapter 02

Uploaded by

LITERATURE SURVEY

2.1 Historical Trends in Data Mining

t is a combination of many disciplines like

2.1.1 Data Mining Trends

2.2 Knowledge Discovery in Databases

2.2.1 The Data Mining Step of the KDD Process

2.2.2 Data Mining Methods

2.2.2.1 Decision Trees and Rules

2.2.2.2 Classification Methods

2.2.2.3 Example-Based Methods

2.3 The Components of Data Mining Algorithms

1. Model representation: A model representation is used to describe or extract patterns

2.4 Research and Application Challenges

2.5 Steps in Data Mining

Link analysis identifies useful associations among datasets [1].

Fayyad et al. proposed following steps of data mining [44]:

Retrieving the data from a large databases

Selecting the relevant subset to work with

Fitting models to the pre-processed data

2.6.2 Classification and Prediction

Table 2.1 Data Mining Techniques and their Roles

2.9 Techniques for Mining Transactional/Relational Database

2.9.1 Artificial Intelligence (AI) Techniques

2.9.2 Decision Tree Approach

2.10 Various Data Mining Areas

2.10.1 Web Mining

2.10.2 Ubiquitous Data Mining

2.11 Data Mining using Multimedia

2.12 Spatial Data Mining

2.13 Emergence of Data Mining in Other Fields

2.14 Performance Improvement in Education Sector

2.14.1 Data Mining Techniques for Education Sector

2.14.2 Statistics and Visualization

2.15 Web Mining

2.15.1 Clustering, Classification and Outlier Detection

2.15.2 Adaptive and Intelligent Web-Based Educational Systems

2.16 Association Rule Mining

2.16.1 Particular Web-Based Courses

2.17 Text Mining

2.18 Web-Based Educational Systems

2.18.1 Well-Known Learning Content Management Systems

2.18.2 Adaptive and Intelligent Web-Based Educational Systems

The main finding of using these techniques is the gathering

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.