Decision Trees

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Supervised Learning

Input data is called training data and has a known label or result such as spam/not-
spam or a stock price at a time.A model is prepared through a training process
where it is required to make predictions and is corrected when those predictions are
wrong. The training process continues until the model achieves a desired level of
accuracy on the training data.

Example problems are classification and regression.

Example algorithms include Logistic Regression and the Back Propagation Neural
Network.

Unsupervised Learning
Input data is not labelled and does not have a known result. A model is prepared by
deducing structures present in the input data. This may be to extract general rules.
It may through a mathematical process to systematically reduce redundancy, or it
may be to organize data by similarity.

Example problems are clustering, dimensionality reduction and association rule


learning.

Example algorithms include: the Apriori algorithm and k-Means.


1. Decision Trees: A decision tree is a decision support tool that uses a tree-like
graph or model of decisions and their possible consequences, including
chance-event outcomes, resource costs, and utility.
From a business decision point of view, a decision tree is the minimum
number of yes/no questions that one has to ask, to assess the probability of
making a correct decision, most of the time. As a method, it allows you to
approach the problem in a structured and systematic way to arrive at a logical
conclusion.

The most popular decision tree algorithms are:

1) Classification and Regression Tree (CART)


2) Iterative Dichotomiser 3 (ID3)
3) C4.5 and C5.0 (different versions of a powerful approach)
4) Chi-squared Automatic Interaction Detection (CHAID)
5) Decision Stump
6) M5
7) Conditional Decision Trees

SUPERVISED LEARNING: Decision tree is suitable for supervised


learning, data are supplied to it and left to create a model that predicts the
value of a target variable by learning simple decision rules inferred from the
data features.

2) CLUSTERING ALGORITHM

Clustering, like regression describes the class of problem and the class of
methods. Clustering methods are typically organized by the modelling
approaches such as centroid-based and hierarchal. All methods are
concerned with using the inherent structures in the data to best organize the
data into groups of maximum commonality.

The most popular clustering algorithms are:

k-Means
k-Medians
Expectation Maximisation (EM)
Hierarchical Clustering

UNSUPERVISED LEARNING: Cluster algorithm is an unsupervised


machine learning technique data are not supplied to it, it involves applying
one or more clustering algorithms with the goal of finding hidden patterns
or groupings in a dataset. Clustering algorithms form groupings or clusters
in such a way that data within a cluster have a higher measure of similarity
than data in any other cluster.

3) Support vector machines

support vector machines are machine learning models with associated


learning algorithms that analyze data used for classification and regression
analysis. Given a set of training examples, each marked for belonging to one
of two categories, an SVM training algorithm builds a model that assigns
new examples into one category or the other, making it a non-
probabilistic binary linear classifier. An SVM model is a representation of
the examples as points in space, mapped so that the examples of the separate
categories are divided by a clear gap that is as wide as possible. New
examples are then mapped into that same space and predicted to belong to a
category based on which side of the gap they fall on.

In addition to performing linear classification, SVMs can efficiently perform


a non-linear classification using what is called the kernel trick, implicitly
mapping their inputs into high-dimensional feature spaces.

SUPERVISED LEARNING: Support vector machine is a supervised type


of machine learning because it makes use of labeled data, it analyze data and
use it for classification and regression analysis.

4) FEATURE LEARNING

In machine learning, feature learning or representation learning is a set of


techniques that learn a feature: a transformation of raw data input to a
representation that can be effectively exploited in machine learning tasks.
This obviates manual feature engineering, which is otherwise necessary, and
allows a machine to both learn at a specific task (using the features) and learn
the features themselves: to learn how to learn.

Feature learning is motivated by the fact that machine learning tasks such
as classification often require input that is mathematically and
computationally convenient to process. However, real-world data such as
images, video, and sensor measurement is usually complex, redundant, and
highly variable. Thus, it is necessary to discover useful features or
representations from raw data. Traditional hand-crafted features often
require expensive human labor and often rely on expert knowledge. Also,
they normally do not generalize well. This motivates the design of efficient
feature learning techniques, to automate and generalize this.

SUPERVISED LEARNING: Featured learning is a type it makes use of


labeled data as input of supervised machine learning technique because
features are learned with labeled input data. Examples include neural networks, multilayer
perceptron, and (supervised) dictionary learning.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy