0% found this document useful (0 votes)

81 views

03 - Non Linear Classifiers PDF

1) Machine learning algorithms can only perform linear classification on real-valued data. 2) To allow for nonlinear classification, data points can be mapped to a higher dimensional feature space using a feature transformation. 3) A linear classifier in the higher dimensional feature space corresponds to a nonlinear classifier in the original input space, allowing for more complex decision boundaries.

Uploaded by

TEDx UniversitéCentrale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views

03 - Non Linear Classifiers PDF

Uploaded by

TEDx UniversitéCentrale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Machine Learning

Linear classifiers on the real line

h(x; θ, θ0) = sign (θ.x + θ0) θx + θ0

− +
0 1 x∈R

− +

Non Linear Classifiers 2

Linear classifiers on the real line
θx + θ0
θ1 
θ = 

+
-1
−
0
× +
1
θ 2 

x∈R

We can remedy the situation by introducing a feature transformation feeding a

different type of example to the linear classifier.
x
x → φ (x) =  2 
x 
x∈R φ (x) ∈ R2
For example, we map x now to a feature vector φ(x).
Remember that x here is a scalar. Feature vector now lives in a high dimensional

Non Linear Classifiers 3

Linear classifiers on the real line

x θ1 
x → φ (x) =  2  θ = 
x  θ 2 
x∈R φ (x) ∈ R2
+ − +
-1 0 1 x∈R

Quadratic function rather than a

h(x; θ, θ0) = sign (θ. φ(x) + θ0) linear one

= sign (θ1. x + θ2. x2 + θ0)

Non Linear Classifiers 4

In Feature Space

φ2

− 1 1
1 1 1
  

+ − +
-1 0 1 φ1

 x  φ1
φ (x) =  2 
 x  φ2
Non Linear Classifiers 5
Back To the Real Line
h(x; θ, θ0) = sign (θ. φ(x) + θ0)
= sign (θ1. x + θ2. x2 + θ0)

+ − +
-1
× 0
× 1 x∈R

A linear classifier in the new feature coordinates implies a nonlinear classifier in the
original x space.
Non Linear Classifiers 6
2-dim example
φ3 x2

+
− +
+
− φ2 x1

+ −
−φ1 The (φ1, φ2) plane is actually an appropriate way
to separate these two examples.
 x1   θ1  0 
θ = θ 2 
⌢  
φ (x) =  x2  θ0 θ = 0  , θ0 = 0
⌢
 
 x1 x 2  θ 2  1
Non Linear Classifiers 7
Polynomial features
• We can add more polynomial terms
x
x2 
 
x ∈ R, φ(x ) =  x 3 
 4
x 
 ... 
• Means lots of features in higher dimensions
 x1 
 x 
 x1   2 
x =   ∈ R2 , φ(x ) =  x12  ∈ R5
 x2   2 
 x2 
 2 x1 x2 

Non Linear Classifiers 8

Polynomial features
• We can get more and more powerful classifiers by adding linearly independent
features.

• For instance x and x2 squared as functions are linearly independent, the original
coordinates always provide something above and beyond what were in the
previous ones.

Non Linear Classifiers 9

Non linear classification and regression
• Non linear classification
h(x; θ, θ0) = sign (θ. φ(x) + θ0)

• Non linear regression

f (x; θ, θ0) = θ. φ(x) + θ0 = θ1x + θ2x2 + θ0
y

×××××××
× ×
× ×
× ×
× × x
× ×
e.g. φ(x) = [x, x2]T
Non Linear Classifiers 10
Non linear regression

x
φ(x) = x φ( x ) =  x 2 
 x 3 

Linear 3rd Order

One question here is which one of these should we actually choose? Which one is
the correct one?

5rd Order 7rd Order

Non Linear Classifiers 11

Non linear regression
• What we can do (as we've discussed before) is to introduce a validation set: hold
out some subset of examples, train the method based on the remaining samples,
and then try out those held out examples to see how well the method would
actually perform on those pretend test examples.

• At the extreme, you hold out each of the training example in turn in a procedure
called leave one out cross validation.
– You take a single training sample, you remove it from the training set
– Retrain the method
– Test how well you would predict that particular holdout example, and do that for each
training example in turn.
– Average the results.

Non Linear Classifiers 12

Non linear regression

Non Linear Classifiers 13

Non linear regression

Non Linear Classifiers 14

Non linear regression

Linear 3rd Order

If we now use this leave one out accuracy as a measure of selecting which one of
these is the correct explanation, we would actually select a linear one. That is the
correct answer because the data was generated from a linear model with some
noise added.
5rd Order 7rd Order

Non Linear Classifiers 15

Non linear regression
• By mapping input examples explicitly into feature vectors, and performing linear
classification or regression on top of such feature vectors, we get a lot of
expressive power .
• The downside of this procedure, which is always available to you, is that you
might end up with a very high-dimensional feature vectors that you have to
explicitly consider.

x ∈ Rd
φ(x) = [x1, … xd, {xixj}, {xixjxk}, …]T
O(d) O(d2) O(d3)

• we would want to have a more efficient way of doing that-- operating with high
dimensional feature vectors without explicitly having to construct them. And that
is what kernel methods provide us
Non Linear Classifiers 16
Non linear regression
• We want a decision boundary that is a polynomial of order p
+ + +
++ + + − + +
+ + − + +
− − −
+ + −
+ +
− − +
− + − −
− − − −
− − −
• Add new features to data vectors x
– Let φ (x) consists of all terms of order ≤ p, such as x1 x22 x3p −3
– Degree-p polynomial in x ⇔ linear in φ (x)

x = (x1, x2); p = 3
φ(x ) = ( x1 , x2 , x12 , x22 , x1 x2 , x13 , x23 , x12 x2 , x1 x22 )

Non Linear Classifiers 17

Non linear regression
• We want a decision boundary that is a polynomial of order p
+ + +
++ + + − + +
+ + − + +
− − −
+ + −
+ +
− − +
− + − −
− − − −
− − −
• Again the key idea is if we want to learn a decision boundary in the original
space which is polynomial of order p, we can equivalently just learn a linear
boundary in the higher dimensional feature space φ (x).
• In order to learn this linear classifier, we can use a standard method like
Perceptron or SVM.

Non Linear Classifiers 18

Non linear regression
• We want a decision boundary that is a polynomial of order p
+ + +
++ + + − + +
+ + − + +
− − −
+ + −
+ +
− − +
− + − −
− − − −
− − −
• The only downside to all of this is that these vectors φ (x), these extend
representations can get enormous, so large that it would be very hard to write
them down.
• Actually we don’t need to write them down at all.
• For Perceptron and for SVM, all we need to do is to compute dot products.

Non Linear Classifiers 19

Inner Products, Kernels
• Computing the inner product between two features vectors can be cheap even if
the vectors are very high dimensional.

[
φ( x) = x1 , x2, 2
x ,
1 2 x1 x2 , x 2 T
2 ]
φ( x' ) = [x′,
1 x2′ , x1′ , 2
2 x1′x′2 , x′ 2
2 T
]
Κ(x,x’) = φ(x). φ(x’) = (x.x’) + (x.x’)2 + (x.x’)3
The inner product between two feature vectors can be evaluated cheaply, based on
just taking inner products of the original examples and doing some nonlinear
transformations of the result.

The inner product of two feature vectors is known as a kernel function.

Non Linear Classifiers 20

Inner Products, Kernels
• Kernel is a way of computing the dot product of two vectors x and y in some
(possibly very high dimensional) feature space, which is why kernel functions are
sometimes called "generalized dot product”

• Suppose we have a mapping φ: Rn → Rm that brings our vectors in Rn to some

feature space Rm. Then the dot product of x and y in this space is φ(x)Tφ(y).

• A kernel is a function K that corresponds to this dot product, i.e.

K (x, y) = φ(x)Tφ(y).

• Why is this useful? Kernels give a way to compute dot products in some feature
space without even knowing what this space is and what is φ.

Non Linear Classifiers 21

Inner Products, Kernels
• For example, consider a simple polynomial kernel:
K (x, y) = (1 + xTy)2 with x, y ∈ R2.
• This doesn't seem to correspond to any mapping function φ, it's just a function
that returns a real number.
• Assuming that x = (x1, x2) and y = (y1, y2), let's expand this expression:
( T
)
K (x, y ) = 1 + x y = (1 + x1 y1 + x2 y2 )
2 2

= 1 + x12 y12 + x22 y22 + 2 x1 y1 + 2 x2 y2 + 2 x1 x2 y1 y2

• This is nothing else but a dot product between two vectors
(1, x , x ,
2
1
2
2 ) (
2 x1 , 2 x2 , 2 x1 x2 and 1, y 21, y 22 , 2 y1 , 2 y2 , 2 y1 y2 )
(
• φ(x) = φ(x1, x2) = 1, x 21 , x 22 , 2 x1 , 2 x2 , 2 x1 x2 )
Non Linear Classifiers 22
Inner Products, Kernels
(
• φ (x) = φ (x1, x2) = 1, x 21 , x 22 , 2 x1 , 2 x2 , 2 x1 x2 )
( T
)
K ( x, y ) = 1 + x y = φ ( x ) φ ( y )
2 T

computes a dot product in 6-dimensional space without explicitly visiting this

space.

• Our task now is to turn our linear methods into methods that can operate in
terms of the kernels, rather than directly in terms of the feature coordinates.

• We will implicitly operate with very high dimensional feature vectors and do the
linear prediction there but actually computationally only deal with the kernel
function.

Non Linear Classifiers 23

Kernels vs. Features
• For some feature maps, we can evaluate the inner product very efficiently, e.g.,
Κ(x, x’) = φ(x). φ(x’) = (1 + x.x’)p, p = 1, 2, …

• In those cases, it’s advantageous to express the linear classifier (regression

methods) in terms of kernels rather than explicitly constructing feature vectors.

sign (θ. φ(x) + θ0) → Κ(x,x’)

Non Linear Classifiers 24

Recall perceptron
PERCPTRON ({(x(i), y(i)), i = 1, …, n}, T)
θ = 0 (vector)
θ 0= 0
for t = 1, …, T do MNIST dataset:
for i = 1, …, n do
if y(i).(θ. φ(x(i)) + θ 0) ≤ 0 then dim (x) = 784
θ ← θ + y(i).φ(x(i)) dim (φ(x)) = 300000
θ 0= θ 0 + y(i) dim (θ) = 300000
return θ

Problem: number of features has now increased dramatically.

The Kernel trick: implement this without ever writing down a vector in.

Non Linear Classifiers 25

The kernel trick
PERCPTRON ({(x(i), y(i)), i = 1, …, n}, T)
θ = 0 (vector)
θ 0= 0
for t = 1, …, T do
for i = 1, …, n do
if y(i).(θ. φ(x(i)) + θ 0) ≤ 0 then θ Is updated only when there is a mistake
θ ← θ + y(i).φ(x(i)) θ updated
θ 0= θ 0 + y(i)
return θ

θ = ∑ α j y ( j )φ(x ( j ) )
n
αj #times we have updated on jth point
j =1

Instead of working with θ, we can work equivalently with the coefficients α1, …, αn

Non Linear Classifiers 26

The kernel trick
• There’s an αj coefficient for each data point.
• We can put them together into a vector α = (α1, …, αn)
• From α, we can if we like recover θ
• θ is always a linear combination of φ(x(i))

θ = ∑ α j y ( j )φ(x ( j ) )
n

j =1

• We solved the problem of θ, we will not use it at all. We will use the vector α
instead.
• The α vector is called the dual representation of θ

Non Linear Classifiers 27

The kernel trick
• Compute θ. φ(x(i)) using the dual representation
 n 
( ) ( )
θ .φ(x ) =  ∑ α j y ( j )φ x ( j ) .φ x (i )

 j =1 
 n 
( ) ( )
=  ∑ α j y ( j )φ x ( j ) .φ x (i ) 

 j =1 

• We still have to compute the dot product of two very high dimensional vectors.
• Compute φ(x). φ(z) without ever writing out φ(x) or φ(z).

Non Linear Classifiers 28

Computing dot products
• First in 2D
(
Suppose x = (x1, x2) and φ ( x ) = x1 , x2 , x1 , x2 , x1 x2
2 2
)
(
• Actually tweak a little: φ ( x ) = 1, 2 x1 , 2 x2 , x1 , x2 , 2 x1 x2
2 2
)
It doesn’t change anything, any function that’s linear in the original φ(x) is also
linear in the new φ(x) and vice versa.

• What is φ(x).φ(z)?

( )(
φ( x ) . φ(z ) = 1, 2 x1 , 2 x2 , x12 , x22 , 2 x1 x2 . 1, 2 z1 , 2 z 2 , z12 , z 22 , 2 z1 z 2 )
= 1 + 2 x1 z1 + 2 x2 z 2 + x12 z12 + x22 z 22 + 2 x1 z1 x2 z 2
= (1 + x1 z1 + x2 z 2 )
2

= (1 + x . z )
2

Non Linear Classifiers 29

Computing dot products
• The dot product between the higher dimensional vectors is the dot product
between the original low dimensional vectors plus 1 squared.
• Exactly the same thing holds when the original vectors are d dimensional.
• Suppose x = ( x1 , x2 ,..., xd )
(
φ(x ) = 1, 2 x1 ,..., 2 xd , x12 ,..., xd2 , 2 x1 x2 ,..., 2 xd −1 xd )
(
φ( x ) . φ(z ) = 1, 2 x1 ,..., 2 xd , x12 ,..., xd2 , 2 x1 x2 ,..., 2 xd −1 xd . )
(1, 2 z1 ,..., 2 z d , z12 ,..., z d2 , 2 z1 z 2 ,..., 2 z d −1 z )
d

= 1 + 2∑ xi zi + ∑ xi2 zi2 + 2∑ xi x j zi z j
i i i≠ j

= (1 + x1 z1 + ... + xd z d )
2

= (1 + x . z )
2

Non Linear Classifiers 30

Kernel perceptron
PERCPTRON ({(x(i), y(i)), i = 1, …, n}, T)
θ = 0 (vector) α1 = … = αn = 0
θ 0= 0
for t = 1, …, T do
for i = 1, …, n do
if y(i).(θj. φ(x(i)) + θ 0) ≤ 0 then
θ ← θ + y(i).φ(x(i))
θ 0= θ 0 + y(i)
return θ

To classify a new point, we again use the dual form of θ.

Non Linear Classifiers 31

Kernel perceptron
PERCPTRON ({(x(i), y(i)), i = 1, …, n}, T)
θ = 0 (vector) α1 = … = αn = 0
θ 0= 0
for t = 1, …, T do
for i = 1, …, n do
if y(i) Σj (αj y(j). φ(x(j)).φ(x(i)) + θ 0) ≤ 0 then
αi ← αi + 1
θ 0= θ 0 + y(i)
return θ

To classify a new point, we again use the dual form of θ.

....

 n 
sign  ∑ α j y φ x
( j)
( )
( j)
φ( x ) + θ 0  sign (θ.φ(x) + θ0)
 j =1 
Non Linear Classifiers 32
Feature engineering, Kernels
• There are two major techniques to construct valid kernel functions: either from
an explicit feature map or from other valid kernel functions.

• K (x, x’) = 1 is a kernel function. φ(x) = 1

• Let f : Rd → R and K (x, x’) is a kernel. Then so is

~ ~
K ( x, x′) = f (x )K (x, x′) f (x′) φ( x ) = f ( x )φ( x )

• If K1(x, x’) and K2(x, x’) are kernels then K(x, x’) = K1(x, x’) + K2(x, x’)
is a kernel. φ1 (x )
φ( x ) =  
φ (
 2 x )
• If K1(x, x’) and K2(x, x’) are kernels then K(x, x’) = K1(x, x’) K2(x, x’) is a
kernel.

Non Linear Classifiers 33

Radial Basis Kernel
• The idea is hat ∀ (xi, yi) ∈ Sn, it influences h(x).

• Based on || x – xi || (affected through the distance.

• A point in the dataset will affect the nearby points more than it affects the
faraway points.

Non Linear Classifiers 34

Radial Basis Kernel
• Standard form:
n
−γ x − x ( j )
2

h( x ) = ∑ α j y e ( j)
j =1

−γ x − x ( j )
2

• Let K ( x, x′ ) = e

( )
n
h ( x ) = ∑ α j y K x, x ( j) ( j)
Radial Basis Function j =1

Non Linear Classifiers 35

Radial Basis Kernel
1
− x− x( j )
2

K ( x, x′ ) = e 2

( )
n

∑ j
α
j =1
y ( j)
K x , x ( j)

Non Linear Classifiers 36

Radial Basis Kernel
1
− x− x( j )
2

K ( x, x′ ) = e 2

Non Linear Classifiers 37

Radial Basis Kernel
1
− x− x( j )
2

K ( x, x′ ) = e 2

Non Linear Classifiers 38

6.86x Machine Learning With Python: Linear Classifiers
No ratings yet
6.86x Machine Learning With Python: Linear Classifiers
7 pages
Midterm Review Spring18 Sols
No ratings yet
Midterm Review Spring18 Sols
22 pages
Supply Chain Management and Advanced Pla PDF
No ratings yet
Supply Chain Management and Advanced Pla PDF
8 pages
MergedPDF Iml
No ratings yet
MergedPDF Iml
114 pages
Vahid
No ratings yet
Vahid
18 pages
CS221 - Artificial Intelligence - Machine Learning - 6 Non-Linear Features
No ratings yet
CS221 - Artificial Intelligence - Machine Learning - 6 Non-Linear Features
22 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
SVM
No ratings yet
SVM
57 pages
Chapter 4. Classification Algorithms-Stud
No ratings yet
Chapter 4. Classification Algorithms-Stud
43 pages
SVM Class
No ratings yet
SVM Class
33 pages
SVM
No ratings yet
SVM
40 pages
Kernel Models 1233
No ratings yet
Kernel Models 1233
56 pages
L6 Lecture Image.classification.fundemental v4
No ratings yet
L6 Lecture Image.classification.fundemental v4
66 pages
Support Vector Machines: Vibhav Gogate The University of Texas at Dallas
No ratings yet
Support Vector Machines: Vibhav Gogate The University of Texas at Dallas
36 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
46 pages
Lecture 8
No ratings yet
Lecture 8
19 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
SVM
No ratings yet
SVM
44 pages
svm
No ratings yet
svm
33 pages
SVM Example
No ratings yet
SVM Example
10 pages
cs221-lecture11
No ratings yet
cs221-lecture11
71 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
Handout 03 Classic Classifiers
No ratings yet
Handout 03 Classic Classifiers
39 pages
Chapter 8
No ratings yet
Chapter 8
103 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
SVM
No ratings yet
SVM
36 pages
M7 ClassificationLinearModels
No ratings yet
M7 ClassificationLinearModels
74 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Introduction To Support Vector Machines
No ratings yet
Introduction To Support Vector Machines
23 pages
Ain3001 - 04 - Support - Vector.machines
No ratings yet
Ain3001 - 04 - Support - Vector.machines
50 pages
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
4 pages
Support Vector Machines For Classification and Regression: Steve R. Gunn
No ratings yet
Support Vector Machines For Classification and Regression: Steve R. Gunn
66 pages
To Machine Learning: Isabelle Guyon
No ratings yet
To Machine Learning: Isabelle Guyon
40 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
ML imppp (1)
No ratings yet
ML imppp (1)
12 pages
Lec06 SVM
No ratings yet
Lec06 SVM
25 pages
SVM-Worked Out Example
No ratings yet
SVM-Worked Out Example
4 pages
Time Series Forecasting by Using Wavelet Kernel SVM
No ratings yet
Time Series Forecasting by Using Wavelet Kernel SVM
52 pages
4 - SVM
No ratings yet
4 - SVM
58 pages
Fast Kernel Classifiers
No ratings yet
Fast Kernel Classifiers
41 pages
Deep Learning Answers
No ratings yet
Deep Learning Answers
36 pages
Supervised Learning - Support Vector Machines and Feature Reduction
No ratings yet
Supervised Learning - Support Vector Machines and Feature Reduction
11 pages
Lect3 2
No ratings yet
Lect3 2
43 pages
SVM-CDing2024 11 15
No ratings yet
SVM-CDing2024 11 15
54 pages
Lect 1
No ratings yet
Lect 1
24 pages
ml_unit_4 (1)
No ratings yet
ml_unit_4 (1)
19 pages
Chapter1: Introduction: Notes On MLAPP
No ratings yet
Chapter1: Introduction: Notes On MLAPP
25 pages
ML-UNIT-I
No ratings yet
ML-UNIT-I
14 pages
02_intro_learning
No ratings yet
02_intro_learning
98 pages
Atc Lecture Tyliu
No ratings yet
Atc Lecture Tyliu
48 pages
Notes Chapter Linear Classifiers
No ratings yet
Notes Chapter Linear Classifiers
4 pages
06 Lectureslides LinearClassification Fixed
No ratings yet
06 Lectureslides LinearClassification Fixed
52 pages
Machine learning assingiment
No ratings yet
Machine learning assingiment
20 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
16_the key to the most powerful ML models
No ratings yet
16_the key to the most powerful ML models
25 pages
Support Vector Machines For Classification and Regression: Steve R. Gunn
No ratings yet
Support Vector Machines For Classification and Regression: Steve R. Gunn
66 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
9 Svm-Handout PDF
No ratings yet
9 Svm-Handout PDF
21 pages
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Interviewing at Pythian IST d5b
No ratings yet
Interviewing at Pythian IST d5b
2 pages
Open PCS7 PDF
No ratings yet
Open PCS7 PDF
114 pages
Coc 2009 1-10
No ratings yet
Coc 2009 1-10
10 pages
imageRUNNER ADVANCE DX 717 Series PC r0 200714
No ratings yet
imageRUNNER ADVANCE DX 717 Series PC r0 200714
88 pages
Linear Control Systems: Ali Karimpour Associate Professor Ferdowsi University of Mashhad
No ratings yet
Linear Control Systems: Ali Karimpour Associate Professor Ferdowsi University of Mashhad
24 pages
Mangalayatan University - KIBOXS
No ratings yet
Mangalayatan University - KIBOXS
6 pages
DRDO 2019 Jobs Latest Recruitment of 116 Apprentice Vacancies
No ratings yet
DRDO 2019 Jobs Latest Recruitment of 116 Apprentice Vacancies
2 pages
The Soul of A New Machine: Book Review
No ratings yet
The Soul of A New Machine: Book Review
5 pages
Straight Blade Receptacle Catalog Q-1101H
No ratings yet
Straight Blade Receptacle Catalog Q-1101H
44 pages
School Improvement Plan (Sip) : Cab - Ilan National High School
No ratings yet
School Improvement Plan (Sip) : Cab - Ilan National High School
35 pages
A Taxonomy of Botnet Behavior
No ratings yet
A Taxonomy of Botnet Behavior
40 pages
Mobile Services: Your Account Summary This Month'S Charges
No ratings yet
Mobile Services: Your Account Summary This Month'S Charges
3 pages
Atmospheric Temperature, Pressure and Density As Function of The Height Above Sea Level
No ratings yet
Atmospheric Temperature, Pressure and Density As Function of The Height Above Sea Level
53 pages
Foundation Types
100% (2)
Foundation Types
203 pages
Ic Socket
No ratings yet
Ic Socket
2 pages
Computer Viruses - The Current State in Italy
No ratings yet
Computer Viruses - The Current State in Italy
2 pages
IFWD Download DLL ReleaseNote
No ratings yet
IFWD Download DLL ReleaseNote
15 pages
r303g Datasheet
No ratings yet
r303g Datasheet
11 pages
User Manual Antenna Selector-Configurator v1.6
No ratings yet
User Manual Antenna Selector-Configurator v1.6
17 pages
Autosar Sws Caninterface
No ratings yet
Autosar Sws Caninterface
213 pages
PCF8574 GPIO Extender - With Arduino and NodeMCU _ 15 Steps - Instructables
No ratings yet
PCF8574 GPIO Extender - With Arduino and NodeMCU _ 15 Steps - Instructables
14 pages
The DevOps Guide To Database Backups For MySQL and MariaDB
No ratings yet
The DevOps Guide To Database Backups For MySQL and MariaDB
48 pages
Lecture1 Clean
No ratings yet
Lecture1 Clean
43 pages
A Comb for Decompiled C Code
No ratings yet
A Comb for Decompiled C Code
15 pages
Ahmad Hussain: Mobile No: 08793796424
No ratings yet
Ahmad Hussain: Mobile No: 08793796424
5 pages
02-SS-Backup Policy V. 0.10
No ratings yet
02-SS-Backup Policy V. 0.10
7 pages
Resume: Abhishek - Lgcse@gmail - Co M
No ratings yet
Resume: Abhishek - Lgcse@gmail - Co M
2 pages
Link de Mis Clases - 6B A Partir de Julio
No ratings yet
Link de Mis Clases - 6B A Partir de Julio
3 pages
NP234Wk13 23
No ratings yet
NP234Wk13 23
56 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

03 - Non Linear Classifiers PDF

Uploaded by

03 - Non Linear Classifiers PDF

Uploaded by

Machine Learning

Linear classifiers on the real line

Non Linear Classifiers 2

We can remedy the situation by introducing a feature transformation feeding a

Non Linear Classifiers 3

Quadratic function rather than a

= sign (θ1. x + θ2. x2 + θ0)

Non Linear Classifiers 4

Non Linear Classifiers 8

Non Linear Classifiers 9

• Non linear regression

Linear 3rd Order

5rd Order 7rd Order

Non Linear Classifiers 11

Non Linear Classifiers 12

Non Linear Classifiers 13

Non Linear Classifiers 14

Linear 3rd Order

Non Linear Classifiers 15

Non Linear Classifiers 17

Non Linear Classifiers 18

Non Linear Classifiers 19

The inner product of two feature vectors is known as a kernel function.

Non Linear Classifiers 20

• Suppose we have a mapping φ: Rn → Rm that brings our vectors in Rn to some

• A kernel is a function K that corresponds to this dot product, i.e.

Non Linear Classifiers 21

= 1 + x12 y12 + x22 y22 + 2 x1 y1 + 2 x2 y2 + 2 x1 x2 y1 y2

computes a dot product in 6-dimensional space without explicitly visiting this

Non Linear Classifiers 23

• In those cases, it’s advantageous to express the linear classifier (regression

sign (θ. φ(x) + θ0) → Κ(x,x’)

Non Linear Classifiers 24

Problem: number of features has now increased dramatically.

Non Linear Classifiers 25

Non Linear Classifiers 26

Non Linear Classifiers 27

Non Linear Classifiers 28

Non Linear Classifiers 29

Non Linear Classifiers 30

To classify a new point, we again use the dual form of θ.

Non Linear Classifiers 31

To classify a new point, we again use the dual form of θ.

• K (x, x’) = 1 is a kernel function. φ(x) = 1

• Let f : Rd → R and K (x, x’) is a kernel. Then so is

Non Linear Classifiers 33

• Based on || x – xi || (affected through the distance.

Non Linear Classifiers 34

Non Linear Classifiers 35

Non Linear Classifiers 36

Non Linear Classifiers 37

Non Linear Classifiers 38

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.