Algorithms: K Nearest Neighbors

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 16

Algorithms: K Nearest

Neighbors

Dr K Meena 03/05/2022 1
Simple Analogy..
• Tell me about your friends(who your neighbors
are) and I will tell you who you are.

Dr K Meena 03/05/2022 2
Instance-based Learning

Its very similar to a


Desktop!!

Dr K Meena 03/05/2022 3
KNN – Different names

• K-Nearest Neighbors
• Memory-Based Reasoning
• Example-Based Reasoning
• Instance-Based Learning
• Lazy Learning

Dr K Meena 03/05/2022 4
What is KNN?
• A powerful classification algorithm used in pattern
recognition.

• K nearest neighbors stores all available cases and


classifies new cases based on a similarity measure(e.g
distance function)

• One of the top data mining algorithms used today.

• A non-parametric lazy learning algorithm (An Instance-


based Learning method).

Dr K Meena 03/05/2022 5
KNN: Classification Approach
• An object (a new instance) is classified by a
majority votes for its neighbor classes.
• The object is assigned to the most common class
amongst its K nearest neighbors.(measured by a
distant function )

Dr K Meena 03/05/2022 6
KNN: Classification Approach

Dr K Meena 03/05/2022 7
Distance Measure

Compute
Distance
Test
Record

Training
Records Choose k of the
“nearest” records

Dr K Meena 03/05/2022 8
Distance measure for
Continuous Variables

Dr K Meena 03/05/2022 9
Distance Between Neighbors
• Calculate the distance between new example
(E) and all examples in the training set.

• Euclidean distance between two examples.


– X = [x1,x2,x3,..,xn]
– Y = [y1,y2,y3,...,yn]
– The Euclidean distance between X and Y is defined as

D( X ,Y )  (x  y )
ni i
2

 i1

03/05/2022
11 1
Dr K Meena 0
K-Nearest Neighbor Algorithm
• All the instances correspond to points in an n-dimensional
feature space.

• Each instance is represented with a set of numerical


attributes.

• Each of the training data consists of a set of vectors and a


class label associated with each vector.

• Classification is done by comparing feature vectors of


different K nearest points.

• Select the K-nearest examples to E in the training set.

• Assign E to the most common class among its K-nearest


neighbors.
Dr K Meena 03/05/2022 11
3-KNN: Example(1)

sqrt [(35-37)2+(35-50)2 +(3-


2)2]=15.16

sqrt [(22-37)2+(50-50)2 +(2-


2)2]=15
sqrt [(63-37)2+(200-50)2 +(1-
2)2]=152.23
sqrt [(59-37)2+(170-50)2 +(1-
2)2]=122
sqrt [(25-37)2+(40-50)2 +(4-
2)2]=15.74

? YES

Dr K Meena 03/05/2022 12
How to choose K?
• If K is too small it is sensitive to noise points.

• Larger K works well. But too large K may include majority


points from other classes.

• Rule of thumb is K < sqrt(n), n is number of examples.

Dr K Meena 03/05/2022 13
Dr K Meena 03/05/2022 14
X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points


that have the k smallest distance to x

Dr K Meena 03/05/2022 15
Strengths of KNN
• Very simple and intuitive.
• Can be applied to the data from any distribution.
• Good classification if the number of samples is large enough.

Weaknesses of KNN

• Takes more time to classify a new example


• ``need to calculate and compare distance from new example
to all other examples.
• Choosing k may be tricky.
• Need large number of samples for accuracy.

Dr K Meena 03/05/2022 16

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy