ML Lec07 KNN
ML Lec07 KNN
ML Lec07 KNN
Fall 2023
Preprocessing Preprocessing
Feature
ML Algorithms
Extraction
Evaluation
2
Optimization ML Algorithms
Normal
Gradient Descent Supervised Unsupervised
Equations
Dimensionality
Regression Classification Clustering
Reduction
Logistic
Linear Regression K-means PCA
regression
3
Lecture 7
k-Nearest Neighbor (kNN)
Slides of:
Machine Learning Specialization https://www.coursera.org/specializations/machine-learning-introduction
at Stanford University (Andrew Ng)
4
Grading Distribution (Update)
Project: (20%)
• Code “github”, pitching video, presentation.
Programming Assignment (10%)- DataCamp track
Midterm exam (20%)
Final exam (50%)
k-NN Intuition
What k-NN does for you?
Before K-NN After K-NN
X2 x2
Category 2 Category 2
Category 1 Category 1
x1 x1
How did it do that?
STEP 1: Choose the number K of neighbors
STEP 2: Take the K nearest neighbors of the new data point, according to the
Euclidean distance
STEP 3: Among these K neighbors, count the number of data points in each
category
STEP 4: Assign the new data point to the category where you counted the most
neighbors
Yo u r M o d e l is R e a d y
K-NN Algorithm
• ST E P 1: Choose the number K of neighbors: K = 5
x2
Category 2
Category 1
x1
Euclidean Distance
STEP 2: Take the K nearest neighbors of the new data point, according to the
Euclidean distance
y
y2 P 2(x 2,y2)
y1
P 1(x 1,y1)
x1 x2
x
K-NN Algorithm
STEP 3: Among these K neighbors, count the number of data points in each category
STEP 4: Assign the new data point to the category where you counted
x2 the most neighbors
Category 1: 3 neighbors
Category 2
Category 2: 2 neighbors
New data point
Category 1
x1
Simple Idea …
• Store all training examples
Unknown record
15
k-Nearest Neighbors (kNN)
The k-nearest neighbor classifier will take a test sample, compare it to every single
one of the training samples, and predict the label of the k closest training samples.
X X X
https://jamesmccaffrey.wordpress.com/2022/05/03/weighte
The sum of the inverse distances is 14.1421 + . . + 6.3246 = 60.8834. d-k-nearest-neighbors-classification-example-using-python/
18
Weighted k-NN Example
• The voting weights are the inverse distances divided by their sum:
class = 0 weight = 14.1421 / 60.8834 = 0.2323
class = 0 weight = 13.1306 / 60.8834 = 0.2157
class = 1 weight = 10.8465 / 60.8834 = 0.1782
class = 2 weight = 8.9443 / 60.8834 = 0.1469
class = 0 weight = 7.4953 / 60.8834 = 0.1231
class = 2 weight = 6.3246 / 60.8834 = 0.1039
• Because class 0 has the most voting weight, the prediction is that the unlabeled
data item is class 0.
19
K-NN Components
20
Components of a KNN Classifier
• Distance metric
− How do we measure distance between instances?
− Determines the layout of the example space
• The k hyperparameter
− How large a neighborhood should we consider?
− Determines the complexity of the hypothesis space
Very problem-dependent.
Must try them all out and see what works best.
Distance Metrics
• We can use any distance function to select nearest
neighbors.
• Different distances yield different neighborhoods
L2 distance L1 distance
( = Euclidean distance) Max norm
( = Manhattan distance)
Manhattan Distance (L1)
• The Manhattan distance in a 2-dimensional space is given as:
𝑑1 𝑝, 𝑞 = 𝑝1 − 𝑞1 + 𝑝2 − 𝑞2
• For instance:
If we have two points A (3,6) and B (-5,4).
L1(A,B)= |3-(-5)|+|6 – 4|= 8 + 2= 10.
𝑑1 (𝑝, 𝑞) = 𝑝𝑖 − 𝑞𝑖
𝑖=1
• Where, n = number of dimensions and pi, qi = data points
23
Euclidean Distance (L2)
• The Euclidean distance in a 2-dimensional space is given as:
𝑑2 𝑝, 𝑞 = 𝑝1 − 𝑞1 2 + 𝑝2 − 𝑞2 2
• For instance:
If we have two points A (3,6) and B (-5,4).
L2 (A,B)= [(3-(-5))2+ (6 – 4)2]1/2 = [16 + 4]1/2= 4.472.
𝑑2 (𝑝, 𝑞) = 𝑝𝑖 − 𝑞𝑖 2
𝑖=1
24
Distance Metric to compare images
• Manhattan distance (L1): Compare the images pixel by pixel and add
up all the absolute differences.
• Given two images I1 and I2
Distance Metric to compare images
• Euclidean distance (L2): Computing the pixel-wise difference as
before, but this time we square all of them, add them up and finally
take the square root.
27
Setting Hyperparameters
Setting Hyperparameters
Bars indicated
standard deviation
(Seems that k ~= 10
works best for this data)
k
K-NN Pros and Cons
33
Feature Scale
• Example:
− height of a person may vary from 1.5m to 1.8m
− weight of a person may vary from 90lb to 300lb
− income of a person may vary from $10K to $1M
35
Disadvantage of K-NN
- It is the computationally expensive in testing phase which is
impractical in industry settings. Predict O(N)
• This is bad: we want classifiers that are fast at prediction, slow in
training is ok.
- KNN can suffer from skewed class distributions.
• For example, if a certain class is very frequent in the training set, it will tend
to dominate the majority voting of the new example (large number = more
common).
- Finally, the accuracy of KNN can be severely degraded with high-
dimension data. For that, it is never used on images.
Advantage of K-NN
- Simple to understand and easy to implement.
- With zero to little training time, Train O(1)
- KNN works just as easily with multiclass data sets (more than 2
classes).
Thanks
38