0% found this document useful (0 votes)

24 views19 pages

Week 1

The document discusses dimension reduction techniques for machine learning. It begins with an overview of principal component analysis (PCA) and variants like sparse PCA, randomized PCA, and kernel PCA. It then covers multidimensional scaling (MDS) which attempts to preserve distances between data points. Locally linear embedding (LLE) and isometric feature mapping (ISOMAP) are also introduced as techniques that preserve local or geometric structures in the data. In closing, it notes that the best technique depends on the application and data.

Uploaded by

Đỗ Thiện

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views19 pages

Week 1

Uploaded by

Đỗ Thiện

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Dimension Reduction Techniques

DS 862: Machine Learning for Business Analysts

Rex Cheung

San Francisco State University

Jan 27, 2021

Rex Cheung (SFSU, DS 862) 8/25-27/2020 1 / 19

Outline:

Dimension Reduction Introduction

PCA review
Different types of PCA
Multidimensional Scaling
Locally Linear Embedding
Isometric Feature Mapping
Others

Rex Cheung (SFSU, DS 862) 8/25-27/2020 2 / 19

Dimension Reduction

Main purpose: reduce dimension of data (from p-dimension to

q-dimension, where q < p)
Curse of dimensionality
Notion of closeness is different in high-dimensional data
Will require a lot more data for high-dimensional setting
Different types of dimension reduction technique:
Raw feature selection
Projection based
Manifold learning

Rex Cheung (SFSU, DS 862) 8/25-27/2020 3 / 19

Review of PCA

Aims to find a lower dimensional representation of the data that

maximizes the variance
Find some linear combinations of the original data
The linear combinations of the original data is called principal
components (or principal component scores).
In general, for a data set with n observations and p predictors, you
can create at most min(n-1,p) principal components

Rex Cheung (SFSU, DS 862) 8/25-27/2020 4 / 19

Variants of PCA

Original PCA is good, but it depends on the data. Can be computationally

intensive if data size is large. Variants of PCA that give good
approximation of the full PCA algorithm:
Sparse PCA
Randomized PCA
Incremental PCA

Rex Cheung (SFSU, DS 862) 8/25-27/2020 5 / 19

Variants of PCA Cont.

Sparse PCA
Instead of a full PCA, sparse PCA encourages sparity in the principal
component loadings (the weights).
The optimization is similar to a full PCA procedure, with the addition
of sparse constraint in the loadings (think of LASSO).

max v T Σv
subject to ||v ||2 = 1
||v ||0 ≤ k

Application: financial protfolios, gene analyses, statistical testing, etc.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 6 / 19

Variants of PCA Cont.

Randomized PCA
Instead of computing the exact PC loadings, this uses approximation
to estimate the first k PC loadings.
Borrows different algorithms to quickly estimate the singular value
decomposition (SVD) of a matrix, which is also a main algorithm for
getting PC.
Incremental PCA
Useful when data set is too large for computer memory.
Instead of computing PCA on the entire data set, it will perform PCA
on smaller chunks of data, and update sequentially.
Also useful for streaming data.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 7 / 19

Variants of PCA Cont.

Kernel PCA
Useful for preserving clusters after projection.
According to some statistical theory, sometimes mapping data to a
higher dimension can help make a nonlinearly separately problem
becomes linearly separable.
Idea of Kernel PCA: map current data to an even higher dimension,
then perform PCA.
High-dimensional mapping will increase computation time. We can
use the kernel trick to help solve this problem.
Kernel: a function / way that calculates the distance between two
vector.
For more mathematical details of Kernel PCA, refer to
https://arxiv.org/pdf/1207.3538.pdf.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 8 / 19

Figure: The two groups are not linearly separable in 2D. Using the mapping
(x1 , x2 ) → (x1 , x2 , x12 + x22 ), it becomes linearly separable in 3D. Example borrowed
from http://www.cs.haifa.ac.il/˜rita/uml_course/lectures/KPCA.pdf

Rex Cheung (SFSU, DS 862) 8/25-27/2020 9 / 19

Variants of PCA Cont.

Common kernels to use:

Gaussian / Radial Basis Function: K (xi , xj ) = exp(−γ||xi − xj ||22 )
Polynomial: K (xi , xj ) = (1 + xiT xj )p
Sigmoid: K (xi , xj ) = tanh(γ(xiT xj ) + δ)
xi and xj are original observations. Everything else are tuning parameters.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 10 / 19

Multidimensional Scaling (MDS)

PCA is considered a projection method, where it attemps to project

high dimensional data to a lower dimension.
Projection methods usually don’t preserve any underlying structure.
Manifold embedding is similar to projection method, but also preserve
some structures simultaneously.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 11 / 19

Figure: The famous Swiss Roll example. Image taken from
https://www.math.pku.edu.cn/teachers/yaoy/2011.fudan/mani.pdf

Rex Cheung (SFSU, DS 862) 8/25-27/2020 12 / 19

Multidimensional Scaling

Reduces dimensionality while trying to preserve the distances between

the instances
Basic steps to perform classical MDS:
1 Calculate distances dij between observations xi and xj
2 Find z1 , . . . zn ∈ Rq such that it minimizes
X
(dij − ||zi − zj ||)2
i6=j

This is called the stress function.

There are variants of MDS, such as using a different stress function.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 13 / 19

Multidimensional Scaling

Remarks:
MDS can only be used for feature exploration.
It cannot be use to transform new observations (unlike PCA)

Rex Cheung (SFSU, DS 862) 8/25-27/2020 14 / 19

Locally Linear Embedding (LLE)

An improved version of MDS.

Preserves local properties rather than global distances.
A pseudocode:
1 For each observation xi , finds the closest k points.
2 Find the weights that reconstructs xi using the k nearest neighbors, i.e.
X
minwi,k ||xi − wi,k xk ||2
k∈N (i)

3 In the lower dimension, find new points zi that minimizes the expression

X N
X
||zi − wi,k zk ||2
i=1 k=1

The idea is to preserve the structure through the weights.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 15 / 19

Isometric Feature Mapping (ISOMAP)

Consider geodesic distance rather than euclidean distance

A high level pseudocode:
1 Find the k nearest neighbours for each observation.
2 Construct the neighborhood graph.
3 Compute the shortest path (the geodesic distance) between each
observation. This will be our dij .
4 Apply MDS on dij .

Rex Cheung (SFSU, DS 862) 8/25-27/2020 16 / 19

Other Dimension Reduction Methods

There are many more methods out there:

Factor analysis
Independent Component Analysis
t-SNE
Self-Organizing Map
Autoencoders
etc..

Rex Cheung (SFSU, DS 862) 8/25-27/2020 17 / 19

Remarks

Most dimension reduction techniques are unsupervised methods.

All methods described works on numerical data only.
No best method; depends on application.
No best ’evaluation’; also depends on application.
For categorical data, one can use Multiple Correspondence Analysis
(MCA).

Rex Cheung (SFSU, DS 862) 8/25-27/2020 18 / 19

Reference

Hands-on ML: Chapter 8

ESL: Chapter 14.5, 14.7 - 14.9
tSNE vs PCA: link

Rex Cheung (SFSU, DS 862) 8/25-27/2020 19 / 19

UNIT-4 Machine Learning
No ratings yet
UNIT-4 Machine Learning
20 pages
Ai & ML Week-9
No ratings yet
Ai & ML Week-9
30 pages
ML 4
No ratings yet
ML 4
14 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Unit 3
No ratings yet
Unit 3
102 pages
UNIT-4
No ratings yet
UNIT-4
79 pages
Lecture W12ab
No ratings yet
Lecture W12ab
60 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
Question and Answer PCA
No ratings yet
Question and Answer PCA
4 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
D3S2 _ Unsupervised - Dimensionality Reduction
No ratings yet
D3S2 _ Unsupervised - Dimensionality Reduction
81 pages
Week12_PCA_BayesianInference_before_lecture
No ratings yet
Week12_PCA_BayesianInference_before_lecture
82 pages
2024 10 22 MC886MO444 Dimensionality Reduction PCA
No ratings yet
2024 10 22 MC886MO444 Dimensionality Reduction PCA
56 pages
QSRI-lecture4
No ratings yet
QSRI-lecture4
56 pages
CS464_Ch6_FeatureExtraction
No ratings yet
CS464_Ch6_FeatureExtraction
46 pages
Lecture 9_PCA
No ratings yet
Lecture 9_PCA
44 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
Lecture6
No ratings yet
Lecture6
38 pages
L09 Dimensionality reduction and advanced topics
No ratings yet
L09 Dimensionality reduction and advanced topics
34 pages
03 Dimensionality Reduction
No ratings yet
03 Dimensionality Reduction
38 pages
PCA revis-BoW PDF
No ratings yet
PCA revis-BoW PDF
47 pages
lec15
No ratings yet
lec15
28 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Machine Learning: Unsupervised Learning Dimensionality Reduction K-Means Clustering
No ratings yet
Machine Learning: Unsupervised Learning Dimensionality Reduction K-Means Clustering
28 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
33 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
16 dm2 Dimred 2022 23
No ratings yet
16 dm2 Dimred 2022 23
49 pages
Dimensionality Reduction: Motivation I: Data Compression
No ratings yet
Dimensionality Reduction: Motivation I: Data Compression
35 pages
ML MOD 4 & 6 PYQ
No ratings yet
ML MOD 4 & 6 PYQ
11 pages
Pca
No ratings yet
Pca
17 pages
Chapter Five Principal Comonent Analysis (PCA)
No ratings yet
Chapter Five Principal Comonent Analysis (PCA)
33 pages
Dimensionality Reduction: Principal Component Analysis (PCA)
No ratings yet
Dimensionality Reduction: Principal Component Analysis (PCA)
17 pages
ML Co3 Session 21 Pca
No ratings yet
ML Co3 Session 21 Pca
12 pages
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
No ratings yet
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
40 pages
MDS, PCoA
No ratings yet
MDS, PCoA
5 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
Clustering_and_dimensionality_reduction_techniques__PCA__t_SNE__K_means_ (1)
No ratings yet
Clustering_and_dimensionality_reduction_techniques__PCA__t_SNE__K_means_ (1)
15 pages
Dimensionality Reduction Algorithms
No ratings yet
Dimensionality Reduction Algorithms
7 pages
Lesson Plan Math2
100% (3)
Lesson Plan Math2
6 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
Linear Regression: Dimensionality Reduction
No ratings yet
Linear Regression: Dimensionality Reduction
7 pages
Pattern Recognition Techniques
No ratings yet
Pattern Recognition Techniques
13 pages
PCA Basics
No ratings yet
PCA Basics
1 page
Principal Component Analysis: Jianxin Wu
No ratings yet
Principal Component Analysis: Jianxin Wu
24 pages
Dimensionality Reduction, PCA, and Kernel Methods
No ratings yet
Dimensionality Reduction, PCA, and Kernel Methods
3 pages
Pca
No ratings yet
Pca
19 pages
Pca and t-SNE Dimensionality Reduction
No ratings yet
Pca and t-SNE Dimensionality Reduction
3 pages
Kernel Principal Component Analysis and Its Applications in Face Recognition and Active Shape Models
No ratings yet
Kernel Principal Component Analysis and Its Applications in Face Recognition and Active Shape Models
9 pages
Principal+Component+Analysis
No ratings yet
Principal+Component+Analysis
6 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
No ratings yet
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
20 pages
Dimensionality Reduction Techniques You Should Know in 2021
No ratings yet
Dimensionality Reduction Techniques You Should Know in 2021
12 pages
Pca PDF
No ratings yet
Pca PDF
6 pages
14: Dimensionality Reduction (PCA) : Motivation 1: Data Compression
No ratings yet
14: Dimensionality Reduction (PCA) : Motivation 1: Data Compression
7 pages
Amruthavalli by Viswanadha Satyanarayana
100% (2)
Amruthavalli by Viswanadha Satyanarayana
97 pages
Q1 DLL Mapeh 7&9 2022-2023 WK4
No ratings yet
Q1 DLL Mapeh 7&9 2022-2023 WK4
5 pages
GROUP 3 - Learners who are Blind and with low vision
No ratings yet
GROUP 3 - Learners who are Blind and with low vision
41 pages
LEGAL BASES OF MTB-MLE
No ratings yet
LEGAL BASES OF MTB-MLE
25 pages
Muhammad-Haris-Resume
No ratings yet
Muhammad-Haris-Resume
4 pages
Training Centre-Wise Course-Wise Intake Capacity List of AY 2021-22
No ratings yet
Training Centre-Wise Course-Wise Intake Capacity List of AY 2021-22
84 pages
Implementation of Dimensionality Reduction Techniques in Hospital Management
No ratings yet
Implementation of Dimensionality Reduction Techniques in Hospital Management
4 pages
23-04-2024 Tuesday‌ educational information and o…
No ratings yet
23-04-2024 Tuesday‌ educational information and o…
2 pages
Railway Time Tracking
No ratings yet
Railway Time Tracking
7 pages
Sped 202 Assign 2
No ratings yet
Sped 202 Assign 2
7 pages
Theory and Practice of Public Administration Syllabus
100% (6)
Theory and Practice of Public Administration Syllabus
17 pages
Group 1 Down Syndrome
No ratings yet
Group 1 Down Syndrome
12 pages
Invidual Factor Pmi O'fallon & Butterfield
No ratings yet
Invidual Factor Pmi O'fallon & Butterfield
39 pages
Nhs Essay
0% (1)
Nhs Essay
3 pages
Developing Reading Skills: Dr. Muhammad Shahbaz
No ratings yet
Developing Reading Skills: Dr. Muhammad Shahbaz
22 pages
2019 Kids Count Data Book
No ratings yet
2019 Kids Count Data Book
44 pages
Dissertation Sample
100% (2)
Dissertation Sample
4 pages
Unit 4 - For a Better Community
No ratings yet
Unit 4 - For a Better Community
3 pages
2015 - Automatic Dialect Detection in Arabic Broadcast Speech - 1509.06928
No ratings yet
2015 - Automatic Dialect Detection in Arabic Broadcast Speech - 1509.06928
7 pages
Sample Paper 25
No ratings yet
Sample Paper 25
3 pages
General Instructions-Annual & Pre Board Exam.) 21-22
No ratings yet
General Instructions-Annual & Pre Board Exam.) 21-22
2 pages
Philippine Qualifications Framework: Agricultural and Biosystems Engineering
No ratings yet
Philippine Qualifications Framework: Agricultural and Biosystems Engineering
31 pages
Research 1 Module
No ratings yet
Research 1 Module
23 pages
Syllabus Is Described As The Summary of The Topics Covered or Units To Be Taught in The Particular
No ratings yet
Syllabus Is Described As The Summary of The Topics Covered or Units To Be Taught in The Particular
3 pages
The Normative Score and The Cut-Off Value of The Oswestry Disability Index (ODI)
No ratings yet
The Normative Score and The Cut-Off Value of The Oswestry Disability Index (ODI)
1 page
Trail Making
100% (1)
Trail Making
6 pages
Prashant Darden Reco1 R
No ratings yet
Prashant Darden Reco1 R
2 pages
08-AUG-2022 10:01 International Islamic University Malaysia Examination Result Session: Semester
No ratings yet
08-AUG-2022 10:01 International Islamic University Malaysia Examination Result Session: Semester
2 pages
A Case Study in Bicol
No ratings yet
A Case Study in Bicol
2 pages
The SWOT Analysis
No ratings yet
The SWOT Analysis
5 pages
Efficient Parallel Computing with Dask: Definitive Reference for Developers and Engineers
From Everand
Efficient Parallel Computing with Dask: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Week 1

Uploaded by

Week 1

Uploaded by

Dimension Reduction Techniques

DS 862: Machine Learning for Business Analysts

San Francisco State University

Jan 27, 2021

Rex Cheung (SFSU, DS 862) 8/25-27/2020 1 / 19

Dimension Reduction Introduction

Rex Cheung (SFSU, DS 862) 8/25-27/2020 2 / 19

Main purpose: reduce dimension of data (from p-dimension to

Rex Cheung (SFSU, DS 862) 8/25-27/2020 3 / 19

Aims to find a lower dimensional representation of the data that

Rex Cheung (SFSU, DS 862) 8/25-27/2020 4 / 19

Original PCA is good, but it depends on the data. Can be computationally

Rex Cheung (SFSU, DS 862) 8/25-27/2020 5 / 19

Application: financial protfolios, gene analyses, statistical testing, etc.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 6 / 19

Rex Cheung (SFSU, DS 862) 8/25-27/2020 7 / 19

Rex Cheung (SFSU, DS 862) 8/25-27/2020 8 / 19

Rex Cheung (SFSU, DS 862) 8/25-27/2020 9 / 19

Common kernels to use:

Rex Cheung (SFSU, DS 862) 8/25-27/2020 10 / 19

PCA is considered a projection method, where it attemps to project

Rex Cheung (SFSU, DS 862) 8/25-27/2020 11 / 19

Rex Cheung (SFSU, DS 862) 8/25-27/2020 12 / 19

Reduces dimensionality while trying to preserve the distances between

This is called the stress function.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 13 / 19

Rex Cheung (SFSU, DS 862) 8/25-27/2020 14 / 19

An improved version of MDS.

The idea is to preserve the structure through the weights.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 15 / 19

Consider geodesic distance rather than euclidean distance

Rex Cheung (SFSU, DS 862) 8/25-27/2020 16 / 19

There are many more methods out there:

Rex Cheung (SFSU, DS 862) 8/25-27/2020 17 / 19

Most dimension reduction techniques are unsupervised methods.

Rex Cheung (SFSU, DS 862) 8/25-27/2020 18 / 19

Hands-on ML: Chapter 8

Rex Cheung (SFSU, DS 862) 8/25-27/2020 19 / 19

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.