Ruiz Modified I2ml3e Chap6
Ruiz Modified I2ml3e Chap6
Ruiz Modified I2ml3e Chap6
INTRODUCTION
TO
MACHİNE
LEARNİNG
3RD EDİTİON
ETHEM ALPAYDIN Modified by Prof. Carolina Ruiz
© The MIT Press, 2014 for CS539 Machine Learning at
WPI
alpaydin@boun.edu.tr
http://www.cmpe.boun.edu.tr/~ethem/i2ml3e
CHAPTER 6:
DİMENSİONALİTY
REDUCTİON
Why Reduce Dimensionality?
3
forward
Chosen
7
Iris data: Add one more feature to F4
Chosen
8
Principal Components Analysis
9
z = WT(x – m)
where the columns of W are the eigenvectors of ∑
and m is sample mean
Centers the data at the origin and rotates the axes
How to choose k ?
12
x z
z x
Factor Analysis
18
s 2
r ,s x xr
gx | gx | x
r s r
x s
2
s 2
r ,s x x
r
Map of Europe by MDS
22
J w
m1 m2 2
s1 s2
2 2
m1
t x r
w T t t
s t w x m1 r
2 T t 2 t
r t 1
t
23
Between-class scatter:
m1 m2 w m1 w m 2
2 T T 2
w T m1 m 2 m1 m 2 T w
w T SB w where SB m1 m 2 m1 m 2 T
Within-class scatter:
s t w x m1 r
2 T t 2 t
1
t w x m1 x m1 wr t w T S1w
T t t T
where S1 t x m1 x m1 r
t t T t
Within-class scatter:
S i t ri x m i x m i
K
SW S i t t t T
i 1
Between-class scatter:
K
1 K
SB Ni m i m m i m T
m mi
i 1 K i 1
Find W that max JW WT SB W
WT SW W
The largest eigenvectors of SW-1SB; maximum rank of K-1
27
PCA vs LDA
28
Canonical Correlation Analysis
29
100 22
2
22
22
2
50 33 22 2
7 77 111 313
333
7 7
77 1 1 338
7
7 7 4 11 1 8
1 5 83
0 9
7 44
9 9 5 5 98
38
4
9
9949 5 9 88
49
4
88 000
0 00
-50 0
0
0
6
4 666 0
666
-100 4
44
4
-150
-150 -100 -50 0 50 100 150
33
Locally Linear Embedding
34
E (W | X ) x r Wrs x(sr )
r s
E (z | W) z r Wrs z(sr )
r s
35
LLE on Optdigits
36
00
0
0
1
7
7
77
7
7
66 7
7 9 9
666
6 7
1 8 443
399 7
94
44 89
3933
8
3
9 9
4
3
45
4843
18
141
9
83 2
1
44 1 8 122 2
2 22
2
89
8 25
1
1
1 55
5
1
1
-3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5