Lecture 8 Course

ECE 50024 / STAT 598 Machine Learning I
Lecture 8
Linear DiscriminantFunction
+
wy
g(x) w..
+
=
G,Rd
E w, wO3, EI
=
X2
(i) **x.
0
gyx) =
+
wy
g(x) w..
+
=
W X,
\
w0 0
+ =
xy
X
I
+
w Ei
-
+ wo =
0.
ZT(X, -
x) 0
=
↓
separating hyperplane
to
a = -
g(x) 115112
DIII]
0
=
I X. -
xp y.
=
>
w+zy w.in.
IIZ1k
Sizp3 =
+
wT
g(X0 =
& =
1. i y.
=
1151k
sxrgcE
11211
Xp x.
|
=
-
-
g(x) xp angmin 115-5.1
penmaia
=
0
=
S.t. Ex wo 0.
+ =
\ xyz =
*8. - x5 =
3
(z ws)
+
7x2 = -
x +
0.
=
0 fXx xw)
+
5x +
0
=
+
w.
+
x0Tw =
+
0 2
=
30 X111' 0.=X
+
+ -
=
same as
srvious
1 xi 3. =
-
pp. /
******
X2
↑-***
Not
linery separable
Not are
↑
Ghearly Separable
Separating Hyperplane Theorem.
↑I
-> convex
Let C, and G be two closed
8
convex
sets.
nonconvex
such thatC,12
0. Then,
=
there exists
linear
a
function
g(x)
+
1 x wo.
= +
C2
/
such that.
91x330.Vxec..
g(x)<0.0xta.
159
xx
x
Linear DiscriminantAnalysis.
1X
Generative Methods Discriminative Methods.
08 (
xx
x
x
x
x x X
determine
①
model of the data
①
separating boundary
② separating boundry.
Generative Approaches
g(x) wx
=
w. 0.
+ =
Two classes C. G2.
Each class is modeled as a Gaussian distribution.
Likelihood. function:Pxly (x (i) =
N(x((i.zi)
Prior distribution.
P.(i) Hi
=
X. R.V. XnN(4.2).
PX (x) = =
exp
f(2π)9(2)
-
z(x -
zz" ,x -
m).
ETX,]
i E(X) =
↑ 1. Y
=
#XaD
&
VarTX.3. CorTX,Xc]...
z (((x
=
⑦: I always
- M)(x -
y) =
j !-
positive semi-definite.
-
Special Case:
2
G,
I =
↑ 0.%] 1. =>
Xi.X;independent.
Vizj.
PX(x) 0z,exp3
=
-
z(5 -
iz ",x -
m).
# S. "9.***'3
Appendix
c Stanley Chan 2020. All Rights Reserved.

26 / 34
Proof of Separating Hyperplane Theorem
Conjecture: Let’s see if this is the correct hyperplane
g (x) = w T (x x 0)
✓ ◆
⇤ ⇤ T x⇤ + y⇤
= (x y ) x
2
⇤
kx k 2 ky ⇤ k2
= (x ⇤ y ⇤ )T x
2
According to picture, we want g (x) > 0 for all x 2 C1 .
Suppose not. Assume
kx ⇤ k2 ky ⇤ k2
g (x) = (x ⇤ y ⇤ )T x < 0.
2
See if we can find a contradiction.
27 / 34
C1 is convex.
Pick x 2 C1
Pick x ⇤ 2 C1
Let 0  1
Construct a point
x = (1 )x ⇤ + x.
Convex means
x 2 C1
So we must have
kx y ⇤k kx ⇤ y ⇤k
28 / 34
Pick an arbitrary point x 2 C1 .

x ⇤ is fixed already.
Pick x along the line connecting x and x ⇤ .
Convexity implies x 2 C1 .
So kx y ⇤k kx ⇤ y ⇤ k. If not, something is wrong.
Let us do some algebra:
kx y ⇤ k2 = k(1 )x ⇤ + x y ⇤ k2
= kx ⇤ y ⇤ + (x x ⇤ )k2
= kx ⇤ y ⇤ k2 + 2 (x ⇤ y ⇤ )T (x x ⇤) + 2
kx x ⇤ k2
= kx ⇤ y ⇤ k2 + 2 w T (x x ⇤) + 2
kx x ⇤ k2 .
Remember: w T (x x 0 ) < 0.
29 / 34
kx y ⇤ k2 = kx ⇤ y ⇤ k2 + 2 w T (x x ⇤) + 2
kx x ⇤ k2
< kx ⇤ y ⇤ k2 + 2 (w T x 0 w T x ⇤ ) + 2 kx x ⇤ k2
✓ ⇤ 2 ◆
kx k ky ⇤ k2
= kx ⇤ ⇤ 2
y k +2 wT x⇤
2
+ 2
kx x ⇤ k2
= kx ⇤ y ⇤ k2 kx ⇤ y ⇤ k2 + 2
kx x ⇤ k2
| {z } | {z }
=A =B
⇤ ⇤ 2 2
= kx y k A+ B
⇤ ⇤ 2
= kx y k (A B).
Now, pick an x such that A B > 0. Then (A B) < 0.
A kx ⇤ y ⇤ k2
< = .
B kx x ⇤ k2 c Stanley Chan 2020. All Rights Reserved.
30 / 34
Therefore, if we choose such that A B > 0, i.e.,
A kx ⇤ y ⇤ k2
< = ,
B kx x ⇤ k2
then (A B) < 0, and so
kx y ⇤ k2 < kx ⇤ y ⇤ k2 (A B)
< kx ⇤ y ⇤ k2
Contradiction, because kx ⇤ y ⇤ k2 should be the smallest!
Conclusion:
If x 2 C1 , then g (x) > 0.
By symmetry, if x 2 C2 , then g (x) < 0.
And we have found the separating hyperplane (w , w0 ).
31 / 34
Q&A 1: What is a convex set?
A set C is convex if the following condition is met.

Pick x 2 C and y 2 C , and let 0 < < 1. If x + (1 )y is also in
C for any x, y and , then C is convex.
Basically, it says that you can pick two points and draw a line. If the
line is also in the set, then the set is convex.

32 / 34
Q&A 2: Is there a way to check whether two sets are
linearly separable?
No, at least I do not know.
The best you can do is to check whether a training set is linearly
separable.
To do so, solve the hard SVM. If you can solve it with zero training
error, then you have found one. If the hard SVM does not have a
solution, then the training set is not separable.
Checking the testing set is impossible unless you know the
distributions of the samples. But if you know the distributions, you
can derive formula to check linear separability.
For example, Gaussians are not linearly separable because no matter
how unlikely you can always find a sample that lives in the wrong
side. Uniform distributions are linearly separable.
Bottom line: Linear separability, in my opinion, is more of a
theoretical tool to describe the intrinsic property of the problem. It
is not for computational purposes. c Stanley Chan 2020. All Rights Reserved.
33 / 34
Q&A 3: If two sets are not convex, how do I know if it is
linearly separable?
You can look at the convex hull.

A convex hull is the smallest convex set that contains the original set.
If the convex hulls are not overlapping, then linearly separable.
For additional information about convex sets, convex hulls, you can
check Chapter 2 of
https://web.stanford.edu/class/ee364a/lectures.html

34 / 34

Lecture 8 Course

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Lecture 8 Course

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 8 Course

Uploaded by

Copyright:

Available Formats

ECE 50024 / STAT 598 Machine Learning I

Two classes C. G2.

Each class is modeled as a Gaussian distribution.

Likelihood. function:Pxly (x (i) =

c Stanley Chan 2020. All Rights Reserved.

Conjecture: Let’s see if this is the correct hyperplane

Pick an arbitrary point x 2 C1 .

then (A B) < 0, and so

Contradiction, because kx ⇤ y ⇤ k2 should be the smallest!

A set C is convex if the following condition is met.

c Stanley Chan 2020. All Rights Reserved.

You can look at the convex hull.

c Stanley Chan 2020. All Rights Reserved.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.