SP14 CS188 Lecture 13 - Markov Models
SP14 CS188 Lecture 13 - Markov Models
§ Preparation page up
§ Topics: Lectures 1 through 11 (inclusive)
§ Past exams
§ Special midterm 1 office hours
§ Practice Midterm 1
§ Optional
§ One point of EC on Midterm 1 for completing
§ Due: Saturday 3/10 at 11:59pm
AI Outside of 188: Angry Birds Competition
§ http://www.aibirds.org
CS 188: Artificial Intelligence
Markov Models
Instructors: Dan Klein and Pieter Abbeel --- University of California, Berkeley
[These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.]
Independence
§ Two variables are independent in a joint distribution if:
§ Says the joint distribution factors into a product of two simple ones
§ Usually variables aren’t independent!
T P
hot 0.5 The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart
your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
cold 0.5
T W P T W P
hot sun 0.4 hot sun 0.3
hot rain 0.1 hot rain 0.2
cold sun 0.2 cold sun 0.3
cold rain 0.3 cold rain 0.2
W P
sun 0.6
rain 0.4
Example: Independence
§ N fair, independent coin flips:
§ Equivalent statements:
§ P(Toothache | Catch , Cavity) = P(Toothache | Cavity)
§ P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity)
§ One can be derived from the other easily
Conditional Independence
§ Unconditional (absolute) independence very rare (why?)
§ Conditional probability
§ Product rule
§ Chain rule
Instructors: Dan Klein and Pieter Abbeel --- University of California, Berkeley
[These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.]
Reasoning over Time or Space
X1 X2 X3 X4
§ Joint distribution:
P (X1 , X2 , X3 , X4 ) = P (X1 )P (X2 |X1 )P (X3 |X2 )P (X4 |X3 )
§ More generally:
P (X1 , X2 , . . . , XT ) = P (X1 )P (X2 |X1 )P (X3 |X2 ) . . . P (XT |XT 1)
T
Y
= P (X1 ) P (Xt |Xt 1)
t=2
§ Questions to be resolved:
§ Does this indeed define a joint distribution?
§ Can every joint distribution be factored this way, or are we making some assumptions
about the joint distribution by using this factorization?
Chain Rule and Markov Models
X1 X2 X3 X4
§ From the chain rule, every joint distribution over can be written as:
X 1 , X2 , X3 , X4
§ Assuming that
and
X3 ? ? X1 | X2 X4 ?
? X 1 , X2 | X 3
§ From the chain rule, every joint distribution over can be written as:
X 1 , X2 , . . . , X T
T
Y
P (X1 , X2 , . . . , XT ) = P (X1 ) P (Xt |X1 , X2 , . . . , Xt 1)
t=2
§ Assuming that for all t:
Xt ?
? X1 , . . . , X t 2 | Xt 1
§ We assumed: and
X3 ? ? X1 | X2 X4 ?
? X 1 , X2 | X 3
§ Do we also have 1?
X ? X3 , X
4|X
2 ?
§ Yes!
P (X1 , X2 , X3 , X4 )
§ Proof: P (X1 | X2 , X3 , X4 ) =
P (X2 , X3 , X4 )
P (X1 )P (X2 | X1 )P (X3 | X2 )P (X4 | X3 )
=P
x1 P (x1 )P (X2 | x1 )P (X3 | X2 )P (X4 | X3 )
P (X1 , X2 )
=
P (X2 )
= P (X1 | X2 )
Markov Models Recap
§ Explicit assumption for all t : Xt ?
? X1 , . . . , X t 2 | Xt 1
§ CPT P(Xt | Xt-1): Two new ways of representing the same CPT
rain sun
0.7
0.1
X1 X2 X3 X4
X
P (xt ) = P (xt 1 , xt )
x
X
t 1
= P (xt | xt 1 )P (xt 1 )
xt 1
Forward simulation
Example Run of Mini-Forward Algorithm
§ From initial observation of sun
1)
P(X P(X2) P(X3) P(X4) P(X∞)
§ From initial observation of rain
…
P(X1) P(X∞) [Demo: L13D1,2,3]
Video of Demo Ghostbusters Basic Dynamics
Video of Demo Ghostbusters Circular Dynamics
Video of Demo Ghostbusters Whirlpool Dynamics
Stationary Distributions
§ Stationary distribution
§ Will spend more time on highly reachable pages
§ E.g. many ways to get to the Acrobat Reader download page
§ Somewhat robust to link spam
§ Google 1.0 returned the set of pages containing all your
keywords in decreasing rank, now all search engines use link
analysis along with many other factors (rank actually getting
less important over time)
Application of Stationary Distributions: Gibbs Sampling*
§ Transitions:
§ With probability 1/n resample variable Xj according to
§ Stationary distribution:
§ Conditional distribution P(X1, X2 , … , Xn|e1, …, em)
§ Means that when running Gibbs sampling long enough
we get a sample from the desired distribution
§ Requires some proof to show this is true!
Next Time: Hidden Markov Models!