STA 412 Markov chain
STA 412 Markov chain
Markov Chains (It only matters where you are, not where you’ve been…..)
➢ Markov Chain is a special type of stochastic process where the probability of the next state
conditional on the entire sequence of previous states up to the current state depends only
on the current state.
➢ Markov chains were introduced in 1906 by Andrei Andreyevich Markov (1856–1922) and
were named in his honor. A Russian mathematician who is credited with developing the
theory of these systems in the early 20th century. Markov was interested in understanding
the behavior of random processes, and he developed the theory of Markov chains as a way
to model such processes.
Markov chains are often used to model systems that exhibit memoryless behavior, where the
system's future behavior is not influenced by its past behavior.
In mathematical terms, the definition can be expressed as follows:
Definition
➢ A discrete-time stochastic process X = {X0, X1, X2, · · · } with a countable state space is a
Markov Chain if
➢ In words, a Markov Chain is a stochastic process such that given the value of the current
state, the distribution of the next state is independent of the past.
Definition:
The state of a Markov chain at time t is the value of Xt .
For example, if Xt = 6, we say the process is in state 6 at time t.
Definition:
The state space of a Markov chain, S, is the set of values that each Xt can take.
For example, S = {1, 2, 3, 4, 5, 6, 7}.
Definition: A trajectory of a Markov chain is a particular set of values for X0, X1, X2, . . ..
For example,
if X0 = 1, X1 = 5, and X2 = 6,
then the trajectory up to time t = 2 is 1, 5, 6.
More generally, if we refer to the trajectory s0, s1, s2, s3, . . ., we mean that
X0 = s0, X1 = s1, X2 = s2, X3 = s3, . . .
Markov Property
• The basic property of a Markov chain is that only the most recent point in the trajectory
affects what happens next.
This is called the Markov Property.
It means that Xt+1 depends upon Xt , but it does not depend upon Xt−1, . . . , X1, X0.
➢ A process which has the Markov property is said to be a Markov chain. If the state space is
finite, the process is called Finite Markov Chain. In attempting to model any real system
it will be important to consider whether the Markov property is likely to hold.
The Transition Matrix
➢ The matrix describing the Markov chain is called the transition matrix.
➢ The N×N matrix P=(Pij), where the entry in the ith row and jth column is the transition
probability pij, is called the transition probability matrix. The transition matrix P for the
➢ Markov chain is the N*N matrix, N referring the number of states, and we denote the entry
as Pij.
➢ We often list the transition probabilities in the matrix. called the state transition
matrix or transition probability matrix.
➢ Assuming the states are 1, 2, ⋯⋯, n, then the state transition matrix is given by:
𝑝11 𝑝12 ⋯ 𝑝𝑖𝑛
𝑝21 𝑝22 ⋯ 𝑝2𝑛
𝑃=( ⋮ ⋮ ⋮ ⋮ )
𝑝𝑛1 𝑝𝑛2 … 𝑝𝑛𝑛
Note that 𝑝𝑖𝑗 ≥ 0, 𝑎𝑛𝑑 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖, 𝑤𝑒 ℎ𝑎𝑣𝑒
Note that P is called the one-step transition matrix andany transition probability matrix 𝑃 = (𝑝𝑖𝑗 )
➢ Any matrix that satisfies both of these conditions is called a STOCHASTIC MATRIX.
.
Note
We also must define 𝜋𝑖 to be the probability that the chain is in state i at the time 0;
P(X0= i) = 𝜋𝑖 .
The vector 𝜋 = [𝜋1 , 𝜋2 , ⋯ 𝜋𝑠 ] is called as the initial probability distribution for the Markov
chain.
Example: What is the transition probability matrix P of the stochastic process in example 3.
Allowing the index 1 to correspond to Lekki and index 2 correspond to Maryland, from the
description of the problem we have
1
𝑝11 = 𝑃(𝑋𝑡+1 = 𝐿𝑒𝑘𝑘𝑖 = 1|𝑋𝑡 = 𝐿𝑒𝑘𝑘𝑖 = 1) =
6
5
𝑝12 = 𝑃(𝑋𝑡+1 = 𝑀𝑎𝑟𝑦𝑙𝑎𝑛𝑑 = 2|𝑋𝑡 = 𝐿𝑒𝑘𝑘𝑖 = 1) =
6
1
𝑝21 = 𝑃(𝑋𝑡+1 = 𝑀𝑎𝑟𝑦𝑙𝑎𝑛𝑑 = 1|𝑋𝑡 = 𝐿𝑒𝑘𝑘𝑖 = 2) =
3
2
𝑝22 = 𝑃(𝑋𝑡+1 = 𝑀𝑎𝑟𝑦𝑙𝑎𝑛𝑑 = 2|𝑋𝑡 = 𝐿𝑒𝑘𝑘𝑖 = 2) =
3
Therefore the transition probability matrix is given by
1 5
𝑃= (61 6
2)
3 3
The transition matrix can also be displayed as a transition diagram
State Transition Diagram:
A Markov chain is usually shown by a state transition diagram.
0.6 0.4
Example. For a Markov Chain with transition matrix 𝑃 = ( ), the corresponding
0.3 0.7
transition graph is drawn below
Example: Consider a Markov chain with three possible states 1, 2, and 3 and the following
transition probabilities
1 1 1
4 2 4
1 2
𝑃= 0
3 3
1 1
0
(2 2)
The figure below shows the state transition diagram for the above Markov chain. In this diagram,
there are three possible states 1, 2, and 3, and the arrows from each state to other states show the
transition probabilities pij. When there is no arrow from state i to state j, it means that pij=0.
Theorem
Distribution of Xt
Let {X0, X1, X2, . . .} be a Markov chain with state space S = {1, 2, . . . , N}. Now each Xt is a
random variable, so it has a probability distribution. We can write the probability distribution of
Xt as an N × 1 vector.
For example, consider X0. Let π be an N × 1 vector denoting the probability distribution of X0 :
Probability distribution of X2
Using the Partition Rule as before, conditioning again on X0:
Theorem
Taking one step in the Markov chain corresponds to multiplying by P on the right.
Example
Suppose the entire cola industry produces only two colas. Given that a person last purchased
cola A, there is a 90% chance that her next purchase will be cola A. Given that a person last
purchased cola B, there is an 80% chance that her next purchase will be cola B.
(i) If a person is currently a cola B purchaser, what is the probability that she will
purchase cola A two purchases from now?
(ii) If a person is currently a cola A purchaser, what is the probability that she will
purchase cola A three purchases from now
Solution
State 1 = person has last purchased cola A
State 2 = person has last purchased cola B
0.90 0.10 0.90 0.10 0.83 0.17
𝑃2 = [ ][ ]=[ ]
0.20 0.80 0.20 0.80 0.34 0.66
(2)
𝑃(𝑋2 = 1|𝑋0 = 2) = 𝑃21 = 0.34 = 34%
We may obtain this answer in a different way
(2)
𝑃21 = prob. that 1st purchase is cola B and 2nd is cola A + (prob. that 1st purchase is cola A and 2nd is
cola A)
= 𝑃22 𝑃21 + 𝑃21 𝑃11 = (0.80)(0.20) + (0.20)(0.90) = 0.34
(ii) 𝑃3 = 𝑃(𝑃2 ) = [
0.90 0.10] [0.83 0.17] = [0.781 0.219]
0.20 0.80 0.34 0.66 0.438 0.562
(3)
Therefore, 𝑃11 = 0.781=78%
Example
Given the transition matrix:
1 1 1
ii) suppose 𝑋0 = (3 , 3 , 3), (i) Find the probability distribution of X1. (ii) Find the probability
distribution of X2.
FURTHER READING: Introduction to Stochastic Processes: (The first text cited in lecture 1)
Markov Chain
Go through each example, provide step-by-step solutions, and explain the concepts.