Introduction To Probability Theory: Rong Jin
Introduction To Probability Theory: Rong Jin
Introduction To Probability Theory: Rong Jin
Theory
Rong Jin
Outline
Basic concepts in probability theory
Bayes rule
Random variable and distributions
Definition of Probability
Experiment: toss a coin twice
Sample space: possible outcomes of an experiment
S = {HH, HT, TH, TT}
Event: a subset of possible outcomes
A={HH}, B={HT, TH}
Probability of an event : an number assigned to an
event Pr(A)
Axiom 1: Pr(A) > 0
Axiom 2: Pr(S) = 1
Axiom 3: For every sequence of disjoint events
Example: Pr(A) = n(A)/N: frequentist statistics
Pr( ) Pr( )
i i
i i
A A =
Joint Probability
For events A and B, joint probability Pr(AB)
stands for the probability that both events
happen.
Example: A={HH}, B={HT, TH}, what is the joint
probability Pr(AB)?
Independence
Two events A and B are independent in case
Pr(AB) = Pr(A)Pr(B)
A set of events {A
i
} is independent in case
Pr( ) Pr( )
i i
i i
A A =
[
Independence
Two events A and B are independent in case
Pr(AB) = Pr(A)Pr(B)
A set of events {A
i
} is independent in case
Example: Drug test
Pr( ) Pr( )
i i
i i
A A =
[
Women Men
Success 200 1800
Failure 1800 200
A = {A patient is a Woman}
B = {Drug fails}
Will event A be independent
from event B ?
Independence
Consider the experiment of tossing a coin twice
Example I:
A = {HT, HH}, B = {HT}
Will event A independent from event B?
Example II:
A = {HT}, B = {TH}
Will event A independent from event B?
Disjoint = Independence
If A is independent from B, B is independent from C, will A
be independent from C?
If A and B are events with Pr(A) > 0, the conditional
probability of B given A is
Conditioning
Pr( )
Pr( | )
Pr( )
AB
B A
A
=
If A and B are events with Pr(A) > 0, the conditional
probability of B given A is
Example: Drug test
Conditioning
Pr( )
Pr( | )
Pr( )
AB
B A
A
=
Women Men
Success 200 1800
Failure 1800 200
A = {Patient is a Women}
B = {Drug fails}
Pr(B|A) = ?
Pr(A|B) = ?
If A and B are events with Pr(A) > 0, the conditional
probability of B given A is
Example: Drug test
Given A is independent from B, what is the relationship
between Pr(A|B) and Pr(A)?
Conditioning
Pr( )
Pr( | )
Pr( )
AB
B A
A
=
Women Men
Success 200 1800
Failure 1800 200
A = {Patient is a Women}
B = {Drug fails}
Pr(B|A) = ?
Pr(A|B) = ?
Which Drug is Better ?
Simpsons Paradox: View I
Drug I Drug II
Success 219 1010
Failure 1801 1190
A = {Using Drug I}
B = {Using Drug II}
C = {Drug succeeds}
Pr(C|A) ~ 10%
Pr(C|B) ~ 50%
Drug II is better than Drug I
Simpsons Paradox: View II
Female Patient
A = {Using Drug I}
B = {Using Drug II}
C = {Drug succeeds}
Pr(C|A) ~ 20%
Pr(C|B) ~ 5%
Simpsons Paradox: View II
Female Patient
A = {Using Drug I}
B = {Using Drug II}
C = {Drug succeeds}
Pr(C|A) ~ 20%
Pr(C|B) ~ 5%
Male Patient
A = {Using Drug I}
B = {Using Drug II}
C = {Drug succeeds}
Pr(C|A) ~ 100%
Pr(C|B) ~ 50%
Simpsons Paradox: View II
Female Patient
A = {Using Drug I}
B = {Using Drug II}
C = {Drug succeeds}
Pr(C|A) ~ 20%
Pr(C|B) ~ 5%
Male Patient
A = {Using Drug I}
B = {Using Drug II}
C = {Drug succeeds}
Pr(C|A) ~ 100%
Pr(C|B) ~ 50%
Drug I is better than Drug II
Conditional Independence
Event A and B are conditionally independent given
C in case
Pr(AB|C)=Pr(A|C)Pr(B|C)
A set of events {A
i
} is conditionally independent
given C in case
Pr( | ) Pr( | )
i i
i i
A C A C =
[
Conditional Independence (contd)
Example: There are three events: A, B, C
Pr(A) = Pr(B) = Pr(C) = 1/5
Pr(A,C) = Pr(B,C) = 1/25, Pr(A,B) = 1/10
Pr(A,B,C) = 1/125
Whether A, B are independent?
Whether A, B are conditionally independent
given C?
A and B are independent = A and B are
conditionally independent
Outline
Important concepts in probability theory
Bayes rule
Random variables and distributions
Given two events A and B and suppose that Pr(A) > 0. Then
Example:
Bayes Rule
Pr(W|R) R R
W 0.7 0.4
W 0.3 0.6
R: It is a rainy day
W: The grass is wet
Pr(R|W) = ?
Pr(R) = 0.8
) Pr(
) Pr( ) | Pr(
) Pr(
) Pr(
) | Pr(
A
B B A
A
AB
A B = =
Bayes Rule
R R
W 0.7 0.4
W 0.3 0.6
R: It rains
W: The grass is wet
R W
Information
Pr(W|R)
Inference
Pr(R|W)
Pr( | ) Pr( )
Pr( | )
Pr( )
E H H
H E
E
=
Bayes Rule
R R
W 0.7 0.4
W 0.3 0.6
R: It rains
W: The grass is wet
Hypothesis H
Evidence E
Information: Pr(E|H)
Inference: Pr(H|E)
Prior Likelihood Posterior
Bayes Rule: More Complicated
Suppose that B
1
, B
2
, B
k
form a partition of S:
Suppose that Pr(B
i
) > 0 and Pr(A) > 0. Then
;
i j i
i
B B B S = C =
1
1
Pr( | ) Pr( )
Pr( | )
Pr( )
Pr( | ) Pr( )
Pr( )
Pr( | ) Pr( )
Pr( ) Pr( | )
i i
i
i i
k
j
j
i i
k
j j
j
A B B
B A
A
A B B
AB
A B B
B A B
=
=
=
=
=
1
1
[ ]
N
i
i
E X x
N
=
=
[ ] ( ) E X xp x dx
u
=
}
1 2 1 2
[ ] [ ] [ ] E X X E X E X + = +
Expectation: Example
Let S be the set of all sequence of three rolls of a die.
Let X be the sum of the number of dots on the three
rolls.
What is E(X)?
Let S be the set of all sequence of three rolls of a die.
Let X be the product of the number of dots on the
three rolls.
What is E(X)?
Variance
The variance of a random variable X is the
expectation of (X-E[x])
2
:
2
2 2
2 2
2 2
( ) (( [ ]) )
( [ ] 2 [ ])
( [ ] )
[ ] [ ]
Var X E X E X
E X E X XE X
E X E X
E X E X
=
= +
=
=
Bernoulli Distribution
The outcome of an experiment can either be success
(i.e., 1) and failure (i.e., 0).
Pr(X=1) = p, Pr(X=0) = 1-p, or
E[X] = p, Var(X) = p(1-p)
1
( ) (1 )
x x
p x p p
u
=
Binomial Distribution
n draws of a Bernoulli distribution
X
i
~Bernoulli(p), X=
i=1
n
X
i
, X~Bin(p, n)
Random variable X stands for the number of times
that experiments are successful.
E[X] = np, Var(X) = np(1-p)
(1 ) 1, 2,...,
Pr( ) ( )
0 otherwise
x n x
n
p p x n
X x p x x
u
| |
=
|
= = =
\ .
>
= = =
otherwise 0
0
!
) ( ) Pr(
x e
x
x p x X
x
o
to
o
to
=
`
)
s s = =
`
)
} }