Probability Theorem Proofs
The axioms of Kolmogorov. Let S denote an event set with a probability measure P
defined over it, such that probability of any event A S is given by P (A). Then, the
probability measure obeys the following axioms:
(1) P (A) 0,
(2) P (S) = 1,
(3) If {A1 , A2 , . . . Aj , . . .} is a sequence of mutually exclusive events such that Ai Aj =
for all i, j, then P (A1 A2 Aj ) = P (A1 ) + P (A2 ) + + P (Aj ) + .
The axioms are supplemented by two definitions:
(4) The conditional probability of A given B is defined by
P (A|B) =
P (A B)
P (B)
The rules of Boolean Algebra. The binary operations of union and intersection are
roughly analogous, respectively, to the arithmetic operations of addition + and multiplication
, and they obey a similar set of laws which have the status of axioms:
Commutative law:
Associative law:
Distributive law:
Idempotency law:
A B = B A,
A B = B A,
(A B) C = A (B C),
(A B) C = A (B C),
A (B C) = (A B) (A C),
A (B C) = (A B) (A C),
A A = A,
A A = A.
(A B)c = Ac B c .
Amongst other useful results are those concerning the null set and the universl set S:
(i) A Ac = S,
(ii) A Ac = ,
(iii) A S = S,
(iv) A S = A,
(v) A = A,
(vi) A = .
LEMMA: the probability of the complementary event. If A and Ac are complementary events, then
P (Ac ) = 1 P (A).
Proof. There are
Therefore, by Axiom 3,
A Ac = S
A Ac = ,
P (A Ac ) = P (A) + P (Ac ) = 1,
LEMMA: the probability of the null event. The probability of the null event is
P () = 0.
Proof. Axiom 3 implies that
P (S ) = P (S) + P (),
since S and are disjoint sets by definition, i.e. S = . But also S = S, so
P (S ) = P (S) = 1,
where the second equality is from Axiom 2. Therefore,
P (S ) = P (S) + P () = P (S) = 1,
so P () = 0.
THEOREM: the union of of events. The probability that either A or B will happen or
that both will happen is the probability of A happening plus the probability of B happening
less the probability of the joint occurrence of A and B:
P (A B) = P (A) + P (B) P (A B)
Proof. There is A (B Ac ) = (A B) (A Ac ) = A B, which is to say that A B can
be expressed as the union of two disjoint sets. Therefore, according to axiom 3, there is
P (A B) = P (A) + P (B Ac ).
But B = B (A Ac ) = (B A) (B Ac ) is also the union of two disjoint sets, so there
is also
P (B) = P (B A) + P (B Ac ) = P (B Ac ) = P (B) P (B A).
Substituting the latter expression into the one above gives
P (A B) = P (A) + P (B) P (A B).
Observe that the formula for conditional probability implies that
whence we
P (B|A)P (A)
P (B)
P (Hi |E) =
A Bayesian Problem. The dawn train collects any milk churns that were left on the
platform of Worplesham station on weekdays. Churns are left on three of the five days. A
man arrives at the station not knowing the exact time and thinking that there is a fifty-fifty
chance that he has missed the train. Then he notices that there are no milk churns on the
platform. How should he reassess the chances that he has missed the train?
Answer: Let T be the event that the train has already passed and let N be the event of
there being no milk churns on the platform when I arrive. We have
P (T |N ) =
P (N |T )P (T )
P (N )
P (N ) = P (N |T )P (T ) + P (N |T c )P (T c ).
We take
P (N |T ) = 1,
P (T ) = P (T c ) =
and P (N |T c ) =
1 2 1
P (N ) = 1. + . =
2 5 2
1 10
and P (T |N ) = 1. .
= .
2 7