Materi Modul 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

The classical definition of probability

If there are m outcomes in a sample space, and all are equally likely of being the
result of an experimental measurement, then the probability of observing an event
𝑠
that contains s outcomes is given by 𝑚

e.g. Probability of drawing an ace from a deck of 52 cards.


sample space consists of 52 outcomes.
desired event (ace) is a set of 4 outcomes (number of desired outcomes is 4)
therefore the probability of getting an ace is 4/52 = 1/13 ≈ 0.0769 (7.69%)

e.g. There are 10 motors, two of which do not work and eight which do.
a) what is the probability of picking 2 working motors
(8!/2!/6!)/(10!/2!/8!)=(8!2!8!)/(10!2!6!)=(8!8!)/(10!6!)
b) what is the probability of picking 1 working and 1 non-working motors
c) S=2*8, m=10!/2!/8! P=16*2!*8!/10!
Mathematically, in defining probabilities of events we are deriving a set function on a
sample space. A set function assigns to each subset in the sample space a real number.

Example: Consider the set function that assigns to each subset (event) A the number
N(A) of outcomes in the set. This set function is additive, that is, if two events A and B
have no outcomes in common (are mutually exclusive), then N(A U B) = N(A) + N(B).

Counter-example: Fig. 3.7

Fig. 3.7. Measurements on 500


machine parts
I = incompletely assembled, D
= defective, S = satisfactory

𝑁(𝐼∪𝐷)≠𝑁(𝐼)+𝑁(𝐷) as N and D
are not mutually exclusive
The axioms of probability
Let S be a finite sample space, A an event in S. We define P(A), the probability of A, to
be the value of an additive set function P( ) that satisfies the following three conditions
Axiom 1 0 ≤ 𝑃 𝐴 ≤ 1 for each event A in S
(probabilities are real numbers on the interval [0,1])
Axiom 2 𝑃 𝑆 = 1
(the probability of some event occurring from S is unity)
Axiom 3 If A and B are mutually exclusive events in S, then
𝑃 𝐴 U 𝐵 = 𝑃 𝐴 + 𝑃(𝐵)
(the probability function is an additive set function)

Note: these axioms do not tell us what the set function P(A) is, only what properties
it must satisfy

The classical definition of probability defines the probability function as


𝑁(𝐴)
𝑃 𝐴 = for any event A in the sample space S
𝑁(𝑆)
Note that this definition satisfies all three axioms
Elementary properties of probability functions

Theorem 3.4. If 𝐴1 , 𝐴2 , … , 𝐴𝑛 are mutually exclusive events in a sample space S, then by


induction on Axiom 3,
𝑃 𝐴1 U 𝐴2 U … U 𝐴𝑛 = 𝑃 𝐴1 + 𝑃 𝐴2 + … + 𝑃(𝐴𝑛 )
Theorem 3.7. If A is an event in S, then
𝑃 𝐴 = 1 − 𝑃(𝐴)
Proof: 𝑃 𝐴 + 𝑃 𝐴 = 𝑃 𝐴 ∪ 𝐴 = 𝑃 𝑆 = 1

Example: drawing 5 card with at least one spade

N(𝐴) is very difficult to count, but N(𝐴) is easier,

39
𝑃 𝐴 =1 −𝑃 𝐴 =1 − 5
52
5
Elementary properties of probability functions
Theorem 3.6. If A and B are any (not necessarily mutually exclusive) events in S, then
𝑃 𝐴 ∪ 𝐵 = 𝑃 𝐴 + 𝑃 𝐵 − 𝑃(𝐴 ∩ 𝐵)

Example: Find the probability of drawing 5 card with one spade or one club.

A: drawing 5 cards with one spade; B: drawing 5 card with one club.
13 39 13 39 13 13 26
N(A) = , N(B) = , N(A∪B) =
1 4 1 4 1 1 3
13 39 13 39 13 13 26
+ −
1 4 1 4 1 1 3
P(A∪B) = 52
5
I D
I ∩D

20 5
10

465 S

Fig. 3.7. Measurements on 500 machine parts


I = incompletely assembled, D = defective, S = satisfactory

Probability of picking up a unsatisfactory part:

I union D, P(I union D)=P(I) + P(D) – P(I intersect D)


=30/500 + 15/500 -10/500 =25/500
Conditional Probability.
The probability of an event is only meaningful if we know the sample space S under
consideration.
The probability that you are the tallest person changes if we are discussing being
the tallest person in your family, or the tallest person in this class.
This is clarified using the notation P(A|S), the conditional probability of event A
relative to the sample space S.
(When S is understood we simply use P(A))

e.g. (using classical probability)


From Fig. 3.7,
𝑁(𝐷) 𝑁(𝐷 ∩ 𝑆) 10 + 5 3
𝑃 𝐷 =𝑃 𝐷𝑆 = = = = = 0.03
𝑁(𝑆) 𝑁(𝑆) 500 100

𝑁(𝐷 ∩ 𝐼) 10 1
𝑃(𝐷|𝐼) = = = = 0.333
𝑁(𝐼) 30 3
Note:
𝑁(𝐷 ∩ 𝐼)
𝑁(𝑆) 𝑃(𝐷 ∩ 𝐼)
𝑃 𝐷𝐼 = =
𝑁(𝐼) 𝑃(𝐼)
𝑁(𝑆)
Conditional Probability.
If A and B are any events in S and P(B) ≠ 0, the conditional probability of A relative to
B (i.e. A often stated ‘of A given B’) is

𝑃(𝐴 ∩ 𝐵)
𝑃 𝐴𝐵 = (CP)
𝑃(𝐵)

From the definition of conditional probability (CP) we see that


𝑃(𝐵 ∩ 𝐴)
𝑃 𝐵𝐴 = (∗)
𝑃(𝐴)
Since 𝑃 𝐵 ∩ 𝐴 = P A ∩ 𝐵 , we have from (CP) and (*)

Theorem 3.8. If A and B are events in S, then


𝑃(𝐴) ∙ 𝑃 𝐵 𝐴 if P A ≠ 0 (from ∗ )
𝑃 𝐴∩𝐵 =
𝑃(𝐵) ∙ 𝑃 𝐴 𝐵 if P B ≠ 0 (from CP )
Theorem 3.8 is the general multiplication rule of probability
Theorem 3.8 is a rather shocking statement. The definition (CP) of conditional
probability implies that we compute P(A|B) by knowing P(A∩B) and P(B).
However, Theorem 3.8 implies we can compute P(A∩B) by knowing P(A|B) and P(B).
This implies that we often have another means at hand for computing P(A|B) rather
than the definition (CP) !! (See next example)

The Venn Diagram on


Conditional probability
e.g. Use of the general multiplication rule
20 workers, 12 are For, 8 are Against. What is the probability of randomly picking 2
workers that are Against? (Assume classical probability).
There are 4 classes out outcomes for the 2-picks: FF, FA, AF, AA
A diagram of the sample space of all 2 picks is
Set A: all outcomes where first worker is A
Set B: all outcomes where second worker is A
Desire 𝑃 𝐴 ∩ 𝐵 = 𝑃(𝐴) ∙ 𝑃(𝐵|𝐴)

P(A) = the probability that the first is ‘against’


A B = probability of picking one ‘against’
A ∩B
from the 20 workers
AF AA FA = N(against)/N(workers) = 8/20

P(B|A) = the probability that the second is


FF S against given that the first pick is
against
= probability of picking one ‘against’
from 19 workers (1 ‘against’ removed)
8 7 14 = N(against)/N(workers) = 7/19
Therefore 𝑃 𝐴 ∩ 𝐵 = ∙ =
20 19 95
Check by classical calculation of probability

8 ∙ 19 8
𝑃 𝐴 = =
20 ∙ 19 20

A B
A ∩B

N(AF) = N(AA) N(FA) = 12·8


8·12 8∙7 7
= 8·7 𝑃 𝐵|𝐴 = =
8 ∙ 19 19

N(FF) = 12·11
N(S) = 20·19

𝑁(𝐴 ∩ 𝐵) 8∙7
𝑃 𝐴∩𝐵 = =
𝑁(𝑆) 20 ∙ 19
If we find that 𝑷 𝑨 𝑩 = 𝑷(𝑨), then we state that event A is independent of event B
We will see that event A is independent of event B iff event B is independent of event A.
It is therefore customary to state that A and B are independent events.

Theorem 3.9.
Two events A and B are independent events iff 𝑃 𝐴 ∩ 𝐵 = 𝑃(𝐴) ∙ 𝑃(𝐵)
Proof:
→ If A and B are independent, that is 𝑃 𝐵 𝐴 = 𝑃 𝐵
Then, by Theorem 3.8,
𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 ∙ 𝑃 𝐵 𝐴 = 𝑃(𝐴) ∙ 𝑃(𝐵)
← If 𝑃 𝐴 ∩ 𝐵 = 𝑃(𝐴) ∙ 𝑃(𝐵)
Then, by definition of conditional probability,
𝑃(𝐴∩𝐵) 𝑃(𝐴)∙𝑃(𝐵)
𝑃 𝐴𝐵 = = = 𝑃(𝐴)
𝑃(𝐵) 𝑃(𝐵)

Theorem 3.9 is the special product rule of probability and states that the probability
that two independent events will both occur is the product of the probabilities that
each alone will occur.
Example: probability of getting two heads in two flips of a balanced coin
(Assumption is that balance implies that the two flips are independent)
Therefore 𝑃 𝐴 ∩ 𝐵 = 𝑃(𝐴) ∙ 𝑃(𝐵)

B
B
A Second flip
A Second
First flip is heads
First draw draw is
is heads
is Ace Ace

S S

Example: probability of selecting two aces at random from a deck of cards if first card
replaced before second card drawn
(Assumption is that replacing first card returns deck to original conditions
making the two draws independent of each other )
Therefore 𝑃 𝐴 ∩ 𝐵 = 𝑃(𝐴) ∙ 𝑃(𝐵)
Example: probability of selecting two aces at random from a deck of cards if first card
not replaced before second card drawn
(Picking the second card is now dependent on the first card choice)
Therefore 𝑃 𝐴 ∩ 𝐵 = 𝑃(𝐴) ∙ 𝑃(𝐵|𝐴)

B
A Second
First draw draw is
is Ace Ace

S
Example: (false positives) 1% probability of getting a false reading on a test
Assuming that each test is independent of the others:
(a) probability of two tests receiving an accurate reading
(0.99)2
(b) probability of 1 test per week for 2 years all receiving accurate readings
(0.99)104 ≈ 0.35 (!) (65% chance that 1 or more of the 104 tests fail)
Example: redundancy in message transmission to reduce transmission errors
probability p that a 0 → 1 or 1 → 0 error occurs in transmission
sent reception probability read Probability
possibility of reception as of reading
111 111 (1 − 𝑝)3 111
2
110 𝑝(1 − 𝑝) 111
101 𝑝(1 − 𝑝)2 111 1 − 𝑝 2 (1 + 2𝑝)
011 𝑝(1 − 𝑝)2 111
001 𝑝2 (1 − 𝑝) 000
010 𝑝2 (1 − 𝑝) 000
100 2
𝑝 (1 − 𝑝) 000 𝑝2 (3 − 2𝑝)
000 𝑝3 000

p 0.01 0.02 0.05


triple mode 0.9997 0.9988 0.05
Prob of reading correct
single mode 0.99 0.98 0.95
Theorem 3.8 shows that P(A|B) and P(B|A) are related, specifically:
𝑃 𝐵 ∙ 𝑃(𝐴|𝐵)
𝑃 𝐵𝐴 = (B)
𝑃(𝐴)

Remember,
P(A|B) is the ratio of the probability of
event A∩B to the probability of event A
← 𝑃(𝐵|𝐴) and
B P(B|A) is the ratio of the probability of
A A ∩B event A∩B to the probability of event B
𝑃(𝐴|𝐵) → Therefore to go from P(A|B) to P(B|A) one
has to apply a correction, by multiplying
S and dividing respectively, by the probability
of B and the probability of A.

In the above Figure probabilities are represented by area. P(B|A) is larger than P(A|B) by the
area fraction P(B)/P(A)
Example: each year about 1/1000 people will develop lung cancer. Suppose a
diagnostic method is 99.9 percent accurate (if you have cancer, it will be 99.9
Percent being diagnosed as positive, if you don’t have cancer, it will be 0.1 percent
Being diagnosed as positive). If you are diagnosed as positive for lunge
Cancer, what is the probability that you really have cancer?

Solution: Your being not have lung cancer is A, your being diagnosed positive is
B. If you are diagnosed positive, what is the probability of being healthy? That is

𝑃(𝐴)
P(A/B) = 𝑃(𝐵) P(B/A)

P(A) = 0.999, P(B/A) = 0.001, P(B) = ?

P(B) = P(B/A)P(A) + P(B/A’)P(A’) = 0.001*0.999 + 0.999*0.001 = 0.001998

Substituting into the calculation:

P(A/B) = 0.999*0.001/0.0019 = 0.5


A
B

𝑃(𝐴∩𝐵) 𝑁(𝐴∩𝐵)
False positive: 𝑃(𝐴 𝐵) = 𝑃(𝐵)
= 𝑁(𝐵) 𝐴∩𝐵
The evolution of thinking
The relation
𝑃 𝐵 ∙ 𝑃(𝐴|𝐵)
𝑃 𝐵𝐴 =
𝑃(𝐴)
is a specific example of Bayes’ Theorem. On the right hand side we have the
(conditional) probability of getting outcome A considered as part of event B (having
occurred). On the left hand side, we have the probability of getting outcome B
considered as part of event A (having occurred).
This can be diagrammed as follows:

𝑆 𝐴 𝑃(𝐵|𝐴) 𝑆
𝑃(𝐴) 𝑃(𝐴|𝐵) 𝐵 𝑃(𝐵)
𝐴∩𝐵

or more completely ….
Partition all outcomes into those with
and without property A and then
subpartition into those with and without
property B

𝑃 𝐴 𝐵 𝑃 𝐵 = 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐵 𝐴 𝑃(𝐴)

𝑃 𝐴 𝐵 𝑃 𝐵 = 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐵 𝐴 𝑃(𝐴)

Partition all outcomes into those with


and without property B and then
subpartition into those with and without
property A
𝐁
𝐀

𝐁|𝐀 𝐁|𝐀 𝐁|𝐀 A|𝐁 A|𝐁 𝐀|𝐁

𝐁
𝐀

𝐀 𝐁|𝐀
𝐀|𝐁 𝐁
Bayes’ result can be generalized.
Consider three mutually exclusive events, 𝐵1 , 𝐵2 , and 𝐵3 , one of which must occur.
e.g. 𝐵𝑖 are supply companies of voltage regulators to a single manufacturer.
Let A be the event that a voltage regulator works satisfactorily. This might be
diagrammed as follows for a hypothetical manufacturer
𝐵1
𝑃 𝐴|𝐵1 = 0.95
𝐴 ∩ 𝐵1

𝑃 𝐵2 = 0.3 𝐵2 𝑃 𝐴|𝐵2 = 0.80


S 𝐴 ∩ 𝐵2 𝐴
all voltage
regulators
𝑃 𝐴|𝐵3 = 0.65
𝐴 ∩ 𝐵3
𝐵3

𝑃 𝐵1 = 0.6 is the probability of getting a regulator from company 𝐵1


𝑃 𝐴|𝐵1 = 0.95 is the probability of getting a working regulator from company
𝐵1
Choosing at random from all regulators, what is the probability of getting a
working regulator? (i.e. what is 𝑃(𝐴)?)
S all voltage regulators

supplier
𝐵1

A
working voltage
regulators

supplier
𝐵2

supplier 𝐵3
From the diagram we see
𝐴 = 𝐴 ∩ 𝐵1 ∪ 𝐵2 ∪ 𝐵3 = 𝐴 ∩ 𝐵1 ∪ 𝐴 ∩ 𝐵2 ∪ 𝐴 ∩ 𝐵3

As 𝐵1 , 𝐵2 , and 𝐵3 are mutually exclusive


so are 𝐴 ∩ 𝐵1 , 𝐴 ∩ 𝐵2 , and 𝐴 ∩ 𝐵3 (see diagram)
By Theorem 3.4
𝑃 𝐴 = 𝑃 𝐴 ∩ 𝐵1 + 𝑃 𝐴 ∩ 𝐵2 + 𝑃 𝐴 ∩ 𝐵3

By Theorem 3.8
𝑃 𝐴 = 𝑃(𝐵1 ) ∙ 𝑃 𝐴|𝐵1 + 𝑃(𝐵2 ) ∙ 𝑃 𝐴|𝐵2 + 𝑃(𝐵3 ) ∙ 𝑃 𝐴|𝐵3

Theorem 3.10 If 𝐵1 , 𝐵2 , … , 𝐵𝑛 are mutually exclusive events of which one must


occur, then for event A
𝑃 𝐴 = 𝑛𝑖=1 𝑃(𝐵𝑖 ) ∙ 𝑃 𝐴|𝐵𝑖
Example: each year about 1/1000 people will develop lung cancer. Suppose a
diagnostic method is 99.9 percent accurate (if you have cancer, it will be 99.9
Percent being diagnosed as positive, if you don’t have cancer, it will be 0.1 percent
Being diagnosed as positive). If you are diagnosed as positive for lunge
Cancer, what is the probability that you really have cancer?

Now let’s change the question: what is the probability that a person is tested
positive?

Solution: If a person is sick, there is a probability of 0.999 to be tested positive;


If a person is healthy, there is a probability of 0.001 to be testes positive.

𝐴: a person is healthy; 𝐴: a person has lung cancer.


𝐵: a person is tested positive.

𝑃 𝐴 = 0.999; 𝑃 𝐴 = 0.001; 𝑃 𝐵 𝐴 = 0.001; 𝑃 𝐵 𝐴 = 0.999.

𝑃 𝐵 = 𝑃 𝐴 𝑃 𝐵 𝐴 + 𝑃 𝐴 𝑃 𝐵 𝐴 = 0.999 × 0.001 + 0.001 × 0.999


= 0.001998
Theorem 3.10 expresses the probability of event A in terms of the probabilities that
each of the constituent events 𝐵𝑖 provided event A
(i.e. in our example, P(A) is expressed in terms of the probabilities that constituent
𝐵𝑖 provided a working regulator)

Suppose, we want to know the probability that a working regulator came from a
particular event 𝐵𝑖 ?
e.g. suppose we wanted to know 𝑃 𝐵3 𝐴
By definition of conditional probability
𝑃(𝐵3 ∩ 𝐴)
𝑃 𝐵3 𝐴 =
𝑃(𝐴)
𝑃(𝐴 ∩ 𝐵3 )
=
𝑃(𝐴)
𝑃(𝐵3 ) ∙ 𝑃(𝐴|𝐵3 ) Theorem 3.8
=
𝑃(𝐴)
𝑃(𝐵3 ) ∙ 𝑃(𝐴|𝐵3 )
= 3 Theorem 3.10
𝑖=1 𝑃(𝐵𝑖 ) ∙ 𝑃 𝐴|𝐵𝑖 )

0.1 ∙0.65
From the tree diagram 𝑃 𝐵3 𝐴 = 0.6∙0.95+0.3∙0.80+0.1∙0.65 = 0.074
The probability that a regulator comes from 𝐵3 is 0.1
The probability that a working regulator comes from 𝐵3 is 𝑃 𝐵3 𝐴 = 0.074
The generalization of this three set example is Bayes’ theorem

Theorem 3.11 If 𝐵1 , 𝐵2 , … , 𝐵𝑛 are mutually exclusive events of which one must occur,
then
𝑃(𝐵𝑟 )∙𝑃(𝐴|𝐵𝑟 )
𝑃 𝐵𝑟 𝐴 = 𝑛
𝑖=1 𝑃(𝐵𝑖 )∙𝑃 𝐴|𝐵𝑖 )
for r = 1, 2, …, n

The numerator in Bayes’ theorem is the probability of achieving event A through the r’th
branch of the tree.
The denominator is the sum of all probabilities of achieving event A.
e.g. Janet (𝐵1 ) handles 20% of the breakdowns in a computer system
Tom (𝐵2 ) handles 60%
Georgia (𝐵3 ) handles 15%
Peter (𝐵4 ) handles 5%
Janet makes an incomplete repair 1 time in 20 (i.e. 5% of the time)
Tom: 1 time in 10 (10%)
Georgia: 1 time in 10 (10%)
Peter: 1 time in 20 (5%)
If a system breakdown is incompletely repaired, what is the probability that Janet made
the repair? (i.e. desire P(B1|A))
𝐵1
𝑃 𝐴|𝐵1 = 0.05
𝐴 ∩ 𝐵1

𝐵2 𝑃 𝐴|𝐵2 = 0.10
𝐴 ∩ 𝐵2
all
breakdowns S
𝑃 𝐴|𝐵3 = 0.10
𝐴 ∩ 𝐵3
𝐵3

A all incomplete repairs 𝑃 𝐴|𝐵4 = 0.05


𝐴 ∩ 𝐵4
The probability that the incomplete repair was made by Janet is

0.2 ∙0.05
𝑃 𝐵1 𝐴 = = 0.114
(0.2)(0.05)+(0.6)(0.1)+(0.15)(0.1)+(0.05)(0.05)

Therefore, although Janet makes an incomplete repair only 5% of the time, because she
handles 20% of all breakdowns, she is the cause of 11.4% of all incomplete repairs
Summary of the chapter

1. Sample space, event, set operation, and Venn diagram


2. Counting principles, addition and multiplication
3. Permutation and combination
4. Classical probability, axioms
5. Independent events
6. Conditional probability
7. Bayes theorem
Formulas to remember
1. De Morgan’s law
𝐴∪𝐵 =𝐴∩𝐵
𝐴∩𝐵 =𝐴∪𝐵
𝑛! 𝑛 𝑛!
2. 𝑛 𝑃𝑟 = =
𝑛−𝑟 ! 𝑟 𝑛−𝑟 !𝑟!
𝑁(𝐴)
3. 𝑃 𝐴 =
𝑁(𝑆)
4. 𝑁 𝐴∪𝐵 =𝑁 𝐴 +𝑁 𝐵 −𝑁 𝐴∩𝐵
5. 𝑃 𝐴∪𝐵 =𝑃 𝐴 +𝑃 𝐵 −𝑃 𝐴∩𝐵
6. 𝑃 𝐴 = 1 − 𝑃(𝐴)
7. For independent events 𝐴 and 𝐵: 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 𝑃(𝐵)
𝑃(𝐴∩𝐵)
8. Conditional probability: 𝑃 𝐴 𝐵 =
𝑃(𝐵)
9. If 𝐴 and 𝐵 are independent, 𝑃 𝐴 𝐵 = 𝑃(𝐴), 𝑃 𝐵 𝐴 = 𝑃(𝐵)
10. Theorem 3.8 (bridge theorem)
𝑃 𝐴 𝑃 𝐵 𝐴 = 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐵 𝑃(𝐴 𝐵 )
11. Total probability
𝑛

𝑃 𝐴 = 𝑃 𝐵𝑖 𝑃(𝐴 𝐵𝑖 )
𝑖=1
𝐵𝑖 ’s are mutually exclusive partition of the sample space.
12. Bayes theorem:
𝑃 𝐵 𝑃(𝐴 𝐵) 𝑃 𝐵 𝑃(𝐴 𝐵)
𝑃 𝐵 𝐴 = =
𝑃(𝐴) 𝑃 𝐵 𝑃 𝐴 𝐵 + 𝑃 𝐵 𝑃(𝐴 𝐵)

13. This can be generalized to give:

𝑃 𝐵𝑟 𝑃(𝐴 𝐵𝑟 ) 𝑃 𝐵𝑟 𝑃(𝐴 𝐵𝑟 )
𝑃 𝐵𝑟 𝐴 = = 𝑛 𝑟 = 1,2, ⋯ , 𝑛
𝑃(𝐴) 𝑖=1 𝑃 𝐵𝑖 𝑃(𝐴 𝐵𝑖 )

14. Independent events are NOT exclusive events.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy