Statistics Notes
Statistics Notes
Statistics Notes
ENEE 331
LECTURE NOTES
BY
JUNE, 2009
I
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
CHAPTER I
FUNDAMENTAL CONCEPTS
OF PROBABILITY
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
Basic Definitions:
We start our treatment of probability theory by introducing some basic definitions.
Experiment:
By an experiment, we mean any procedure that:
1- Can be repeated, theoretically, an infinite number of times.
2- Has a well-defined set of possible outcomes.
Sample Outcome:
Each of the potential eventualities of an experiment is referred to as a sample outcome(s).
Sample Space:
The totality of sample outcomes is called the sample space (S).
Event:
Any designated collection of sample outcomes, including individual outcomes, the entire
sample space and the null space, constitute an event.
Occur:
An event is said to occur if the outcome of the experiment is one of the members of that event.
EXAMPLE (2-1):
Consider the experiment of flipping a coin three times.
a- What is the sample space?
b- Which sample outcomes make up the event:
A : Majority of coins show heads.
SOLUTION:
a- Sample Space (S) = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
b- A = {HHH, HHT, HTH, THH}
Algebra of Events:
Let A and B be two events defined over the sample space S, then:
- The intersection of A and B, (A ∩ B), is the event whose outcome belongs to both A and B.
- The union of A and B, (A U B), is the event whose outcome belongs to either A or B or both.
- Events A and B are said to be Mutually Exclusive (or disjoint) if they have no outcomes in
common, that is A ∩ B = Ø, where Ø is the null set (a set which contains no outcomes).
- The complement of A (Ac or Ā) is the event consisting of all outcomes in S other than those
contained in A.
- Venn Diagram is a graphical format often used to simplify the manipulation of complex
events.
-1-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
A B A B
A∩B AUB Ac
S S S
B A B
A
B
EXAMPLE (2-2):
An experiment has its sample space specified as:
S = {1, 2, 3, ……, 48, 49, 50}. Define the events
A : set of numbers divisible by 6
B : set of elements divisible by 8
C : set of numbers which satisfy the relation 2n , n = 1, 2, 3,…
Find: 1- A, B, C 2- A U B U C 3- A ∩ B ∩ C
SOLUTION:
1- Events A, B, and C are:
A = {6, 12, 18, 24, 30, 36, 42, 48}
B = {8, 16, 24, 32, 40, 48}
C = {2, 4, 8, 16, 32}
2- A U B U C = {6, 12, 18, 24, 30, 36, 42, 48,
8, 16, 32, 40,
2, 4}
3- A ∩ B ∩ C = { Ø }
EXAMPLE (2-3):
The sample space of an experiment is:
S = { - 20 x 14 }. If A = { -10 x 5 } and B = { - 7 x 0 } find.
1- A U B 2- A ∩ B A
SOLUTION: x
-20 -10 0 5 14
1- A U B = { -10 x 5 } B
2- A ∩ B = { - 7 x 0 } x
-7 0
-2-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
Definitions of Probability:
Four definitions of probability have evolved over the years:
- Definition I: Classical (a priori)
If the sample space S of an experiment consists of finitely many outcomes (points) that are
equally likely, then the probability of event A, P(A) is:
Number of outcomes in A
P(A)
Number of outcomes in S
Thus in particular, P(S) = 1
- Definition II: Relative Frequency (a posteriori)
Let an experiment be repeated (n) times under identical conditions then, the relative
frequency:
f(A) Number of times A occurs
P(A) lim
n n Number of trials
f(A) is called the frequency of (A)
f(A)
Clearly 0 1
n
f(A)
0 if (A) does not occur in the sequence of trials
n
f(A)
1 if (A) occurs on each of the (n) trials
n
EXAMPLE (2-4):
In digital data transmission, the bit error probability is (p). If 10,000 bits are transmitted over
a noisy communication channel and 5 bits were found to be in error, find the bit error
probability (p).
SOLUTION:
5
According to the relative frequency definition we can estimate (p) as: (p)
10,000
-3-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
Where events (1) and (2) and (3) are mutually exclusive
P(A U B) = P(1) + P(2) + P(3)
P(A) = P(1) + P(2)
P(B) = P(2) + P(3)
P(A U B) = {P(1) + P(2)} + {P(2) + P(3)} – {P(2)}
P(A U B) = P(A) + P(B) – P(A ∩ B)
- Theorem:
If A, B, and C are three events, then:
P(A U B U C) = P(A) + P(B) + P(C) – P(A ∩ B) – P(A ∩ C) – P(B ∩ C) + P(A ∩ B ∩ C)
EXAMPLE (2-5):
One integer is chosen at random from the numbers {1, 2, ……, 50}. What is the probability
that the chosen number is divisible by 6? Assume all 50 outcomes are equally likely.
SOLUTION:
S = {1, 2, 3, …………, 50}
A = {6, 12, 18, 24, 30, 36, 42, 48}
Number of elements in A 8
P(A)
Number of elements in S 50
EXAMPLE (2-6):
If the probability of occurrence of an even number is twice as likely as that of an odd number
in Example (2-5). Find P(A); A is defined above.
SOLUTION:
P(S) = P(even) + P(odd) = 1 ;
Let (P) be the probability of occurrence of an odd number,
then (2P) will be the probability of occurrence of an even number.
(25)(2P) + (25)(P) = 1
-4-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
1
(50 + 25)(P) = 1 P
75
16
P(A) 8 2P
75
EXAMPLE (2-7):
Suppose that a company has 100 employees who are classified according to their marital
status and according to whether they are college graduates or not. It is known that 30% of the
employees are married, and the percent of graduate employees is 80%. Moreover, 10
employees are neither married nor graduates. What proportion of married employees are
graduates?
SOLUTION:
Let: M : set of married employees M G
G : set of graduate employees
N(.) : number of members in any set (.)
10 20 60
N(S) = 100
N(M) = 0.3 × 100 = 30 10
N(G) = 0.8 × 100 = 80
N(M U G)c = 10
S
N(M U G) = 100 – 10 = 90
N(M U G) = N(M) + N(G) – N(M ∩ G)
90 = 30 + 80 – N(M ∩ G)
N(M ∩ G) = 30 + 80 – 90 = 20
Two third of the married employees in the company are graduates.
EXAMPLE (2-8):
An experiment has two possible outcomes; the first occurs with probability (P), the second
with probability (P2), find (P).
SOLUTION:
P(S) = 1
P + P2 = 1
P2 + P – 1 = 0
-1 5
P ; (only the positive root is taken)
2
EXAMPLE (2-9):
A sample space “S” consists of the integers 1 to 6 inclusive. Each outcome has an associated
probability proportional to its magnitude. If one number is chosen at random, what is the
probability that an even number appears?
SOLUTION:
Sample Space “S” = {1 , 2 , 3 , 4 , 5 , 6}
Event (A) = {2 , 4 , 6}
P(A) = P(2) + P(4) + P(6)
-5-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
6 6
6(6 1)
P(S) = 1 = p(i) (i)
i 1 i 1 2
1
1
The proportionality constant
21
2 4 6 12
P( A)
21 21 21 21
EXAMPLE (2-10):
Let (A) and (B) be any two events defined on (S). Suppose that P(A) = 0.4, P(B) = 0.5, and
P(A ∩ B) = 0.1.
Find the probability that:
1- (A) or (B) but not both occur.
2- None of the events (A) or (B) will occur.
3- At least one event will occur.
4- Both events occur.
SOLUTION:
P(A) = P[(A ∩ Bc) U (A ∩ B)]
Using Venn diagram: A B
P(A) only = 0.3
P(B) only = 0.4 0.4
0.3 0.1
1- P(A or B only) = 0.3 + 0.4 = 0.7
0.2
Note that: S
P(A U B) = P(A) + P(B) – P(A ∩ B)
P(A U B) = 0.4 + 0.5 – 0.1 = 0.8
2- P(none) = P(AUB) = 0.2
3- P(at least one) = P(A U B) = 0.8
4- P(both) = P(A ∩ B) = 0.1
EXAMPLE (2-11):
The outcome of an experiment is either a success with probability p or a failure with
probability (1-p). If the experiment is to be repeated until a success comes up for the first
time. Let X be the number of times the experiment is performed then the discrete probability
function for the countably infinite sample space is
-6-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
P( x) p(1 p) x1 ; x = 1, 2, …
What is the probability that a success occurs on an an odd-numbered trial?
SOLUTION:
The sample space for the experiment is S={1, 2, 3, ….}
Let A be the event that a success occurs on an odd numbered trial. Then A consists of the
sample points: A = {1, 3, 5, …}
P(A) = P(1) + P(3) + P(5) + …
P( A) p(1 p)11 p(1 p)31 p(1 p)51 ...
p 1
P( A) p(1 p 2 p 4 ...)
1 p 2
, by virtue of the geometric series
k 0
xk
1 x
In the special case when p = ½, P(A) becomes
1 1 2
P(A) = P(A) =
2 1 3
1
4
EXAMPLE (2-12):
The discrete probability function for the countably infinite sample space S = {1, 2, 3, …} is:
C
P(x) 2 ; x = 1, 2, 3, ……
x
a- Find the constant “C” so that P(x) is valid discrete probability function.
b- Find the probability that the outcome of the experiment is a number less than 4.
SOLUTION:
a. By Axiom 2, P(S) = 1
C 1 π2 6
x 1 x
2
1 C
x 1 x
2
1 C
6
1, C 2
π
-7-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
SOLUTION:
2 2
k
a- P(S) f (x) dx 1 x 2
dx 1 k 2
1 1
1.5
k 2
b- P(x 1.5) x
1
2
dx
3
EXAMPLE (2-14):
The length of a pin that is a part of a wheel assembly is supposed to be 6 cm. The machine
that stamps out the parts makes them 6 + x cm long, where x varies from pin to pin according
to the probability function:
f(x) = k(x + x2) ; 0 x 2
where (k) is a constant. If a pin is longer than 7 cm, it is unusable. What proportion of pins
produced by this machine will be unusable?
SOLUTION:
P( S ) f ( x) dx 1
2
k (x x 2 ) dx 1
0
2
x2 x2 6
k 1 k
2 2 0 28
A cotter pin is not accepted if the error x 1 cm.
2
P( x 1 ) = k(x x 2 ) dx
1
2
x2 x2 6 4 8 1 1 23
k
2 2 1 28 2 3 2 3 28
23
P( x 1 ) = p(pin length 7 cm) =
28
-8-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
A B
S
EXAMPLE (2-15):
A sample space (S) consists of the integers 1 to n inclusive. Each has an associated
probability proportional to its magnitude. One integer is chosen at random, what is the
probability that number 1 is chosen given that the number selected is in the first (m) integers.
SOLUTION:
Let (A) be the event “number 1” occurs
(A) = {1}
(B) the event “outcome belongs to the first m integers”
(B) = {1 , 2 , 3 , … , m}
n n n
n (n 1) 2
i 1
P i
i 1
i 1
i 1
i 1
2
1
n (n 1)
P(A B) P(1) α α 1 2
P(A/B) m m P(A/B)
P(B) P(B) m(m 1) m(m 1)
i 1
Pi α i
i 1 2
2
A priori probability: P(A) =
n (n 1)
2
A posteriori probability: P(A/B)
m(m 1)
Clearly P(A/B) > P(A) due to the additional information given by event (B).
EXAMPLE (2-16):
A certain computer becomes inoperable if two components A and B both fail. The probability
that A fails is 0.001 and the probability that B fails is 0.005. However, the probability that B
fails increases by a factor of 4 if A has failed. Calculate the probability that:
a- The computer becomes inoperable.
b- A will fail if B has failed.
SOLUTION:
P(A) = 0.001
-9-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
P(B) = 0.005
P(B/A) = 4 × 0.005 = 0.020
a- The system fails when both A and B fail, i.e.,
P(A B) P(A) P(B/A)
P(A B) 0.001 0.020 = 0.00002
b- P(A B) P(A) P(B/A) = P(B) P(A/B)
0.001 0.020
P(A/B) = 0.004
0.005
EXAMPLE (2-17):
A box contains 20 non-defective (N) items and 5 defective (D) items. Three items are drawn
without replacement.
a. Find the probability that the sequence of objects obtained is (NND) in the given order.
b. Find the probability that exactly one defective item is obtained.
SOLUTION:
a. P(NND) = P(N)× P(N/N)× P(D/N,N)
20 20 1 5 20 19 5
P(NND) = ( )( )( ) = ( )( )( )
25 25 1 25 2 25 24 23
b. One defective item is obtained, when any one of the following sequences occurs:
(NND), (NDN), (DNN)
The probability of getting one defective item is the sum of the probabilities of these
sequences and is given as:
20 19 5 20 5 19 5 20 19 20 19 5
( )( )( ) + ( )( )( ) + ( )( )( ) = (3)( )( )( )
25 24 23 25 24 23 25 24 23 25 24 23
Later in Chapter 2, we will see that Part (b) can be solved using the hyper-geometric
distribution.
-10-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
EXAMPLE (2-18):
Let S = {1 , 2 , 3 , 4} ; Pi = 1 . A = {1 , 2} and B = {2 , 3}. Are (A) and (B) independent?
4
SOLUTION:
EXAMPLE (2-19):
Consider an experiment in which the sample space contains four outcomes {S1, S2, S3, S4}
such that P(Si) = 1 . Let events (A), (B) and (C) be defined as:
4
A = {S1, S2} , B = {S1, S3} , C = {S1, S4}
Are these events independent?
SOLUTION:
P(A) = P(B) = P(C) = 1
2
(A ∩ B) = {S1} ; (A ∩ C) = {S1} ; (B ∩ C) = {S1} ; (A ∩ B ∩ C) = {S1}
P(A ∩ B) = 1 ; P(A ∩ C) = 1 ; P(B ∩ C) = 1 ; P(A ∩ B ∩ C) = 1
4 4 4 4
Check the conditions:
- P(A ∩ B) = 1 = P(A) P(B) = 1 1 ; P(A ∩ C) = 1 = P(A) P(C) = 1 1
4 2 2 4 2 2
1 1 1
- P(B ∩ C) = = P(B) P(C) =
4 2 2
1
- P(A ∩ B ∩ C) = ≠ P(A) P(B) P(C) = 1 1 1 1
4 2 2 2 8
Events are not independent (even though the pair wise conditions of independence are satisfied )
-11-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
SOLUTION:
C1
Reliability of the system = P(system works)
P(system works) = P(C1 or C2 or both C1 and C2 works) P
= P(C1 U C2)
= P(C1) + P(C2) – P(C1 ∩ C2) P
= P + P – (P × P) = 2P – P2 C2
EXERCISE:
A pressure control apparatus contains 4 electronic tubes. The apparatus will not work unless
all tubes are operative. If the probability of failure of each tube is 0.03, what is the probability
of failure of the apparatus assuming that all components work independently?
P
P P
P
EXAMPLE (2-22):
A coin may be fair or it may have two heads. We toss it (n) times and it comes up heads on
each occasion. If our initial judgment was that both options for the coin (fair or both sides
heads) were equally likely (probable), what is our revised judgment in the light of the data?
SOLUTION:
Let A : event representing coin is fair
B : event representing coin with two heads
C : outcome of the experiment H H H
HH...
H
n times
A priori probabilities:
1 1
P(A) = , P(B) =
2 2
We need to find P(A/C) = ?
P(A C) P(A) P(C/A)
P(A/C)
P(C) P(C)
P(A) P(H H H ... H / fair coin)
P(A/C)
P(A) P(H H H ... H / fair coin) P(B) P(H H H ... H / coin with two heads)
-12-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
n n
1 1 1
2 2
n
2 1
P(A/C)
1 1
n
1 1 1 2n
(1) 1
2 2 2 2
1 2n
P(B/C) 1 - P(A/C) 1
1 2n 1 2n
P(A1) P(B/A1)
P(A2) P(B/A2)
B
P(B/An)
P(An)
EXAMPLE (2-23):
If female students constitute 30% of the student body in the Faculty of Engineering and 40%
of them have A GPA > 80, while 25 % of the male students have their GPA > 80. What is the
probability that a person selected at random will have a GPA > 80?
SOLUTION:
A1 = Event representing the selected person is a female
A2 = Event representing the selected person is a male A1 A2
B = Event representing GPA > 80
P(A1) = 0.3 A1∩B
P(A2) = 0.7 A2∩B
B = (A1∩B) U (A2∩B) P(B) = P(A1∩B) + P(A2∩B)
P(B) = P(A1) P(B/A1) + P(A2) P(B/A2) B
P(B) = (0.3 × 0.4) + (0.7 × 0.25)
P(B) = 0.295
-13-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
Baye's Theorem:
If A1, A2, A3, ……, An are disjoint events defined on (S), and (B) is another event defined on
(S) (same conditions as above), then:
EXAMPLE (2-24):
Suppose that when a machine is adjusted properly, 50% of the items produced by it are of
high quality and the other 50% are of medium quality. Suppose, however, that the machine is
improperly adjusted during 10% of the time and that under these conditions 25% of the items
produced by it are of high quality and 75% are of medium quality.
a- Suppose that one item produced by the machine is selected at random, find the
probability that it is of medium quality.
b- If one item is selected at random, and found to be of medium quality, what is the
probability that the machine was adjusted properly.
SOLUTION:
A1 = Event representing machine is properly adjusted
A2 = Event representing machine is improperly adjusted
H = Event representing item is of high quality
M = Event representing item is of medium quality
From the problem statement we have:
P(A1) = 0.9 ; P(A2) = 0.1
P(H/A1) = 0.5 ; P(H/A2) = 0.25
P(M/A1) = 0.5 ; P(M/A2) = 0.75 A1 M
0.5
A2 M
P(A2) = 0.1
0.75
-14-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
EXAMPLE (2-25):
Consider the problem of transmitting binary data over a noisy communication channel. Due
to the presence of noise, a certain amount of transmission error is introduced. Suppose that
the probability of transmitting a binary 0 is 0.7 (70% of transmitted digits are zeros) and
there is a 0.8 probability that a given 0 or 1 being received properly.
a- What is the probability of receiving a binary 1.
b- If a 1 is received, what is the probability that a 0 was sent.
SOLUTION:
0.8
A0 B0
A0 = Event representing 0 is sent P(A0) = 0.7
A1 = Event representing 1 is sent
0.2
B0 = Event representing 0 is received
B1 = Event representing 1 is received
From the problem statement we have: 0.2
P(A0) = 0.7 ; P(A1) = 0.3 A1 B1
P(B0/A0) = 0.8 ; P(B0/A1) = 0.2 P(A1) = 0.3
0.8
P(B1/A0) = 0.2 ; P(B1/A1) = 0.8
a- P(B1) = P(A0) P(B1/A0) + P(A1) P(B1/A1)
P(B1) = (0.7)(0.2) + (0.3)(0.8) = 0.38
P(B0) = 1 – P(B1) = 0.62
P(A 0 B1 ) P(A 0 ) P(B1 /A 0 )
b- P(A 0 /B1 )
P(B1 ) P(B1 )
(0.7)(0.2)
P(A0 /B 1 ) 0.3684
(0.38)
EXAMPLE (2-26):
In a factory, four machines produce the same product. Machine A1 produces 10% of the
product, A2 20%, A3 30%, and A4 40%. The proportion of defective items produced by the
machines follows:
A1: 0.001 ; A2: 0.005 ; A3: 0.005 ; A4: 0.002
An item selected at random is found to be defective, what is the probability that the item was
produced by machine A1?
SOLUTION:
Let D be the event: Selected item is defective
P(D) = P(A1) P(D/A1) + P(A2) P(D/A2) + P(A3) P(D/A3) + P(A4) P(D/A4)
P(D) = (0.1 × 0.001) + (0.2 × 0.005) + (0.3 × 0.005) + (0.4 × 0.002)
P(D) = 0.0034
P(A1 ) P(D/A1 ) (0.1) (0.001) 0.0001 1
P(A1 /D)
P(D) (0.0034) 0.0034 34
-15-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
Counting techniques:
Here we introduce systematic counting of sample points in a sample space. This is necessary
for computing the probability P(A) in experiments with a finite sample space (S) consisting of
(n) equally likely outcomes. Then each outcome has probability 1 .
n
m
And if (A) consists of (m) outcomes, then P(A)
n
- Multiplication Rule:
If operation A can be performed in n1 different ways and operation B in n2 different ways, then
the sequence (operation A , operation B) can be performed in n1 x n2 different ways.
EXAMPLE (2-27):
There are two roads between A and B and four roads between B and C. How many different
routes can one travel between A and C.
SOLUTION: A B C
n=2x4=8
Permutation:
Consider an urn having (n) distinguishable objects (numbered 1 to n). We perform the
following two experiments:
1- Sampling without replacement:
An object is drawn; its number is recorded and then put aside, another object is drawn; its
number is recorded and then put aside, the process is repeated (k) times. The total number of
ordered sequences {x1, x2, ……., xk} (repetition is not allowed) called permutation is:
N = n (n – 1) (n – 2) …… (n – k + 1)
n!
N …………….. (1) 1
( n k )! 2
where n! = n (n – 1) (n – 2) …… (3) (2) (1) 3
n
2- Sampling with replacement:
If in the previous experiment, each drawn object is dropped back into the urn and the process
is repeated (k) times. The number of possible sequences {x1, x2, ……., xk} of length (k) that
can be formed from the set of (n) distinct objects (repetition allowed):
N = nk …………….. (2)
EXAMPLE (2-28):
How many different five-letter computer passwords can be formed:
a- If a letter can be used more than once.
b- If each word contains each letter no more than once.
SOLUTION:
a- N = (26)5
26!
b- N
(26 5)!
-16-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
EXAMPLE (2-29):
An apartment building has eight floors (numbered 1 to 8). If seven people get on the elevator
on the fist floor, what is the probability that:
8
a- All get off on different floors?
b- All get off on the same floor? 7
6
SOLUTION:
Number of points in the sample space:
5
First person can get off at any of the 7 floors. 4
3
Person (2) can get off at any of the 7 floors and so on. 2
The number of ways people can get off:
(N) = 7 × 7 × 7 × 7 × 7 × 7× 7 = 77
a- Here the problem is to find the number of permutations of 7 objects taking 7 at a time.
7!
P 7
7
b- Here there are 7 ways whereby all seven persons get off on the same floor.
7
P 7
7
EXAMPLE (2-30):
If the number of people getting on the elevator on the first floor is 3:
a- Find the probability they get off the elevator on different floors.
b- Find the probability they get off the elevator on the same floor.
SOLUTION:
Number of points in the sample space (N) = 7 × 7 × 7 = 73
765
a- P
73
7
b- P 3
7
EXAMPLE (2-31):
If the number of floors is 5 (numbered 1 to 5) and the number of people getting on the
elevator is 8. Find the probability that exactly 2 people get off the elevator on each floor.
SOLUTION:
Number of points in the sample space (N) = 4 × 4 × 4 × 4 × 4 × 4 × 4 × 4 = 48
8 6 4 2
P 8
2 2 2 2
4
-17-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
EXAMPLE (2-32):
To determine an "odd man out", (n) players each toss a fair coin. If one player's coin turns up
differently from all the others, that person is declared the odd man out. Let (A) be the event
that some one is declared an odd man out.
a- Find P(A)
b- Find the probability that the game is terminated with an odd man out after (k) trials
SOLUTION:
Number of outcomes in event (A) n
a- P(A)
Number of possible sequences
number of outcomes leading to an odd man out:
O O O O O ... O
H H H H H H
T T T T T T
(n – 1) Heads and one Tail
(n – 1) Tails and one Head H H H H H ... T
H H H H ... T H
2n n
P(A) n n 1
2 2
with an odd man out, a success is obtained H H H ... T H H (n)
and the game is over.
b- A second trial is needed when the experiment T H H H H ... H
ends with a failure:
P(a second trial is needed) = 1 – P(A) T T T T T ... H
For (k) trials: T T T T ... H T
P(F F S) P(F)k-1 P(S) T T T ... H T T (n)
FFF
...
k 1 Trials
P( F
FFF F S) [1 - P(A)]k-1 P(A)
... H T T T T ... T
k 1 Trials
Combination:
In permutation, the order of the selected objects is essential. In contrast, a combination of
a given objects means any selection of one or more objects without regard to order.
The number of combinations of (n) different objects, taken (k) at a time, without repetition is
the number of sets that can be made up from the (n) given objects, each set containing (k)
different objects and no two sets containing exactly the same (k) objects.
The number is:
n n!
k k ! ( n k )!
Note that:
Arrange (k) objects First select (k) Arrange the (k)
is the same as and then
selected from (n) objects from (n) selected objects
N n k!
k
n n! n N n!
N k! where N
k (n k )! k k! k! (n k )!
-18-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
EXAMPLE (2-33):
From four persons (set of elements), how many committees (subsets) of two members
(elements) may be chosen?
SOLUTION:
Let the persons be identified by the initials A, B, C and D
Subsets: (A , B) , (A , C) , (A , D) , (B , C) , (B , D) , (C , D)
4 4!
N 6
2 2! (4 2)!
Missing sequences: (A , A) , (B , B) , (C , C) , (D , D) (repetition is not allowed)
Missing sequences: (B , A) , (C , A) , (D , A)
(C , B) , (D , B) , (D , C) (order is not important)
EXAMPLE (2-34):
Consider the rolling of a die twice, how many pairs of numbers can be formed for each case?
SOLUTION:
n = 6 and k = 2
Case I: Permutation D2
1 2 3 4 5 6
a- With repetition D1
N = nk = 62 = 36 1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
EXAMPLE (2-35):
In how many ways can we arrange 5 balls numbered 1 to 5 in 10 baskets each of which can
accommodate one ball?
SOLUTION:
n! 10! 10!
The number of ways ( N)
(n k )! (10 5)! 5!
NOTE:
If we remove the numbers of the balls so that the balls are no longer distinguishable, then:
n n! 10! 10!
The number of ways
k k!(n k )! 5!(10 5)! 5! 5!
This is because the permutation within the 5 balls is no longer needed.
-19-
FUNDAMENTAL CONCEPTS OF PROBABILITY CHAPTER II
elements in the set is given by the binomial coefficient. Suppose, for example, that we have k
ones and (n-k) zeros to be arranged in a row, then the number of binary numbers that can be
n
formed is . If n = 4 and k = 1, then the possible binary numbers are (0001, 0010, 0100,
k
1000).
Exercise: How many different binary numbers of five digits can be formed from the numbers
1, 0? List these numbers.
Exercise: How many different binary numbers of five digits can be formed from the numbers
1, 0 such that each number contains two ones? List these numbers.
Exercise: In how many ways can a group of five persons be seated in a row of 10 chairs?
-20-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
CHAPTER II
Definition:
A real-valued function whose domain is the sample space is called a random variable (r.v).
- The random variable is given an uppercase letter X, Y, Z, … while the values assumed by this
random variable are given lowercase letters x, y, z, …
- The whole idea behind the r.v is a one to one mapping from the sample space on the real line
via the mapping function X(s).
- Associated with each discrete r.v (X) is a Probability Mass Function P(X = x). This density
function is the sum of all probabilities associated with the outcomes in the sample space that
get mapped into (x) by the mapping function (random variable X).
- Associated with each continuous r.v (X) is a Probability Density Function (pdf) fX(x).This fX(x)
is not the probability that the random variable (X) takes on the value (x), rather fX(x) is a
continuous curve having the property that:
b
P(a X b) f X ( x ) dx
a
Definition:
The cumulative distribution function of a r.v (X) defined on a sample space (S) is given by:
FX(x) = P{X x}
- Properties of FX(x)
1- FX(– ∞) = 0
2- FX(∞) = 1
3- 0 FX(x) 1
4- FX(x1) FX(x2) if x1 x2
5- FX(x+) = FX(x) function is continuous from the right
6- P{x1 X x2} = FX(x2) – FX(x1)
EXAMPLE (3-1):
A chance experiment has two possible outcomes, a success with probability 0.75 and a failure
with probability 0.25. Mapping function (random variable X) is defined as:
x = 1 if outcome is a success
x = 0 if outcome is a failure
SOLUTION:
P(X < 0) = 0 ; P(X 0) = 0.25 ; P(X < 1) = 0.25 ; P(X 1) = 1
P(X x)
F S
1.0
0.75
P(X = x)
0.75
0.25
0.25
Real Line x
x 0 1
0 1
Probability Mass Function Cumulative Distribution Function
-21-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
EXAMPLE (3-2):
Let the above experiment be conducted three times in a row.
a- Find the sample space.
b- Define a random variable (X) as X = number of successes in the three trials.
c- Find the probability mass function P(X = x).
d- Find the cumulative distribution function FX(x) = P{X x}
SOLUTION:
In the table below we show the possible outcomes and the mapping process:
Sample Outcome P(si) x P(X = x)
F F F (0.25)3 0 (0.25)3 = 0.015625
F F S (0.75) (0.25)2
S F F (0.75) (0.25)2 1 3 x (0.75) (0.25)2 = 0.140625
F S F (0.75) (0.25)2
S S F (0.75)2 (0.25)
S F S (0.75)2 (0.25) 2 3 x (0.75)2 (0.25) = 0.421875
F S S (0.75)2 (0.25)
S S S (0.75)3 3 (0.75)3 = 0.421875
FFS FSS
FFF SFF SFS SSS
FSF SSF
P(X = 2) P(X = 2)
Probability 0.421875
0.421875
Mass
Function
P(X = 1)
0.140625
P(X = 0)
0.015625
Real Line
x
0 1 2 3
Cumulative
Distribution P(0) + P(1) + P(2)
= 0.578125
Function
P(0) + P(1)
= 0.15625
P(X = 0)
= 0.015625
x
0 1 2 3
Binomial Distribution: (n) = number of trials ; (x) number of successes in (n) trials
-22-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
EXAMPLE (3-3):
Suppose that 5 people including you and your friend line up at random. Let (X) denote the
number of people standing between you and your friend. Find the probability mass function
for the random variable (X).
SOLUTION:
Number of different ways by which the 5 people can arrange themselves = 5!
This is the total number of points in the sample space.
Let (A) denote you,
(B) denote your friend.
The random variable (X) assumes four possible values 0, 1, 2, 3 as shown below:
A B O O O
O A B O O
(X 0)
O O A B O
O O O A B
A O B O O
O A O B O (X 1)
O O A O B
A O O B O
(X 2)
O A O O B
A O O O B (X 3)
Any Sequence similar to what is shown can be done in: 2! 3!
you and your the other
friend people
4 2! 3!
P(X 0) 0.4
5! fX(x)
3 2! 3!
P(X 1) 0.3 0.4
5!
2 2! 3! 0.3
P(X 2) 0.2
5!
1 2! 3! 0.2
P(X 3) 0.1
5!
0.1
x
0 1 2 3
-23-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
EXAMPLE (3-4):
Let (X) have the pdf: fX(x) = 0.75 (1 – x2) ; {-1 x 1}
1- Verify that fX(x) is indeed a valid pdf. fX(x)
2- Find:
a- FX(x) 0.75
1 1
b- P{ X }
2 2
SOLUTION:
1 x
f X (x) dx = 1 2 0.75 (1 - x 2 ) dx -1 1
1-
- 0 Probability Density Function
3 1
u
= 2 (0.75 u - 0.75 2(0.75 0.25) 1.0 FX(x)
3 0
x 1
2-a) FX (x) 0.75 (1 - u ) du = 0.5 + 0.75x – 0.25 x
2 3
-
1
0.5
2
1 1
2-b) P{ X } = 0.75 (1 - u
2
) du
2 2 1 x
-
2 -1 1
1 1 Cumulative Distribution Function
= FX( ) – FX( ) = 0.6875
2 2
EXERCISE:
SOLUTION:
P{X x0} = 0.5 + 0.75x0 – 0.25 x03 = 0.95 3x0 – x03 = 1.8 x0 0.73
-24-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
- Definition:
The variance of a random variable (X) is defined as:
2X E{(X - x ) 2 } (X - x ) 2 P(X x i ) if x is discrete
E{(X - x ) } (x - x ) 2 f X (x) dx
2
X
2
if x is continuous
-
- Definition:
For any random variable (X) and any continuous function Y = g (X), the expected value of
g(X) is defined as:
- Theorem:
Let (X) be a random variable with mean X , then:
2X E(X 2 ) - 2X
Proof:
2X E{(X - X ) 2 } (x - X ) 2 f X (x) dx
-
2X (x 2 - 2x X 2X ) f X (x) dx
-
x f X (x) dx - 2 X x f X (x) dx
2
X
2 2
X f X (x) dx
- - -
2X E(X 2 ) - 2 X X 2
X
σ X2 E(X 2 ) - μ X2
Illustration:
1. The center of mass for a system of particles of masses m1, m2 ..., mn placed at x1, x2 …, xn is:
1
x cm (x1m1 x 2 m 2 ...... x n m n )
mi
If we let m1 = p1, m2 = p2, …, then:
x cm x1p1 x 2p2 ...... x n pn (The mean of a discrete distribution)
2. If ρ(x) is the density of a rigid body along the x-axis, then the center of mass is:
-25-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
1
M
x cm x ρ(x) dx
Where M ρ(x) dx
Again, if ρ(x) is replaced by fX(x), the pdf function, then:
x cm x f X (x) dx is the mean of continuous distribution.
Moment of Inertia:
3. If the particles in (1) above rotate with angular velocity (w), then the moment of inertia is
evaluated as:
n
I mi x i2
i 1
With mi replaced by pi, we have:
n
I pi x i2
i 1
4. If the rigid body in (2) rotates with angular velocity (w), then:
I x 2 ρ(x) dx E(x 2 ) x 2 f X (x) dx
5. The variance E{(X - μ X )2 } parallels the moment of inertia about the center of mass.
“Recall the parallel axis theorem”
I = Icm + M h2
E(x 2 ) σ 2 E 2 (x)
EXAMPLE (3-5):
In the kinetic theory of gases, the distance (x) that a molecule travels between collisions is
described by the exponential density function
-x
1
f X (x) e λ x 0
λ
a- The mean free path defined as the average distance between collisions is calculated as:
Mean Free Path = μ X E{X} x f X (x) dx
0
1
-x
1
x e dx λ
λ
0 λ 2 π d 2 N/V
Where (N/V) is the number of molecules per unit volume and (d) is the molecular diameter.
b- If the average speed of a molecule is ν m/s, then the average collision rate is
ν
Rate =
λ
-26-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
EXAMPLE (3-6):
Maxwell’s Distribution Law:
The speed of gas molecules follows the distribution:
3
-M ν 2
M 2 2 2R T f (v)
f ( ) 4 π ν e ν0
2πRT
Where v is the molecular speed
T is the gas temperature in Kelvin
R is the gas constant (8.31 J/mol.K)
M is the molecular mass of the gas v
a- Find the average speed, ν
b- Find the root mean square speed vrms
c- Find the most probable speed
SOLUTION:
8R T
a- ν = E(v) v f (v) dv
0
πM
b- E{ } rms 2 f (v) dv
3R T 3R T
rms ; rms E( 2 )
2 2
0
M M
c- The most probable speed is the speed at which f ( ) attains its maximum value.
Therefore, we differentiate f ( ) with respect to (v), set the derivative to zero and solve
for the maximum. The result is:
2RT
Most probable speed =
M
Root
Mean E(.)
Square v2
rms E( 2 )
Exercise
The radial probability density function for the ground state of the hydrogen atom (the pdf of
the electron position from the atom) is given by
4
f (r ) 3 r 2 e 2 r / a for r > 0
a
where a is the Bohr radius (a = 52.9 pm).
a. What is the distance from the center of the atom that the electron is most likely to be
found?
b. Find the average value of r?, (the mean distance of the electron from the center of the
atom).
c. What is the probability that the electron will be found within a sphere of radius a
centered at the origin?
- Theorem:
Let (X) be a random variable with mean μ X and variance 2X .
Define Y = aX + b ; (a) and (b) are real constants, then:
-27-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
Y a x b ………… (a)
2Y a 2 2X ………… (b)
Proof:
a- μ Y E{aX b}
(ax b) f X (x) dx
-
a x f X (x) dx b f X (x) dx Y a x b
- -
b- 2Y E{(Y - Y ) 2 }
E{[(ax b) - (a X b)] 2 } E{[a(x - X )]2 }
a 2 E{(x - X ) 2 } σ Y2 a 2 σ X2
EXAMPLE (3-7):
Find the mean and the variance of the binomial distribution considered earlier (Example 3-2)
with n = 3 and P(S) = 0.75
SOLUTION:
Mean = X E{X} x i P(X x i )
x P(X = x) x . P(X = x)
0 0.015625 0
1 0.140625 0.140625
2 0.421875 0.843750
3 0.421875 1.265625
∑ 2.25
x x2 P(X = x) x2 . P(X = x)
0 0 0.015625 0
1 1 0.140625 0.140625
2 4 0.421875 1.687500
3 9 0.421875 3.796875
∑ 5.625
-28-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
EXAMPLE (3-8):
Find the mean and the variance of the uniform distribution shown in the figure.
SOLUTION:
fX(x)
Mean = X E{X} x f X (x) dx
- 1
ab
b
1 ba
X x dx
a
b-a 2 x
Var(X) = E(X ) - [E(X)]
2
X
2 2 a b
b 3 a 3 a 2 ab b 2
b
1
E{X 2 } x 2 dx
a
b-a 3(b - a) 3
a ab b 2 a b (b a ) 2
2 2
2
X -
3 2 12
EXAMPLE (3-9):
X - X
Let Z (Standardized r.v.), show that the mean of (Z) is zero and the variance is 1.
X
SOLUTION:
X X
Z can be written as: Z aX b
X X
1 1
Mean = Z E{Z} E{(X X )} {E(X) E( X )} 0
X X
1
Var(Z) = 2Z 2 2X 1
X
EXAMPLE (3-10):
Let X be a discrete random variable with the following pmf: P (X = 0) = 0.4 , P (X = 1) = 0.3
P (X = 2) = 0.2 , P (X = 3) = 0.1 . Find the mean and variance of X.
SOLUTION:
X E{X} x i P(X x i ) =0(0.4)+1(0.3)+2(0.2)+3(0.1) = 1
E{X 2 } x i2 P(X x i ) = 0(0.4) + 1(0.3) + (2)2(0.2) + (3)2(0.1) = 2
Variance = 2X E(X 2 ) - [E(X)]2 = 2 – 1= 1
-29-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
- Definition:
If a random variable (X) has a pdf fX(x), then the value of (x) for which fX(x) is maximum is
called the mode of the distribution.
EXAMPLE (3-11):
2
Find the median and the mode for the random variable X with pdf: fX (x ) = 2xe - x , x >0
SOLUTION:
The median is a point (x0) such that
x0 ¥
- x2 - x2
ò 2xe dx = ò 2xe dx = ½.
0 x0
2
(x0) is the solution to e - x 0 = 0.5 which results in (x0) = 0.832554
To find the mode we differentiate fX (x ) with respect to x and set the derivative to zero
df (x ) 2 2
= 2e - x - 4x 2e - x = 0 , the solution of which is x = 1/ 2 .
dx
X2 Var(X) n p (1 - p)
Proof:
-30-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
The summation on the right hand side equals 1 since this is the summation of probabilities of a
binomial distribution with parameters m and p. The mean value of X is then
X np
This result simply states that the mean value of a binomial random variable with parameters
(n, p) equals the number of the times the experiment is repeated times the probability of a
success on each trial.
-31-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
EXAMPLE (3-12):
Suppose that the probability that any particle emitted by a radioactive material will penetrate
a certain shield is 0.02. If 10 particles are emitted. Find the probability that:
EXAMPLE (3-13):
Consider the parallel system shown in the figure. The system fails if at least three of the five
machines making up the system fail. Find the reliability of the system assuming that the
probability of failure of each unit is 0.1 over a given period of time.
SOLUTION:
Let (X) be the number of machines in failure.
(X) has a binomial distribution.
P(system fails) = P(number of machines in failure 3 )
= P(x 3)
5 5 5
(p)3 (1 p)2 (p)4 (1 p) (p)5
3 4 5
P(system fails) = 0.00856 ; when p = 0.1
Reliability = 1 – P(Failure) = 0.99144
EXAMPLE (3-14):
The process of manufacturing screws is checked every hour by inspecting 10 screws selected
at random from the hour’s production. If one or more screws are found defective, the
production process is halted and carefully examined. Otherwise the process continues. From
past experience it is known that 1% of the screws produced are defective. Find the probability
that the process is not halted.
SOLUTION:
Let (X) be the number of defective items in the sample.
P(system is not halted) = P(X = 0) = P(number of defective items is zero)
10
(p)0 (1 p)10 - 0
0
-32-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
10
(0.01)0 (0.99)10 - 0 (0.99)10 0.9043
0
EXAMPLE (3-15):
Thirty students in a class compare birthdays. What is the probability that:
a- 5 of the students have their birthday in January?
b- 5 of the students have their birthday on January 1st?
c- At least one student is born in January?
SOLUTION:
1 11
a- P(success) ; P(failure) , Number of trials (n) = 30
12 12
Required number of successes (k) = 5
n
P(5 successes in 30 trials) (p)k (1 p) n - k
k
30 1 11 30 1 11
5 30 - 5 5 25
EXAMPLE (3-16):
The captain of a navy gunboat orders a volley of 25 missiles to be fired at random along
a 500-foot stretch of shoreline that he hopes to establish as a beach head. Dug into the beach
is a 30-foot long bunker serving as the enemy's first line of defense.
a. What is the probability that exactly three shells will hit the bunker?
b. Find the number of shells expected to hit the bunker. 500 ft
SOLUTION: 30 ft
30
P(success) 0.06
500
n
P(3 successes in 25 shells) (p)k (1 p) n - k
k
For p = 0.06 and n = 25
25 25
P(3 successes in 25 shells) (0.06) 3 (1 0.06) 25 - 3 (0.06) 3 (0.94) 22
3 3
b. E(x) = n p = 25(0.06)= 1.5.
-33-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
= (1 – p)x–1 (p) ; x = 1, 2, 3, ……
- Theorem:
The mean and the variance of (X) are:
1
μ X E(X)
p
1- p
σ 2X Var(X) 2
p
EXAMPLE (3-17):
Let the probability of occurrence of a flood of magnitude greater than a critical magnitude in
a given year be 0.02. Assuming that floods occur independently, determine the “return period”
defined as the average number of years between floods.
SOLUTION:
(X) has a geometric distribution with p = 0.01
1 1
μ X E(X) 50 years
p 0.02
EXAMPLE (3-18):
1-p
Show that the mean value of the geometric distribution = 1/p and the variance is σ 2X ,
p2
where p is the probability of a success.
SOLUTION:
X E(X) x p(1 p)
x 1
x 1
p{1+2(1-p)+3(1-p)2 4(1 p)3 ...}
-34-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
Hyper-geometric Distribution
Consider the sampling without replacement of a lot of (N) items, (k) of which are of one type
and (N – k) of a second type. The probability of obtaining (x) items in a selection of (n) items
without replacement obeys the hyper-geometric distribution:
k N k
x n x Type I Type II
P(X x)
N k (N – k)
n (N objects)
x = 0 , 1 , 2 , ……… , min(n , k)
NOTE:
k Type I Type II
p is the ratio of items of type (I) to the total population x (n – x)
N
Sample of size (n)
- Theorem:
The mean and the variance of the hyper-geometric random variable are:
k
X E(X) n np
N
n k (N - k) (N - n) k N-k N-n
2X Var(X) n
N N N -1
2
N (N - 1)
k k N-n N-n
n 1 - n p (1 - p)
N N N -1 N -1
EXAMPLE (3-19):
Fifty small electric motors are to be shipped. But before such a shipment is accepted, an
inspector chooses 5 of the motors randomly and inspects them. If none of these tested motors
are defective, the lot is accepted. If one or more are found to be defective, the entire shipment
is inspected. Suppose that there are, in fact, three defective motors in the lot. What is the
probability that the entire shipment is inspected?
SOLUTION:
Let (X) be the number of defective motors found, then (X) assumes the values (0 , 1 , 2 , 3).
-35-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
Theorem:
For large (N), one can use the approximation:
n k
P ( X x ) P x ( 1 P ) n x ; P
x N
n
This approximation gives very good results if 0.1 , for the example above:
N
3 5
P 0.06 P( X 0) (0.06)0 (1 0.06)50 = 0.733
50 0
-36-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
bx
e -b
x 0 x!
1
The summation on the right side is easily recognized as the power series expansion of e b .
Therefore,
x
-b b
x 0
e
x!
e -b e b 1 .
- Theorem:
If (X) is a Poisson r.v with parameter (b), then its mean and variance are:
X E(X) b
X2 Var(X) b
Proof:
First we find the mean value of X. The method we follow is quite similar to the one used to
find the mean and variance of the binomial distribution.
b x -b b x
E(X) xe xe
-b
x 0 x! x 1 x!
bx bx
xe -b e -b
x 1 x( x 1)! x 1 ( x 1)!
Let u = x-1 ( or x = u + 1), and change the index of the summation from x to u. The result is
b u 1
bu
E(X) e -b b e - b
u 0 u! u 0 u!
As was shown earlier, the summation on the right side equals 1. Therefore,
E(X) b
which completes the proof.
-37-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
But, E ( X ( X 1)) E ( X 2 ) E ( X )
Or, E ( X 2 ) E ( X ( X 1)) E ( X )
The variance of X can, therefore, be obtained as
X2 E( X 2 ) X2 E( X ( X 1)) E( X ) X2
X2 b 2 b b 2 b
- Poisson Process:
Consider a counting process in which events occur at a rate of occurrence per unit time.
Let X(t) be the number of occurrences recorded in the interval (0 , t), we define the Poisson
process by the following assumptions:
1- X(0) = 0 , i.e., we begin the counting at time t = 0.
2- For non-overlapping time intervals (0 , t1) , (t2 , t3), the number of occurrences {X(t1) – X(0)}
and {X(t3) – X(t2)} are independent.
3- The probability distribution of the number of occurrences in any time interval depends only on
the length of that interval.
4- The probability of an occurrence in a small time interval (t) is approximately (t).
X(t0) X(t1) X(t2) X(t3)
t=0 t1 t2 t3
Using the above assumptions, one can show that the probability of exactly (x) occurrences in
any time interval of length (T) follows the Poisson distribution and,
( T ) x
P ( X x ) e - T ; x = 0 , 1 , 2 , 3 , ………
x!
- Theorem:
Let (b) be a fixed number and (n) any arbitrary positive integer. For each nonnegative integer (x):
n bx
Lim (p)x (1 p) n x e b ; where p = b/n
n x
x!
EXAMPLE (3-21):
Messages arrive to a computer server according to a Poisson distribution with a mean rate
of 10 messages/hour.
a- What is the probability that 3 messages will arrive in one hour.
b- What is the probability that 6 messages will arrive in 30 minutes.
SOLUTION:
a- 10 messages/hour T = 1 hour
(10 1) x (10) x
P(X x ) e -101 e -10 ; x = 0 , 1 , 2 , 3 , ………
x! x!
(10) 3
P(X 3) e -10
3!
b- 10 messages/hour T = 0.5 hour
1 x
1 (10 ) x
-10
P( X x ) e 2 2 e -5 (5) ; x = 0 , 1 , 2 , 3 , ………
x! x!
-38-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
(5) 6
P(X 6) e -5
6!
EXAMPLE (3-22):
The number of cracks in a section of a highway that are significant enough to require repair is
assumed to follow a Poisson distribution with a mean of two cracks per mile.
a- What is the probability that there are no cracks in 5 miles of highway?
b- What is the probability that at least one crack requires repair in ½ miles of highway?
c- What is the probability that at least one crack in 5 miles of highway?
SOLUTION:
a- 2 cracks/mile T = 5 miles
(2 5) x (10) x
P( X x ) e - 2 5 e-10 ; x = 0 , 1 , 2 , 3 , ………
x! x!
P(X 0) e -10
b- 2 cracks/mile T = 5 miles
1 x
1 (2 ) x -1
- 2
P( X x ) e 2 2 e -1 (1) e ; x = 0 , 1 , 2 , 3 , ………
x! x! x!
-1
e
P(X 1) [1 - P(X 0)] 1 - e -1
x 1 x!
EXAMPLE (3-23):
Given 1000 transmitted bits, find the probability that exactly 10 will be in error. Assume that
1
the bit error probability is .
365
SOLUTION:
X: random variable representing number of bits in error.
Exact solution:
1
P(bit error) ; Number of trials (n) = 1000
365
Required number of bits in error (k) = 10
n k 1000 1 364
10 990
-39-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
f X ( x ) e - x ; x 0
The cumulative distribution function is:
FX ( x ) 1 - e - x ; x 0
fX(x) FX(x)
x x
The exponential distribution is often used in a practical problem to represent the distribution of
the time that elapses before the occurrence of some event. It has been used to represent the
such periods of time as the period for which a machine or an electronic component will
operate without breaking down, the period required to take care of a customer at some service
facility, and the period between the arrivals of two successive customers at a facility.
If the event being considered occurs in accordance with a Poisson process, then both the
waiting time until an event will occur and the period of time between any two successive
events will have exponential distribution.
Occurrence of events
Time
x x (t)
- Theorem:
If the random variable (X) has an exponential distribution with parameter (), then:
¥
1
mX = E (X ) = ò x l e - l xdx = and
0
l
¥
2
òx le
2 2 - lx
E (X ) = dx =
0
l2
1
s X2 = E (x 2 )-E 2 (x ) =
l2
Exercise
The number of telephone calls that arrive at a certain office is modeled by a Poisson random
variable. Assume that on the average there are five calls per hour.
a. What is the average (mean) time between phone calls?
b. What is the probability that at least 30 minutes will pass without receiving any
phone call?
-40-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
c. What is the probability that there are exactly three calls in an observation interval
of two consecutive hours?
d. What is the probability that there is exactly one call in the first hour and exactly two
calls in the second hour of a two-hour observation interval?
EXAMPLE (3-24):
Suppose that the depth of water, measured in meters, behind a dam is described by an
exponential random variable with pdf:
1 -x
e 13.5
x0
f X (x) 13.5
0
o.w
There is an emergency overflow at the top of the dam that prevents the depth from exceeding
40.6 m. There is a pipe placed 32.0 m below the overflow that feeds water to a hydroelectric
generator (turbine).
c- P(generator has water to derive it / water is not wasted) = P(x > 8.6 / x < 40.6)
40.6 -x
1 13.5
P(x 8.6 x 40.6) P(8.6 x 40.6) 13.5 dx
e
= 40.6 8.6
0.504
P(x 40.6) P(x 40.6) -x
1 13.5
0 13.5 e dx e
3
FX (x) 1 - e b
; x 0
The Rayleigh pdf describes the envelope of white noise when passed through a band pass filter.
It is used in the analysis of errors in various measurement systems.
- Theorem:
πb b(4 - π)
μ X E(X) and σ X2 Var(X)
4 4
III. Cauchy Random Variable:
This random variable has:
-41-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
/ 1 1 x
fX ( x ) , F X ( x ) tan 1
x
2 2
2
Exercise: Prove that the mean and variance of the Rayleigh distribution are as given in the
theorem above.
Exercise: Find the mode and the median of the Rayleigh distribution
Exercise: Find the mean and variance of the Cauchy distribution.
-42-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
1 ≠ 2 1 = 2 1 ≠ 2
1 = 2 1 ≠ 2 1 ≠ 2
- Definition:
A normal random variable with mean zero and variance one is called a standard normal
random variable. A standard normal random variable is denoted as Z.
fX(x) fZ (z)
1
2 2X
0.607
2 2X
z
μ X σX X μX σX x 0 z
- Definition:
The function Ф(z) = P{Z z} is used to denote the cumulative distribution function of a
standard normal random variable:
z u2
1
Ф(z) =
2
e du 2
-43-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
x - Scale
μ X 4σ X μ X 3σ X μ X 2σ X μ X σ X μ X μ X σ X μ X 2σ X μ X 3σ X μ X 4σ X
z - Scale
-4 -3 -2 -1 0 1 2 3 4
fZ (z)
z
0 1 a b
Ф(1) Ф(b) - Ф(a)
1 -1
Area = 1 - Ф(1) Ф(-1) = 1 - Ф(1)
-44-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
z u2
1
Ф(z)
- 2
e 2
du
Z - μX
Ф(z)
σX
Therefore, we conclude that:
x - μX
1- P(X x0) = 0
σX
x - μX x - μX
2- P(x0 X x1) = 1 – 0
σX σX
EXAMPLE (3-25):
Suppose the current measurements in a strip of wire are assumed to follow a normal
distribution with a mean of 10 mA and variance 4 (mA)2. What is the probability that
a measurement will exceed 13 mA?
SOLUTION:
X = current in mA
X -μX X - 10
Z
2
σX
X - 10 13 - 10
P(X 13) PZ 1.5
2 2
P(X 13) PZ 1.5 1 (z ) From tables:
= 1 – 0.93319 = 0.06681
fX(x) fZ(z)
z
10 13 x 0 1.5
-45-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
EXAMPLE (3-26):
The diameter of a shaft in an optical storage drive is normally distributed with mean 0.25
inch and standard deviation of 0.0005 inch. The specifications on the shaft are 0.25 ± 0.0015
inch. What proportion of shafts conforms to specifications?
fX(x)
SOLUTION:
P(0.2485 X 0.2515)
0.2485 - 0.2500 0.2515 - 0.2500
P Z
0.0005 0.0005 0.2485 0.25 0.2515
P 3 Z 3 fZ(z)
= Ф(3) – Ф(–3)
= 2 Ф(3) – 1 From tables:
= Ф(3) – Ф(–3) = (2 x 0.99865) – 1 = 0.9973
-3 0 3
EXAMPLE (3-27):
Assume that the height of clouds above the ground at some location is a Gaussian random
variable (X) with mean 1830 m and standard deviation 460 m. find the probability that clouds
will be higher than 2750 m.
fX(x)
SOLUTION:
2750 - 1830
1 P Z 1830 2750
x
460
fZ(z)
1 P(Z 2.0)
= 1 – Ф(2.0) From tables:
= 1 – 0.9772
z
P(X 2750) 0.0228 0 2.0
Exercise
The tensile strength of paper is modeled by a normal distribution with a mean of 35 pounds
per square inch and a standard deviation of 2 pounds per square inch.
a. If the specifications require the tensile strength to exceed 33 lb/in2 , what is the
probability that a given sample will pass the specification test?
b. If 10 samples undergo the specification test, what is the probability that at least 9
will pass the test?
c. If 20 samples undergo the test, what is the expected number of samples that pass
the test?
Exercise
The rainfall over Ramallah district follows the normal distribution with a mean of 600 mm and
a standard deviation of 80 mm. The rainfall is distributed over 500 km2 area. Find:
-46-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
1. The probability of obtaining a rainwater volume less than 206 MCM (MCM = Million
Cubic Meter)
2. Find the mean and the standard deviation of the volume (V) of rainfall in MCM.
3. Flooding condition will be considered if the rainfall is higher than 900 mm.
Find the probability of flooding for any given year.
- Remark:
The area under the Gaussian curve within (k) standard deviations of the mean is given in the
following table:
Area
k
P( X k X X X k X )
1 0.6826
2 0.9544
3 0.9973
Total probability outside an interval of 4 standard
4 0.99994
deviations on each side of the mean is only 0.00006
b-np
and a - n p
npq npq
EXAMPLE (3-28):
Consider a binomial experiment with n = 1000 and p = 0.2. if X is the number of successes,
find the probability that X 240.
SOLUTION:
240 1000
Exact solution: P(X 240) 0.2 1 - 0.2
x 1000- x
x 0 x
Applying the Demoiver-Laplace theorem:
240 1000 0.2
P(X 240) (3.162) 0.999216
1000 0.2 0.8
-47-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
- Theorem:
X-b
If (X) is a Poisson r.v with E(X) = b and Var(X) = b, then: Z
b
is approximately a standard normal r.v. The approximation is good for (b > 5).
(x - b)2
bx 1
e-b
e 2b
x! 2πb
EXAMPLE (3-29):
Assume the number of asbestos particles in a cm3 of dust follow a Poisson distribution with
a mean of 1000. If a cm3 of dust is analyzed, what is the probability that less than 950
particles are found in 1 cm3?
SOLUTION:
(1000) x
950
Exact solution: P(X 950) e -1000
x 0 x!
950 1000
Approximate: P(X 950) PZ PZ 1.58 0.057
1000
I. Discrete Case:
EXAMPLE (3-30):
Let (X) be a binomial r.v with parameters (n = 3) and (p = 0.75). Let Y = g(x) = 2X + 3
P(Y = y) = P(X = x)
SOLUTION:
The table below shows the (x) and (y) values and their probabilities.
n x y
P(X x ) p x (1 p) n x Y = g(X)
x (3, 5, 7, 9)
(0, 1, 2, 3)
y = 2x + 3
10
(y) 6
4
0
0 1 2 3 4
(x)
-48-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
x y P(X = x) P(Y = y)
0 3 (1 – p)3 (1 – p)3
1 5 3 p (1 – p)2 3 p (1 – p)2
2 7 3 p2 (1 – p) 3 p2 (1 – p)
3 9 p3 p3
EXAMPLE (3-31):
1
Let (X) has the distribution P{X = x} = ; x = -3 , -2 , -1 , 0 , 1 , 2
6
Define Y = g(x) = X2. Find the pdf of the random variable Y.
P(X=x) = 1/6
SOLUTION:
x y P(X = x) P(Y = y)
7 9 1/6 1/6 x
-3 -2 -1 0 1 2
-2 4 1/6 1/6
-1 1 1/6 1/6 Probability Density Function fX(x)
0 0 1/6 1/6 P(Y=1) P(Y=4)
1 1 1/6 1/6
2 4 1/6 1/6 P(Y=0) P(Y=9)
fX(x) x = fY(y) y
y
y(x)
x f X ( x ) f X ( x )
f Y ( y) f X ( x )
y y dy
x dx
y1 < y < y2 x
x x+x
EXAMPLE (3-32):
Let (X) be a Gaussian r.v with mean (0) variance (1).
Let Y = X2. Find fY(y)
-49-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
SOLUTION:
f X (x) dy
f Y ( y) 2 0 y ; 2x
dy dx dx
( x )2
2 1
f Y ( y) e 2
, x y
2x 2
y
1 1
f Y ( y) e 2
y 2
y
1
f Y ( y) e 2
; y0
2y
EXAMPLE (3-33):
Let (X) be a uniform r.v in the interval (-1 , 4). If Y = X2. Find fY(y)
SOLUTION:
f X (x) 2 1 5 1
For (-1 X 1) : f Y ( y) 2
dy dx 2x 5 y
f X (x) 1 5 1 y
For (1 X 4) : f Y ( y)
dy dx 2 x 10 y
16
1
0 y 1
5 y
1
f Y ( y) 1 y 16
10 y
0 x
Otherwise -1 1 4
-50-
SINGLE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS CHAPTER III
EXAMPLE (3-32):
e -x x 0
Let (X) be a r.v with the exponential pdf: f X ( x)
0 x 0
Let Y = 2X + 3. Find fY(y) and the region over which it is defined.
SOLUTION:
f X (x) dy
f Y ( y) ; 2
dy dx dx
y -3
fX
f Y ( y)
f X (x)
, but x
y-3
f Y ( y) 2
2 2 2
- ( 2 ) y - 3
y 3
e 0 - y-3
e 2 y3
f Y (y) 2 2 f Y (y) 2
0 y-3
0 0 y 3
2
1
e
- x
NOTE: P(3 < Y 5) = P(0 < X 1) = dx 1 - e -
0
EXAMPLE (3-33):
-51-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
CHAPTER I I I
PROBABILITY DISTRIBUTIONS
FOR MORE THAN ONE
RANDOM VARIABLE
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
In certain experiments we may be interested in observing several quantities as they occur, such
as carbon content (X) and hardness (Y) of steel; input (X) to a system and output (Y) at a
given time too.
- If we observe two quantities (X) and (Y), each trial gives a pair of values X = x and Y = y,
(x,y) which represents a point (x,y) in the x-y plane.
- The joint cumulative distribution function of two r.v X and Y is defined as:
FXY(x,y) P{X x , Y y} y
Event (A) = {X x}
(x,y)
Event (B) = {Y y}
x
A B
R
R
S A∩B
FXY(x,y) = P
xi x y j y
ij
and P
i j
ij 1
where fXY(x,y) is the joint probability density function (f being continuous and nonnegative)
-53-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
y
- Properties of the joint pdf:
y2
1- fXY(x,y) 0
R
2- f
-
XY ( x, y) dx dy 1 y1
y2 x 2
x
x1 x1
3- P(x 1 X x 2 , y1 Y y 2 ) f XY ( x, y) dx dy
y1 x1 y
and in general:
R
P(x , y R) f XY ( x, y) dx dy
R
x
Marginal Distributions of a Discrete Distribution:
P(X = xi) = P(X = xi , Y arbitrary)
= P(X x i , Y y j )
y
This is the probability that (X) may assume a value (x), while (Y) may assume any value
which we ignore.
Likewise:
P(Y = yj) = P(X x
x
i , Y yj)
-54-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
Conditional Densities:
Let (X) and (Y) be discrete random variables. The conditional probability density function of
(Y) given (X = x), that is the probability that (Y) takes on the value (y) given that (X = x), is
given by:
P(X x , Y y)
f Y / X ( y) P(Y y / X x)
P( X x )
If (X) and (Y) are continuous, then the conditional pdf of (Y) given (X = x) is given by:
f X , Y ( x , y)
f Y / X ( y)
f X (x)
EXAMPLE (4-1):
Let (X) and (Y) be continuous random variables with joint pdf:
1
f X , Y ( x, y) (6 x y) ; 0 < x 2 , 2 < y 4 y
8
1- Find fX(x) and fY(y). 4
2- Find fY/X(y).
3- Find P(2<y 3)
4- Find P(2<y 3 / x=1)
3
R
2
SOLUTION:
1
x
1- f X (x) f
-
XY ( x, y) dy
0 1 2
4
1 1
= 8 (6 x y) dy 8 (6 2x)
2
; 0<x 2
2
1 1
f Y (y)
-
f XY ( x, y ) dx = 8 (6 x y) dx 4 (5 y)
0
; 2<y 4
f X , Y ( x , y)
2- f Y / X ( y)
f X (x)
1
(6 x y)
8 (6 x y)
= ; 0<x 2 , 2<y 4
1 (6 2 x )
(6 2 x )
8
3 3
1 5
3- P(2<Y 3) = f Y (y) dy (5 - y) dy
2 2
4 8
(5 y)
4- f Y / X ( y / x 1) ; 2<y 4
4
3
5y 5
P(2<Y 3 / X=1) = dy
2
4 8
Exercise:
1- Find P(2<Y 3 / 0 X 1)
2- Find X , μ Y , σ 2X , and σ 2Y
3- Are X and Y independent?
-55-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
EXAMPLE (4-2):
Suppose that the random variable (X) can take only the values (1 , 2 , 3) and the random
variable (Y) can take only the values (1 , 2 , 3 , 4). The joint pdf is shown in the table.
Y
1 2 3 4
X
1 0.1 0 0.1 0
2 0.3 0 0.1 0.2
3 0 0.2 0 0
f XY (x, y)
0.3
X
0.2
3
0.2
2
0.1 0.1
1
0.1
1
2
3
4
1- Find fX(x) and fY(y).
Y
2- Find P(X 2)
3- Are (X) and (Y) independent.
SOLUTION:
1- P(X = 1) = 0.2 P(Y = 1) = 0.4
P(X = 2) = 0.6 P(Y = 2) = 0.2
P(X = 3) = 0.2 P(Y = 3) = 0.2
P(Y = 4) = 0.2
∑ = 1.0 ∑ = 1.0
2- P(X 2) = P(X = 2) + P(X = 3) = 0.6 + 0.2 = 0.8
3- Check all pairs (x,y) for:
?
P(X = x ,Y = y) P(X = x) P(Y = y)
P(X = 1 , Y = 1) = 0.1 ≠ (0.2 x 0.4) = 0.08 we do not continue
X and Y are not independent
Exercise:
– Find X and 2X
-56-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
EXAMPLE (4-3):
Let (X) and (Y) have the joint pdf:
k x y , 0 x 1, 0 y 1,
f XY ( x, y) y
0 , otherwise
1
a- Find (k) so that fX(x) is a proper pdf.
b-
c-
Find P(X > 0.5 , y > 0.5)
Find fX(x) and fY(y)
R
d- Are (X) and (Y) independent. 0.5
SOLUTION:
x
1 1
a- f XY ( x, y) dx dy 1 k y x dx dy 1 (0,0) 0.5 1
- 0 0
1 2 1 1 1
x k k y2 k k
k y dy y dy 1 k 4
0
2 20 2 2 0 4 4
0
1 1
b- P(X > 0.5 , y > 0.5) = 4 x y dx dy
0.5 0.5
x 2 1 y2 1
4 (1 - 0.25) (1 - 0.25) 0.75 0.75 0.5625
2 2
0.5 0.5
1
c- f X (x) f XY ( x, y) dy
0
1 1
y2
f X (x) 4 x y dy = 4 x 2x
0
2 0
1 1
x2
f Y (y) 4 x y dx = 4 y 2y
0
2 0
Since fXY(x,y) = fX(x) fY(y)
4 x y = (2 x) (2 y) (X) and (Y) are independent.
EXAMPLE (4-5):
For example (4-3), find P(X > Y). y
SOLUTION: 1
1 x 1 1
P(Y X) 4 x y dx dy 4 x y dy dx 4 x y dx dy y=x
R 0 0 0 y
1 y2 x 1 2 1
4 x dx 4 x x dx 2 x 3 dx
0 2 0
2
0 0
x
x4 1 1
P(Y X) 2 1
4 2
0
-57-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
EXAMPLE (4-4):
Two random variables (X) and (Y) have the joint pdf:
5 2 y
x y , 0 yx2
f XY ( x, y) 16 2
0 , otherwise
a- Verify that fXY(x,y) is a valid pfd.
b- Find the marginal density functions of X and Y.
c- Are X and Y statistically independent?
d- Find P{X<1} , P{Y<0.5} , P{XY<1} R
SOLUTION: x
2 x 2
5
f 16 x
2
a- XY ( x, y) dx dy = y dy dx
R 0 0
2
x 2 2 x
2 4 5 2
5 5 2 y dx 5 x dx 5 1 x 5 1 32
16 0 0 16 0 2 16 0 2
x 2
y dy dx = x 1
16 2 5 16 2 5
0 0
x 2 x
5 2 5 2 y 5 4
b- f X (x)
-
f XY ( x, y) dy f X (x)
0
16
x y dy =
16
x
2
32
x
0
5 4 2
x , 0x2 5 4
2
5 x5
f X ( x ) 32 check x dx 1 OK
32 32 2
0 , otherwise 0 0
2 2
5 2 5 x3 5
f Y (y) f XY ( x, y) dx f Y (y) x y dx = y y(8- y 3 )
- y
16 16 3 y
48
5 2
y(8- y ) , 0 y 2 5 2 y5
3 2
5
f Y ( y) 48 (8y - y ) dy 4 y
4
1 OK
48 48 5
0 , otherwise 0 0
c- Since fXY(x,y) ≠ fX(x) fY(y) (X) and (Y) are not statistically independent.
1 1 1
5 4 5 x5 1
d- P{X<1} =
0
f X (x) dx
0
32
x dx
32 5
32
0.03125
0
0.5 0.5
5
f (y) dy 48 y(8- y
3
P{Y<0.5} = Y ) dy
0 0 y
0.5
5 2 y4 105
= 4y - 0.1025
48 4 0
1024 2
1
P{XY<1} = P{Y<
1
} y
X x
1 1
P{Y<
X
}= f
R
XY ( x, y) dx dy
1 5
1 x x
5
2
1
R
P{Y< } = x 2 y dy dx x 2 y dy dx x
X 0 0
16 1 0
16 1 2
=1/32 + 5/32 = 6/32
-58-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
- Definition:
The expected value of a function g(x,y) of two random variables (X) and (Y) is:
E{g(x,y)} = g(x
xi yj
i , y j ) P(X x i , Y y j ) ; X and Y are discrete
= g(x , y) f
-
XY ( x, y) dx dy ; X and Y are continuous
-59-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
- Theorem:
Let Y = a1X1 + a2X2 , then
σ 2Y a 12 σ 2X1 a 22 σ 2X 2 2 a 1a 2 σ X1 σ X 2 ρ X1X 2
Proof:
2Y E{(Y - Y ) 2 } E{(a 1 X1 a 2 X 2 - a 1 X1 - a 2 X 2 ) 2 }
E{[a 1 (X1 - X1 ) a 2 (X 2 - X 2 )]2 }
E{a 12 (X1 - X1 ) 2 } E{a 22 (X 2 - X 2 ) 2 } 2 a 1 a 2 E{(X1 - X1 )(X 2 - X2 )}
a 12 2X1 a 22 2X 2 2 a 1a 2 XY
XY
since XY σ 2Y a12σ 2X1 a 22σ 2X2 2 a1a 2σ X1 σ X2 ρXY
X Y
- Theorem: Multiplication of Means
If (X) and (Y) are independent random variables, then they are uncorrelated.
Proof:
μ XY E{(X - μ X )(Y - μ Y )}
E{XY} - Y E{X} - X E{Y} X Y
E{XY} - E{X}E{Y}
But since (X) and (Y) are independent, then E{XY} E{X}E{Y}
XY 0 XY XY 0
X Y
This result asserts that if X and Y are independent then they are uncorrelated ( r X Y = 0 ).
However, the converse is not necessarily true. That is, if r X Y = 0 , then X and Y are nor
necessarily independent. The only exception is when X and Y are Gaussian. In this case,
r X Y = 0 implied that X and Y are independent.
- Theorem:
Let Y = a1X1 + a2X2 , and (X) and (Y) are independent random variables, then
2Y a 12 2X1 a 22 2X 2
This result follows immediately from the above two theorems.
The sum of independent random variables equals the sum of the variances of these variables.
- In the case of continuous random variables (X) and (Y) we find FZ(z) first:
FZ(z) = P{Z z} = f XY ( x, y) dx dy
g(x,y)z
d Fz (z )
Then we find: fZ(z) =
dz
-60-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
- Theorem:
Let Z = X + Y and let (X) and (Y) be independent random variables, then;
f Z (z )
-
f X ( x ) f Y (z - x ) dx
Proof:
FZ (z ) P(Z z ) P(X Y z )
FZ (z ) P(Y z - X) f XY ( x, y) dx dy
R
z x
= f
-
XY ( x, y) dx dy
z x
Fz (z ) f Y ( y) dy f X ( x ) dx
-
Fz (z ) f
-
X ( x ) FY (z x ) dx
d Fz (z )
fZ(z) =
dz
f Z (z )
-
f X (x) f Y (z - x) dx
z
y=z-x
z
x
-61-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
EXAMPLE (4-6):
Consider the joint pdf shown in the table (considered before in example 4-1).
Let Z = X + Y.
1- Find the probability mass function of (Z), P{Z = z}.
2- Find P(X = Y).
3- Find E{XY}
Y
1 2 3 4
X
1 0.1 0 0.1 0
2 0.3 0 0.1 0.2
3 0 0.2 0 0
SOLUTION:
1- Possible values of (Z) and their probabilities are shown as follows:
Z P(Z = z)
2 P(X = 1 , Y = 1) = 0.1
3 P(X = 1 , Y = 2) + P(X = 2 , Y = 1) = 0 + 0.3 = 0.3
4 P(X = 1 , Y = 3) + P(X = 3 , Y = 1) + P(X = 2 , Y = 2) = 0.1 + 0 + 0 = 0.1
5 P(X = 1 , Y = 4) + P(X = 2 , Y = 3) + P(X = 3 , Y = 2) = 0 + 0.1 + 0.2 = 0.3
6 P(X = 2 , Y = 4) + P(X = 3 , Y = 3) = 0.2 + 0 = 0.2
7 P(X = 3 , Y = 4) = 0
3- E{XY} = x y
xi yj
i j P(X x i , Y y j )
E{XY} =5
Exercise:
Let Z = |X – Y|
- Find the pmf of Z: P(Z = z)
-62-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
EXAMPLE (4-7):
Let (X) and (Y) be two independent exponential random variables, such that:
ìï b e-b y , fX(x)
e -x , x 0 ï y> 0
f X (x) ; fY (y ) = í
0 , otherwise ïï 0 , ot herwise
ïî
Let Z = X + Y. Find fZ(z ). x
fY(x)
SOLUTION:
¥ z
x
ò òa e . b e
-a x - b ( z - x)
f Z (z ) = fX (x ) fY (z - x ) dx = dx
fY(-x)
-¥ 0
z z
òa ò
-a x - b ( z - x) -b z
f Z (z ) = e .b e dx = a b e e -( a - b )x dx,
x
0 0
ab
f Z (z ) = e-b z (1 - e-(a - b )z ) fY(z -x)
(a - b )
ab
f Z (z ) = (e-b z - e- a z ) x
(a - b ) z
EXAMPLE (4-8):
Let (X) and (Y) be two identical and independent random variables, such that:
1 1
, 0x2 , 0y2
f X (x) 2 ; f Y ( y) 2 fX(x)
0 , otherwise
0 , otherwise
Let Z = X + Y. Find fZ(z ). ½
x
SOLUTION: 2
fY(x)
Z=X+Y 0<z<4
- For (z < 0) fZ(z ) = 0 x
2
- For (0 < z < 2) fY(-x)
f Z (z ) f
-
X (x) f Y (z - x) dx
x
z -2
1 1 z
f Z (z ) dx fY(z -x)
0
2 2 4 ½
- For (z = 2) x
2 2 -2 +z z
1 1 x 1
f Z (z ) dx
0
2 2 40 2 fZ(z )
- For (2 < z < 4)
2
(2 - (-2 z ) 1 4 - z
2
1 1 x 1
f Z (z )
- 2 z
dx =
2 2 4 - 2 z
4 4 ½
z
1 1
Total area = 4 1 It is a pdf 0 2 4
2 2
-63-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
EXAMPLE (4-9):
Let (X) and (Y) be two uniformly distributed and independent random variables, such that:
1 1 fX(x)
, 0x2 , 0 y4
f X (x) 2 ; f Y ( y) 4
0 , otherwise
0 , otherwise
Let Z = X + Y. Find fZ(z ). x
2
SOLUTION: fY(x)
- For (z < 0) fZ(z ) = 0
z x
1 1 1
- For ( 0 z 2 ) f Z (z ) dx z fY(-x)
4
0
2 4 8
2
1 1 1 1
- For ( 2 z 4 ) f Z (z ) dx 2 x
0
2 4 8 4 -4
- For ( 4 z 6 ) fY(z -x)
2 2
1 1 1 1
f Z (z ) dx x
2 4 8
(6 z )
8 x
-4z -4z
fZ(z ) -4 +z z
1
8z 0 z 2
1 1/4
f Z (z ) 2 z 4 Area = 1
4 z
1 2 4 6
8 (6 - z ) 4 z 6
EXAMPLE (4-10):
Let (X) and (Y) be two identical and independent random variables, such that:
e -x , x 0 e -y , y 0 fX(x)
f X (x) ; f Y ( y)
0 , otherwise 0 , otherwise
Let Z = X + Y. Find fZ(z ). x
SOLUTION: fY(x)
z
x
f X ( x ) f Y (z - x ) dx e -x . e
- (z x)
f Z (z ) dx
- 0 fY(-x)
z
f Z (z ) 2 e -x . e
-z x
dx
0
x
z
dx Þ
-z - z
f Z (z ) 2 e f Z (z ) 2ze fY(z -x)
0
-64-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
P{ ≤ ≤ +∆ , ≤ ≤ +∆ } = P{ ≤ ≤ +∆ , ≤ ≤ +∆ }
( , )d d = ( , )d d
( , )= =
The denominator is what is called the Jacobean. It is the ratio of the differential area in the
plane to the differential area in the plane.
|J| = ; J 0
( , )=
Note that :
J=| |=
J= =
( ( , )d
-65-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
Example 1
Define =
=
a. Find the joint pdf of and .
b. Find the marginal pdf’s ( and ( .
c. Are and independent?
Solution
= =
J= = =
Therefore , ( =
= , =
= =
( = =
( =
-66-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
( = =
( =
Example 2
Let and be two independent uniform R.V with a joint pdf
Define =
=
Solution
= , J= = =2
= for
-67-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
( = 0<
= ( =
( = = 1< <2
= = 1< <2
( =
= ( = 1+
( = = =1-
-68-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
Example 3
Let and be two zero –mean, unit variance independent Gaussian random variable. Define the
polar random variables R and as
R= , =
Find (r, ) , (r) , ( )
Solution
It can be easily shown that:
=r =r
The Jacobian J = =
J= = =
Therefore,
(r, ) = = r. .
= 0
(r, ) = =
-69-
PROBABILITY DISTRIBUTIONS FOR MORE THAN ONE R.V. CHAPTER IV
(r) = =
(r) =
( )=
-70-
ELEMENTARY STATISTICS CHAPTER V
CHAPTER IV
ELEMENTARY STATISTICS
ELEMENTARY STATISTICS CHAPTER V
- In statistics, we take a random sample (X1, X2, ……, Xn) of size (n) from a distribution X
(population) for which the pdf is f X (x) by performing that experiment (n) times. The purpose
is to draw conclusions from properties of the sample about properties of the distribution of the
corresponding X (the population)
- First let us introduce some basic definitions about the random sample
- The sample mean ̂ X is defined as :
1 n
ˆ X (or X ) =
m å xi
n i= 1
- The sample variance ̂ 2X is defined as:
1 n
ˆ 2X
n - 1 i 1
( x i - ˆ X ) 2
i 1 i 1
ˆ 2X
n (n 1)
Regression Techniques:
Suppose in a certain experiment we take measurements in pairs, i.e. (x1,y1) , (x2,y2), … (xn,yn).
We suspect that the data can be fit in a straight line of the form y x .
Suppose that the line is to be fitted to the (n) points and let ( ) denote the sum of the squares
of the vertical distances at the (n) points, then
n
y i - (x i )
2
i 1
The method of least squares specifies the values of and so that is minimized.
n
2 ( y i - x i ) x i 0 y (xi, xi + )
i 1
n
2 ( y i - x i ) 0
i 1
n n
n x i y i ……… (1) (xi,yi)
i 1 i 1
n n n
x i x i2 x i y i ……… (2)
i 1 i 1 i 1 x
In matrix form, these equations are:
n
x i β y i
x x2 α x y
i i i i
-66-
ELEMENTARY STATISTICS CHAPTER V
n n n
n x i y i x i y i
i 1 i 1 i 1 C XY
n (n 1) ˆ 2
X ˆ 2X
n n n
n x i y i x i y i
1 n
where, C XY (x i ˆ X ) ( y i ˆ Y ) i1 n(n i11) i1 is the sample covariance
n - 1 i 1
between x and y and ˆ X2 is the sample variance of the x measurements (as defined earlier).
ˆ Y ˆ X
1 n 1 n
where, ˆ X i
n i 1
x and
ˆ Y y i are the mean of the x and the y measurements
n i 1
respectively.
x y
i 1
i i n ˆ X ˆ Y
n
(Curve passes through ̂ X and ̂ Y .)
x i 1
2
i n ˆ 2
X
C XY
Finally, the sample correlation coefficient can be calculated as X ,Y
ˆ X ˆY
Fitting a Polynomial by the Method of Least Squares:
Suppose now that instead of simply fitting a straight line to (n) plotted points, we wish to fit a
polynomial of the form:
y 1 2 x 3 x 2
The method of least squares specifies the constants 1 , 2 and 3 so that the sum of the
squares of errors is minimized.
n
y i - (1 2 x i 3 x i2 )
2
i 1
-67-
ELEMENTARY STATISTICS CHAPTER V
EXAMPLE (5-1):
Suppose that the polynomial to be fitted to a set of (n) points is y = b x. It can be shown that:
n n
b xi yi / xi2
i 1 i 1
EXAMPLE (5-2):
Let y = a xb.
Taking the ln of both sides, then:
ln y = ln a + b ln x
y' = ' + ' x' (Linear regression)
where: y' = ln y , ' = ln a , ' = b , x' = ln x
EXAMPLE (5-3):
xb
If y 1 e a
EXAMPLE (5-4):
L
If y
1 e a bx
This form reduces to:
L-y
ln a b x
y
which is in the standard form:
y' = ' + ' x' (Linear regression)
-68-
ELEMENTARY STATISTICS CHAPTER V
EXAMPLE (5-5):
The number of pounds of steam used per month by a chemical plant is thought to be related to
the average ambient temperature (in ˚F) for that month. The past year’s usage and temperature
are shown in the following table
Solution:
a. The linear regression model to be fit is y x
12 12 12 12
Here, x i 558 , x i2 29256 ,
i 1 i 1
x i y i 265607 ,
i 1
å
i= 1
y i = 5060
The equation parameters are given by: α = 9.2182, β = - 7.3126. The minimum value of the
mean square error is MMSE = 38.1315. VERIFY THESE RESULTS.
b. when the temperature is 55 F˚, the linear model predicts a usage of y=9.2182*55 -7.3126 =
499.69. (Note that this temperature is not one of those that appear in the table, yet the model
can predict the usage at this temperature).
c. The correlation coefficient between the x and y data = 0.9999. This is very close to 1
meaning that the data are highly correlated (we know that when y is linearly related to x, the
correlation coefficient =1).
Now let us try to fit the data in a polynomial y 1 2 x 3 x 2 . The equation parameters are:
1 -5.0455 , 2 9.1068 , 3 0.0012 . The MMSE = 37.0561.
Note that the second order curve fitting has too little effect on the mean square error, which
essentially implies that the linear model is quite satisfactory.
The linear model and the measured data are shown in the figure below
-69-
ELEMENTARY STATISTICS CHAPTER V
Theorem:
Let X1, X2, …, Xn be a sequence of Gaussian random variables, then any linear combination of
them is Gaussian, i.e., if
Y = C1X1 + C2X2 + …… + CnXn
( y - y )2
1 2 X2
Then the pdf of y is: f Y ( y ) e
2 Y2
where Y C1 1 C2 2 ......... Cn n
And Y2 C12 12 C22 22 ... Cn2 n2 2C1C2 1 2 1, 2 2C1C3 1 3 1,3 2C2 C3 2 3 2,3 ...
The following example illustrates this theorem for the case when the random variables are
dependent Gaussian random variables.
EXAMPLE (5-6):
Let X1 and X2 be two Gaussian random variables such that: 1 0 , 12 4 , 2 10 ,
22 4 , 1,2 0.25 . Define Y = 2X1 + 3X2
a. Find the mean and variance of Y
b. Find P(Y≤ 35).
SOLUTION:
Y 21 32 = 2(0) + 3(10) = 30
Y2 412 9 22 2(2)(3)(1 )( 2 ) 1,2 4(4) 9(9) 2(2)(3)(2)(3)(0.25) = 115
35 30
P(Y 35) ( ) (0.466) = 0.6794
115
Theorem:
Let X1, X2, …, Xn be a sequence of independent Gaussian random variables, each with mean i
and variance i2 . Define
Y = C1X1 + C2X2 + …… + CnXn
Then Y has a Gaussian distribution with mean and variance given by:
Y C1 1 C2 2 ......... Cn n
Y2 C1212 C22 22 ... Cn2 n2
EXAMPLE (5-7):
Let X1 and X2 be two independent Gaussian random variables such that: 1 0 , 12 4 ,
2 10 , 22 4 . Define Y = 2X1 + 3X2
c. Find the mean and variance of Y
d. Find P(Y≤ 35).
SOLUTION:
Y 21 32 = 2(0) + 3(10) = 30
Y2 412 9 22 = 4(4) 9(9) = 97
35 30
P(Y 35) ( ) (0.5077) = 0.6942.
97
-70-
ELEMENTARY STATISTICS CHAPTER V
Theorem:
Let X1, X2, …, Xn be a sequence of independent Gaussian random variables, each with mean
and variance 2 . Define
Y= (X1 + X2 + …… + Xn)/ n
Then Y has a Gaussian distribution with mean and variance given by: Y , Y2 2 /n . Y is
called the sample mean and will be denoted by ̂ or X .
EXAMPLE (5-8):
Soft-drink cans are filled by an automated filling machine. The mean fill volume is 330 ml
and the standard deviation is 1.5 ml. Assume that the fill volumes of the cans are independent
Gaussian random variables. What is the probability that the average volume of 10 cans
selected at random from this process is less than 328 ml.
SOLUTION:
ˆ ( X1 X 2 ... X n ) / n
E{ˆ} ( ... ) / n =330
Var (ˆ ) 2 / n (1.5)2 /10 0.225
̂ is Gaussian with mean 330 and variance 0.225.
328 330
P( ˆ 328) ( ) (-4.21) = 1.2769e-005.
0.225
The theorem works well for small samples n = 4 , 5 when the population has a continuous
distribution as illustrated in the following example.
EXAMPLE (5-9):
Let Y = X1+ X2+X3 , where Xi are uniform over the interval (0 ≤ Xi ≤ 1) and are independent.
Find and sketch the pdf of Y.
SOLUTION:
First we find the pdf of (X1+ X2) by convolving the pdf of X1 with that for X2. Then the new
pdf is convolved with that for X3. The result is:
-71-
ELEMENTARY STATISTICS CHAPTER V
0 y0
y2 / 2 0 y 1
fY ( y ) 3 y y 3 / 2 1 y 2
2
2 y3
(3 y ) / 2
2
0 y3
The mean and variance of Y are: Y 3(1/ 2) 3/ 2 , Y2 3( x )2 3(1/12) 3/12 . In the
figure below we plot the pdf’s of X and (X1+ X2). Also, we plot the Gaussian pdf (Solid line)
with mean 3/2 and variance 3/12 on the same graph of the pdf for Y=X1+ X2+X3 (dashed line)
-72-
ELEMENTARY STATISTICS CHAPTER V
It is very clear that even for n=3, f(y) is very close to the Gaussian curve.
Now let us calculate P(0≤ Y ≤ 1) using the exact formula and the approximation.
1
P(0≤ Y ≤ 1) = y 2 / 2dy = 0.1666; Exact probability
0
1 1.5 0 1.5
P(0≤ Y’ ≤ 1) = ( ) ( ) = 0.1587-0.0013= 0.1574.; Gaussian approximation
0.25 0.25
P(0 Y 1) 0.1574
94.44%
P(0 Y ' 1) 0.1666
EXAMPLE (5-10):
An electronic company manufactures resistors that have a mean resistance of 100 Ω and a
standard deviation of 10 Ω. Find the probability that a random sample of n = 25 resistors will
have an average resistance less than 95 Ω.
SOLUTION:
̂ X is approximately normal with:
mean = E(ˆ X ) = 100 Ω.
2X 10 2
Var (ˆ X ) ˆ 2X
n 25
2X 10 2
ˆ X 2
n 25
95 100 ̂ X
P{ ̂ X < 95} = P{Z < }
2 95 100
(2.5) 0.00621
EXAMPLE (5-11):
The lifetime of a special type of battery is a random variable with mean 40 hours and
-73-
ELEMENTARY STATISTICS CHAPTER V
standard deviation 20 hours. A battery is used until it fails, then it is immediately replaced by
a new one. Assume we have 25 such batteries, the lifetime of which are independent,
approximate the probability that at least 1100 hours of use can be obtained.
SOLUTION:
Let X1, X2, …, X25 be the lifetimes of the batteries.
Let Y = X1 + X2 + …… + X25 be the overall lifetime of the system
Since Xi are independent, then Y will be approximately normal with mean and variance:
Y 1 2 ... 25 25 25 * 40 1000
Y2 12 22 ... 25
2
25 X2 25 * (20) 2 10000
1100 1000
P(Y 1100) P( Z ) P(Z 1) = 1- (1) =0.158655
10000
EXAMPLE (5-12):
Suppose that the random variable X has a uniform distribution: over the interval 0≤ X≤ 1.A
random sample of size 30 is drawn from this distribution.
a. Find the probability distribution of the sample mean ̂ X
b. Find P(ˆ X ) 0.52
SOLUTION:
Since X has a continuous distribution, and since n = 30, then the probability density function
of the sample mean ̂ X is approximately normal with:
E(ˆ X ) = E(X) = 1/2.
X2
1 / 12 1
Var ( ˆ X ) ˆ X2
n 30 360
0.52 0.5
P{ ̂ X < 0.52} = P{Z < 0.379 }
1 / 360
(0.379) 0.648027
EXAMPLE (5-13):
Suppose that X is a discrete distribution which assumes the two values 1 and 0 with equal
probability. A random sample of size 50 is drawn from this distribution.
a. Find the probability distribution of the sample mean ̂ X
b. Find P(ˆ X ) 0.6
SOLUTION:
Since n=50 > 30, then we can approximate the sample mean by a normal distribution with:
E(ˆ X ) = E(X) = 0*1/2 + 1*1/2 =1/2.
-74-
ELEMENTARY STATISTICS CHAPTER V
X2
(0 1 / 2) 2 *1 / 2 (1 1 / 2) 2 *1 / 2 1
Var ( ˆ X ) ˆ X2
n 50 200
0.6 0.5
P{ ̂ X < 0.6} = P{Z < } (1.414) 0.92073
1 / 200
Estimation of Parameters:
- The field of statistical inference consists of those method used to make decisions or to draw
conclusions about a population. These methods utilize the information contained in a sample
from a population in drawing conclusions.
- The random variables X1, X2, …, Xn , called a random sample, have the same distribution and
are assumed to be independent.
- Estimator: is a function of the observable sample data that is used to estimate an unknown
population parameter.
Point Estimation:
Point estimation involves the use of the sample data to calculate a single value which is to
serve as a best guess for an unknown parameter. In other words, a point estimate of some
population parameter () is a single numerical value ˆ f (x 1 , x 2 , ........, x n ) .
- In the table below we list some examples of point estimators and the parameters that are used
to estimate.
-75-
ELEMENTARY STATISTICS CHAPTER V
Unknown ˆ)
Statistic (Θ Remarks
Parameter ()
1 n Used to estimate the mean regardless
X ˆ X xi
n i 1
of whether the variance is known or
unknown.
1 n
2X ˆ 2X
n - 1 i 1
( x i - ˆ X ) 2 Used to estimate the variance when
the mean is unknown.
1 n
2X ˆ 2X
n i 1
(x i - X ) 2
Used to estimate the variance when
the mean is known.
Used to estimate the probability of
x a success in a binomial distribution.
p P̂
n n : sample size
x : number of successes in the sample
n n
x 1i x
X1 X 2 ˆ X1 - ˆ X2 2i Used to estimate the difference in
i 1 n 1 i 1 n 2 the means of two populations.
x1 x 2 Used to estimate the difference in
p1 – p2 P̂1 P̂2
n1 n 2 the proportions of two populations.
3- the mean square error of an estimator (ˆ ) of the parameter () is defined as:
MSE(ˆ ) E(ˆ ) 2
This measure of goodness takes into account both the bias and imprecision.
MSE(ˆ ) can also be expressed as:
MSE(ˆ ) E{[ˆ E(ˆ ) E(ˆ ) ] 2 } E{[(ˆ E(ˆ )) ((E(ˆ ) ))] 2 }
B
-76-
ELEMENTARY STATISTICS CHAPTER V
- Definition:
An estimator whose variance and bias go to zero as the number of observations goes to infinity
is called consistent.
EXAMPLE (5-14):
Show that the sample mean ̂ X
1 n
X xi
ˆ
n i 1
is an unbiased estimator of the population mean X
SOLUTION:
1 n 1 n 1 n 1
E{ˆ X } E xi = E{xi } = x = (n x ) x (unbiased estimator)
n i 1 n i 1 n i 1 n
The variance of ˆ X is
1 n X2
Var{ˆ X } Var i x = Var {ˆ X } (The variance tends to zero as n tends to infinity.
n2 i 1 n
Therefore, ˆ X is unbiased and consistent estimator). When n goes to infinity,
1 n
xi X
n i 1
EXAMPLE (5-15):
Show that the sample variance ̂ 2X (when the mean is unknown).
1 n
ˆ X2
n - 1 i 1
( xi - ˆ X ) 2
2
n
n
n x i x i
2
i 1 i 1
ˆ 2X .
n (n 1)
1
n 2 n
2
1 n n
2
E{ˆ X2 } E n xi xi = E{ X }
ˆ n E ( xi ) E xi
2 2
n(n 1) i 1 i 1 n(n 1)
i 1 i 1
n
Note that since E{xi2 } X2 X2 , then n E ( xi ) n 2 ( X2 X2 )
2
i 1
n n n n n
2
n n
E xi E xi x j E xi x j E ( xi x j )
i 1 i 1 j 1 i 1 j 1 i 1 j 1
The double summation contains n elements, n terms are such that i=j, and (n2-n)=n(n-1) are
2
such that i ≠j. When i=j, E{xi2 } X2 X2 , and when i ≠j, E ( xi x j ) E ( xi ) E ( x j ) X2 since
the random variables are independent.
-77-
ELEMENTARY STATISTICS CHAPTER V
n n
Therefore, E ( x x
i 1 j 1
i j ) n( X2 X2 ) n(n 1) X2 = n X2 n 2 X2
E{ˆ X2 }
1
n(n - 1)
n 2 ( X2 X2 ) n X2 n 2 X2
E{ˆ X2 }
1
n(n - 1)
n 2 X2 n X2
1
n(n - 1)
n(n 1) X2
E{ˆ X } X
2 2
EXAMPLE (5-16):
Let X1 and X2 be a random sample of size two from a population with mean µx and variance
X X2 X 2X 2
2X . Two estimators for µx are proposed: ˆ 1 1 and ˆ 2 1 . Which
2 3
estimator is better and in what sense?
SOLUTION:
X1 X 2 x
E ( ˆ1 ) E ( ) x x (Therefore, ˆ 1 is an unbiased estimator of x )
2 2
X 2X 2 2 x
E ( ˆ 2 ) E ( 1 ) x x (Therefore, ˆ 2 is also an unbiased estimator of x )
3 3
Now, we evaluate the variance of the two estimators:
From previous results we know:
X X2 1 1 1
Var ( ˆ1 ) Var ( 1 ) x2 x2 x2
2 4 4 2
X1 2X 2 1 2 4 2 5 2
Var ( ˆ 2 ) Var ( ) x x x
3 9 9 9
1 2 5 2
Since Var ( ˆ 1 ) x < Var ( ˆ 2 ) x , then the first estimator is more efficient, and
2 9
therefore is better than the second.
-78-
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
CHAPTER V
ESTIMATION THEORY
AND APPLICATIONS
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
Method for Obtaining Point Estimators: The Maximum Likelihood (ML) Estimator
Let us start this section with two motivating examples:
EXAMPLE (6-1):
The probability p = P(H) of a coin may be 0.1 or it may be 0.9. To resolve the uncertainty,
the coin was tossed 10 times and 3 heads were observed. What will be your estimate for p in
light of the experiment outcome?
Solution:
Let us calculate the probability of getting 3 successes in 10 trials for the two possible values
of p using the binomial distribution
10
P( x 3;0.1) (0.1) 3 (1 0.1) 7 = 0.0574
3
10
P( x 3;0.9) (0.9) 3 (1 0.9) 7 = 8.748e-6
3
Therefore, we conclude that p=0.1 has a higher probability of producing the outcome and our
estimate for p would be pˆ 0.1 .
Instead, suppose that the experiment resulted in 8 heads, what would be our estimate for p?.
Again, we calculate
P( x 8;0.1) =3.645 e-7
P( x 8;0.9) =0.1937
In this case pˆ 0.9 .
EXAMPLE (6-2):
Let p be the probability of a success in a binomial distribution. This probability is unknown.
To estimate p, the experiment is performed 10 times and 3 successes were observed. Find a
maximum likelihood estimate for p.
Solution:
Any value of 0 p 1 is likely to produce the three successes in the 10 trials. But there is a
specific value, p̂ , to be estimated, that has the highest probability of producing the result.
This value of p is called the maximum likelihood estimate.
For the sake of comparison, let us compute f(p) at three different values of p; p=0.3, p=0.35,
and p=0.25. f(0.3) = 0.2668, f(0.35) = 0.2503, f(0.25) = 0.2522. Hence pˆ 3 / 10 has the
highest probability of generating the 3 successes in 10 trials.
08--
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
To find the maximum likelihood estimator of a parameter θ, we base our estimation on (n)
statistically independent samples {X1, X2, …, Xn} taken from the population. The maximum
likelihood estimator selects the parameter value which gives the observed data the largest
possible probability. The following steps summarize the procedure for obtaining a maximum
likelihood estimator for a continuous parameter θ.
- The joint pdf of the samples is (expressed in terms of )
L() = f{x1, x2, …, xn ; } = f{x1 ; } . f{x2 ; } …… f{xn ; } due to independent of xi.
L() is called the likelihood function. The maximum likelihood technique looks for that value
(ˆ ) of the parameter that maximizes the joint pdf of the samples.
- A necessary condition for the maximum likelihood estimator of () is:
lnL() 0
d d
L() 0 or equivalently
d d
(The ln(*) is a monotonically increasing function of the argument)
The following example illustrates this technique.
EXAMPLE (6-3):
Given a random sample of size (n) taken from a Gaussian population with parameters X and
2X . Use the ML technique to find estimators for the cases:
a- The mean X when the variance 2X is assumed known.
b- The variance 2X when the mean X is assumed known.
c- The mean X and variance 2X when both are assumed unknown.
SOLUTION:
n
( x i X ) 2
(x i X ) 2 n
( x i X ) 2
e
n 2 2X n
1
L ln( L)
i 1
e 2 2X ln 2 2X
2 X
n 2
i 1 2 2X 2 2X 2 i 1 2
d
a- Set ln L() 0 treating 2X as a constant.
d X
n
1 n
i 1
( x i
ˆ X ) 2
0
ˆ X x i ……… (1)
n i 1
Unbiased Estimator
Thus the ML estimator of the mean is the sample average mentioned earlier.
d
b- Set ln L( 2X ) 0 treating X as a constant
d X2
1 n
The result is X ( x i - X ) 2
ˆ 2
……… (2) Unbiased Estimator
n i 1
Note that: the division is by (n) since we are using the known mean of the distribution
c- Set ln L( X , X2 ) 0 and ln L( X , X2 ) 0
X 2
X
n
1 1 n
This results in: ˆ X
and xi
n i 1
n i 1
( x i - ˆ X ) 2 ˆ 2X
(n - 1) 2X
̂ 2X is a biased estimator since E{ˆ 2X }
n
08--
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
1 n
For this general case, the unbiased estimator of 2X is: ˆ 2X (x i - ˆ X ) 2
n - 1 i 1
Which is the sample variance introduced earlier.
EXAMPLE (6-4):
Given a random sample of size (n) taken from a distribution X with pdf
f ( x ) ( 1) x , 0 < x < 1.
Use the ML technique to find an estimators for α.
SOLUTION:
The likelihood function is
L( ) f ( x1 ) f ( x2 )... f ( xn )
L( ) ( 1) x1 ...( 1) xn = ( 1) n x1 ...xn
ln L( ) n ln( 1) ln x1 ... ln xn
Differentiating with respect to α and setting the derivative to zero, we get
d n
ln L( ) ln x1 ... ln xn 0
d ˆ 1
Solving for ̂ we get
n 1
ˆ 1 1 (note that ln xi 0 since 0 < x < 1)
ln x1 ... ln xn ( ln xi ) / n
n
i 1
08--
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
P z Z z 1 P z X
ˆ X
X n
z 1
2
2 2 2
P ˆ X z X n X ˆ X z X n 1 /2
2 2 /2
z
-z/2 z/2
Confidence Interval
Error z
ˆ X z X
2
n ̂ x x ˆ X z X
2
n
- Definition:
If ̂ X is the sample mean of a random sample of size (n) from a population with known
variance 2X , a 100(1 – )% confidence interval on X is given by:
ˆ X z X n X ˆ X z X n
2 2
can be controlled, we can choose (n) so that we are 100(1 – )% confident that the error in
estimating X is less than a specified error (E).
z X
2
(n) is chosen such that E z X n n 2 .
2 E
08--
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
EXAMPLE (6-5):
The following samples are drawn from a population that is known to be Gaussian.
7.31 10.80 11.27 11.91 5.51 8.00 9.03 14.42 10.24 10.91
Find the confidence limits for a 95% confidence level if the variance of the population is 4.
SOLUTION:
From the sample we have:
n = 10
1 n
ˆ X x i = 9.94
n i 1
z = 1.96
2
P ˆ X z X n X ˆ X z X n 1
2 2
1.96 4 1.96 4
P9.94 X 9.94 0.95
10 10
P8.70 X 11.1796 0.95
- Definition:
Let X1, X2, ……, Xn be a random sample for a normal distribution with unknown mean X
and unknown variance 2X . The quantity
ˆ
T X has a (t-distribution) with (n – 1) degree of freedom. fT(t)
ˆ X / n
(k 1)
The t-pfd is: f ( t ) 2 1
-t
k 1
k
T
/2 /2
k t 2 2
2
k 1
t
(k) is the number of degrees of freedom. -t/2 t/2
k
The mean of the t-distribution is zero and the variance .
k2
The t-distribution is symmetrically and unimodal, the maximum is reached when the mean is 0
(quite similar to normal distribution. As k , the t-distribution is the normal distribution).
P t 2 , n -1 T t 2 , n -1 1
ˆ X
T is the t-distribution with (n – 1) degree of freedom
ˆ X / n
t 2 , n -1
is the upper 100(/2)% point of the t- distribution with (n – 1) degree of freedom
08--
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
ˆ X
P t 2 , n -1 X t 2 , n -1 1
ˆ X n
ˆ ˆ
Pˆ X t 2 , n -1 X X ˆ X t 2 , n -1 X 1
n n
- Definition:
If ̂ X and ̂ X are the mean and standard deviation of a random sample from a normal
distribution with unknown variance 2X , the 100(1 – )% confidence interval on X is:
ˆ ˆ
- Pˆ X t 2 , n -1 X X ˆ X t 2 , n -1 X 1
n n
where t 2 , n -1
is the upper 100(/2)% point of the t-distribution with (n–1) degrees of freedom.
EXAMPLE (6-6):
For the following samples drawn from a normal population:
7.31 10.80 11.27 11.91 5.51 8.00 9.03 14.42 10.24 10.91
Find 95% confidence interval for the mean if the variance of the population is unknown.
SOLUTION:
From the sample we have:
1 n
ˆ X x i = 9.94
n i 1
1 n
ˆ 2X
n - 1 i 1
( x i - ˆ X ) 2 = 6.51
ˆ ˆ
Pˆ X t 2 , n -1 X X ˆ X t 2 , n -1 X 1
n n
6.51 6.51
P9.94 2.263 X 9.94 2.263 0.95
10 10
P8.11 X 11.77 0.95
08--
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
X /2 /2
1–
n ˆ
2
n ˆ
2
P 2 X 2X 2 X 1
2 , n
1 2 , n
-2/2 , n - 2/2 , n
- Definition:
If ̂ 2X is the sample variance from a random sample of (n) observations from a normal
distribution with a known mean and an unknown variance 2X , then a 100(1 – )% confidence
interval on 2X is:
n ˆ 2X n ˆ 2X
X 2
2
2 2 , n 1 2 , n
where 2 2 , n and 12 2 , n is the upper and lower 100(/2)% point of the chi-square distribution
with (n) degrees of freedom, respectively.
EXAMPLE (6-7):
For the following samples drawn from a normal population:
7.31 10.80 11.27 11.91 5.51 8.00 9.03 14.42 10.24 10.91
Find 95% confidence interval for estimation of the variance if the mean of the population is
known to be 10.
SOLUTION:
From the sample we have:
1 n 1 n
ˆ X x i = 9.94 and ˆ 2X ( x i - X ) 2 = 5.866
n i 1 n i 1
From tables of χ -distribution:
2
P 2.863 2X 18.065 0.95
08--
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
X2
i 1 X
;
ˆ 2
X
n - 1 i 1
( x i - ˆ X ) 2
where 2 2 , n -1 and 12 2 , n -1 is the upper and lower 100(/2)% point of the chi-square
distribution with (n – 1) degrees of freedom, respectively.
EXAMPLE (6-8):
For the following samples drawn from a normal population:
7.31 10.80 11.27 11.91 5.51 8.00 9.03 14.42 10.24 10.91
Find 95% confidence interval for estimation of the variance if the mean of the population is
unknown.
SOLUTION:
From the sample we have:
1 n
ˆ X x i = 9.94
n i 1
1 n
ˆ 2X
n - 1 i 1
( x i - ˆ X ) 2 = 6.51
(n 1) ˆ X
2
(n 1) ˆ 2X
P 2 2X 2 1
2 , n -1 1 2 , n -1
9 6.51 9 6.51
P 2X 0.95
19.023 2.7
P 3.0799 2X 21.7 0.95
08--
ESTIMATION THEORY AND APPLICATIONS CHAPTER VI
- To find a 100(1 – )% confidence interval on the binomial proportion using the normal
approximation we construct the statistic:
Z
X np
np(1 p)
P̂ p
p(1 p)
P z Z z 1
2 2
n
P̂ p
P z z 1
2 p(1 p) 2
n
p(1 p) p(1 p)
PP̂ z p P̂ z 1
2 n 2 n
The last equation expresses the upper and lower limits of the confidence interval in terms of
the unknown parameter.
p (1 - p)
- The solution is to replace (p) by P̂ in so that:
n
P̂(1 P̂) P̂(1 P̂)
PP̂ z p P̂ z 1
2 n 2 n
EXAMPLE (6-9):
In a random sample of 85 automobile engine crankshafts bearings, 10 have a surface finish
that is rougher than the specifications allow. A 95% confidence interval for (p) is:
x 10
z z 0.025 = 1.96 and P̂ 0.12
2 n 85
P̂(1 P̂) P̂(1 P̂)
P P̂ z p P̂ z 1
2 n 2 n
0.12(1 0.12) 0.12(1 0.12)
P0.12 1.96 p 0.12 1.96 0.95
85 85
P0.05 p 0.19 0.95
00--
ENGINEERING DECISION CHAPTER VII
CHAPTER VI
ENGINEERING DECISION
08--
ENGINEERING DECISION CHAPTER VII
Hypothesis Testing:
In the last chapter we illustrated how a parameter can be estimated (points or interval
estimation) from sample data. However, many problems require that we decide whether to
accept or reject a statement about some parameter. The statement is called a hypothesis, and
the decision-making procedure is called Hypothesis Testing.
- Definition:
The power of the test (1 – ) is the probability of accepting the alternative hypothesis when the
alternative hypothesis is true.
H0 : = 0 Reject H0 Accept H0
Accept H1
H1 : < 0 > 0 = 0
88--
ENGINEERING DECISION CHAPTER VII
ˆ 0
region
88--
ENGINEERING DECISION CHAPTER VII
N (0 , 1) N ( n , 1)
z
0 z n
EXAMPLE (7-1):
Aircrew space systems are powered by a solid propellant. The burning rate of this propellant
is an important product characteristic. Specifications require that the mean burning rate must
be 50 cm/s. We know that the standard deviation of burning rate is 2 cm/s. The experimenter
decided to specify a type I error probability of significance level of = 0.05. He selects a
random sample of n = 25 and obtains a sample average burning rate of ̂ = 51.3 cm/s. What
conclusions should be drawn?
SOLUTION:
Test H0 : = 50 cm/s , = 0.05
≠ 50 cm/s
Rejected H0 if z 1.96 or z 1.96
88--
ENGINEERING DECISION CHAPTER VII
Z H1 : 0
n Z z
N(0 , 1) H1 : 0 Z -z
H0 : = 0 H1 : 0 t t 2 , n -1
ˆ 0
2 unknown T
ˆ / n H1 : 0 t t 2 , n -1
student t-distribution
with (n – 1) degrees of freedom H1 : 0 t t 2 , n -1
H0 : 2 02 H1 : 2 02 2 2 2 , n -1 or 2 12 2 , n -1
(n 1) ˆ 2
unknown 2
02 H1 : 2 02 2 2 , n - 1
Chi-square distributions
with (n – 1) degrees of freedom H1 : 2 02 2 12- , n -1
H0 : 2 02 H1 : 2 02 2 2 2 , n or 2 12 2 , n
n ˆ 2
known 2 2
0 H1 : 2 02 2 2 , n
Chi-square distributions
with (n) degrees of freedom H1 : 2 02 2 12- , n
H0 : p = p0 H1 : p p 0 Z z
X np 0 P̂ p 0 2
Z
np 0 (1 p 0 ) p 0 (1 p 0 ) H1 : p p 0 Z z
n
N(0 , 1) H1 : p p 0 Z -z
88--
ENGINEERING DECISION CHAPTER VII
- Population (1) has mean 1 and variance 12 , population (2) has mean 2 and variance 22 .
Inferences will be based on two random samples of sizes (n1) and (n2).
That is X11, X12, ……, X1n1 is a random sample of (n1) observations from population 1, and
X21, X22, ……, X2n2 is a random sample of (n2) observations from population 2.
12
22
H1 : 1 2 0 Z z or Z z
2 2
H1 : 1 2 0 Z z
H1 : 1 2 0 Z -z
88--
ENGINEERING DECISION CHAPTER VII
H1 : 1 2 0 t t , , n1 n 2 2
H1 : 1 2 0 t t , , n1 n 2 2
If ̂ 1 , ̂ 2 , S12 , and S 22 are the means and variances of two random samples of sizes (n1) and
(n2) respectively from two independent normal populations with unknown but equal variances,
then a 100%(1 – ) confidence interval on the difference in means (1 – 2) is:
1 1 1 1
ˆ 1 ˆ 2 t 2 , n1 n 2 2 s P 1 2 ˆ 1 ˆ 2 t 2 , n n 2 s P
n1 n 2 1 2
n1 n 2
88--
ENGINEERING DECISION CHAPTER VII
CASE II: σ 12 σ 22
ˆ 1 ˆ 2 ( 0 )
If H0 : 1 – 2 = 0 is true, then the test Statistic T *
s 12 s 22
n1 n 2
If ̂ 1 , ̂ 2 , S12 , and S 22 are the means and variances of two random samples of sizes (n1) and
(n2) respectively from two independent normal populations with unknown and unequal
variances, then an approximate 100%(1 – ) confidence interval on the difference in means
(1 – 2) is:
S12 S22 S12 S22
μ̂
1 μ̂ 2 t α2
, μ1 μ 2 μ̂1 μ̂ 2 t α2
,
n1 n 2 n1 n 2
- Definition:
Let X11, X12, ……, X1n1 be a random sample from a normal population with mean 1 and
variance 12 , and let X21, X22, ……, X2n2 be a random sample from a second normal
population with mean 2 and variance 22 . Assume that both normal populations are
independent. Let S12 and S 22 be the sample variances, then the ratio:
S12 / 12
F 2 2
S2 / 2
has an F distribution with (n1 – 1) numerator degrees of freedom and (n2 – 1) denominator
degrees of freedom.
S12
Test Statistic: F0
S 22
88--
ENGINEERING DECISION CHAPTER VII
H1 : 12 22 f 0 f , n1 1, n 2 1
88--
ENGINEERING DECISION CHAPTER VII
H1 : p1 p 2 Z0 z
H1 : p1 p 2 Z 0 z
80--