Sta 2200 Notes
Sta 2200 Notes
Sta 2200 Notes
We will only know when we are no longer able to take advantage of it…
DESCRIPTION
Random variables: discrete and continuous, probability mass, density and distribution
functions, expectation, variance, percentiles and mode. Moments and moment generating
function. Moment generating function and transformation Change of variable technique for
univariate distribution. Probability distributions: hyper-geometric, binomial, Poisson, uniform,
normal, beta and gamma. Statistical inference including one sample normal and t tests.
Pre-Requisites: STA 2100 Probability and Statistics I, SMA 2104 Mathematics for Science
Course Text Books
1) RV Hogg, JW McKean & AT Craig Introduction to Mathematical Statistics, 6th ed.,
Prentice Hall, 2003 ISBN 0-13-177698-3
2) J Crawshaw & J Chambers A Concise Course in A-Level statistics, with worked examples,
3rd ed. Stanley Thornes, 1994 ISBN 0-534- 42362-0
Course Journals:
1) Journal of Applied Statistics (J. Appl. Stat.) [0266-4763; 1360-0532]
2) Statistics (Statistics) [0233-1888]
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 1
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
1. RANDOM VARIABLES
1.1 Introduction
In application of probability, we are often interested in a number associated with the outcome
of a random experiment. Such a quantity whose value is determined by the outcome of a
random experiment is called a random variable. It can also be defined as any quantity or
attribute whose value varies from one unit of the population to another.
A discrete random variable is function whose range is finite and/or countable, Ie it can only
assume values in a finite or countably infinite set of values. A continuous random variable is
one that can take any value in an interval of real numbers. (There are uncountably many real
numbers in an interval of positive length.)
Note: The probability that X takes on the value x, ie p(X x) , is defined as the sum of the
probabilities of all points in S that are assigned the value x.
We can say that this pmf places mass 83 on the value X = 2 .
The “masses” (or probabilities) for a pmf should be between 0 and 1.
The total mass (i.e. total probability) must add up to 1.
Definition: The probability mass function of a discrete variable is a graph, table, or formula
that specifies the proportion (or probabilities) associated with each possible value the random
variable can take. The mass function P(X x) (or just p(x) has the following properties:
0 p(x) 1 and p(x) 1
all x
More generally, let X have the following properties
i) It is a discrete variable that can only assume values x1 , x2 , .... xn
ii) The probabilities associated with these values are
P( X x1 ) p1 , P( X x2 ) p2 ……. P( X xn ) pn
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 2
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
n
Then X is a discrete random variable if 0 pi 1 and p
i 1
i 1
Remark: We denote random variables with capital letters while realized or particular values
are denoted by lower case letters.
Example 1
Two tetrahedral dice are rolled together once and the sum of the scores facing down was noted.
Find the pmf of the random variable ‘the sum of the scores facing down.’
Solution
+ 1 2 3 4 Therefore t is given the pmf by the table
1 2 3 4 5 below
2 3 4 5 6 x 2 3 4 5 6 7 8
1 1 3 1 3 1 1
3 4 5 6 7 P(X=x) 16 8 16 4 16 8 16
Example 3
A discrete random variable Y has a pmf given by the table below
Y 0 1 2 3 4
P(Y=y) c 2c 5c 10c 17c
Find the value of the constant c hence computes P1 Y 3
Solution
p(Y y) 1 c(1 2 5 10 17) 1 c 351
ally
Exercise 1.1
1. A die is loaded such that the probability of a face showing up is proportional to the face
number. Determine the probability of each sample point.
2. Roll a fair die and let X be the square of the score that show up. Write down the
probability distribution of X hence compute P X 15 and P3 X 30
3. Let X be the random variable the number of fours observed when two dice are rolled
together once. Show that X is a discrete random variable.
4. The pmf of a discrete random variable X is given by P( X x ) kx for x 1, 2 , 3, 4 , 5 , 6
Find the value of the constant k, P X 4 and P3 X 6
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 3
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
5. A fair coin is flip until a head appears. Let N represent the number of tosses required to
realize a head. Find the pmf of N
6. A discrete random variable Y has a pmf given by P(Y y ) c 34 for y 0 ,1, 2 , .....
x
b) f(x) cx 2 for x 0 ,1, 2 , .....k d) f(x) c2 x for x for x 0 ,1, 2 , ....
9. A coin is loaded so that heads is three times as likely as the tails. For 3 independent
tosses of the coin find the pmf of the total number of heads realized and the probability of
realizing at most 2 heads.
Remark A crucial property is that, for any real number x, we have P(X x) 0 (implying
there is no difference between P(X x) and P(X x) ); that is it is not possible to talk about
the probability of the random variable assuming a particular value. Instead, we talk about the
probability of the random variable assuming a value within a given interval. The probability
of the random variable assuming a value within some given interval from x a to x b is
defined to be the area under the graph of the probability density function
between x a and x b .
Example 1
Let X be a continuous random variable. Show that the function
1 x, 0 x 2
f(x) 2 is a pdf of X hence compute P0 X 1 and P 1 X 1
0, elsewhere
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 4
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Solution
x
2
2 2
f(x) 0 for all x in the interval 0 x 2 and 1
2
xdx 1
4 0 1 . Therefore f(x)is indeed a
0
pdf of X.
x
1
Now P0 X 1 12 xdx 1
4
2 1
0 1
4
and
0
Example 2
The time X, in hours, between computer failures is a continuous random variable with density
e 0.01x for x 0
f(x) Find hence compute P50 X 150 and PX 100
0 , elsewhere
Solution
f(x) 0 for all x in 0 x Thus 1 e 0.01x dx 100 e 0.01x
0
0 100 0.01..
50
150
50 e 0.5 e 1.5 0.3834005 and
0
100
0 1 e 1 0.6321206
Exercise 1.2
cx, 0 x 1
1) Suppose that the random variable X has p.d.f. given by f(x) Find the
0, elsewhere
value of the constant c hence determine m so that PX m 12
5x k , 0 x 3
2) Let X be a continuous random variable with pdf f(x) Find the value
0, elsewhere
of the constant k hence compute P1 X 3
k (1 y ), 4 x 7
3) A continuous random variable Y has the pdf given by f(y) Find
0, elsewhere
the value of the constant k hence compute PY 5 and P5 Y 6
4) The continuous random variable X has probability density function
k ( x 1)2 , 0 x 2
f( x) Find the value of the constant k k hence compute P X 1.5
0, otherwise
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 5
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
k (1 y 2 ), 1 x 1
5) A continuous r.v Y has probability density function f( x) Find
0, otherwise
the value of the constant k hence compute PY 0.5 and P(0.5 Y 0.5)
kx(1 x) , 0 x 1
6) Let X be a continuous r.v witha pdf f ( x) . Calculate P X 12 14
0, otherwise
6 , 0 x 2
1
Reminder If the cdf of X is F(x) and the pdf is f(x) , then differentiate F(x) to get f(x), and
integrate f(x) to get F(x) ;
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 6
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Theorem: For any random variable X and real values a < b, Pa X b F(b) - F(a)
Example 1
201 1 x for x 1, 2 , 3 , 4 , 5
Let X be a discrete random variable with pmf given by f( x) .
0 , elsewhere
Determine the cdf of X hence compute P X 3
Solution
4 x 1 x( x403)
x x
F(x) f(t)
t
1
20 (t 1)
t 1
1
20
(2 3 .... x) 1 x
20 2
0 for x 1
x 3)
F( x) x ( 40 for x 1, 2 , 3 , 4 , 5 Recall for an AP S n n
2 2a n 1d
1 for x 5
P X 3 1 P X 3 1 340
6
11
20
Example 2
Suppose X is a continuous random variable whose pdf f(x) is given by
1 x, 0 x 2
f( x) 2 . Obtain the cdf of X hence compute P X 23
0, elsewhere
Solution
0, x 0
2
x x
F(x) f(t)dt 12 tdt 14 t 2 0 14 x 2 thus F( x) x4 , 0 x 2
x
- 0 1, x 2
P X 3 1 P X 3 1 4 3 9
2 2 1 2 2 8
Example 3
A continuous random variable X has a probability density function given by
0.25 , 0 x 2
f( x) 0.5 x c , 2 x 3 Find the cdf of X hence compute P1.5 X 2.5 .
0 , elsewhere
Solution
x 1 dt x
0 4
x
4
F(x) f(t)dt x
12 t - 34 dt k x 4-3x 12 k
2
-
2
Under the two levels, F(2) must be the same. (reason for introducing k)
0 for x 0
x
4 for 0 x 2
F(2) 2 k F( x) 2
1
4 1 for 2 x 3
x -3x
1, for x 2
Exercise 1.3
x3 k
1. The CDF of a discrete random variable X is given by F( X ) , x 1, 2, 3
40
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 7
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 8
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Example 1
x 6 for x 1, 2 , 3
Give the pmf of a random variable X as f( x) find the pmf of Y X 2
0 , elsewhere
Solution
The only values of Y with non-zero probabilities are Y = 1, Y = 4 and Y = 9 . Now
PY = 1 PX2 = 1 PX = 1 16 PY = 4 P X2 = 4 PX = 2 13 and
PY = 9 P X2 = 9 PX = 3 12
In some cases several values of X will give rise to the same value of Y. The procedure
is just the same as above but it is necessary to add the several probabilities that are associated
with each value x that provides a unique value y.
x 1
for x 0 , 1, 2 , 3 , 4
Example 2 Give the pmf of a r.v X as f( x) 15 find the pmf of
0 , elsewhere
Y X 2
2
Solution
x 0 1 2 3 4
y 4 1 0 1 4
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 9
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
\Exercise 1.4
1 6 for x 1, 2 , 3 , 4 , 5 , 6
1. Suppose the pmf of a r.v X is given by f( x) , Obtain the pmf of
0 , elsewhere
Y 2 X 2 and Z X 3
x2 1
for x 0 , 1, 2 , 3
2. Let the pmf of a r.v X be given by f( x) 18 , determine the pmf of
0 , elsewhere
Y X 1
2
x 10 for x 0 , 1, 2 , 3 , 4
3. Suppose the pmf of a r.v X is given by f( x) , Obtain the pmf of
0 , elsewhere
Y X 2
1 for x 1, 2 , 3 , ....
x
Example 1
5 x 4 , 0 x 1
A continuous r.v X has a pdf given by f( x) . Determine the pdf of Y x 3
0 otherwise
Solution
1 2
5 y 3 , 0 x 1
2
dx 1 2 3 dx 4
Yx X Y Y g(y) f(x) 5 y 3 y 3 3
3 1 1
3
dy 3 dy 3
0 otherwise
24 x 2 , 0 x 12
Example 2 A r.v X has pdf f( x) determine the pdf of Y 8 X 3
0 otherwise
Solution
Y 8 x 3 X 12 Y 3
1 dx 1 2 3
6y g(y) f(x)
dx 2
24 12 y 3 16 y 3
1 2
1, 0 x 1
dy dy 0, elsewhere
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 10
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
NB Y 8 X 3 is the cdf of X
Exercise 1.5
5 x 4 , 0 x 1
1. For a r.v X with f( x) , determine the pdf of Y 2 ln x and its range.
0 otherwise
e x , x 0
2. A r.v X has pdf f( x) determine the pdf of Y X 4
0 otherwise
3. Let X be a continuous random variable with density function f( x) x12 for x 2 and
f( x) 0 otherwise . Determine the density function of Y 1
X 1
.
2
X
4. The probability density function of X is given by f( x) ( x 4)
2
Obtain
0 otherwise
the probability density function of Y tan1 ( 2x )
5. Which transformation will change a r.v X with pdf is as below to a uniform R.V Y whose
2e 2 x , x 0 1 (x - 3), 3 x 5
range is 0 x 1 a) f( x) b) f( x) 2
0 otherwise 0 otherwise
for x 1
6. Suppose that X has probability density function f( x) x ln If U X 2 , what
0 , elsewhere
is the probability density function f(u) of U?
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 11
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
(iv) Eag1 (x) bg 2 (x) ag (x) bg (x) P( X x)dx ag (x) P( X x)dx bg (x) P( X x)dx
1 2 1 2
All x All x All x
Eag1 (x) Ebg 2 (x) aEg1 (x) bEg 2 (x) ie from part iii
Proof:
Recall that EaX b a b therefore
Var (aX b) E(aX b) (a b) Ea X E a 2 X a2 E X a 2 var( X )
2 2 2
2
Remark
(i) The expected value of X always lies between the smallest and largest values of X.
(ii) In computations, bear in mind that variance cannot be negative!
Example 1
Given a probability distribution of X as below, find the mean and standard deviation of X.
x 0 1 2 3
P(X=x) 1/8 1/4 3/8 1/4
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 12
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Solution
3
X 0 1 2 3 total E ( X ) xp ( X x) 1.75 and
p ( X x) 1
8
1
4
3
8
1
4 1 x 0
xp ( X x) 0 1
4
3
4
3
4
7
4
standard deviation
x p ( X x)
2
0 1
4
3
2
9
4 4 E ( X 2 ) 2 4 1.752 0.968246
Example 2 The probability distribution of a r.v X is as shown below, find the mean and
standard deviation of; a) X b) Y 12 X 6 .
x 0 1 2
P(X=x) 1/6 1/2 1/3
Solution
2
X 0 1 2 total E ( X ) xp ( X x) 7 6 and
p ( X x) 1
6
1
2
1
3 1 x 0
xp ( X x) 2
E ( X 2 ) x 2 p( X x) 116
1 2 7
0 2 3 6
x p ( X x)
2
0 1
2
4
3
11
6 x 0
Standard deviation E ( X 2 ) 2 11
6 (7 6) 2 17
6 1.6833
Now E(Y ) 12E( X ) 6 12(7 6 ) 6 20
Var (Y ) Var (12 X 6) 122 Var ( X ) 144 17
6 242.38812
1 x, 0 x 2
Example 3 A continuous r.v X has a pdf given by f(x) 2 , find the mean
0, elsewhere
and standard of X
Solution
2 2
4
E(x) xf( x)dx 2 x dx 6 x 0 and E(x ) x f( x)dx 12 x 3dx 18 x 4 0 2
2 2
1 2 1 3 2 2
3
0 0
Standard deviation E ( X 2 ) 2 2 ( 4 3 )2 3
2
Exercise 1.6
1. Suppose X has a probability mass function given by the table below
x 2 3 4 5 6
P(X=x) 0.01 0.25 0.4 0.3 0.04
Find the mean and variance of; X
2. Suppose X has a probability mass function given by the table below
x 11 12 13 14 15
P(X=x) 0.4 0.2 0.2 0.1 0.1
Find the mean and variance of; X
3. Let X be a random variable with P(X = 1) = 0.2, P(X = 2) = 0.3, and P(X = 3) = 0.5. What is the
expected value and standard deviation of; a) X b) Y 5 X 10 ?
4. A random variable W has the probability distribution shown below,
w 0 1 2 3
P(W=w) 2d 0.3 d 0.1
Find the values of the constant d hence determine the mean and variance of W. Also find
the mean and variance of Y 10 X 25
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 13
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 14
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
k (1 10
m
) for M 10
16 A continuous r.v M has the pdf given by f(m) , find the value of
0 , elsewhere
the constant k, the mean and the variance of X
k (1 x) for 0 x 1
17 A continuous r.v X has the pdf given by f(x) , findt the value of
0 , elsewhere
the constant k. Also find the mean and the variance of X
d 2 if T 1
18 The lifetime of new bus engines, T years, has continuous pdf f(t) t find the
0 , if T 1
value of the constant d hence determine the mean and standard deviation of T
19 An archer shoots an arrow at a target. The distance of the arrow from the centre of the
k (3 + 2x - x 2 ) if x 3
target is a random variable X whose p.d.f. is given by f(x) find
0 , if x 3
the value of the constant k. Also find the mean and standard deviation of X
20 The random variable Y has probability density function f(y) given by
ky(a y ), 0 x 3
f(y) where k and a are positive constants.
0, elsewhere
2
a) Explain why a ≥ 3 and then show that k
9(a 2)
b) Given that E(Y) = 1.75 , show that a = 4 and write down the value of k.
c) For these values of a and k, sketch the probability density function,
d) Write down the mode of Y.
a bx , 0 x 5
21 A continuous random variable x has the following pdf f( x) where a and b
0 , otherwise
are constants. Show that 10a 25b 2
a) Given E X
35
, find a second equation in a and b hence find the values of a and b.
12
b) Find the median of X
22 The queuing time X minutes of a customer at a till of a supermarket has a pdf
323 x(k x) , 0 x k
f( x)
0 , otherwise
a) Show that k 4 . Also find E (x) and var(x)
b) Find the probability that a randomly chosen customers queuing time will differ from the mean
by at least half a minute
23 The probability density function f(x) can be written in the following form.
ax , 0 x 2
f( x) b ax , 2 x 4
0 , otherwise
a) Find the values of the constants a and b.
b) Show that σ, the standard deviation of X, is 0.816 to 3 decimal places.
c) Find the lower quartile of X.
d) State, giving a reason, whether P(2 – σ < X < 2 + σ) is more or less than 0.5
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 15
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
k (1 x) , 1 x 0
24 A continuous r.v X has the pdf given by f(x) 2k (1 x), 0 x 1 , find the value of
0 , elsewhere
the constant k. Also find the mean and the variance of X
e x for x 0
25 A continuous r.v X has the pdf given by f(x) , find the mean and
0 , elsewhere
3x
standard deviation of; a) X b) Y e 4
The lower quartile Q l and the upper quartile Q 3 are similarly defined by
FX (Ql ) = 0.25 and FX (Q3 ) = 0.75
Thus, the probability that X lies between Q l and Q 3 is 0.75 - 0.25 = 0.5 , so the quartiles give
an estimate of how spread-out the distribution is. More generally, we define the nth percentile
of X to be the value of xn such that FX (x n ) = 0.01n or n 100 , that is, the probability that X is
smaller than xn is n%.
2 x, 0 x 1
Example A random variable X has the pdf given by f(x) Find the lower,
0 , elsewhere
middle and upper quartiles.
Solution
On the interval 0 x 1, the cdf of X is given by F(x) x 2 thus
a) At lower quartile Q l , F(Ql ) Ql 0.25 Ql 0.25 0.5
2
2. PROBABILITY DISTRIBUTION
2.1 Discrete Distribution
Among the discrete distributions that we will discuss in this topic includes the Bernoulli,
binomial, Poisson, geometric and hyper-geometric
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 16
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Example 1
A biased coin is tossed 6 times. The probability of heads on any toss is 0:3. Let X denote the
number of heads that come up. Calculate: (i) P( X 2 ) (ii) P( X 3) (iii) P(1 X 5 )
Solution
If we call heads a success then X has a binomial distribution with parameters n=6 and p=0:3.
(i) P( X 2 )6 C2 (0.3)2 (0.7)4 0.324135
(ii) P( X 3 )6 C3 (0.3)3 (0.7)3 0.18522
(iii) P(1 X 5 ) P( X 2 ) P( X 3 ) P( X 4 ) P( X 5 )
0.324 0.185 0.059 0.01 0.578
Example 2 A quality control engineer is in charge of testing whether or not 90% of the
DVD players produced by his company conform to specifications. To do this, the engineer
randomly selects a batch of 12 DVD players from each day's production. The day's
production is acceptable provided no more than 1 DVD player fails to meet specifications’.
Otherwise, the entire day's production has to be tested.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 17
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
a) What is the probability that the engineer incorrectly passes a day's production as
acceptable if only 80% of the day's DVD players actually conform to specification?
b) What is the probability that the engineer unnecessarily requires the entire day's
production to be tested if in fact 90% of the DVD players conform to specifications?
Solution
Let X denote the number of DVD players in the sample that fail to meet specifications.
a) In part a we want P( X 1) with binomial parameters n 12 and p 0.2
P( X 1) P( X 0 ) P( X 1) 12 C0 (0.2) 0 (0.8)12 12 C1 (0.2)1 (0.8)11
0.069 0.206 0.275
b) In part b we require P( X 1) 1 P( X 1) with parameters n 12 and p 0.1 .
P( X 1) P( X 0 ) P( X 1) 12 C0 (0.1)0 (0.9)12 12C1 (0.1)1 (0.9)11 0.659
So P(X 1) 0.341
Example 3 Bits are sent over a communications channel in packets of 12. If the probability
of a bit being corrupted over this channel is 0.1 and such errors are independent, what is the
probability that no more than 2 bits in a packet are corrupted?
If 6 packets are sent over the channel, what is the probability that at least one packet will
contain 3 or more corrupted bits?
Let X denote the number of packets containing 3 or more corrupted bits. What is the
probability that X will exceed its mean by more than 2 standard deviations?
Solution
Let C denote the number of corrupted bits in a packet. Then in the first question, we want
P(C 2 ) P(C 0 ) P(C 1) P(C 2 )
12 C0 (0.1)0 (0.9)12 12C1 (0.1)1 (0.9)1112C2 (0.1) 2 (0.9)10
0.282 0.377 0.23 0.889.
Implying the probability of a packet containing 3 or more corrupted bits is
P(C 3 ) 1 P(C 2 ) 1 - 0.889 = 0.111.
Therefore X=’number of packets containing 3 or more corrupted bits” can be modelled with a
binomial distribution with parameters n 6 and p 0.111 . The probability that at least one
packet will contain 3 or more corrupted bits is:
P( X 1) 1 P( X 0 ) 1 -6 C0 (0.111)0 (0.889)6 0.494.
The mean of X is E(X) = 6(0.111) = 0.666 and its standard deviation is
= 6(0.111) (0.889) = 0.77
So the probability that X exceeds its mean by more than 2 standard deviations is
P( X 2 ) P( X 2.2 ) P( X 3 ) since X is discrete.
Now P( X 3 ) 1 P( X 2 ) 1 P( X 0 ) P( X 1) P( X 2 )
1 - 6 C0 (0.111)0 (0.889)6 6 C1 (0.111)1 (0.889)5 6 C2 (0.111)2 (0.889)4
1 - (0.4936 0.3698 0.1026) 0.032
Exercise 2.1
1. A fair coin is tossed 10 times. What is the probability that exactly 6 heads will occur.
2. If 3% of the electric bulbs manufactured by a company are defective find the probability
that in a sample of 100 bulbs exactly 5 bulbs are defective.
3. Suppose that 10% of inmates in a large prison are known to be innocent. A non-profit
group randomly selects 20 inmates from this prison. Find the probability the group will
find at least 3 innocent inmates.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 18
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
4. An oil exploration firm is formed with enough capital to finance 10 explorations. The
probability of a particular exploration being successful is 0.1. Find the mean and variance
of the number of successful explorations.
5. Emily hits 60% of her free throws in basketball games. She had 25 free throws in last
week’s game.
a) What is the expected number and the standard deviation of Emily’s hit?
b) Suppose Emily had 7 free throws in yesterday’s game, what is the probability that she
made at least 5 hits?
6. A coin is loaded so that heads has 60% chance of showing up. This coin is tossed 3 times.
a) What are the mean and the standard deviation of the number of heads that turned out?
b) What is the probability that the head turns out at least twice?
c) What is the probability that an odd number of heads turn out in 3 flips?
7. According to the 2009 current Population Survey conducted by the U.S. Census Bureau,
40% of the U.S. population 25 years old and above have completed a bachelor’s degree or
more. Given a random sample of 50 people 25 years old or above, what is expected
number of people and the standard deviation of the number of people who have
completed a bachelor’s degree.
8. Joe throws a fair die six times and face number 3 appeared twice. It he incredibly lucky or
unusual?
9. If the probability of being a smoker among a group of cases with lung cancer is .6, what’s
the probability that in a group of 8 cases you have; a) less than 2 smokers? b) More than
5? c) What are the expected value and variance of the number of smokers?
10. The manufacturer of the disk drives in one of the well-known brands of microcomputers
expects 2% of the disk drives to malfunction during the microcomputer’s warranty
period. Calculate the probability that in a sample of 100 disk drives, that not more than
three will malfunction
11. Manufacturer of television set knows that on an average 5% of their product is defective.
They sells television sets in consignment of 100 and guarantees that not more than 2 set
will be defective. What is the probability that the TV set will fail to meet the guaranteed
quality?
12. Suppose 90% of the cars on Thika super highways does over 17 km per litre.
a) What is the expected number and the standard deviation of cars on Thika super
highways that will do over 17 km per litre.in a random sample of 15 cars ?
b) What is the probability that in a random sample of 15 cars exactly 10 of these will do
over 17 km per litre?
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 19
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
xe
formula for the Poisson probability mass function is P( X x) , x 0 ,1, 2 , .... This is
x!
abbreviated as X ~ Po ( ) . is the shape parameter which indicates the average number of
events in the given time interval. The mean and variance of this distribution are equal ie
2
P( X x ) 1
Let’s check to make sure that if X has a poisson distribution, then x 0 . We will
2
3
4
need to recall that e 1
..... . Consequently
1! 4!2! 3!
x e
x
x 0
P( X x )
x 0 x!
e e e e 0 1
x 0 x!
Remark The major difference between Poisson and Binomial distributions is that the Poisson
does not have a fixed number of trials. Instead, it uses the fixed interval of time or space in which
the number of successes is recorded.
Example 1 Consider a computer system with Poisson job-arrival stream at an average of 2 per
minute. Determine the probability that in any one-minute interval there will be
a) 0 jobs; b) exactly 3 jobs; c) at most 3 arrivals. d) more than 3 arrivals
Solution
Job Arrivals with =2
a) No job arrivals: P(X 0 ) e2 0.1353353
23 e 2
b) Exactly 3 job arrivals: P( X 3) 0.1804470
3!
c) At most 3 arrivals
2 2 2 23
d) P( X 3) P( X 0) P( X 1) P( X 2) P( X 3) 1 e 2 0.8571
1 2 3!
e) more than 3 arrivals P( X 3) 1 P( X 3) 1 - 0.8571 0.1429
Example 2 If there are 500 customers per eight-hour day in a check-out lane, what is the
probability that there will be exactly 3 in line during any five-minute period?
Solution
The expected value during any one five minute period would be 500 / 96 = 5.2083333. The
96 is because there are 96 five-minute periods in eight hours. So, you expect about 5.2
customers in 5 minutes and want to know the probability of getting exactly 3.
(-500/96) 3e-500/96
P( X 3) 0.1288 (approx)
3!
Example 3 If new cases of West Nile in New England are occurring at a rate of about 2 per
month, then what’s the probability that exactly 4 cases will occur in the next 3 months?
Solution
X ~ Poisson (=2/month)
(2 * 3) 4 e ( 2*3) 6 4 e ( 6)
P(X 4 in 3 months) 13.4%
4! 4!
Exactly 6 cases?
(2 * 3) 6 e ( 2*3) 66 e ( 6)
P(X 6 in 3 months) 16%
6! 6!
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 20
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Exercise 2.2
1. Calculate the Poisson distribution whose λ (Average Rate of Success)) is 3 & X (Poisson
Random Variable) is 6.
2. Customers arrive at a checkout counter according to a Poisson distribution at an average
of 7 per hour. During a given hour, what are the probabilities that
a) No more than 3 customers arrive?
b) At least 2 customers arrive?
c) Exactly 5 customers arrive?
3. It is known from the past experience that in a certain plant there are on the average of 4
industrial accidents per month. Find the probability that in a given year will be less that 3
accidents.
4. Suppose that the change of an individual coal miner being killed in a mining accident
during a year is 1.1499. Use the Poisson distribution to calculate the probability that in
the mine employing 350 miners- there will be at least one accident in a year.
5. The number of road construction projects that take place at any one time in a certain city
follows a Poisson distribution with a mean of 3. Find the probability that exactly five road
construction projects are currently taking place in this city. (0.100819)
6. The number of road construction projects that take place at any one time in a certain city
follows a Poisson distribution with a mean of 7. Find the probability that more than four
road construction projects are currently taking place in the city. (0.827008)
7. The number of traffic accidents that occur on a particular stretch of road during a month
follows a Poisson distribution with a mean of 7.6. Find the probability that less than three
accidents will occur next month on this stretch of road. (0.018757)
8. The number of traffic accidents that occur on a particular stretch of road during a month
follows a Poisson distribution with a mean of 7. Find the probability of observing exactly
three accidents on this stretch of road next month. (0.052129)
9. The number of traffic accidents that occur on a particular stretch of road during a month
follows a Poisson distribution with a mean of 6.8. Find the probability that the next two
months will both result in four accidents each occurring on this stretch of road. (0.00985)
10. Suppose the number of babies born during an 8-hour shift at a hospital's maternity wing
follows a Poisson distribution with a mean of 6 an hour. Find the probability that five
babies are born during a particular 1-hour period in this maternity wing. (0.160623)
11. The average number of claims per day made to the Insurance Company for damage or
losses is 3.1. What is the probability that in any given day; (i)exactly 2 (ii) at most 2
(iii) more than 2 claims will be made?
12. The university policy department must write, on average, five tickets per day to keep
department revenues at budgeted levels. Suppose the number of tickets written per day
follows a Poisson distribution with a mean of 8.8 tickets per day. Find the probability that
less than six tickets are written on a randomly selected day from this distribution.
(0.128387)
13. A taxi firm has two cars which it hires out day by day. The number of demands for a car
on each day is distributed as Poisson distribution with mean 1.5. Calculate the proportion
of days on which neither car is used and the proportion of days on which some demands
is refused
14. If calls to your cell phone are a Poisson process with a constant rate =0.5 calls per hour,
what’s the probability that, if you forget to turn your phone off in a 3 hour lecture, your
phone rings during that time? How many phone calls do you expect to get during this
lecture?
15. The average number of defects per wafer (defect density) is 3. The redundancy built into
the design allows for up to 4 defects per wafer. What is the probability that the
redundancy will not be sufficient if the defects follow a Poisson distribution?
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 21
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
16. The mean number of errors due to a particular bug occurring in a minute is 0.0001
a) What is the probability that no error will occur in 20 minutes?
b) How long would the program need to run to ensure that there will be a 99.95% chance
that an error wills show up to highlight this bug?
Properties of Poisson
The mean and variance are both equal to .
The sum of independent Poisson variables is a further Poisson variable with mean
equal to the sum of the individual means.
As well as cropping up in the situations already mentioned, the Poisson distribution
provides an approximation for the Binomial distribution.
a
Recall sum to infinity of a convergent G.P is s
1 r
The cdf of a geometric distributions is given by
F(y) P(Y y) P(Y 1) P(Y 2) P(Y 3) ... P(Y y)
p 1 q y
p pq pq 2 ..... pq y 1
1 q y
1 q
1 q
Let Y~ Geo(p), then E (Y ) and Var ( X ) 2 2 Show?
p p
Solution
Let X be the random variable ‘the number of shoots required to realize the 1st hit’
x ~ Geo(0.7) and P( X x) 0.71 0.7 , x 1, 2 , 3, ....
x 1
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 22
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
1 1 1- p 1 0.7
b) 1.428571 and 0.78
0.7 p 0.7
Example 2
The State Department is trying to identify an individual who speaks Farsi to fill a foreign
embassy position. They have determined that 4% of the applicant pool are fluent in Farsi.
a) If applicants are contacted randomly, how many individuals can they expect to
interview in order to find one who is fluent in Farsi?
b) What is the probability that they will have to interview more than 25 until they find
one who speaks Farsi?
Solution
1 1
a) 25
0.04
b) P( X 25) 1 q 25 1 0.96 0.6396 P( X 25) 1 P( X 25) 0.3604
25
Example 3
From past experience it is known that 3% of accounts in a large accounting population are in
error. What is the probability that 5 accounts are audited before an account in error is found?
What is the probability that the first account in error occurs in the first five accounts audited?
Solution
P(Y 5) 1 F (5) 1 1 0.975 0.975 0.8587 P(Y 5) 1 0.975 0.1413
Exercise 2.3
1. Over a very long period of time, it has been noted that on Friday’s 25% of the customers
at the drive-in window at the bank make deposits. What is the probability that it takes 4
customers at the drive-in window before the first one makes a deposit.
2. It is estimated that 45% of people in Fast-Food restaurants order a diet drink with their
lunch. Find the probability that the fourth person orders a diet drink. Also find the
probability that the first diet drinker of th e day occurs before the 5th person.
3. What is the probability of rolling a sum of seven in fewer than three rolls of a pair of
dice? Hint (The random variable, X, is the number of rolls before a sum of 7.)
4. In New York City at rush hour, the chance that a taxicab passes someone and is
available is 15%. a) How many cabs can you expect to pass you for you to find one that
is free and b) what is the probability that more than 10 cabs pass you before you find
one that is free.
5. An urn contains N white and M black balls. Balls are randomly selected, one at a time,
until a black ball is obtained. If we assume that each selected ball is replaced before the
next one is drawn, what is;
a) the probability that exactly n draws are needed?
b) the probability that at least k draws are needed?
c) the expected value and Variance of the number of balls drawn?
6. In a gambling game a player tosses a coin until a head appears. He then receives $2n ,
where n is the number of tosses.
a) What is the probability that the player receives $8.00 in one play of the game?
b) If the player must pay $5.00 to play, what is the win/loss per game?
7. An oil prospector will drill a succession of holes in a given area to find a productive
well. The probability of success is 0.2.
a) What is the probability that the 3rd hole drilled is the first to yield a productive well?
b) If the prospector can afford to drill at most 10 well, what is the probability that he will
fail to find a productive well?
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 23
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
8. A well-travelled highway has itstraffic lights green for 82% of the time. If a person
travelling the road goes through 8 traffic intersections, complete the chart to find a) the
probability that the first red light occur on the nth traffic light and b) the cumulative
probability that the person will hit the red light on or before the nth traffic light.
9. An oil prospector will drill a succession of holes in a given area to find a productive
well. The probability of success is 0.2.
a) What is the probability that the 3rd hole drilled is the first to yield a productive well?
b) If the prospector can afford to drill at most 10 well, what is the probability that he will
fail to find a productive well?
Example 1
Boxes contain 2000 items of which 10% are defective. Find the probability that no more than
2 defectives will be obtained in a sample of size 10 drawn Without Replacement
Solution
Let Y be the number of defectives
C C C C C
P(Y 2) P(Y 0) P(Y 1) P(Y 2) 180 10 20 1 180 9 20 2 180 8
200C10 200C10 200C10
Example 2
How many ways can 3 men and 4 women be selected from a group of 7 men and 10 women?
Solution
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 24
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
C2 10 C4
7 7350
The answer is = = 0.3779 (approx)
17 C7 19448
Note that the sum of the numbers in the numerator are the numbers used in the combination
in the denominator.
This can be extended to more than two groups and called an extended hypergeometric
problem.
Exercise 2.4
1. A bottle contains 4 laxative and 5 aspirin tablets. 3 tablets are drawn at random from the
bottle. Find the probability that; a) exactly one, b) at most 1 c) at least 2 are laxative
tablet.
2. Want is the probability of getting at most 2 diamonds in the5 selected without
replacement from a well shuffled deck?
3. A massager has to deliver 10 out of 16 letters to computing department the rest to
statistics department. She mixed up the letters and delivered 10 letters at random to
computing department. What is the probability that, only 6 letters for computing
department actually got there?
4. In a class there are 20 students. 6 are compulsive smokers and they always keep
cigarette in their lockers. One day prefects checked at random on 10 lockers. What is the
probability that they find cigarette in at most 2 lockers?
5. A box holds 8 green, 4 white and 8 red beads. 6 beads are drawn at random without
replacement from the box. What is the probability that 3 red, 2 green and 1 white beads
are drawn?
2 12
0 xa
1 x xa
The cdf F(x) is given by F(x)
ba dt
ba
F(x) bxaa , a x b
1 xb
a
Example Prof Hinga travels always by plane. From past experience he feels that
take off time is uniformly distributed between 80 and 120 minutes after check in.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 25
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take
advantage of it…
determine the probability that: a) he waits for more than 15 minutes for take-off after
check in. b) the waiting time will be between 1.5 standard deviation from the mean,
Solution
401 , 80 x 120
X ~ U(80,120) f(x)
0, elsewhere
P(X 105) 1 - P(X 105) 1 - 1054080 83
3 ba
dx 401 x4 1.51 0
1.5 40
P( 1.5 x 1.5 )
1.5
1
But
1.5 40 40 12 12
3 3
P( 1.5 x 1.5 )
40 12
Exercise 2.5
1. Uniform: The amount of time, in minutes, that a person must wait for a bus is
uniformly distributed between 0 and 15 minutes, inclusive. What is the probability
that a person waits fewer than 12.5 minutes? What is the probability that will be
between 0.5 standard deviation from the mean,
2. The current (in mA) measured in a piece of copper wire is known to follow a
uniform distribution over the interval [0, 25]. Write down the formula for the
probability density function f(x) of the random variable X representing the current.
Calculate the mean and variance of the distribution and find the cumulative
distribution function F(x)
3. Slater customers are charged for the amount of salad they take. Sampling
suggests that the amount of salad taken is uniformly distributed between 5 ounces
and 15 ounces. Let x = salad plate filling weight, find the expected Value and the
Variance of x. What is the probability that a customer will take between 12 and 15
ounces of salad?
4. The thickness x of a protective coating applied to a conductor designed to work in
corrosive conditions follows a uniform distribution over the interval [20, 40]
microns. Find the mean, standard deviation and cumulative distribution function
of the thickness of the protective coating. Find also the probability that the coating
is less than 35 microns thick.
5. The average number of donuts a nine-year old child eats per month is uniformly
distributed from 0.5 to 4 donuts, inclusive. Determine the probability that a
randomly selected nine-year old child eats an average of;
a) more than two donuts
b) more than two donuts given that his or her amount is more than 1.5 donuts.
6. Starting at 5 pm every half hour there is a flight from Nairobi to Mombasa.
Suppose that none of these plane tickets are completely sold out and they always
have room for passagers. A person who wants to fly to Mombasa arrives at the
airport at a random time between 8.45 AM and (.45 AM. Determine the
probability that he waits for
a) At most 10 minutes b) At least 15 minutes
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 26
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take
advantage of it…
exponential random variable occur in the following way. There are fewer large values
and more small values. For example, the amount of money customers spend in one
trip to the supermarket follows an exponential distribution. There are more people that
spend less money and fewer people that spend large amounts of money.
The exponential distribution is widely used in the field of reliability. Reliability deals
with the amount of time a product lasts
In brief this distribution is commonly used to model waiting times
betweenoccurrences of rare events, lifetimes of electrical or mechanical devices
1
d) 100 P(T 100) 0.01e 0.01t dt e 0.01t e 1 0.3679
0.01 100 100
Exercise 2.6:
1. Jobs are sent to a printer at an average of 3 jobs per hour.
a) What is the expected time between jobs?
b) What is the probability thatthe next job is sent within 5 minutes?
2. The time required to repair a machine is an exponential random variable
with rate λ= 0.5 downs/hour
a) what is the probability that a repair time exceeds 2 hours?
b) what is the probability that the repair time will take at least 4 hours
given that the repair man has been working on the machine for 3
hours?
3. Buses arrive to a bus stop according to an exponential distribution with
rate λ= 4 busses/hour. If you arrived at 8:00 am to the bus stop,
a) what is the expected time of the next bus?
b) Assume you asked one of the people waiting for the bus about the
arrival time of the last bus and he told you that the last bus left at
7:40 am. What is the expected time of the next bus?
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 27
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take
advantage of it…
0
Proof
( ) x 1e x dx x 1e x
0
0
( 1) x 2e x dx ( 1)( 1)
0
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 28
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
For 1 , (1) 0
e x dx e x
0 1 Now suppose it holds for k ie
(k ) (k 1)! for k Z then (k 1) k (k 1)! k! for k Z . Thus if the results holds for
k then they must also hold for k 1 . But the results are true for 1
Therefore the results are true for any Z
from ( ) x 1e x dx put x = u 2 dx 2udu ( ) u 2 1e u 2udu 2 u 2 1e u du
2 2
0 0 0
Gamma Distribution
The Gamma , distribution models the time required for events to occur, given that
the events occur randomly in a Poisson process with a mean time between events of . For
example, an insurance company observes that large commercial fire claims occur randomly
in time with a mean of 0.7 years between claims. Not only in real life, the Gamma
distribution is also wildly used in many scientific areas, like Reliability Assessment, Queuing
Theory, Computer Evaluations, or biological studies. In a nut shell, this distribution is used
to model total waiting time of a procedure that consists of independent stages, each stage
with a waiting time having a distribution Exp( ) .Then the total time has a Gamma
distribution with parameters and .
Some examples of gamma distributions are plotted below. Notice that the modes shift to the
right as the ratio of increases.
Remarks
a) If 1 then we have the standard gamma distribution.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 29
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
3 3
P( X 4) 1 P( X 4) 1 xe dx 1 ( x 1)e x 1 1 5e 4 0.09158
0
4
x
0
4
Example 2 Suppose the survival time X in weeks of a randomly selected male mouse
exposed to 240 rads of gamma radiation has a gamma distribution with 8 and 15 .
a) Find the expected value and the standard deviation of the survival time.
b) What is the probability that a mouse survives (i) between 60 and 120 weeks.(ii) at least 30
weeks
Solution
X ~ 8 ,15 f(x) 8
1 x
x 7 e 15 , x 0 and f(x) 0 otherwise
15 (8)
a) E(X) = (8)(15) = 120 weeks 8(15) 2 30 2 42.4264 weeks
120 1 8 1
P(60 X 120) 7 15
4 (8) y e dy
7 y
x
b) x e dx
60 158 (8)
8
e y
y 7 y 42 y 210 y 840 y 2520 y 5040 y 5040
7 6 5 4 3
7!
2
4
4 8
261104e 6805296e
0.4959
7!
1 1
PX 30) x 7 e 15 dx
x
c) y 7 e y dy
30 15 (8)
8 2 7!
e y
y 7 y 42 y 210 y 840 y 2520 y 5040 y 5040
7 6 5 4 3
7!
2
2
2
37200e 0
0.9989
7!
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 30
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Question The time between failures of a laser machine is exponentially distributed with a
mean of 25,000 hours. What is the expected time until the second failure? What is the
probability that the time until the third failure exceeds 50,000 hours?
Theorem:
i) B , B , ie by putting y 1 x in the function
u 1
ii) B ,
x u
du ie by putting u or x in the function
0 1 u 1
1 x 1 u
0
( )( )
iv) B ,
( )
( 12 )( 12 )
v) ( 12 ) use B 12 , 12 ( 12 ) but
2
(1)
B 12 , 12 2 cos
tdt 2 dt
2 12 1 2 12 1
( 12 )
2 2
t sin
0 0
1
Note ( 12 ) x e dx 2 e u du . Simply
2 x
e x dx 12
2 2
0 0 0
Beta Distribution
Definition: A random variable X is said to have a standard beta distribution with parameters
and if it’s probability density function is given by
x 1 1 x
1
f(x) ,0 x 1 and f(x) = 0 elsewhere we denote this as X ~ Beta ,
B( , )
Theorem: If X has a standard beta distribution with parameters and , then
E(X) and. Var (X) 2
1 2
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 31
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
d k M (t )
Theorem: If M (t ) exists, then for any k N k
k/ E ( X k )
dt t 0
Proof
t2 t3
M (t ) 1 t1/ 2/ 3/ ...
2! 3!
2
2t 3t /
M ' (t ) 1/ 2/ 3 ... M ' (0) 1/ E ( X )
2! 3!
2t 3t 2 /
M ' ' (t ) 2/ 3/ 4 ... M ' ' (0) 2/ E ( X 2 )
2! 3!
Remark: The mgf of a particular distribution is unique and we can recognize the pdf if we
are given the mgf.
Example 1
The mgf of a r.v Y is given by M (t ) 16 et 13 e2t 12 e3t Find the mean and variance of Y
Solution
E (Y ) M ' (0) 16 et 23 e2t 32 e3t 16 23 32 73 t 0
M (t ) E e e
xe
e t x
e ee e e 1
e
t t
tx tx
x 0 x! x 0 x!
Now E ( X ) M ' (0) et e e 1
t
t 0
x 0 x 0 1 6 e 6 5e t
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 32
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
M ' (t ) 5et 6 5et
2
and
M ' ' (t ) 5et 6 5et
2
50e2t 6 5et 3
) 5e 6 5e
2 t 2 3
E ( X ) 5et 6 5et 5 and E (Y 2 t
50e2t 6 5et t 0 55
t 0
Var ( X ) E ( X 2 ) 2 55 52 30
Exercise 3.1
1) The mgf of a r.v Y is given by; a) M (t ) e2t 3t b) M (t ) exp 12 2t 2 t Find the
2
4 NORMAL DISTRIBUTION
4.1 Introduction
The normal, or Gaussian, distribution is one of the most important distributions in probability
theory. It is widely used in statistical inference. One reason for this is that sums of random
variables often approximately follow a normal distribution.
Definition A r.v X has a normal distribution with parameters and 2 , abbreviated
X ~ N , 2 if it has probability density function
f(x) 1
2
exp 12 for x
x 2
and 0
Where is the mean and is the standard deviation.
4.1.1 Properties of normal distribution
1) The normal distribution curve is bell-shaped and symmetric, about the mean
2) The curve is asymptotic to the horizontal axis at the extremes.
3) The highest point on the normal curve is at the mean, which is also the median and mode.
4) The mean can be any numerical value: negative, zero, or positive
5) The standard deviation determines the width of the curve: larger values result in wider,
flatter curves
6) Probabilities for the normal random variable are given by areas under the curve. The total
area under the curve is 1 (0.5 to the left of the mean and 0.5 to the right).
7) It has inflection points at and .
8) Empirical Rule:
a) 68.26% of values of a normal random variable are within 1 standard deviation of its
mean. ie P X 0.6826
b) 95.44% of values of a normal random variable are within 2 standard deviation of its
mean. ie P 2 X 2 0.9544
c) 99.72% of values of a normal random variable are within 3 standard deviation of its
mean. ie P 3 X 3 0.9972
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 33
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
(z)
1
exp 12 z 2 for z
2
The cumulative distribution function of a standard normal random variable is denoted (z ) ,
and is given by
z
( z ) (t )dt
-
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 34
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Example 1
Given Z ~ N0 ,1 , find; d) P(-0.696 z 1.865)
a) P(Z z) if z = 1.65, -1.65, 1.0, -1.0 e) P(-2.345 z 1.65)
b) P(Z z) for z = 1.02, -1.65 f) P( z 1.43)
c) P(0.365 z 1.75)
Solution
a) Look up and report the value for (z ) from the standard normal probabilities table
P(Z 1.65) = (1.65) 0.9505 (1.65) 0.0495 (1.0) 0.8413 (1.0) 0.1587
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 35
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
b) P(Z z) ( z) Thus P(Z 1.02) (1.02) 0.1515 P(Z -1.65) (1.65) 0.9505
c) P(0.365 z 1.75) (1.75) - (0.365) 0.9599 - 0.6350 0.3249
d) P(-0.696 z 1.865) (1.865) - (-0.696) 0.9689 - 0.2432 0.3249 0.7257
e) P(-2.345 z 1.65) (1.65) - (-2.345) 0.0505 0.0095 0.0410
f) P( z 1.43) P(-1.43 z 1.43) 2(1.43) -1 2(0.9236) 1 0.8472
Exercise 4.1
1...Given Z ~ N0 ,1 , find; 2...If Z ~ N0 ,1 , find the value of z for
a) P(Z z) if which;
z = 1.95, -1.89, 1.074, -1.53 a) P(Z a) = = 0 .973, 0.6693, 0.4634
b) P(Z z) for z = 1.72, -1.15 b) P(Z a) = 0.3719, 0 .9545, 0 .7546
c) P(0 z 1.05) c) P(-1.21 z t ) 0.6965
d) P(-1.396 z 1.125) d) P( z t ) 0.9544 , 0.9905 , 0.3750
e) P(-1.96 z 1.65)
f) P( z 2.33)
Z X-
~ N0 ,1 . It is also easily shown that the cumulative distribution function satisfies
F(x) X-
and so the cumulative probabilities for any normal random variable can be calculated using
the tables for the standard normal distribution..
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 36
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Let X ~ N , 2 then P(a X b) P a -
Z b-
where
b-
a -
X-
Z ~ N (0 ,1)
Example 3
The top 5% of applicants (as measured by GRE scores) will receive scholarships. If
GRE ~ N(500,100 2 ) , how high does your GRE score have to be to qualify for a scholarship?
Solution
Let X GRE . We want to find x such that P(X x) = 0.05 this is too hard to solve as it
stands - so instead, compute Z X100 500
~ N 0, 1 and find z for the problem,
P(Z z) 1 - ( z) = 0.05 ( z) 0.95 z 1.645
To find the equivalent x, compute X Z x 500 100(1.645) 66.5
Thus, your GRE score needs to be 665 or higher to qualify for a scholarship.
Exercise 4.2
1) Suppose X ~ N130 , 25 . Find; a) P( X 140) b) P( X 120) c) P(130 X 135)
2) The random variable X is normally distributed with mean 500 and standard deviation 100.
Find; (i) P( X 400) , (ii) P( X 620) (iii) the 90th percentile (iv) the lower and upper
quartiles. Use graphs with labels to illustrate your answers.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 37
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
3) A radar unit is used to measure speeds of cars on a motorway. The speeds are normally
distributed with a mean of 90 km/hr and a standard deviation of 10 km/hr. What is the
probability that a car picked at random is travelling at more than 100 km/hr?
4) For a certain type of computers, the length of time bewteen charges of the battery is
normally distributed with a mean of 50 hours and a standard deviation of 15 hours. John
owns one of these computers and wants to know the probability that the length of time
will be between 50 and 70 hours
5) Entry to a certain University is determined by a national test. The scores on this test are
normally distributed with a mean of 500 and a standard deviation of 100. Tom wants to
be admitted to this university and he knows that he must score better than at least 70% of
the students who took the test. Tom takes the test and scores 585. Will he be admitted to
this university?
6) A large group of students took a test in Physics and the final grades have a mean of 70
and a standard deviation of 10. If we can approximate the distribution of these grades by a
normal distribution, what percent of the student; (a) scored higher than 80? (b) should
pass the test (grades≥60)? (c) should fail the test (grades<60)?
7) A machine produces bolts which are N(4 0.09) where measurements are in cm. Bolts are
measured accurately and any bolt smaller than 3.5 cm or larger than 4.4 cm is rejected.
Out of 500 bolts how many would be accepted? Ans 430
8) Suppose IQ ~ N (100, 22.5).a woman wants to form an Egghead society which only
admits people with the top 1% IQ score. What should she have to set the cut-off in the
test to allow this to happen? Ans 134.9
9) A manufacturer does not know the mean and standard deviation of ball bearing he is
producing. However a sieving system rejects all the bearings larger than 2.4 cm and those
under 1.8 cm in diameter. Out of 1,000 ball bearings, 8% are rejected as too small and
5.5% as too big. What is the mean and standard deviation of the ball bearings produced?
Ans mean=2.08 sigma=0.2
a) P( X 4)10 C4 12
105
0.2051
10
512
b) P( X 4) 10 C0 10C1 10C2 10C3 10C4 0.5
193
0.3770
10
512
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 38
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
for a large enough n, a binomial variable X is approximately ∼ N(np, npq). Hence, the
normal distribution can be used to approximate the binomial distribution.
To get a feel for why this might work, let us draw the probability histogram for 10 tosses of a
fair coin. The histogram looks bell-shaped, as long as the number of trials is not too small
The usual way to solve this problem is to associate 1/2 of the interval from a to a + 1 with
each adjacent integer. The continuous approximation to the probability P( X a) would thus
be P( X a 12 ) , while the continuous approximation to P( X a 1) would be
P( X a 12 ) . This adjustment is called a continuity correction. More specifically,
P( X x) P( X x 1) P( X x 0.5) P z x 0npq
.5 np
Binomialdistribution Normal approximation
Standardd Normal approx
P( X x) P( X x 1) P( X x 0.5) P z x 0npq
.5 np
Binomialdistribution Normal approximation
Standardd Normal approx
a 0.5 np
P(a X b) P(a 1 X b 1) P(a 0.5 X b 0.5) P z b 0npq
.5 np
npq
Binomialdistribution Normal approximation
NB: For the binomial distribution, the values to the right of each = sign are primarily
included for illustrative purposes. The equalities which hold in the binomial distribution do
not hold in the normal distribution, because there is a gap between consecutive values of a.
The normal approximation deals with this by “splitting” the difference.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 39
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Returning to the case of coin tossing. Suppose we wish to find P( X 4) , the probability that
the binomial r.v is less than or equal to 4. In the diagram above, the bars represent the
binomial distribution with n = 10, p = 0.5. The superimposed curve is a normal density f(x).
The mean of the normal is np 5 and the standard deviation is
10(0.5)(0.5) 1.58 Using the normal approximation, we need to calculate the
probability that our normal r.v is less than or equals to 4.5. ie
P( X 4) P( X 4.5) PZ 41.5.585 (0.3162) 0.3759 which is very close to the
Binomial Normal std normal
Example 1 Suppose 50% of the population approves of the job the governor is doing, and
that 20 individuals are drawn at random from the population. Solve the following, using the
normal approximation to the binomial. What is the probability that;
a) exactly 7 people will support the governor?
b) at least 7 people will support the governor?
c) more than 11 people will support the governor?
d) 11 or fewer will support the governor?
Solution
Note that n 20 , p 0.5 np 10 and npq 5 Since np 5 and nq 5 , it is
probably safe to assume that X ~ N (10 , 5)
P
a) P( X 7) P(6.5 X 7.5) P 6.5510 Z 7.5510 1 Z
.565
1
.118
Binomial Normal std normal
std normal
Example 2
In each of 25 races, the Democrats have a 60% chance of winning. What are the odds that the
Democrats will win 19 or more races? Use the normal approximation to the binomial
Solution
Note that n 25 , p 0.6 np 15 and npq 6 Since np 5 and nq 5 , it is
probably safe to assume that X ~ N (15 , 6). .
Using the normal approximation to the binomial,
P( X 19) P( X 18.5) PZ 1.4289 1 1.4289 1 0.9235 0.0765
Binomial Normal std normal
Hence, Democrats have a little less than an 8% chance of winning 19 or more races.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 40
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Example 3 Tomorrow morning Iberia flight to Madrid can seat 370 passengers. From past
experience, Iberia knows that the probability is 0.90 that a given ticket-holder will show up
for the flight. They have sold 400 tickets, deliberately overbooking the flight. How confident
can Iberia be that no passenger will need to be .bumped (denied boarding)?
Solution:
We will assume that the number (X) of passengers showing up for the flight has a binomial
distribution with mean 400 0.9 360 and standard deviation 400 0.9 0.1 6
We want P( X 370) P( X 370.5) PZ 1.75 0.9599 So the probability that
Binomial Normal std normal
Exercise 4.3 (Use the normal approximation to the binomial in the entire exercise)
1. A coin is loaded such that heads is thrice as likely as the tails. Find the probability of
observing between 4 and 7 heads inclusive with 12 tosses of the coin.
2. Based upon past experience, 40% of all customers at Miller’s Automotive Service Station
pay for their purchases with a credit card. If a random sample of 200 customers is
selected, what is the approximate probability that;
a) at least 75 pay with a credit card?
b) not more than 70 pay with a credit card?
c) between 70 and 75 customers, inclusive, pay with a credit card?
3. The probability that Ron scores a goal in any game against a tough opponent in soccer is
0.3. What is the probability that he scores 30 goals in the next 100 games which he plays?
4. Crafty Computers limited produces PCs. The probability that one of their computers has a
virus is 0.25. JKUAT ICSIT buys 300 computers from the company. What is the
probability that between 70 and 80 PCs inclusive have a virus? Would you advice the
director JKUAT ICSIT to buy Computers from this company in future?
5. For overseas flights, an airline has three different choices on its dessert menu—ice cream,
apple pie, and chocolate cake.Based on past experience the airline feels that each dessert
is equally likely to be chosen.
a) If a random sample of four passengers is selected, what is the probability that at
least two will choose ice cream for dessert
b) If a random sample of 21 passengers is selected, what is the approximate
probability that at least eight will choose ice cream for dessert?
6. In a family of 11 children, what is the probability that there will be more boys than girls?.
7. A baseball player has a long term batting average of 0.300. What is the chance he gets an
average of 0.330 or higher in his next 100 bats?
8. Suppose we draw a Simple Random Sample of 1,500 Americans and want to assess
whether the representation of blacks in the sample is accurate. We know that about 12%
of Americans are black, what is the probability that the sample contains at most170 lacks?
9. Let T be the lifetime in years of new bus engines. Suppose that T is continuous with
0 for t < 1
probability density function f(t) = d for some constant d.
t 3 for t > 1
a) Find the value of d and the mean and median of T.
b) Suppose that 240 new bus engines are installed at the same time, and that their
lifetimes are independent. By using a normal approximation to the binomial, find the
probability that at most 10 of the engines last for 4 years or more.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 41
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Example If X ~ N60 ,16 and Y ~ N70 , 9 are 2 independent r.v, Find (a) P X Y 140
(b) P120 X Y 135 (c) PY X 7 (d) P2 Y X 12
Solution
X Y ~ N130 ,25 and Y X ~ N10 ,25 therefore
a) P X Y 140 PZ 1405130 (2) 0.9772
b) P120 X Y 135 P1205130 Z 1355130 (1) (2) 0.8413 - 0.0228 0.8185
c) PY X 7 PZ 7510 PZ 0.6 PZ 0.6 (0.6) 0.7257
d) P2 Y X 12 P 2510 Z 12510 (0.4) (1.6) 0.6554 - 0.0548 0.6006
Exercise 4.4
1. If X ~ N65 ,28 and Y ~ N85 , 36 are 2 independent r.v, Find (a) P X Y 142
(b) P134 X Y 166 (c) PY X 4 (d) P12 Y X 24
2. Each day Mr. Njoroge walks to the library bto read a newspaper. Total time spent
walking is normally distributed with mean 15 minutes and standard deviation 2 minutes.
Total time spent in the library is also normally distributed with mean 25 minutes and
standard deviation 12 minutes. Find the probability that on one day;
a) he is away from his home for more than 45 minutes.
b) he spends more time walking than in the library
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 42
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
random, we would observe a different value of x if we repeated the procedure. That is, x is
also a random quantity. Its value is determined partly by which people are randomly chosen
to be in the sample. If we repeatedly drew samples of size n and calculated x , we could
ascertain the sampling distribution of x .
Many possible samples, many possible x ’s
mean = 1.78 mean = 1.55 mean = 1.45 mean = 1.6 mean = 1.73
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
mean = 1.6 mean = 1.56 mean = 1.67 mean = 1.44 mean = 1.7
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
mean = 1.53 mean = 1.62 mean = 1.66 mean = 1.38 mean = 1.45
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
mean = 1.7 mean = 1.64 mean = 1.61 mean = 1.59 mean = 1.72
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
We will have a better idea of how good our one estimate is if we have good knowledge of
how x behaves; that is, if we know the probability distribution of x .
Example The law firm of Hoya and Associates has six partners (A, B, C, D, E, F). At their
weekly partners meeting each reported the number of hours they charged clients for their
services last week. A 24, B 26, C 28, D 26, E 24. F26 (eg, Mr. E charged 24 hrs)
If n=2, (ie two partners) are selected randomly, how many different samples are possible?
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 43
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
This is the combination of 6 objects taken 2 at a time. Ie there are 6 c 2 15 possible samples.
15 sample means are given below: (e.g. if the sample has A and B, sample mean is 24)
AB 25, AC 26, AD 25, AE 24, AF 25, BC 27, BD 26, BE 25, BF 26, CD 27, CE 26, CF 27,
DE 25, DF 26, EF 25 putting this in a frequency table we get
x 24 25 26 27
f 1 6 5 3
This is almost the sampling distribution of means. If we divide individual frequencies by total
frequency (ie 15) we get “relative frequency” or probability. These probabilities add up to
one, so we have a prob. distribution. The above information says that the probability that
sample mean is 24 is 1 out of 15 or 0.066667. Now draw a histogram for sampling distribution
of means and fit a normal curve on it.
The sampling distribution is simply this probability distribution defined over all possible
samples of size n from the population of size N. In the real world problems N will be large
(e.g. 200 million US population) and n will be also be large (e.g., 1000 people surveyed) and
N
c n will be astronomical number. Then the sampling distribution can only be imagined. We
have chosen a simple example of N=6, n=2 so that the entire sampling distribution can be
explicitly computed and visualized. Now the random variable is x , it is no longer just X.
Definitions
Central Limit Theorem:- Stats that as the sample size increases, the sampling distribution of
the sample means will become approximately normally distributed.
Sampling Distribution of the Sample Means:- Distribution obtained by using the means
computed from random samples of a specific size.
Sampling Error :- Difference which occurs between the sample statistic and the population
parameter due to the fact that the sample isn't a perfect representation of the population.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 44
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Standard Error or the Mean:- The standard deviation of the sampling distribution of the
sample means. It is equal to the standard deviation of the population divided by the square
root of the sample size.
Remarks
Central limit theorem involves two different distributions: the distribution of the original
population and the distribution of the sample means
x
The formula for a z-score when working with the sample means is: z ~ N(0 ,1)
/ n
Example Intelligence Quotient (IQ) is normally distributed with mean 110 and standard
deviation of 10. A moron is a person with IQ less than 80. Find the probability that a
randomly chosen person is a moron. Let idiot be defined as one with an IQ less than 90.
Find the probability that a randomly chosen person is an idiot. (Hint this random variable is
for a single person X)
If a sample of 25 students is available, what is the probability that the average IQ exceeds
105? What is the probability that the average IQ exceeds 115 (Hint this random variable is for
an average over 25 persons or X )
Solution
IQ X ~ N 1101 , 102 , and therefore for a sample of 25 people average IQ X ~ N1101 , 4
The probability that a randomly chosen person is a moron is given by
P( X 80) PZ 8010
110
(3) 0.0013
The probability that a randomly chosen person is an idiot is given by
P( X 90) PZ 9010
110
(2) 0.0228
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 45
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Exercise 4.5
1) The annual salaries of employees in a large company are approximately normally
distributed with a mean of $50,000 and a standard deviation of $10,000. If a random
sample of 50 employees is taken, what is the probability that their average salary is;
a) less than $45,000?
b) between $45,000 and $65,000?
c) more than $70,000
2) Library usually has 13% of its books checked out. Find the probability that in a sample of
588 books greater than 14% are checked out. ANS= 0.2358
3) The length of similar components produced by a company are approximated by a normal
distribution model with a mean of 5 cm and a standard deviation of 0.02 cm.
a) If a component is chosen at random what is the probability that the length of this
component is between 4.98 and 5.02 cm?
b) what is the probability that the average length of of a sample of 25 component is
between 4.96 and 5.04 cm?
4) The length of life of an instrument produced by a machine has a normal distribution with
a mean of 12 months and standard deviation of 2 months. Find the probability that in a
random sample of 4 instrument produced by this machine, the average length of life
a) less than 10.5 months. b) between 11and 13 months.
5) The time taken to assemble a car in a certain plant is a random variable having a normal
distribution of 20 hours and a standard deviation of 2 hours. What is the probability that a
car can be assembled at this plant in a period of time
a) less than 19.5 hours? b) between 20 and 22 hours?
5 STATISTICAL INFERENCES
5.1 Introduction
In research, one always has some fixed ideas about certain population parameters based on
say, prior experiments, surveys or experience. However, these are only ideas. There is
therefore a need to ascertain whether these ideas /claims are correct or not.
The ascertaining of claims is done by first collecting information in the form of sample data.
We then decide whether our sample observations (statistic) have come from a postulated
population or not.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 46
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Definitions
A hypothesis is a claim (assumption) about a population parameters such as the population
mean, the population proportion or the population standard deviation is a postulated or a
stipulated value of a parameter
Example: The mean monthly cell phone bill in this city is μ= $42
The proportion of adults in this city with cell phones is π = 0.68
On the basis of observation data, one then performs a test to decide whether the postulated
hypothesis should be accepted or not. However, we note that the decision aspect is prone to
error/risk.
Null Hypothesis (denoted H0): Statement of zero or no change and is the hypothesis which is
to be actually tested for acceptance or rejection. If the original claim includes equality (<=, =,
or >=), it is the null hypothesis. If the original claim does not include equality (<, not equal,
>) then the null hypothesis is the complement of the original claim. The null hypothesis
always includes the equal sign. The decision is based on the null hypothesis.
Eg: The average number of TV sets in U.S. Homes is equal to three ( H 0 : μ 3 )
It’s always about a population parameter, and not about a sample statistic
Ie H 0 : μ 3 but NOT H 0 : x 3
We begin with the assumption that the null hypothesis is true
- Similar to the notion of innocent until proven guilty
Alternative Hypothesis (denoted H1 or Ha): Statement which is true if the null hypothesis is
false. it Challenges the status quo. It Is generally the hypothesis that the researcher is trying to
prove and it is accepted when H 0 is rejected and vice versa. The type of test (left, right, or
two-tail) is based on the alternative hypothesis.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 47
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
- If the sample mean is close to the assumed population mean, the null hypothesis is not
rejected.
- If the sample mean is far from the assumed population mean, the null hypothesis is
rejected.
How far is “far enough” to reject H0? The critical value of a test statistic creates a “line in the
sand” for decision making -- it answers the question of how far is far enough.
Critical values
Actual Situation
Decision H0 True H0 False
Do Not Reject H0 No Error Type II Error
Probability 1 - α Probability β
Reject H0 Type I Error No Error
Probability α Probability 1 - β
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 48
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
H0: μ = 3
H1: μ ≠ 3
Test statistic: Sample statistic used to decide whether to reject or fail to reject the null
hypothesis
Probability Value (P-value): The probability of getting the results obtained if the null
hypothesis is true. If this probability is too small (smaller than the level of significance), then
we reject the null hypothesis. If the level of significance is the area beyond the critical values,
then the probability value is the area beyond the test statistic.
Decision: A statement based upon the null hypothesis. It is either "reject the null hypothesis"
or "fail to reject the null hypothesis". We will never accept the null hypothesis.
Conclusion: A statement which indicates the level of evidence (sufficient or insufficient), at
what level of significance, and whether the original claim is rejected (null) or supported
(alternative).
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 49
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Remarks
The first thing to do when given a claim is to write the claim mathematically (if possible),
and decide whether the given claim is the null or alternative hypothesis. If the given claim
contains equality, or a statement of no change from the given or accepted condition, then it is
the null hypothesis, otherwise, if it represents change, it is the alternative hypothesis.
The type of test is determined by the Alternative Hypothesis ( H1 )
If the test statistic falls into the non rejection region, do not reject the null hypothesis H0. If
the test statistic falls into the rejection region, reject the null hypothesis. Express the
managerial conclusion in the context of the problem
Conclusions are sentence answers which include whether there is enough evidence or not
(based on the decision) and whether the original claim is supported or rejected. Conclusions
are based on the original claim, which may be the null or alternative hypotheses.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 50
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 51
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Example Test at 5% level the claim that the true mean # of TV sets in US homes is equal to
3. Suppose the sample results are n = 100, x 2.84 (σ = 0.8 is assumed known)
Solution
State the appropriate null and alternative hypotheses
- H0: μ = 3 H1: μ ≠ 3 (This is a two-tail test)
Determine the appropriate technique
- σ is assumed known so this is a Z test.
Determine the critical values
- For = 0.05 the critical Z values are ±1.96
Xμ 2.84 3
Compute the test statistic ZSTAT so the test statistic is: ZSTAT σ/ n
0.8/ 100
2.0
Exercise 5.1
1. A simple random sample of 10 people from a certain population has a mean age of 27.
Can we conclude that the mean age of the population is less than 30? The variance is
known to be 20. Let = .05.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 52
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
2. Bon Air Elementary School has 300 students. The principal of the school thinks that the
average IQ of students at Bon Air is at least 110. To prove her point, she administers an
IQ test to 20 randomly selected students. Among the sampled students, the average IQ is
108. Assuming variance is known to be 100, should the principal accept or reject her
original hypothesis? at 5% level of significance
3. Central bank believes that if consumer confidence is too high, the economy risks over
heating. Low confidence is a warning that rcession might be on the way. In either case,
the bank may choose to intervene by altering interest rates. The ideal value for the bank's
chosen measure is 50. We may assume the measure is normally distributed with standard
deviation 10. The bank takes a survey of 25 people. Which returned a sample mean of 54
for the index. What would you advice the bank to do? Use = .05.
4. A manager will switch to a new technology if the production process exceeds 80 units per
hour. The manager asks the company statistician to test the null hypothesis: H0: μ = 80
against the alternative hypothesis: H1: μ >80 If there is strong evidence to reject the null
hypothesis then the new technology will be adopted. Past experience has shown that the
standard deviation is 8. A data set with n = 25 for the new technology has a sample mean
of 83 Does this justify adoption of the new technology?
Example 1 A fertilizer mixing machine is set to give 12 kg of nitrate for every 100kg bag of
fertilizer. Ten 100kg bags are examined. The percentages of nitrate are as follows: 11, 14, 13,
12, 13, 12, 13, 14, 11, 12. Is there reason to believe that the machine is defective at 5% level
of significance?
Solution
Hypothesis H0: μ = 12 H1: μ ≠ 12 (This is a two-tail test)
n 1
Critical Region based on = 0.05 and 9 degrees freedom
t9, 0.025 2.262 ie reject H0: μ = 12 if tc 2.262
From calculator x 12.5 and s 1.0801
x 12.5 12
Test statistic tc 1.4639
s / n 1.0801 / 10
Decision since tc 1.2639 2.262 , we fail to reject H0 and conclude that the machine is not
defective.
Example 2 The following figures give the end of year profits of ten randomly selected
Chemists in Nairobi county.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 53
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
On the basis of this data, test whether the average profit is greater than 30M KSH at 1% level
of significance
Solution
Hypothesis H0: μ = 30 H1: μ > 30 (This is a 1-tail test)
n 1
Critical Region based on = 0.01 and 9 degrees freedom
t9, 0.01 2.82 ie reject H0: μ = 12 if tc 2.82
From calculator x 29.415 and s 3.6601
x 29.415 30
Test statistic tc 0.51
s / n 3.6601 / 10
Decision since tc 0.51 2.82 , we don’t reject H0 and conclude that the average profit is not
greater than 30M KSH.
Exercise 5.2
1. Identify the critical t value for each of the following tests:
a. A two-tailed test with =0.05 and 11 degrees of freedom
b. A one-tailed test with =0.01 and n=17
2. Consider a sample with n 20 x 8.0 and s 2 Do the following hypothesis tests.
a) . H0: μ =8.7 H1: μ > 8.7 at =0.01
b) H0: μ =8.7 H1: μ ≠ 8.7 at =0.05
3. It is widely believed that the average body temperature for healthy adults is 98.6 degrees
Fahrenheit. A study was conducted a few years go to examine this belief. The body
temperatures of n = 130 healthy adults were measured (half male and half female). The
average temperature from the sample was found to be x 98.249 with a standard
deviation s = 0.7332. Do these statistics contradict the belief that the average body
temperature is 98.6? test at 1% level of significance
4. A study is to be done to determine if the cognitive ability of children living near a lead
smelter is negatively impacted by increased exposure to lead. Suppose the average IQ for
children in the United States is 100. From a pilot study, the mean and standard deviation
were estimated to be x 89 and s = 14.4 respectively. Test at 5% level whether there is a
negative impact.
5. The average cost of a hotel room in New York is said to be $168 per night. To determine
if this is true, a random sample of 25 hotels is taken and resulted in x $172.5
and s $15.40 . Test the appropriate hypotheses at = 0.05.
6. A sample of eleven plants gave the following shoot lengths
Shoot length (cm) 10.1 21.5 11.7 12.9 14.8 11.0 19.2 11.4 22.6 10.8 10.2
An earlier study reported that the mean shoot length is 15cm. Test whether the
experimental data confirms the old view at 5% level of significance.
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 54
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
7. A simple random sample of 14 people from a certain population gives body mass indices
as shown in Table 7.2.1. Can we conclude that the BMI is not 35? Let = .05.
subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14
BM 23 25 21 37 39 21 23 24 32 57 23 26 31 45
8. A company selling licenses for a franchise operation claims that, in the first year, the
yield on an initial investment is 10%. How should a hypothesis test be stated? If there is
strong evidence that the mean return on the investment is below 10% this will give a
cautionary warning to a potential investor. Therefore, test the null hypothesis: H0 : μ = 10
against the alternative hypothesis: H1 : μ < 10 From a sample of n = 10 observations, the
sample statistics are: x 8.82 x and s = 2.40
9. We know the distance that an athlete can jump is normally distributed but we do not
know the standard deviation. We record 15 jumps: 7.48 7.34 7.97 5.88 7.48 7.67 7.49
7.48 8.51 5.79 7.13 6.80 6.19 6.95 5.93 Test whether these values are consistent with
a mean jump length of 7m. Do you have any reservations about this test?
10. The manufacturing process should give a weight of 20 ounces. Does the data show
evidence that the process is operating correctly? Test the null hypothesis: H0 : μ = 20 the
process is operating correctly against the alternative: H1 : μ ¹ 20 the process is not
operating correctly From the data set, the sample statistics are: n = 9, x 20.356
(ounces) and s = 0.6126
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 55