Statistical Method

Statistical Methods (Stat 1013)Lecture Notes UoG
CHAPTER ONE
1. REVIEW OF SOME BASIC CONCEPTS
1.1 Basic concepts: population, sample, parameter, statistic
Statistics is a set of scientific principles and techniques that are useful in reaching conclusions about
populations and processes when the available information is both limited and variable; that is, statistics is
the science of learning from data. Almost everyone including corporate presidents, marketing
representatives, social scientists, engineers, medical researchers, and consumers deals with data. Although,
the objective of statistical methods is to make the process of scientific research as efficient and productive
as possible, many scientists and engineers have inadequate training in experimental design and in the
proper selection of statistical analyses for experimentally acquired data.
A population A Population is the set of all items or individuals of interest whose properties are to be
analyzed. It consists of all subjects (human or otherwise) that are being studied. Most of the time, due to
the expense, time, size of population, medical concerns, etc., it is not possible to use the entire population
for a statistical study; therefore, researchers use samples.
A sample is a group of subjects selected from a population or it is a subset of the population. If the
subjects of a sample are properly selected, most of the time they should possess the same or similar
characteristics as the subjects in the population.
Why sample?
 Les time consuming than census
 Less costly to administer than a census
 When it’s impossible to study the whole population
Parameter: is a descriptive measure of a population, or summary value calculated from a population.
Examples: Population Range, Population Average, Population proportion, Population variance, etc.
Statistic: is a descriptive measure of a sample, or summary value calculated from a sample.
Example: Sample Proportion, Sample Average, Sample Range, Sample variance.
Target Population: The population to be studied/ to which the investigator wants to generalize his
results.
Sampling Unit: smallest unit from which sample can be selected
1
Sampling frame: List of all the sampling units from which sample is drawn
Example: a college dean is interested in learning about the average age of faculty members. Identify the
basic terms in this situation:
 The population is the age of all faculty members at the college.
 A sample is any subset of the population. For example, we might select 10 faculty members and
determine their age.
 The variable is the “age” of each faculty member
 The experiment would be the method used to select the ages forming the sample and determining the
actual age of each faculty member in the sample.
 The parameter of interest is the “average” age of all faculties at the college.
 The statistic is the “average” age for all faculties in the sample.
1.2 Review of descriptive statistics
Depending on how data can be used Statistics is divided in two main areas:
 Descriptive and inferential statistics
 Descriptive statistics
 Is concerned with summary calculations, graphs, charts and tables.
 Collecting, presenting, and describing data
 Measures of averages and dispersions
 Summarizing and Graphing Data
 Frequency distribution (frequency table): shows how a data set is partitioned among all of
several categories (classes) by listing all of the categories along with the number of data values
in each of the categories.
 Statistical Graphics
 Objective is to identify a suitable graph for representing the data set. The graph should be effective
in revealing the important characteristics of the data.
 Graphs use a visual tool called a histogram to analyze the shape of the distribution of the data.
Histogram Frequency Polygon Relative Frequency Polygon
Dot Plot Ogive curve or a line graph Bar Graph
Pie Chart Scatter Plot (or Scatter Diagram)
Time series Box plot
Statistical graphics are used to:
 Describing data: consider distribution, center, variation, and outliers
 Exploring data: features that reveal some useful and/or interesting characteristic of the data set.
2
 Comparing data: construct similar graphs to compare data sets.

On the basis of the measurement scales
Four levels of measurement scales are commonly distinguished:
 Nominal, Ordinal, Interval and Ratio
Depending up on the source statistical data can be divided in to two:
I. Primary Data
 Data measured or collected by the investigator or the user directly from the source.
 Primary sources are sources that can supply first hand information for immediate user.
II. Secondary Data
 Data gathered or compiled from published and unpublished sources or files.
 Not investigated by the investigator himself/herself, but he/she obtain from previously recorded
form or file.
According to the role of time, data are classified in to two
 Cross-sectional: cross- sectional data is a set of observations taken at one point in time
 Time series data: time series data is a set of observations collected for a sequence of times, usually
at equal interval which may be on weekly, monthly, quarterly, yearly, etc basis.
On the basis of information contained by the data
 Qualitative: variables are non numeric variables and can't be measured.
A typical case is the “success” or “failure” of a particular test. For example, to test the effect of a
drug in a clinical trial setting, the experimenter may define two possible outcomes for each patient:
either the drug was effective in treating the patient, or the drug was not effective. In the case of two
possible outcomes, any sample of size n can be represented as a sequence of n nominal outcome x1,
x2,…, xn that can assume either the value “success” or “failure”.
 Quantitative: variables are numerical variables and can be measured.
 Examples: number of children in family, Temperatures, Salaries, Number of points scored on a 100
point exam, Number of students in a class.
Note that quantitative variables are either discrete (which can assume only certain values, and there are
usually "gaps" between the values, such as the number of chairs in your class)
 Are variables which assume a finite or countable number of possible values?
 Are usually obtained by counting.
 Is characterized by gaps or interruptions in the values that it can assume. These gaps or
interruptions indicate the absence of values between particular values that the variable can assume.
3
Examples of discrete data may be the number of subjects sensitive to the effect of a drug (number of
“success” and number of “failure”).
Or continuous (which can assume any value within a specific range, such as the air pressure in a tire.)
 Continuous variables are variables which assume an infinite number of possible values between any
two specific values.
 Are usually obtained by measurement.
Examples continuous data are weight, height, pressure, and survival time.
 There are many numerical indicators for summarizing and describing data. The most common ones
indicate central tendency, variability, and proportional representation (the sample mean, variance, and
percentiles, respectively).
Inferential statistics
 Making statements about a population by examining sample results
 Drawing conclusions and/or making decisions concerning a population based on sample results.
 The two major activities of inferential statistics are
 To use sample data to estimate values of a population parameters and
 To test hypothesis or claims made about population parameters.
 Can be either estimation or hypothesis testing
1.3 Review of probability concepts and distributions
Probability:-is the measure of a chance that something will appear or it is uncertainty.
Random experiment: involves obtaining observations of some kind. It is an experiment that can be
repeated any number of times under similar conditions and it is possible to enumerate the total number of
outcomes without predicting an individual out come.
Examples Toss of a coin; throw a die, counting arrivals at emergency room, etc.
Population: Set of all possible observations. Conceptually, a population could be generated by repeating
an experiment indefinitely.
Outcome: The result of a single trial of a random experiment.
Elementary event (simple event): one possible outcome of an experiment
Event (Compound event): One or more possible outcomes of a random experiment.
Sample space: the set of all sample points (simple events) for an experiment is called a sample space; or
set of all possible outcomes for an experiment.
Notation:
Sample space: S
Sample point: E1, E2 . . . etc.
4
Event: A, B, C, D, E etc. (any capital letter).

Example
S = {E1, E2, . . ., E6}.
That is S = {1, 2, 3, 4, 5, 6}. We may think of S as representation of possible outcomes of a throw of a die.
More definitions
Union, Intersection and Complementation
Given A and B two events in a sample space S.
1. The union of A and B, A ∪ B, is the event containing all sample points in either A or B or both.
Sometimes we use A or B for union.
2. The intersection of A and B, A∩B, is the event containing all sample points that are both in A and B.
Sometimes we use AB or A and B for intersection.
3. The complement of A, Ac, is the event containing all sample points that are not in A. Sometimes we use
not A or A for complement.
Mutually Exclusive Events (Disjoint Events) Two events are said to be mutually exclusive (or disjoint) if
their intersection is empty. (i.e. A ∩ B = φ).
Example Suppose S = {E1, E2, . . ., E6}. Let
A = {E1, E3, E5};
B = {E1, E2, E3}. Then
(i)A ∪ B = {E1, E2, E3, E5}.
(ii) A∩B = {E1, E3}.
(iii) A and B are not mutually exclusive (why?)
(iv) Give two events in S that are mutually exclusive.
Equally Likely Events: Events which have the same chance of occurring.
Independent Events: Two events are said to be independent if the occurrence of one does not affect the
probability of the other occurring.
Dependent Events: Two events are dependent if the first event affects the outcome or occurrence of the
second event in a way the probability is changed.
Exhaustive events: events A1, A2, …, An are said to be exhaustive iff their union is same as the sample
space.
Approaches to measuring Probability
There are four different conceptual approaches to study probability theory. These are:
• The classical approach.
• The frequencies approach.
5
• The axiomatic approach.

• The subjective approach.
The classical approach
This approach is used when:
- All outcomes are equally likely and mutually exclusive.
- Total number of outcome is finite, say N.
Definition: If a random experiment with N equally likely outcomes is conducted and out of these NA
outcomes are favorable to the event A, then the prob. that event A occur denoted P(A) is defined as:
NA n  A  No . of outcomes favourable for A
P A    
N n s  Total Number of outcomes
Limitation:
 If it is not possible to enumerate all the possible outcomes for an experiment.
 If the sample points (outcomes) are not mutually independent.
 If the total number of outcomes is infinite.
 If each and every outcomes is not equally likely.
Example A fair die is tossed once. What is the probability of getting?
a) Number 4?
b) An odd number?
c) Number greater than 4?
d) Either 1 or 2 or …. Or 6
2. A box of 80 candles consists of 30 defective and 50 non defective candles. If 10 of these candles are
selected at random, what is the probability?
a) All will be defective.
b) 6 will be non defective
c) All will be non defective
The Frequentist Approach
This is based on the relative frequencies of outcomes belonging to an event.
Definition: The probability of an event A is the proportion of outcomes favorable to A in the long run
when the experiment is repeated under same condition.
lim NA
P( A)  n  
N
Example If records show that 60 out of 100,000 bulbs produced are defective. What is the probability of a
newly produced bulb to be defective?
6
Axiomatic Approach:
Let E be a random experiment and S be a sample space associated with E. With each event A a real
number called the probability of A satisfies the following properties called axioms of probability or
postulates of probability.
1. 0  P  A   1
2. P(S) =1
3. If A and B are mutually exclusive events, the probability that one or the other occur equals the sum of
the two probabilities. i. e. P (A∪ B) =P (A) +P (B)
Subjective Approach
It is always based on some prior body of knowledge. Hence subjective measures of uncertainty are always
conditional on this prior knowledge. The subjective approach accepts unreservedly that different people
(even experts) may have vastly different beliefs about the uncertainty of the same event.
Example: Abebe’s belief about the chances of Ethiopia Buna club winning the FA Cup this year may be
very different from Daniel's. Abebe, using only his knowledge of the current team and past achievements
may rate the chances at 30%.
Daniel, on the other hand, may rate the chances as 10% based on some inside knowledge he has about key
players having to be sold in the next two months.
Some basic properties of probability
1) For any event A , 0≤P(A)≤1
2) P    0
3) For any event A and B ,P(A∪ B)=P(A)+P(B)-P(A∩B)

 
4) P A  1  P ( A)
5) P(S) = 1 (since S contains all the outcomes, S always occurs).
Generalized basic principle of counting rules:
Addition Rule
Suppose that the 1st procedure designed can be performed in n1 ways. Assume that 2nd procedure designed
can be performed in n2 ways. suppose further more that, it is not possible that both procedures 1 and 2 are
performed together then the number of ways in which we can perform 1 or 2 procedure is n1+n2 ways, and
also if we have another procedure that is designed by k with possible way of nk we can conclude that there
is n1+n2+…+nk possible ways.
Example: suppose that there are 2 ways to go from place A to place B it one takes a bus and 3 ways it one
takes a train. In how many ways can somebody go from place A to B?
7
n1 + n2= 2+3= 5 ways.

The Multiplication Rule
If a choice consists of k steps of which the first can be made in n1 ways, the second can be made in n2
ways, …, the kth can be made in nk ways, then the whole choice can be made in
(n1 * n2 * ........ * nk ) ways.
Example
1. How many possible answers we have from 3 questions, each question has 4 possible
answers.
The number of possible answer =4*4*4=64
Permutation Rules:
1. The number of permutations of n distinct objects taken all together is n!
Where n! =n*(n-1)*(n-2)*,…,*2*1.
2. The arrangement of n objects in a specified order using r objects at a time is called the permutation of n
n!
objects taken r objects at a time. It is written as P r and the formula is n Pr 
n
n  r !
3. The number of permutations of n objects in which k1 are alike, k2 are alike ---- etc is
n!
nPr =
k1!k 2 !....k n !
Example Suppose we have a letters A, B, C, D
a) How many permutations are there taking all the four?
b) How many permutations are there two letters at a time?
2. How many different permutations can be made from the letters in the word
“CORRECTION”?
Combination
A selection of objects without regard to order is called combination.
Example: Given the letters A, B, C, and D list the permutation and combination for selecting two letters.
Solutions:
Permutation: Combination:
Note that in permutation AB is different from BA. But in combination AB is the same as BA.
8
Combination Rule
The number of combinations of r objects selected from n objects is denoted by nC r or   and is given
n
r
by the formula:
n!
nC r 
n  r ! r !
Exercise:
1. In how many ways can a committee of 5 people be chosen out of 9 people?
2. A committee of 5 people must be selected out 5 men and 8 women. In how many ways
can be selection made if there are three women on the committee?
Probability Distribution:
Definition of random variables and probability Distribution:
Random variable: - is numerical valued function defined on the sample space. It assigns a real number
for each element of the sample space. Generally a random variables are denoted by capital letters and the
value of the random variables are denoted by small letters
Example: Consider an experiment of tossed a fair of coin three times. Let the random variable X be the
number of heads in three tosses, then find X?
Random variables are of two types:
1. Discrete random variable: are variables which can assume only a specific number of values. They
have values that can be counted
Examples:
• Toss a coin n time and count the number of heads.
• Number of children in a family.
• Number of car accidents per week.
• Number of defective items in a given company.
• Number of bacteria per two cubic centimeter of water.
2. Continuous random variable: are variables that can assume all values between any two give values.
Examples:
• Height of students at certain college.
• Mark of a student.
• Life time of light bulbs.
• Length of time required to complete a given training.
Probability distribution:- consists of a value a random variable can assume and the corresponding
probabilities of the values or it is a function that assigns probability for each element of random variable.
9
Probability distribution can be discrete or continues.

A) Discrete probability distribution:- is a formula, a table, a graph or other devices used to specify all
possible values of the discrete random variable(R.V) X along with their respective probabilities.
Example) Consider the experiment of tossing a coin three times. Let X be the number of heads. Construct
the probability distribution of X.
2) A balanced die is tossed to twice, construct a probability distribution if
A) X is the sum the number of spots in the two trials.
B) X is the absolute difference of the number of spots in the trials.
Properties of discrete probability distribution
n
1)  P X  x   1
i 1
i
2) P X  xi   0 or 0  P X  xi   1
3) If X is discrete random variable then
b 1
P a  X  b    P( x)
X  a 1
b 1
P a  X  b    P( x)
X a
b
P a  X  b    P( x)
X  a 1
b
P a  X  b    P( x)
X a
B) Continuous probability distribution

Definition: a non negative function f(x) is called probability distribution of continuous R.V X if the total
area bounded by the curve and the X-axis is 1 and if the sub area under the curve bounded by the curve &
X-axis and perpendicularly erected at any points a and b give the probability that X is between a and b.
Properties of continuous probability distribution

a) The total area under the curve is one i.e.  f ( x)  1

b) P  X   0
c) P a  X  b   P a  X  b   P ( a  X  b )  P ( a  X  b )
Introduction to expectation
Definition:
10
1. Let a discrete random variable X assume the values X1, X2, ….,Xn with the probabilities P(X1), P(X2),
….,P(Xn) respectively. Then the expected value of X, denoted as E(X) is defined as:
n
E(X) =X1.P(X1) +X2.P(X2) +…. +Xn.P(Xn) =  X i .P X i 
i 1
2. Let X be a continuous random variable assuming the values in the interval (a, b) such that
b
 f x d (x) =1, then

a
b
E  X    X . f ( x)d ( x)
a
Mean and Variance of a random variable

Let X is given random variable.
1. The expected value of X is its mean
Mean of X=E(X)
2. The variance of X is given by:
 
Variance of X=Var(x) = E X 2  ( E  X ) 2
Where
n
2
E ( X 2 )   X i .P X i  If X is discrete
i 1
  X 2 f  x d ( x) if X is continuous
x
Rule of Expectation
1) Let X be a R.V and k be a real number, then
a) E (kX) =kE(X)  var kX   k 2 . var X 

b) E(X+k) =E(X) + k  var( X  k )  var( X )
2) Let X and Y be R.V on the sample space, then
a) E  X  Y   E  X   E Y 
b) var  X  Y   var  X   var Y   2. cov  X , Y 
Where, Cov(X, Y) =the covariance between X and Y=E (XY)-E(X).E(Y)
3) Let X and Y be independent R.V, then
a) E (XY) =E(X).E(Y)
b) var  X  Y   var  X   var(Y )
c) Cov (X, Y) =0
Example: 1 what is the expected value and Variance of a random variable X obtained by tossing a coin
three times where X is the number of heads?
2. Let X be a continuous R.V with distribution
11
1
 x 0 x2
f ( x)   2
0, otherwise
Then find a) P (1<x<1.5)
b) E(x)
a) Var(x)
b) E (3x 2  2 x)
Common Discrete Probability Distributions
1. Binomial Distribution
A binomial experiment is a probability experiment that satisfies the following four requirements called
assumptions of a binomial distribution.
1. The experiment consists of n identical trials.
2. Each trial has only one of the two possible mutually exclusive outcomes, success or a failure.
3. The probability of each outcome does not change from trial to trial, and
4. The trials are independent, thus we must sample with replacement.
Examples of binomial experiments
• Tossing a coin 20 times to see how many tails occur.
• Asking 200 people if they watch BBC news.
• Registering a newly produced product as defective or non defective.
• Asking 100 people if they favor the ruling party.
• Rolling a die to see if a 5 appears.
Definition: The outcomes of the binomial experiment and the corresponding probabilities of these
outcomes are called Binomial Distribution.
Let p=probability of success q= 1-p=probability of failure on any given trials
Then the probability getting x success in n trials becomes
 n  x n  x
 . p q x  0,1,3,....n
P X  x    x 
0 otherwise

And this sometimes written as
X ~ Bin ( n, p )
When using the binomial formula to solve problems, we have to identify three things:
• The number of trials (n)
• The probability of a success on any one trial (P) and
• The number of successes desired (X).
12
Remark: If X is a binomial random variable with parameters n and p then

E(X)=np and var(X)=npq
Example:
1) What is the probability of getting three heads by tossing a fair con four times?
2) Suppose that an examination consists of six true and false questions, and assume that a student has no
knowledge of the subject matter. The probability that the student will guess the correct answer to the
first question is 30%. Likewise, the probability of guessing each of the remaining questions correctly is
also 30%.
a) What is the probability of getting more than three correct answers?
b) What is the probability of getting at least two correct answers?
c) What is the probability of getting at most three correct answers?
d) What is the probability of getting less than five correct answers?
3) The probability that a patient contracting IB will recover from the distance under medical treatment is
0.6 out of 15 patients contracting the diseases
a) What is the probability that exactly 10 is record?
b) What is the expected number of patient who will recover?
c) What is the variance of the number of patient who will recover?
Assume that the patients are subjected under the same medical treatment.
2. Poisson Distribution
- A random variable X is said to have a Poisson distribution if its probability distribution is given by:
  x .  
 x  0,1,2.....
P ( X  x)   x! Where  is the average number occurrence of an event in the
0 otherwise

unit length of interval or distance and x is the number of occurrence in a Poisson process.
- The Poisson distribution depends only on the average number of occurrences per unit time of space.
- The Poisson distribution is used as a distribution of rare events, such as:
 Number of misprints.
 Natural disasters like earth quake.
 Accidents.
 Hereditary.
 Arrivals
 Number of misprints per page
- The process that gives rise to such events is called Poisson process.
- If X is a Poisson random variable with parameters λ then
E(x) = λ, var(x)= λ
13
Note: The Poisson probability distribution provides a close approximation to the binomial probability
distribution when n is large and p is quite small or quite large with λ=np.
x
P X  x  
np  .e np
where  np  the average number
x!
x  0,1,2,....
Usually we use this approximation if 5≤np. In other words, if n>20 and np<5 or n(1-p) ≤5 then we may
use Poisson distribution as an approximation to binomial distribution.
Example:
1. If 1.6 accidents can be expected an intersection on any given day, what is the probability that there
will be 3 accidents on any given day?
2. If there are 200 typographical errors randomly distributed in a 500-page manuscript, find the
probability that a given page contains exactly 3 errors.
3. A sale firm receives, on the average, 3 calls per hour on its toll-free number. For any given hour, find
the probability that it will receive the following.
a) At most 3 calls
b) At least 3 calls
c) Five or more calls
4. If approximately 2% of the people are left-handed, find the probability that in a room 200 people, there
are exactly 5 people who are left-handed?
Common Continuous Probability Distributions
Normal Distribution
A random variable X is said to have a normal distribution if its probability density function is given by
2
1 1 x
f x   .e 2
  where    x  ,      ,  0
 2   
  E  x  and  2  var iance x  are parameters of the normal distribution.
Properties of Normal Distribution:
1. It is bell shaped and is symmetrical about its mean and it is mesokurtic. The maximum ordinate is at
μ=x and is given by:
1
f x  
 2
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the mean.
3. It is a continuous distribution i.e. there is no gaps or holes.
14
4. It is a family of curves, i.e., every unique pair of mean and standard deviation defines a different
normal distribution. Thus, the normal distribution is completely described by two parameters: mean
and standard deviation.
5. Total area under the curve sums to 1, i.e., the area of the distribution on each side of the mean is 0.5

  f ( x)d x   1

6. It is unimodal, i.e., values mound up only in the center of the curve.

7. Median=Mean=mode =μ and located at the center of the distribution.
8. The probability that a random variable will have a value between any two points is equal to the area
under the curve between those points.
Note: To facilitate the use of normal distribution, the following distribution known as the standard normal
distribution was derived by using the transformation
1
X  1 Z2
Z

 f z   e2  
i.e. if X ~ N  ,  2 then Z ~ 0,1
2
Properties of the Standard Normal Distribution:
Same as a normal distribution, but also
• Mean is zero
• Variance is one
• Standard Deviation is one
- Areas under the standard normal distribution curve have been tabulated in various ways. The most
common ones are the areas between Z=0 and a positive value of Z.
- Given a normal distributed random variable X with Mean μ and standard deviation σ
a X  b
Pa  X  b   P   
    
a b
 P Z 
   
Example:
1. Find the area under the standard normal distribution which lies
a) Between Z=0 and z=96.0
b) Z=-1.45 and Z=0
c) The right of Z=-0.35
d) To the left of Z=0.35
e) Between Z=-0.67 and Z=0.75
f) Between Z=0.25 and Z=1.25
2. Find the value of Z if
15
a) The normal curve area between 0 and z(positive) is 0.4726

b) The area to the left of z is 0.9868
3. A random variable X has a normal distribution with mean 80 and standard deviation 4.8. What is the
probability that it will take a value?
a) Less than 87.2
b) Greater than 76.4
c) Between 81.2 and 86.0
4. A normal distribution has mean 62.4.Find its standard deviation if 20.05% of the area under the normal
curve lies to the right of 72.9
5. A random variable has a normal distribution with σ =5.Find its mean if the probability that the random
variable will assume a value less than 52.5 is 0.6915.
6. Of a large group of men, 5% are less than 60 inches in height and 40% are between 60 & 65 inches.
Assuming a normal distribution, find the mean and standard deviation of heights.
Student’s t Distribution
In statistics as long as sample size is large enough, most datasets can be explained by Standard Normal
Distribution. But when the sample size is small, statisticians rely on the distribution of the t statistic (also
known as the t score), whose value is given by:
[x  μ]
t=
s
n
Where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is
the sample size.
The distribution of the t statistic is called the t distribution or the Student t distribution. The particular
form of the t distribution is determined by its Degrees of Freedom (df). The degree of freedom refers to
the number of independent observations in a set of data. When estimating a mean score or a proportion
from a single sample, the number of independent observations is equal to the sample size minus one. The t
distribution can be used with any statistic having a bell-shaped distribution (i.e., approximately normal).
The t distribution has the following properties:
 The mean of the distribution is equal to 0.
 The variance is equal to v / (v - 2), where v is the degrees of freedom.
 With infinite degrees of freedom, the t distribution is the same as the standard normal distribution.
 The t distribution is similar to standard normal distribution in the following ways
 It is bell-shaped.
16
 It is symmetric about the mean.

 The mean, median, and mode are equal to zero and located at the center of the distribution.
 The curve never touches the x axis.
 The t distribution differs from standard normal distribution in the following ways.
 The variance is greater than one
 The t distribution is actually a family of curves based on the concept of degrees of freedom,
which is related to sample size.
 As the sample size increases, the t distribution approaches the standard normal distribution.
Chi-Square Distribution
 The chi-square variable is similar to t variable in that its distribution is a family of curves based on
the number of degree of freedom.
 The symbol for chi-square is  2 (Greek letter chi, pronounced “ki”). The chi-square distribution is
obtained from the values of

n  1s 2 when random samples are selected from a normally
2
distributed population whose variance is  2 . A chi-square variable can not be negative, and the
distributions are positively skewed. At about 100 degree of freedom, the chi-square distribution
becomes some what symmetrical. The area under each chi-square distribution is equal to 1.00 or
100%.
 In order to find the area under the chi-square distribution, there are three cases to consider:
1) Find the chi-square critical value for a specific  when the hypothesis test is one tailed right.
In this case, find the  value at the top of  2 table and the corresponding degree of freedom in the left
column. Then, the critical value is located when the two columns meet.
Example: the critical chi-square value for 15 degree of freedom when  =0.05 and the test is one-tailed
right (  015.05 ) is 24.996
2). Find the chi-square critical value for a specific  when the hypothesis test is one tailed left. In this
case, the  value must be subtracted from one. Then, the left side of the table used, because the  2 table
gives the area to the right of the critical value, the  2 statistics can not be negative.
Example: The critical  2 value for 10 df when  =0.05 and the test is one-tailed left is 3.940.
3). Find the chi-square critical value for a specific  when the hypothesis test is two-tailed. When a two-
tailed test is conducted, the area must be split. For example, to find the critical chi-square values for 22
degrees of freedom when  =0.05, we use the area to the right of the larger value 0.025 (0.05/2), and the
17
area to right of the smaller value 0.975(1-0.05/2). Hence, one must use  values in the table of 0.025 and
0.975, with 22 degrees of freedom the critical values are 36.781 and 10.982 respectively.
Note that after the degrees of freedom reach 30, chi-square table only gives values for multiples of 10(40,
50,60,etc.). When the exact degrees of freedom one is seeking are not specified in the table, the closer
smaller value should be used.
The chi-square distribution has the following properties:
 The mean of the distribution is equal to the number of degrees of freedom (v) i.e μ = v.
 The variance is equal to two times the number of degrees of freedom: σ2 = 2 * v

 As the degrees of freedom increase, the chi-square curve approaches a normal distribution.
 It is unimodal.
F-Distribution
The F distribution is an asymmetric distribution that has a minimum value of 0, but no maximum value.
The curve reaches a peak not far to the right of 0, and then gradually approaches the horizontal axis the
larger the F value is. The F distribution approaches, but never quite touches the horizontal axis.
The F distribution has two degrees of freedom, n for the numerator, m for the denominator. For each
combination of these degrees of freedom there is a different F distribution. The F distribution is most
spread out when the degrees of freedom are small. As the degrees of freedom increase, the F distribution
is less dispersed.
Let U~χ and V~χ , that is if U and V are independent chi-squared random then the statistic:
~F , with n numerator d.f., m denominator d.f.
Let X , X , … , X i. i. d random variables from N(μ , δ ) and let Y , Y , … , Y i. i. d random variables from
N(μ , δ ) then:
~F , Where
S = ∑ (x − x) , x = ∑ x
S = ∑ (y − y) , y = ∑ y
18
CHAPTER TWO
2. Inference about a Population Mean and Proportion
2.1 Introduction
Recall the five stages in statistical investigations:
 Collection of data
 Organization of data
 Presentation of data
 Analysis of data and
 Interpretation of data
As we have discussed so far, one of the primary objectives of a statistical analysis is to use data from a
sample to make inferences about the population from which the sample was drawn. In this chapter we will
discuss on the basic procedures for making such inferences.
Definition: Inference is the process of making interpretations or conclusions from sample data for the
totality of the population. Statistical inference may be divided into two major areas: statistical estimation
and statistical hypothesis testing. For instance, suppose that a structural engineer is analyzing the tensile
strength of a component used in an automobile chassis. Since variability in tensile strength is naturally
present between the individual components because of differences in raw material batches, manufacturing
processes, and measurement procedures, the engineer is interested in estimating the mean tensile strength
of the components. In practice, the engineer will use sample data to compute a number that is in some
sense a reasonable value (or guess) of the true mean. This number is called a point estimate. Now
consider a situation in which two different reaction temperatures can be used in a chemical process, say t1
and t2. The engineer conjectures that t1 results in higher yields than does t2. Statistical hypothesis testing
is a framework for solving problems of this type. In this case, the hypothesis would be that the mean yield
using temperature t1 is greater than the mean yield using temperature t2. Notice that there is no emphasis
on estimating yields; instead, the focus is on drawing conclusions about a stated hypothesis.
2.2 Estimation
Statistical Estimation: This is one way of making inference about the population parameter where the
investigator does not have any prior notion about values or characteristics of the population parameter.
It is the process by which sample data are used to indicate the value of an unknown quantity in a
population.
There are two ways of estimation.
1) Point Estimation
19
It is a procedure that results in a single value as an estimate for a parameter.
2) Interval estimation
It is the procedure that results in the interval of values as an estimate for a parameter, which is interval that
contains the likely values of a parameter. It deals with identifying the upper and lower limits of a
parameter. The limits by themselves are random variable.
Estimator and Estimate
Estimator: is the rule or random variable that helps us to approximate a population parameter.
It is any quantity calculated from the data which is used to give information about an unknown quantity in
a population.
Estimate: is the different possible value which an estimator can assume. It is an indication of the value of
an unknown quantity based on observed data. Or it is the particular value of an estimator that is obtained
from a particular sample of data and used to indicate the value of a parameter.
∑
Example: The sample mean X = is an estimator for the population mean and X =10 is an estimate,
which is one of the possible value of X.

Definitions of some terms
Confidence Interval: An interval estimate with a specific level of confidence
Confidence Level: The percent of the time the true value will lie in the interval estimate given.
Degrees of Freedom: The number of data values which are allowed to vary once a statistic has been
determined.
Interval Estimate: A range of values used to estimate a parameter.
Point Estimate: A single value used to estimate a parameter.
Properties of best estimator
The following are some qualities of an estimator
 It should be unbiased.
 It should be consistent.
 It should be relatively efficient.
 Sufficiency
To explain these properties let θ be an estimator of θ:
1. Unbiased Estimator: An estimator whose expected value is the value of the parameter being
estimated. i.e. E(θ)= θ.
2. Consistent Estimator: An estimator which gets closer to the value of the parameter as the sample
size increases. i.e. θ gets closer to θ as the sample size increases.
20
3. Relatively Efficient Estimator: The estimator for a parameter with the smallest variance. This
actually compares two or more estimators for one parameter.
2.2.1 Point and Interval Estimation
Definition: A point estimate: is a specific numerical value estimate of a parameter. It is a single value
computed from the observations in a sample that is used to estimate the value of the target parameter.
Definition: An interval estimate for a parameter is an interval of numbers within which we expect the
true value of the population parameter to be contained. An “interval estimator” draws inferences about a
population by estimating the value of an unknown parameter using an interval. The endpoints of the
interval are computed based on sample information.
Example: mean weakly income of sample of 25 households is $380-$420.
2.2.2 Sampling Distribution of the Sample Mean
Sampling Distribution
Suppose we have a finite population and we draw all possible simple random samples of size n by without
replacement or with replacement. For each sample we calculate some statistic (sample mean X or
proportion P etc.). All possible values of the statistic make a probability distribution which is called the
sampling distribution.
The sampling distribution of a statistic is the distribution of all possible values of the statistic, computed
from samples of the same size randomly drawn from the same population. When sampling a discrete,
finite population, a sampling distribution can be constructed. The number of all possible samples is
usually very large and obviously the number of statistics (any function of the sample) will be equal to the
number of sample if one and only one statistic is calculated from each sample. In fact, in practical
situations, the sampling distribution has very large number of values. The shape of the sampling
distribution depends upon the size of the sample and the nature of the population and the statistic which is
calculated from all possible simple random samples.
Sampling Distribution of Sample Mean, ( X)
Given a finite population with mean (μ) and variance (δ ), when sampling from a normally distributed
population, it can be shown that the distribution of the sample mean will have the following properties.
The probability distribution of all possible values of Xcalculated from all possible simple random samples
are called the sampling distribution of X. In brief, we shall call it distribution of X. The mean of this
distribution is called expected value of X and is written as E(X) or μ .
21
The standard deviation (standard error) of this distribution is denoted by S.E(X) or σ and the variance of
X is denoted by Var(X) or σ . The distribution of X has some important properties as under:
An important property of the distribution of X is that it is a normal distribution when the size of the sample
is large. When the sample size n is more than 30, we call it a large sample size. The shape of the
population distribution does not matter. The population may be normal or non-normal, the distribution of
X is normal for n > 30. But this is true when the number of samples is very large. As the distribution of
random variable X is normal, X can be transformed into standard normal variable Z where
μ
Z= σ .
√
The distribution of X has the t-distribution when the population is normal and n ≤ 30. Diagram (a) shows
the normal distribution and diagram (b) shows the t-distribution.
The square root of the variance of the sampling distribution is called the standard error of the mean which
is also called the standard error. The standard error (standard deviation) of X is related with the standard
deviation of population σ through the relations:
S.E(X) = σ = . This is true when population is infinite which means n is very large or the sampling is
√
done with replacement from finite or infinite population.
S.E(X) = σ = . This is true when sampling is without replacement from finite population.
√
The above two equations between and σ are true both for small as well as large sample sizes.
22
The Central Limit Theorem: Even if the population is not normal, sample means from the population
will be approximately normal as long as the sample size is large enough.
Example 1:
Assume there is a population of size N=4 and random variable, X, is age of individuals Values of X: 18,
20, 22, 24 (years). Consider all possible samples of size n = 2 with replacement and construct sampling
distribution of the sample mean.
23
Summary Measures for the Population Distribution:
= ∑ = = 21, = ∑ ( − ) = 2.236
Now consider all possible samples of size n = 2

1st observation 2nd Observation
18 20 22 24
18 18,18 18,20 18,22 18,24
20 20,18 20,20 20,22 20,24
22 22,18 22,20 22,22 22,24
24 24,18 24,20 24,22 24,24
2 =16 possible samples (sampling with replacement)

1st observation 2nd Observation
18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
16 sample means.
24
Summary Measures of this Sampling Distribution:

∑ f 18 + 19 + 20 + ⋯ + 24
( )= = = 21 =
∑ f 16
∑( − ) (18 − 21) (19 − 21) … (24 − 21)

= = = 1.58
∑ f 16
Example 2:
Draw all possible samples of size 2 without replacement from a population consisting of 3, 6, 9, 12, 15.
Form the sample distribution of sample means and verify the results:
(i) E( ) = (ii) Var ( ) = ( )

Solution: We have population values 3, 6, 9, 12, 15, population size N=5 and sample size n=2 Thus, the
number of possible samples which can be drawn without replacement is: = = 10.
S.No. Sample values Sample mean S.No. Sample values Sample mean
1 3,6 4.5 6 6,12 9.0
2 3,9 6.0 7 6,15 10.5
3 3,12 7.5 8 9,12 10.5
4 3,15 9.0 9 9,15 12.0
5 6,9 7.5 10 12,15 13.5
The sampling distribution of the sample mean ( )and its mean and standard deviation are:
25
4.5 1 1/10 4.5/10 20.25/10

6.0 1 1/10 6.0/10 36.00/10
7.5 2 2/10 15.0/10 112.50/10
9.0 2 2/10 18.0/10 162.00/10
10.5 2 2/10 21.0/10 220.50/10
12.0 1 1/10 12.0/10 144.00/10
13.5 1 1/10 13.5/10 182.25/10
Total 10 1 90/10 877.5/10
( )= ( ) = 90 10 = 9
.
( )= ∑ ( ) − [∑ ( )] = − ( ) = 87.75 – 81 = 6.75
The mean and variance of the population are:

X 3 6 9 12 15
9 36 81 144 225
∑ ∑ ∑
= = = 9 and = − ( ) = − ( ) = 99 – 81 = 18
Verification:
(i) E( ) = =9 (ii) Var ( ) = = = 6.75
Central Limit Theorem

The central limit theorem is one of the most remarkable results of the theory of probability. In its simplest
form, the theorem states that the sum of a large number of independent observations from the same
distribution has, under certain general conditions, an approximate normal distribution. Moreover, the
approximation steadily improves as the number of observations increases. The theorem is considered the
heart of probability theory, although a better name would be normal convergence theorem.
For example, suppose an ordinary coin is tossed 100 times and the number of heads is counted. This is
equivalent to scoring 1 for a head and 0 for a tail and computing the total score. Thus, the total number of
heads is the sum of 100 independent, identically distributed random variables. By the central limit
theorem, the distribution of the total number of heads will be, to a very high degree of approximation,
normal.
It has been empirically observed that various natural phenomena, such as the heights of individuals, follow
approximately a normal distribution. A suggested explanation is that these phenomena are sums of a large
26
number of independent random effects and hence are approximately normally distributed by the central
limit theorem.
2.2.3 Point and Interval Estimation of Population Mean
Point Estimation
Another term for statistic is point estimate, since we are estimating the parameter value. A point
estimate is a single value (or point) used to approximate a population parameter. For instance, sum of
∑
over n is the point estimator used to compute the estimate of the population means, . That is = a
point estimator of the population mean. A point estimate of the population mean, is the sample mean,
which varies from sample to sample . A drawback to the point estimate is that it fails to make a
probability statement as to how close the estimate, ( ) is to the population parameter, .
Confidence Interval Estimation of the Population Mean
Although ( ) possesses nearly all the qualities of a good estimator, because of sampling error, we know
that it's not likely that our sample statistic will be equal to the population parameter, but instead will fall
into an interval of values. We will have to be satisfied knowing that the statistic is "close to" the
parameter. That leads to the obvious question, what is "close"?
We can phrase the latter question differently: How confident can we be that the value of the statistic falls
within a certain "distance" of the parameter? Or, what is the probability that the parameter's value is
within a certain range of the statistic's value? This range is the confidence interval.
A confidence interval allows us to estimate the unknown parameter, and provides a margin for error
indicating how good our estimate is. The method of confidence interval give an idea on the actual
numerical values of a parameters by giving it a range of possible values of the parameter and the degree of
our confidence in limiting the unknown population parameter within that limit.
The confidence level is the probability that the value of the parameter falls within the range specified by
the confidence interval surrounding the statistic. There are different cases to be considered to construct
confidence intervals.
Case 1: If the population is normal with known variance
Recall the Central Limit Theorem, which applies to the sampling distribution of the mean of a sample.
Consider samples of size n drawn from a population, whose mean is and standard deviation is with
replacement and order important. The population can have any frequency distribution.
The sampling distribution of will have a mean ̅ = and a standard deviation = , and approaches
√
a normal distribution as n gets large. This allows us to use the normal distribution curve for computing
confidence intervals.
27
If X ~ ( , ), ℎ ~ , = ~ (0, 1)
√
=> = has a standard normal distribution.

√
=> = ±Z = ± , ℎ .
√
=> =Z
√
 For the interval estimator to be good the error should be small.
 How it be small (how to minimize ?
· By making n large
· Small variability
· Taking Z small
- The smaller the interval the more the precise.
- Let α be the probability that the parameter lies outside the interval.
- To obtain the value of Z, we have to attach this to a theory of chance. That is, there is an area of
size 1- α such that;
The confidence interval estimator for (at level of confidence of (1 - )) is:
− < < +
√ √
The (1 - )*100% confidence interval estimator for is a random interval
− , +
√ √
28
The probability is (1 - ) that the above random interval includes the true value of . (1 - ) of all random
samples will produce an inequality, (the (1 - ) confidence interval estimate for )
− < < + that is true.

√ √
Note that the confidence interval estimate contains no random quantities at all. The statement is either
absolutely certain to be true or absolutely certain to be false,(depending on the values of , , ̅ , n and ).
Case 2: If the sample size is large and the variance is unknown
Usually is not known, in that case we estimate by its point estimator
=> ( − , − ) is a (1- )100% confidence interval for .
√ √
Here are the z values corresponding to the most commonly used confidence levels.
100(1-α)% α α/2
90 0.10 0.05
1.645
95 0.05 0.025
1.96
99 0.01 0.005
2.58
Case 3: If sample size is small and the population variance, is not known.
Sampling where σ is unknown: in this case we cannot assume that our sampling distribution follows a
completely normal distribution. Now we must use the t-distribution (sometimes called the students t-
distribution) to make probability statements.
t-distribution – is a family of distributions that is based on degrees of freedom. Each distribution with its
degrees of freedom has its own features. As the degrees of freedom (d.f) go up the distributions get closer
and closer to the standard normal. This is because as the d.f go up the variability is reduced. We read the
chart the same way as the standard normal with the only difference coming from the fact that we now have
df which is (n-1)
t= , has t distribution with n – 1 degrees of freedom. df = n – 1, number of observations that are free
√
to vary after sample mean has been calculated.
29
The only difference from before is that we don’t use our population standard deviation σ because it is not
known. If one thinks about it, if we don’t know the mean, it would be impossible to know the standard
deviation since the mean is required to get this value.
Two-Tailed Confidence Intervals for is given by:
=> ( - / , + / ) is a 100(1-α)%
√ √
The unit of measurement of the confidence interval is the standard error. This is just the standard deviation
of the sampling distribution of the statistic.
Definition: The confidence coefficient is the proportion of times that a confidence interval encloses the
true value of the population parameter if the confidence interval procedure is used repeatedly a very large
number of times.
Example 1:
From a normal sample of size 25 a mean of 32 was found .Given that the population standard deviation is
4.2. Find
a) A 95% confidence interval for the population mean.
b) A 99% confidence interval for the population mean.
Solution: = 32, = 4.2, 1 − = 0.95 => = 0.05, = 0.025 => = 1.96
=> ℎ ± /
√
a) = 32 ± 1.96 ∗ 4.2/√25
= 32 ± 1.65
= (30.35, 33.6)
b) = 32, = 4.2, 1 − = 0.99 => = 0.01, = 0.005 => = 2.58
=> ℎ b ± /
√
= 32 ± 2.58 ∗ 4.2/√25
= 32 ± 2.17
30
= (29.83, 34.17)
Example 2:
A drug company is testing a new drug which is supposed to reduce blood pressure. From the six people
who are used as subjects, it is found that the average drop in blood pressure is 2.28 points, with a standard
deviation of .95 points. What is the 95% confidence interval for the mean change in pressure?
Solution:
= 2.28, = 0.95, 1 − = 0.95 => = 0.05, = 0.025 => = 2.571 ℎ =5

2
=> ℎ ± /
√
= 2.28 ± 2.571 ∗ 0.95/√6
= 2.28 ± 1.008
= (1.28, 3.28)
That is, we can be 95% confident that the mean decrease in blood pressure is between 1.28 and 3.28
points.
Relationship between width of confidence interval and confidence coefficient
For a given sample size, the width of the confidence interval for a parameter increases as the confidence
coefficient increases. Intuitively, the interval must become wider for us to have greater confidence that it
contains the true parameter value.
Relationship between width of confidence interval and sample size
For a fixed confidence coefficient, the width of the confidence interval decreases as the sample size
increases. That is, larger samples generally provide more information about the target population than do
smaller samples.
2.2.4 Sampling Distribution of the Sample Proportion
While statistics such as the sample mean are derived from measured variables, the sample proportion is
derived from counts or frequency data.
Let p = the proportion of the population having some characteristic Sample proportion ( ) provides an
estimate of p:
ℎ ℎ ℎ ℎ
= =
 0≤ ≤1
 has a binomial distribution, but can be approximated by a normal distribution when np(1 – p) > 5
Properties of the sample proportion
31
Construction of the sampling distribution of the sample proportion is done in a manner similar to that of
the mean and the difference between two means. When the sample size is large, the distribution of the
sample proportion is approximately normally distributed because of the central limit theorem.
Mean and variance
The mean of the distribution, , will be equal to the true population proportion, E( ) = P, and the
( )
variance of the distribution, , will be equal to = Var( ) = , (where P = Population
proportion).
The z-score for the sample proportion is: = = ( )
2.2.5 Point and Interval Estimation of Population Proportion

Estimation of Population Proportion
When a random sample of size n is drawn from a population in which a proportion p of the items are
“successes”, each item in the sample is a Bernoulli random quantity, with P[“success”] = P and
P[“failure”] = q = 1-p.
Let X = number of successes in the random sample, then the probability distribution of X is X ~ bin(n, p)
with E[X] = np and V[X] = npq = . But, for n, np, nq all large, in (n, p) → N( , ). Therefore X ~
N(np, npq) (approximately)
The sample proportion is a point estimator for the population proportion P.
E = = ( )= =P => is an unbiased estimator of P
V( )=V( )= V(X) =
Therefore, for sufficiently large n, np and nq, (namly, np > 10 and nq > 10), ~ N(P, )
Compare this with ~ N( , ), for which the corresponding confidence interval has endpoints ̅ ±
( ⁄√ ).
Therefore, the 100(1-α)% confidence interval estimator for P is ̂ ± / .
Note also the more precise confidence interval is given by:
32
Example 1: From a random sample of one thousand silicon wafers, 750 pass a quality control test. Find a
99% confidence interval estimate for p (the true proportion of wafers in the population that are good).
n = 1000 and x = 750
=> = = =
=> =1− ̂ =
α/2 = 0.005
. = . , ≈ 2.576
. × .
Endpoints of the CI: ̂ ± / = 0.75 ± 2.576 = 0.75 ± 0.03527.
Therefore the 99% confidence interval estimate for p is 71.5% ≤ p ≤ 78.5% correct to three significant
figures.
Using the more precise version of the confidence interval yields
2.3 Hypothesis Testing

Objectives:
The main objective of this section is to develop the ability to conduct hypothesis tests for claims made
about a population proportion p, or populations mean μ.
 The purpose of hypothesis testing is to help the researcher or administrator in reaching a decision
concerning a population by examining a sample from that population.
 Hypothesis testing is to provide information in helping to make decisions.
 The administrative decision usually depends on the null hypothesis.
 If the null hypothesis is rejected, usually the administrative decision will follow the alternative
hypothesis.
 It is important to remember never to base any decision solely on the outcome of only one test.
 Statistical testing can be used to provide additional support for decisions based on other relevant
information.
2.3.1 Important Concepts in Testing Statistical Hypothesis
33
Now we want to extend our statements about a population parameter; we use our knowledge of sampling
and sampling distributions to reject or fail to reject whether or not we think a certain population has
certain characteristics.
Hypothesis Testing: This is also one way of making inference about population parameter, where the
investigator has prior notion about the value of the parameter. For instance we may think that the average
income level in a certain community is around $45,000, but if we take an appropriate random sample and
find out that the income is much lower we now will have a basis to prove this false (with a certain level of
confidence → think and recall confidence intervals).
Definitions of some terms:

 Statistical hypothesis: is an assertion or statement about the population whose plausibility is to be
evaluated on the basis of the sample data. It is a statement, or claim, about a population parameter (such
as the mean).
 Test statistic: is a statistics whose value serves to determine whether to reject or fail to reject the null
hypothesis.
 Statistic test: is a test or procedure used to evaluate a statistical hypothesis and its value depends on
sample data.
 The significance level (α): is the probability that the test statistic will fall in the critical region when the
null hypothesis is actually true.
 Critical region: the critical region (or rejection region) is the set of all values of the test statistic that
cause us to reject the null hypothesis.
 Critical value: a critical value is any value that separates the critical region (where we reject the null
hypothesis) from the values of the test statistic that do not lead to rejection of the null hypothesis. The
critical values depend on the nature of the null hypothesis, the sampling distribution that applies, and the
significance level α.
There are two types of hypothesis:
Null hypothesis:
 Tentative assumption about a population parameter. This is what you are testing (it is the hypothesis
to be tested).
 It is the particular hypothesis under test, and it is the hypothesis of “no difference”
 Usually denoted by H0.
 It can only be rejected or failed to be rejected with a certain level of confidence → cannot be
accepted.
34
Alternative hypothesis: It is the hypothesis available when the null hypothesis has to be rejected.
It is the hypothesis of difference.
Usually denoted by H1 or Ha.
Types and size of errors
No matter which hypothesis represents the claim, always begin the hypothesis test assuming that the
equality condition in the null hypothesis is true. (Note: This may or may not be the claim)
At the end of the test, one of two decisions will be made:
 reject the null hypothesis
 fail to reject the null hypothesis
Because the decision is based on a sample, there is the possibility of making the wrong decision.
 Testing hypothesis is based on sample data which may involve sampling and non sampling errors.
The following table gives a summary of possible results of any hypothesis test:
 Type I error: Rejecting the null hypothesis when it is true.

 Type II error: Failing to reject the null hypothesis when it is false.
 A Type I error is considered to be more serious than a Type II error.
 Power of a test: The most powerful test is a test that fixes the level of significance and minimizes type
II error (β). The power of a test is defined as the probability of rejecting the null hypothesis when it is
actually false. It is given as: = −
Note:
1. There are errors that are prevalent in any two choice decision making problems.
2. There is always a possibility of committing one or the other errors.
3. Type I error (α) and type II error (β) have inverse relationship and therefore, cannot be minimized at the
same time.
4. If we decrease the probability of a Type I error by specifying a smaller significance level, we increase
the probability of a Type II error.
5. The only way to decrease the probability of both errors at the same time is to increase the sample size
because such an increase reduces the denominator of our test statistic.
35
 In practice we set α at some value and design a test that minimizes β. This is because a type I error is
often considered to be more serious, and therefore more important to avoid, than a type II error.
 Since rejecting a null hypothesis has a chance of committing a type I error, we make a small by
selecting an appropriate confidence interval.
 Generally, we do not control β, even though it is generally greater than α. However, when failing to
reject a null hypothesis, the risk of error is unknown.
Example: The USDA limit for salmonella contamination for chicken is 20%. A meat inspector reports
that the chicken produced by a company exceeds the USDA limit. You perform a hypothesis test to
determine whether the meat inspector’s claim is true. When will a type I or type II error occur? Which is
more serious?
Solution:
Let p represent the proportion of chicken that is contaminated.
Hypotheses: H0: p ≤ 0.2 versus Ha: p > 0.2
Chicken meets USDA limits: H0: p ≤ 0.20 and Chicken exceeds USDA limits: H0: p > 0.20.
A type I error is rejecting H0 when it is true.
The actual proportion of contaminated chicken is less than or equal to 0.2, but you decide to reject H0.
A type II error is failing to reject H0 when it is false.
The actual proportion of contaminated chicken is greater than 0.2, but you do not reject H0.
General steps in hypothesis testing:
1. The first step in hypothesis testing is to specify the null hypothesis (H0) and the alternative hypothesis
(H1).
2. The next step is to select a significance level, α,
Level of significance is your maximum allowable probability of making a type I error. By setting the
level of significance at a small value, you are saying that you want the probability of rejecting a true
null hypothesis to be small.
3. Identify the sampling distribution of the estimator.
4. The fourth step is to calculate a statistic analogous to the parameter specified by the null hypothesis.
5. Identify the critical region (rejection region). It will tell us to reject the null hypothesis if the test
statistic falls in the rejection area, and to accept it if it falls in the acceptance region.
To use a rejection region to conduct a hypothesis test, calculate the standardized test statistic, z. If the
standardized test statistic
 Is in the rejection region, then reject H0.
 Is not in the rejection region, then fail to reject H0.
36
Types of test
6. Making decision.
Decision Claims
Claim is H0 Claim is H1
Reject H0 There is enough evidence to reject the There is enough evidence to support the
claim claim
Fail to reject H0 There is not enough evidence to reject the There is not enough evidence to support
claim the claim
Use the following decision rule.
YES NO
Write a statement to interpret the decision in the context of the original claim. If we reject the Ho we will
say that the data to be tested does not provide sufficient evidence to cause rejection.
37
If it is rejected we say that the data are not compatible with Ho and support the alternative hypothesis (Ha)
7. Conclusion or Summarization of the result.
2.3.2 Hypothesis Testing about the Population Mean

Suppose the assumed or hypothesized value of α is denoted by , then one can formulate two sided (1)
and one sided (2 and 3) hypothesis as follows:
1. ∶ = Vs ∶ ≠ two sided test
2. ∶ = Vs ∶ > right tailed test
3. ∶ = Vs ∶ < left tailed test
Cases:
Case 1: When sampling is from a normal distribution with known
 The relevant test statistic is =

√
 After specifying α we have the following regions (critical and acceptance) on the standard
normal distribution corresponding to the above three hypothesis.
Summary Table for Decision Rule
38
Where =
√
Case 2: When sampling is from a normal distribution with unknown and small sample size
0
 The relevant test statistic is = ~ with n -1 degrees of freedom.
√
 After specifying α we have the following regions on the student t-distribution corresponding to the
above three hypothesis.
Summary Table for Decision Rule
Where =
√
Case3: When sampling is from a non- normally distributed population or a population whose
functional form is unknown.
 If a sample size is large one can perform a test hypothesis about the mean by using:
Where = , if
√
= , if
√
The decision rule is the same as case I.

Hypothesis-Testing Common Phrases
> <
 Is greater than Is less than
 Is above Is below
 Is higher than Is lower than
 Is longer than Is shorter than
39
 Is bigger than Is smaller than

 Is increased Is decreased or reduced from
 Is an improvement
 Is exaggeration
 Is superior
= ≠
 Is equal to Is not equal to
 Is the same as Is different from
 Has not changed from Has changed from
 Is the same as Is not the same as
Example 1: A researcher wishes to see if the mean number of days that a basic, low price, small
automobile sits on a dealer’s lot is 29. A sample of 30 automobile dealers has a mean of 30.1 days for
basic, low price, small automobiles. At α = 0.05, test the claim that the mean time is greater than 29 days.
The standard deviation of the population is 3.8 days.
Solution:
Step 1: state the hypothesis and identify the claim
: μ = 29 and : μ > 29 (claim)
Step 2: find the critical value. α = 0.05 and the test is a right-tailed test, the critical value is z = +1.65.
Step 3: compute the test value:
.
= = . = 1.59
√ √
Step 4: make the decision. Since the test value, +1.59, is less than the critical value, +1.65, and is not in
the critical region, the decision is to not reject the null hypothesis. This test is summarized in the following
figure:
Step 5: summarize the result. There is not enough evidence to support the claim that the mean time is
greater than 29 days.
Example 2: An industrial company claims that the mean pH level of the water in a nearby river is6.8. You
randomly select 19 water samples and measure the pH of each. The sample mean and standard deviation
40
are 6.7 and 0.24, respectively. Is there enough evidence to reject the company’s claim at α = 0.05?
Assume the population is normally distributed.
Solution:
: μ = 6.8 (Claim) versus : μ ≠ 6.8
α = 0.05, df = 19 – 1 = 18, ( − 1) = . (18) = ± 2.101
. .
Test statistic: = = . = −1.816
√ √
Rejections rejoin:
Decision: Fail to reject .

At the 5% level of significance, there is not enough evidence to reject the claim that the mean pH is 6.8.
Example 3: A certain breed of rats shows a mean weight gain of 65 gm, during the first 3 months of life.
16 of these rats were fed a new diet from birth until age of 3 months. The mean was 60.75 gm. (a) If the
population variance is 10 gm , (b) If the population variance is unknown, and the sample Standard
deviation is 3.84 is there a reason to believe at the 5% level of significance that the new diet causes a
change in the average amount of weight gained.
Solution:
(a) : = 65 : ≠ 65 (claim)
α = 0.05 => / = Z = 1.96 (critical value)
− 0 60.75 − 65
= = = −5.38
10/16
√
Since the calculated values falls in the rejection region, we reject the Ho, and accept the Ha.
(b) / = ±2.1315
df= n-1=15
− 60.75 − 65
= = = −4.1315
3.84/16
√
Since the calculated values falls in the rejection region, we reject the Ho, and accept the Ha
41
Example 4: The mean life time of a sample of 16 fluorescent light bulbs produced by a company is
computed to be 1570 hours. The population standard deviation is 120 hours. Suppose the hypothesized
value for the population mean is 1600 hours. Can we conclude that the life time of light bulbs is
decreasing? (Use α = 0.05 and assume the normality of the population)
Solution: Let μ is the population mean, = 1600
Step 1: Identify the appropriate hypothesis
: = 1600 , Versus : < 1600
Step 2: select the level of significance, α = 0.05 (given), − . = −1.645
Step 3: Select an appropriate test statistics
Z- Statistic is appropriate because population variance is known.
Step 4: identify the critical region.
The critical region is <− . = −1.645 => (−1.645, ∞) is acceptance region.
Step 5: computations: = = = −1.0

√ √
Step 6: Decision
Accept H0, since Zcal is in the acceptance region.
Step 7: Conclusion
At 5% level of significance, we have no evidence to say that the life time of light bulbs is
decreasing, based on the given sample data.
Example 5: Suppose it is known that the average cholesterol level in children is 175 mg/dl. A group of
men who have died from heart disease within the past year are identified, and the cholesterol levels of
their offspring are measured. Suppose the mean cholesterol level of 10 children whose fathers died from
heart disease is 200mg/dl, with standard deviation 50mg/dl. Test the hypothesis that the mean cholesterol
level is higher in this group than the general population. Use the 0.05 level of significance.
Solution: Let μ is the population mean, = 175
Step 1: Identify the appropriate hypothesis: : = 175 : > 175 (claim)
Step 2: select the level of significance, α = 0.05 (given)
t- Statistic is appropriate because population variance is not known and the sample size is also small.
Here we have one critical region since we have one tailed hypothesis
The critical region is > . (9) = 1.833
=> (−∞, 1.833) is acceptance region.
42
Step 5: computation: = 200, = 50
=> = = = 1.58
√ √
Step 6: Decision
Accept H0, since is in the acceptance region.
Step 7: Conclusion
At 5% level of significance, we have no evidence to say that the mean cholesterol level is higher in
children whose fathers have died from heart disease within the past year than the general population.
Example 6: It is hoped that a newly developed pain reliever will more quickly produce perceptible
reduction in pain to patients after minor surgeries than a standard pain reliever. The standard pain reliever
is known to bring relief in an average of 3.5 minutes. To test whether the new pain reliever works more
quickly than the standard one, 50 patients with minor surgeries were given the new pain reliever and their
times to relief were recorded. The experiment yielded samples mean 3.1 minutes and sample standard
deviation 1.5 minutes. Is there sufficient evidence in the sample to indicate, at the 5% level of
significance, that the newly developed pain reliever does deliver perceptible relief more quickly? (Use α =
0.05 and assume the normality of the population)
Solution: Let μ is the population mean, = 3.5
Step 1: Identify the appropriate hypothesis: : = 3.5 : < 3.5(claim)
Step 2: select the level of significance, α = 0.05 (given)
Z- Statistic is appropriate because population variance is known.
The critical region is <− . = −1.645
=> (-1.645, ∞) is acceptance region.
Step 5: computation: = 3.1, = 1.5
. .
=> = = . = −1.886
√ √
Step 6: Decision, Reject H0, since the test statistic falls in the rejection region, the decision is to
reject H0.
Step 7: Conclusion
The data provide sufficient evidence, at the 5% level of significance, to conclude that the average time
until patients experience perceptible relief from pain using the new pain reliever is smaller than the
average time for the standard pain reliever.
43
Exercise: It is known in a pharmacological experiment that rats fed with a particular diet over a certain
period gain an average of 40 gms in weight. A new diet was tried on a sample of 20 rats yielding a weight
gain of 43 gms with variance 7 gms2. Test the hypothesis that the new diet is an improvement assuming
normality.
a) State the appropriate hypothesis
b) What is the appropriate test statistic? Why?
c) Identify the critical region(s)
d) On the basis of the given information test the hypothesis and make conclusion.
Testing Hypothesis using the p-Value Method: in this case we use p-value instead of the test statistic
method. It is almost identical to the above method, but instead of using and comparing a test statistic to
some critical value we simply compare the probability of the test statistic to the p-value.
P-value – the probability that the test statistic would take on some extreme value than that which is
actually observed. This assumes that the null hypothesis is true. For a one sided test this is simply α and
for a two sided test it is α/2. It is the smallest value of α for which the Ho can be rejected , so it gives a
more precise statement about probability of rejection of Ho when it is true than the alpha level, so instead
of saying the test statistic is significant or not , we will mention the exact probability of rejecting the Ho
when it is true
The probability, if the null hypothesis is true, of obtaining a sample statistic with a value as extreme or
more extreme than the one determined from the sample data. Depends on the nature of the test.
It should be the case that the results from the p-value method and the test statistic method give the same
results. If you don’t get the same results a mistake has been made. If we get a p-value that is smaller than
the α, then we say that the data are significant at the level of α.
Decision rule when using a p-value
If p-value ≤ α, reject the null hypothesis.
If p-value > α, do not reject the null hypothesis.
Example: a researcher claims that the average wind speed in a certain city is 8 miles per hour. A sample of
32 days has an average wind speed of 8.2 miles per hour. The standard deviation of the population is 0.6
mile per hour. At α =0.05, is there enough evidence to reject the claim? Use the p-value method.
Solution:
Step 1: state the hypothesis and identify the claim
: = 8( ) : ≠8
Step 2: compute the test value:
44
.
= . = 1.89
√
Step 3: find the p-value. Using table, and find the corresponding area for z =1.89. it is 0.9706. Subtract the
value from 1.0000.
1.0000-0.9706 = 0.0294, since this is a two tailed test, the area of 0.0294 must be double to get the
p-value. i.e 2(0.0294) = 0.0588
Step 4: make the decision. The decision is to not reject the null hypothesis, since the p-value is greater
than 0.05. See the following figure:
Step 5: summarize the result, there is not enough evidence to reject the claim that the average wind speed
is 8 miles per hour.
Nature of the test
Three types of hypothesis tests
 left-tailed test
 right-tailed test
 two-tailed test
The type of test depends on the region of the sampling distribution that favors a rejection of . This
region is indicated by the alternative hypothesis. In other words, we are testing the alternative hypothesis.
Left- tailed Test Right- tailed Test
: = : < : = : >
Two- tailed Test

: = : ≠
45
Summary Procedure for Finding P-Values
Hypothesis test based on Confidence Interval

Recall that a two tailed hypothesis test rejects the null hypothesis when the observed value of the test
statistic is either below the lower critical value or above the upper.
 The lower critical value can be restated as the lower limit on a confidence interval.
 The upper critical value can be restated as the upper limit on a confidence interval.
 When the hypothesized population parameter lies within this confidence interval, we fail to reject the
null hypothesis.
 Although this relationship is useful, it precludes easy calculations of the significance level of the test,
known as a p-value, from the values of the standard error and point estimate.
2.3.3 Hypothesis testing about the population proportion
46
A hypothesis test involving a population proportion can be considered as a binomial experiment when
there are only two outcomes and the probability of a success does not change from trial to trial.
Since a normal distribution can be used to approximate the binomial distribution when np ≥ 5 and nq ≥ 5,
the standard normal distribution can be used to test hypotheses for proportions. The estimate of p from a
sample of size n is the sample proportion, ̂ = x/n, where x is the number of successes in the sample.
Using the normal approximation, the appropriate statistic to perform inferences on p is
̂−
=
(1 − )
Under the conditions for binomial distributions, this statistic has the standard normal distribution,
assuming sufficient sample size for the approximation to be valid.
 When you use proportions you will state your Null and Alternate Hypothesis in terms of population
parameters (p), using your sample statistic ( ̂ ) as an estimator.
 Null Hypothesis will be a statement about the parameter that will include the equality.
 Alpha (α) determines the size of the rejection region.
 Test can be one or two-tailed, depending on how the alternative hypothesis is formulated. The most
important requirement to perform a hypothesis test for proportion is to assume that the distribution is
normal, and in this case, n has to be sufficiently large such that np > 5 and n(1-p) > 5 .
These are steps for a hypothesis test:
1. Formulate Hypothesis (one-tailed or two-tailed)
:p=
:p≠ where
Compute Z-test = ̂ = sample proportion (x/n)

( )
which is compared to the appropriate critical values from the normal distribution, or a p-value is
calculated from the normal distribution.
2. Compute using the alpha provided by the problem
3. State the Decision rule:
 If Z-test > , Reject Ho (use the decision rule that best fits to your needs)
Note that we do not use the t distribution here because the variance is not estimated as a sum of squares
divided by degrees of freedom. Of course, the use of the normal distribution is an approximation, and it is
generally recommended to be used only if np ≥ 5 and n(1 − p) ≥ 5.
47
Example 1: A research center claims that 25% of college graduates think a college degree is not worth the
cost. You decide to test this claim and ask a random sample of 200 college graduates whether they think a
college degree is not worth the cost. Of those surveyed, 21% reply yes. At α = 0.10 is there enough
evidence to reject the claim?
Solution: Verify that np ≥ 5 and nq ≥ 5. np = 200(0.25) = 50 and nq = 200 (0.75) = 150
 : = 0.25 ( ) : ≠ 0.25 α = 0.01
 Rejection region:
Test statistic:
̂− 0.21 − 0.25
= = −1.31
(1 − ) 0.25(1 − 0.25)
200
Decision: Fail to reject .
At the 10% level of significance, there is enough evidence to reject the claim that 25% of college
graduates think a college degree is not worth the cost.
Example 2: Suppose that the national smoking rate among men is 25% and we want to study the smoking
rate among men in Addis Ababa. Let p be the proportion of Addis Ababa men who smoke. Test the null
hypothesis that the smoking prevalence in Addis Ababa is the same as the national rate. Suppose that we
plan to take a sample of size n = 100 and in the sample, 15 are smokers.
Does this evidence indicate that the true proportion of smokers is significantly different from the national
rate? Test at significance level α = .0 5.
Solution: : = % : ≠ %
. . .
Compute = = = = −2.3094
( ) . ( . ) .
= . = 1.96
| |> /
Conclusion: At 5% level of significance this indicates that the proportion of smokers in Addis Ababa is
significantly different from the national rate.
48
Exercise 1: A telephone company representative estimates that 40% of its customers have call-waiting
service. To test this hypothesis, she selected a sample of 100 customers and found that 37% had call
waiting. At α = 0.01, is there enough evidence to reject the claim?
Exercise 2: Suppose it is claimed that in a very large batch of components, about 10% of items contain
some form of defect. It is proposed to check whether this proportion has increased, and this will be done
by drawing randomly a sample of 150 components. In the sample, 20 are defectives. Does this evidence
indicate that the true proportion of defective components is significantly larger than 10%? Test at
significance level α = .0 5.
2.4 Sample size determination

How large a sample is necessary to make an accurate estimate? Determine the minimum sample size for
finding a confidence interval for the mean. The answer is not simple, since it depends on three things:
 The maximum error of the estimate,
 The population standard deviation, and
 The degree of confidence.
For example, how close to the true mean do you want to be (2 units, 5 units, etc.), and how confident do
you wish to be (90, 95, 99%, etc.)? For the purpose of simplicity, assumed that the population standard
deviation of the variable is known or has been estimated from a previous study.
The formula for sample size is derived from the maximum error of the estimate formula and this formula
is solved for n as follows:
The error between the estimator and the quantity it is supposed to estimate is: − is a random
√
variable having approximately the standard normal distribution We could assert with probability 1-α that
the inequality
−
− / ≤ ≤ /
√
The error | − | will be less than = / ∗ with probability 1-α
√
Suppose that we want to use the mean of a large random sample to estimate the mean of a population, and
want to be able to assert with probability 1-α that the error will be at most some prescribed quantity E. As
before, we get;
/
=
Choosing the sample size for a population proportion

49
The formula for determining the sample size in the case of a proportion is:
̂ (1 − ̂ )
. = ̂± /
To construct a confidence interval about a proportion, you must use the maximum error of the estimate,
( )
which is = /
A minimum sample size needed for interval estimate of a population proportion is:
/
= ̂ (1 − ̂ )
Example 1:
The Dairy Farm Company wants to estimate the proportion of customers that will purchase its new
broccoli-flavored ice cream. The company wants to be 90% confident that they have estimated the
population proportion (p) to within .03. How many customers should they sample?
Solution:
The desired margin of Error is E = 0.03. The company wants to be 90% confident, so / =1.645; the
required sample size is:
1.645
= ̂ (1 − ̂ ) = ̂ (1 − ̂ )
0.03
Since the sample has not yet been taken, the sample proportion pˆ is still unknown.
We proceed using either one of the following two methods:
Method 1:
 There is no knowledge about the value of p
 Let p =0.5. This results in the largest possible n needed for a 90% confidence interval of the form
̂ ± 0.03
 If the proportion does not equal 0.5, the actual margin of error will be narrower than 0.03 with the
n obtained by the formula below.
.
(0.5)(0.5) = 751.67 ↑ 752
.
Method 2:
 There is some idea about the value of p (say p ~ 0.2)
 Use the value of p to calculate the sample size
.
(0.2)(0.8) = 481.07 ↑ 482
.
50
CHAPTER THREE
3. Inference about the Difference between Two Populations
Learning Objectives
 To understand the logical framework for estimating the difference between the means of two distinct
populations and performing tests of hypotheses concerning those means.
 To learn how to construct a confidence interval for the difference in the means of two distinct
populations using, independent and paired samples.
 To learn how to perform a test of hypotheses concerning the difference between the means of two
distinct populations using, independent and paired samples.
 To learn how to perform a test of hypotheses concerning the difference between the proportions of two
distinct populations using, independent and paired samples.
 Make decisions about sample size in comparing means and proportions
3. 1 Inference About the Difference Between Two Population Means
3.1.1 Introduction
Inferences concerning differences between two population parameters are considered as comparative
studies. Comparative studies are designed to discover and evaluate differences between groups or between
treatments. In comparative study experiments are conducted to collect informative data and conclusions
are drawn based on the experimental evidences.
In studies involving comparison of two groups there are two ways of taking the samples and conducting
the experiment:
i. Paired samples and
ii. Independent samples
Definition:
a. Two samples are said to be paired if each data point in the first sample is matched and related to a
unique data point in the second sample. Pairs of similar individuals (observations) are selected. In an
experiment one treatment is applied to one member of each pair and another treatment is applied to the
other member. A common application occurs in self pairing where a single individual is measured on
two occasions. The aim of pairing is that to make the comparison more accurate by having members in
pair as like as possible except for differences in treatment that the investigator deliberately introduces.
b. Two samples are independent if data points in one sample are unrelated to data points in the second
points. This case arises when we wish to compare two populations and have drawn a sample from each
quite independently. The independent samples are used widely when there is no suitable basis for
paring.
51
Motivating example:
Consider two populations; the two population descriptions are as follows:
Let x , x , …, x represents the n observations from the first population and x , x , …, x ,

represents the n observations from the second population. We assume that the samples are drawn at
random from two independent normal populations.
Where x = is the j observation from population i
μ = is the mean of the response at the i population
Statistical Hypothesis: a statistical hypothesis is a statement either about the parameters of a probability
distribution or the parameters of a model.
Hypothesis: H : μ = μ versus H : μ ≠ μ
H :μ >μ
H :μ <μ
Test statistic: To test the hypotheses we devise a procedure for taking a random sample computing an
appropriate test statistic, and then rejecting or failing to reject the null hypothesis H . Part of this
procedure is specifying the set of values for the test statistic that leads to rejection of H . This set of values
is called the critical region or rejection region for the test.
In order to compare two population means, it is necessary to take two random and independent samples
from the two populations. Independence is achieved by making the selection of one sample not influence
the selection of the other in any way. We shall use the notations shown in Figure 3.1.
n n
X X
S S
Figure 3.1 Comparing Two Population Means
The hypothesis are about μ − μ , such as
H :μ −μ = D
52
H :μ −μ ≠ D
For a two-tailed test where D is the hypothesized difference in the means
Often, D is zero.
For a one-tailed test with rejection on the left the hypotheses will be
H :μ −μ = D
H :μ −μ < and for rejection on the right tail
H :μ −μ = D
H :μ −μ >
3.1.2 Sampling Distribution of the Difference Between Two Means
It often becomes important to compare two population means. Knowledge of the sampling distribution of
the difference between two means is useful in studies of this type. It is generally assumed that the two
populations are normally distributed.
Sampling distribution of X − X
Plotting mean sample differences against frequency gives a normal distribution with mean equal to
μ − μ which is the difference between the two population means.
Variance
The variance of the distribution of the sample differences is equal to + . Therefore, the standard
σ1 2 σ2 2
error of the differences between two means would be equal to
n1
+ n2
.
3.1.3 Comparison of Means in Independent Samples

There are many sampling situations in which we will be selecting independent random sample from two
populations in order to compare the population means. The statistic used to make such inference will in
many cases be the difference in the corresponding sample statistic.
In situation where we wish to make inference about μ − μ , based on independent samples we use the
data from the two samples and make a comparison between the population means μ and μ . A logical
point estimate of the difference in the population means is the sample difference x − x where x is the
mean of the sample observations from population 1 and x is the mean of the sample observations from
population 2.
For confidence interval estimation and hypothesis testing we shall consider the following four cases.
Case 1: small sample and equal population variance
The assumption of equal variances is justified when the causes for variations are the same for both
populations.
53
For instance, we may be testing the difference between the average strengths of pins produced from two
different raw materials but using the same machinery and process. It is reasonable in this case to assume
equal variances.
Assumptions:
1. The samples from the two populations were drawn independently.
2. The population variances/standard deviations are equal.
3. The populations are both normally distributed.
From sampling theory, we note that
E[X − X ] = μ − μ
V(X − X ) = V(X ) + V(X ) = +
By the Central Limit Theorem, as n1 and n2 both increase, X − X will approach the normal distribution.
If the common variance is unknown it is estimated by the pooled estimator of the population variance
given by:
(n − 1)S + (n − 1)S
S =
n +n −2
Decision rule: we shall reject: we shall reject H at α level of significane if :
|Z| > Z for H : μ = μ Versus H : μ ≠ μ
Z > Z for H : μ = μ Versus H : μ > μ

Z < −Z for H : μ = μ Versus H : μ < μ
54
Confidence interval: A 100(1-α)% confidence interval for mean difference μ − μ is given by:
1 1
(X − X ) ± Z σ −
n n
( ) ( )
If σ is unknown T = has a t-distribution with n + n − 2 degrees of freedom.
Decision rule: we shall reject: we shall reject H at α level of significane if :

|T| > T , n + n − 2 for H : μ = μ Versus H : μ ≠ μ
T > T , n + n − 2 for H : μ = μ Versus H : μ > μ

T < −T , n + n − 2 for H : μ = μ Versus H : μ < μ
Confidence interval:
A 100(1-α)% confidence interval for mean difference μ − μ is given by:
1 1
(X − X ) ± T ,n + n − 2 ∗ S −
n n
Example 1: Lung destructive index:

Given:
Smokers: ̅ = 17.5 = 16 = 4.4752
Non-smokers: ̅ = 12.4 =9 = 4.8492 , α = 0.05
We wish to know if we may conclude, at the 95% confidence level, that smokers, in general, have greater
lung damage than do non-smokers.
Solution:
Assumptions
o Independent random samples
o Normal distribution of the population
o Population variances are equal and unknown
Hypothesis: H : μ = μ versus H :μ > μ
The appropriate test statistic is:
( ) ( )
t=
( ) ( ) ( )( . ) ( )( . ) . .
Where:S = = = = 21.2165
( ) ( ) ( . . ) .
t= = = .
= 2.6563
. ∗
55
Critical region:
With α = 0.05 and df = 23, the critical value of t is 1.7139. We reject H if t > 1.7139.
Reject H because 2.6563 > 1.7139 (p-value = 0.014).
Conclusion: At a significance level of 5% there is sufficient evidence that smokers, in general, have
greater lung damage than do non-smokers, i.e., the mean lung damage of smokers is significantly higher
from the mean damage of no smokers.
Case 2: Unequal variance but and are known
Assumptions:
Assumptions:
1. The samples from the two populations were drawn independently.
2. The population variances/standard deviations are NOT equal.
3. The populations are both normally distributed.
When and are known, we can therefore conduct a z-test if both n and n are more than 30 and the
two populations are normal. The z statistic for testing H : μ = μ is given by the formula
(X − X ) − (μ − μ )
z=
σ σ
+
n n
Decision rule: we shall reject H at α level of significane if :
|Z| > Zα for H : μ = μ Versus H : μ ≠ μ
Z > Zα for H : μ = μ Versus H : μ > μ

Z < −Zα for H : μ = μ Versus H : μ < μ
σ σ
(X − X ) ± Zα ∗ +
n n
Example 1: Serum uric acid levels

Given: x = 4.5 n = 12 σ = 1 x = 3.4 n = 15 σ = 1.5 α = 0.05
Is there a difference between the means between individuals with Down's syndrome and normal
individuals?
Solution:
Assumptions
o Two independent random samples
56
o Each drawn from a normally distributed population with known variance.

Hypothesis: H : μ = μ Versus H : μ ≠ μ
The appropriate statistic for testing H : μ = μ is
(X − X ) − (μ − μ ) (4.5 − 3.4) − 0 1.1
Z= = = = 2.57
σ σ 1 1.5 0.4282
+
n n 12 + 15
Critical region: with α = 0.05, the critical values of z are -1.96 and +1.96. We reject H : if z < -1.96 or z >
+1.96.
Reject H because 2.57 > 1.96.

Conclusion: From these data, it can be concluded that the population means are not equal (p-value
=0.0102). At 95% level of confidence there enough evidence that Serum uric acid levels of Down's
syndrome and normal individuals are different.
Case 3: Unknown variance but the sample sizes are large
If large samples of size are drawn from two population with unknown variances σ and σ , an
appropriate test statistic for testing H : μ = μ is given by:
( ) (μ μ )
Z= σ(
, Where σ( ) = +
)
S = ∑ (X − X ) and S = ∑ (X − X )
Decision rule: we reject: we shall reject H at α level of significane if :

|Z| > Zα for H : μ = μ Versus H : μ ≠ μ
Z > Zα for H : μ = μ Versus H : μ > μ

Z < −Zα for H : μ = μ Versus H : μ < μ
S S
(X − X ) ± Zα ∗ +
n n
Example 1: These data were obtained in a study comparing persons with disabilities with persons without
disabilities. A scale known as the Barriers to Health Promotion Activities for Disabled Persons (BHADP)
57
Scale gave the data. We wish to know if we may conclude, at the 99% confidence level, that persons with
disabilities score higher than persons without disabilities.
Given
Disable: x = 31.83 n = 132 s = 7.93
Non-disable: x = 25.07 n = 137 s = 4.80 α = 0.01
Solution:
Assumptions
 Independent random samples
 large samples
 unknown variance
Hypothesis: H : μ = μ Versus H : μ > μ
Test statistic: Because of the large samples, the central limit theorem permits calculation of the z score as
opposed to using t. The z score is calculated using the given sample standard deviations. If the
assumptions are correct and H is true, the test statistic is approximately normally distributed
(X − X ) − (μ − μ ) (31.83 − 25.07) − 0 6.76
z= = = = 8.42
(7.93) (4.80) 0.8029
+
132 137
n + n
Critical region:
With α = 0.01 and a one tail test, the critical value of z is 2.33. We reject H if z > 2.33.
Discussion: Reject H because 8.42 > 2.33.
Conclusion: At 99% level of confidence we conclude that the data support the claim that persons with
disabilities score higher than persons without disabilities.
Case 4: Unequal variances with no information about and , and small sample size.
There are many situations in which the comparison of means has to be made based on small samples from
population with different variances. Among the common situations in which we cannot assume σ
σ are:
i. When the samples come from different types of population as in comparisons made from survey data.
ii. When computing confidence limits in case in which the population means differ widely the common
result that σ changes (though slowly) as μ changes will make us hesitant to assume
σ =σ .
iii. When one treatment is erratic in its performance sometimes giving high sometimes low responses. In
populations that are markedly skew, the relationship between μ and σ is relatively strong.
58
An appropriate test statistic for testing H :μ = μ is :

(X − X ) − (μ − μ )
t= , t − does not follow the t − distribution, when H is true.
S S
+
n n
Degrees of freedom=v=smaller value of n-1or n-2

In many statistical software packages, a different method is used to compute the
degrees of freedom for this t test. They are determined by the formula
+
. = =
⎛ ⎞ ⎛ ⎞
⎜ − 1⎟ + ⎜ − 1⎟
⎝ ⎠ ⎝ ⎠
Decision rule: we reject: we shall reject H at α level of significane if :
|t| > t , v for H : μ = μ Versus H : μ ≠ μ
t > t , v for H : μ = μ Versus H : μ > μ

t < −t α , v for H : μ = μ Versus H : μ < μ
(X − X ) ± t α , v ∗ + .
Example 1: We wish to compare the mean gestational age (in weeks) of babies born to women with
preeclampsia during pregnancy vs. those who had normal pregnancies. Is the mean gestational age for
babies born to preeclamptic mothers is less than the mean gestational age for babies born to mothers with
normal pregnancies at 95% confident level?
Data:
Preeclampsia: 38, 32, 42, 30, 38, 35, 32, 38, 39, 29, 29, 32
Normal: 40, 41, 38, 40, 40, 39, 39, 41, 41, 40, 40, 40
Solution:
Preeclampsia: x = 34.5 n = 12 s = 19.36
Normal: x = 39.92 n = 12 s = 0.81 α = 0.05
Hypothesis: H : μ = μ Versus H : μ < μ
59
(X − X ) − (μ − μ ) (34.5 − 39.92) − 0 −5.42

t= = = = −4.177
19.36 0.81 1.297
12 + 12
n + n
Critical region: t < −t α , v, where df = smaller value of n1-1 or n2-1=11

tα , V = t . , 11 = −1.796
Reject H because − 4.177 < −1.796
Conclusion: since t = - 4. 177 is less than -1.796 we reject H0 at 5% level of significance and conclude
that the mean gestational age for babies born to preeclamptic mothers is less than the mean gestational age
for babies born to mothers with normal pregnancies.
Example 2: The average size of a farm in Indiana County, Pennsylvania, is 191 acres. The average size
of a farm in Greene County, Pennsylvania, is 199 acres. Assume the data were obtained
from two samples with standard deviations of 38 and 12 acres, respectively, and sample
sizes of 8 and 10, respectively. Can it be concluded at a 0.05 that the average size of the
farms in the two counties is different? Assume the populations are normally distributed.
Solution
1. State the hypotheses H : μ = μ Versus H : μ ≠ μ
2. Find the critical values. Since the test is two-tailed, since a 0.05, and since the variances are unequal, the
degrees of freedom are the smaller of −1 − 1. In this case, the degrees of freedom are 8-1=7. Hence,
from table, the critical values are 2.365 and -2.365.
3. Compute the test value. Since the variances are unequal, use the this formula.
(X − X ) − (μ − μ ) 191 − 199) − (0)
t= = = −0.57
38 12
8 + 10
n + n
4. Make the decision. Do not reject the null hypothesis, since -0.57 > -2.365.
5. Summarize the results. There is no any significance difference between the average size of the two farms.
3.1.4 Comparison of Means in Paired Samples
At times it might be possible to pair the observations in the two samples and take the difference in each
pair of observations. Usually this happens when subjects are exposed to a treatment, measurements are
taken before and after the treatment, and these measurements are compared to test the effectiveness of the
treatment. In effect, this amounts to a single population test where the population is the set of all possible
differences in the measurements.
60
A paired difference test is more efficient than the other tests because it has less chances of Type I and
Type II errors for the same sampling effort. The efficiency obtains because when measurements are made
on the same subject before and after a treatment, effects of extraneous variables such as age, race and
gender on the before/after difference are avoided. Thus the measured difference due to the treatment is
more accurate and reliable. Hence, whenever a paired difference test is possible, one should settle for that
rather than for other types of tests.
When using dependent samples each observation from population 1 has a one-to-one correspondence with
an observation from population 2. One of the most common cases where this arises is when we measure
the response on the same subjects before and after treatment.
This is commonly called a “pre-test/post-test” situation. However, sometimes we have pairs of subjects in
the two populations meaningfully matched on some prespecified criteria. For example, we might match
individuals who are the same race, gender, socio-economic status, height, weight, etc... to control for the
influence these characteristics might have on the response of interest. When this is done we say that we
are “controlling for the effects of race, gender, etc...”. By using matched-pairs of subjects we are in effect
removing the effect of potential confounding factors, thus giving us a clearer picture of the difference
between the two populations being studied.
When two samples are not independent and observations are taken in pairs the paired t-test is applicable.
In this case for each data point in one sample there is corresponding data point in the second sample.
Consider n paired sample points (x , x ), (x , x ), …, (x , x ). Let the mean difference among pairs
in the population be denoted by μ .
61
|T| > Tα , n − 1 for H : μ = μ Versus H : μ ≠ μ
T > Tα , n − 1 for H : μ = μ Versus H : μ > μ

T < −Tα , n − 1 for H : μ = μ Versus H : μ < μ
Example 1: Tumor size

Having an accurate measure of tumor size is extremely important because it allows a physician to
accurately determine if a tumor is growing, shrinking or remaining constant. The problem is that often the
measurements of the tumor size vary from physician to physician. In the past, tumor size was measured
62
using the linear distance across the tumor, but this was found to be very variable because of the irregular
shape of some tumors. A new method called the RECIST criteria traces the outside of the tumor. The
RECIST method was believed to give more consistent measures of the volume of the tumor. For a portion
of the study, a pair of doctors were shown the same set of tumor pictures. The volume of the tumor was
measured by two separate physicians under similar conditions.
Question of interest: Did the measurements from the two physicians significantly differ? If not, then there
would be no evidence that the volume measurements change based on physician.
Tumor 1 2 3 4 5 6 7 8 9 10
Dr.1 15.8 22.3 14.5 15.7 26.8 24.0 21.8 23.0 29.3 20.5
Dr.2 17.2 20.3 14.2 18.5 28.0 24.8 20.3 25.4 27.5 19.7
Solution:
We can measure the effect of the treatment in each person by taking the difference = − . Instead
of having two samples, we can consider our dataset to be one sample of differences
Tumor 1 2 3 4 5 6 7 8 9 10
Dr.1 15.8 22.3 14.5 15.7 26.8 24.0 21.8 23.0 29.3 20.5
Dr.2 17.2 20.3 14.2 18.5 28.0 24.8 20.3 25.4 27.5 19.7
Difference -1.4 2.0 0.3 -2.8 -1.2 -0.8 1.5 -2.4 1.8 0.8
Volume from Dr. 1: Population mean = , Sample mean = ̅

Volume from Dr. 2: Population mean = , Sample mean = ̅
Difference in population mean: = −
∑
Difference in sample mean: ̅ =
′
Assuming are normally distributed, can use t-distribution with n-1 df where n is the number of
difference.
̅−
=
√
∑ ( )
Standard deviation of differences =
Test statistic acts just like one sample

1) Null hypothesis: No difference between physicians effect
: = => − =0 : ≠ => − ≠0
2) α = 0.05
63
3) Test statistic
.
= = . = −0.457
√ √
4) Critical value . ,9 = 2.26 (p-value = 0.53)

5) Decision: we fail to reject null hypothesis
6) Conclusion: there is no evidence of a difference in tumor volume measurement based on physician at
5% level of significance.
Confidence interval for paired t-test constructed in the same way as one-sample t-test
( ̅− , ̅+ )
√ √
For our example, the confidence interval is (-1.01, 0.54)

Note that the conclusion from the hypothesis test and the confidence interval are the same.
Example 2: Sugar concentration of nectars
The following data are a sugar concentration of nectar in half heads of red clover kept at different vapor
pressure for 8 hours
Vapor pressure
4.4 mmHg( ) 9.9 mmHg ( )
62.5 51.7
65.2 54.2
71.3 57.0
69.9 56.4
74.5 61.5
67.8 57.2
70.3 58.1
67.0 56.2
68.5 58.4
62.4 55.5
1. Test the hypothesis that vapor pressure has no effect on the secretion at 1% level of significance.
2. Construct the 99% confidence interval for the difference in mean sugar concentration
Solution:
Let the population mean sugar concentration of nectar under vapor pressure of 4.4mmHg, and the
population mean sugar concentration of nectar under vapor pressure of 9.9mmHg.
Hypothesis: The null and the alternative hypotheses are
64
 Ho: − = 0 (i.e. vapor pressure has no effect on the nectar secretion)

H1: − ≠ 0 (i.e. vapor pressure has effect on the nectar secretion)
 Level of significance, a=1%=0.01
 Appropriate test Statistics is:
= =
√
 Compute critical value: since a two-tailed test is required and n = 10 then

= ( − 1) = . = (10 − 1) = . = 3.250
 Thus, we shall reject | |> = 3.250, . . , | | > 3.250
62.5 51.7 10.80 116.64

65.2 54.2 11.00 121
71.3 57.0 14.30 204.49
69.9 56.4 13.50 182.25
74.5 61.5 13.00 169
67.8 57.2 10.60 112.36
70.3 58.1 12.20 148.84
67.0 56.2 10.80 116.64
68.5 58.4 10.10 102.01
62.4 55.5 6.90 47.61
Sum
Hence, the standard error and will be:
65
 Decision Rule: As | | > t critical then, Ho will be rejected at a=5% level of significance
 Conclusion: There is strong evidence to conclude that vapor pressure has an effect on the nectar
secretion.
b) The 99% confidence interval for − or is given by:-
Therefore, the 99% confidence interval for − is (9.17, 13.47).

Example 3: A physical education director claims that by taking 800 international units (IU) of vitamin E,
a weight lifter can increase his strength. Eight athletes are selected and given a test of strength, using the
standard bench press. After two weeks of regular training supplemented with vitamin E, they were tested
again. Test the effectiveness of vitamin E regiment at 5% level of significance. Each value in the data that
follow represents the maximum number of pounds of the athletes can bench press. (Assume that the
variable is approximately normally distributed).
Athletes 1 2 3 4 5 6 7 8
Before 210 230 182 205 262 253 219 216
After 219 236 179 204 270 250 222 216
Solution:
For vitamin E to be effective, the “before weights” must be less than the “after weights”, i.e., the
difference in the population means must be negative in order for the vitamin to effective.
Thus the null and alternative hypotheses to be tested are:
 Hypothesis: : − =0 = 0 (Vitamin E does not increase strength or vitamin E is not
effective)
: − <0 < 0 (Vitamin E increase strength i.e. Athletes gain more weight
66
after they are supplemented with vitamin E or vitamin E is effective)

 Level of significance, a=0.05
 Appropriate test Statistics is:
̅ ̅
= =
√n
 Computing Critical Value

A one-tailed test is required and n=8. Thus,
t = t α (n − 1) = t . (8 − 1) = t . (7) = 1.895
We shall reject Ho at 0.05 level of significance if t <1.895
 Computing Calculating value
Athletes 1 2 3 4 5 6 7 8 Total
Before 210 230 182 205 262 253 219 216
After 219 236 179 204 270 250 222 216
-9 -6 3 1 -8 3 -3 0 -19
81 36 9 1 64 9 9 0 209
 Decision Rule: because t >t , we don`t reject Ho at 0.05 level of significance

 Conclusion: There is no sufficient evidence to support the claim vitamin E increases the strength of
weight lifters.
Exercise: The following are physiological measures taken of patients before and after the administration
of some medication.
Patient 1 2 3 4 5 6 7 8 9 10
Before 101 96 98 102 95 99 107 103 102 104
67
After 111 110 107 101 121 115 122 118 123 105
a) Find the mean change in the measure after the medication.
b) Find standard deviation of the chance and the standard error of the mean.
c) Construct 99% confidence interval for the true mean chance.
d) Test for a significant difference between the measures before and after the medication at the 5%
level of significance.
3.2 Inference About the Difference Between Two Population Proportions
3.2.1 Introduction
We now consider the case where there are two binomial parameters of interest, say, p and p , and we
wish to draw inferences about these proportions. Often you want to analyze differences between two
groups in the proportion of items that are in a particular category. The sample statistics needed to analyze
these differences are the proportion of occurrences in group 1 and the proportion of occurrences in group
2. With a sufficient sample size in each group, the sampling distribution of the difference between the two
proportions approximately follows a normal distribution. Suppose we wish to compare the proportions of
two populations that have a specific characteristic, such as the proportion of men who are left-handed
compared to the proportion of women who are left-handed. Each population is divided into two groups,
the group of elements that have the characteristic of interest (for example, being left handed) and the
group of elements that do not. We arbitrarily label one population as Population 1 and the other as
Population 2, and we draw a random sample from Population 1 and, without reference to the first sample
we draw a sample from Population 2.
Our goal is to use the information in the samples to estimate the difference P − P in the two population
proportions and to make statistically valid inferences about it.
3.2.2 Sampling Distribution of the Difference Between Two Proportions
We assess the probability associated with a difference in proportions computed from samples drawn from
each of these populations.
Sampling distribution of P − P .
68
The sampling distribution of the difference between two sample proportions is constructed in a manner
similar to the difference between two means. Independent random samples of size n and n are drawn
from two populations of dichotomous variables where the proportions of observations with the character
of interest in the two populations are P and P , respectively.
The distribution of the difference between two sample proportions, P − P , is approximately normal. For
large sample, n and n are large.
The mean for difference P − P is μ =P −P
( ) ( )
The variance for P − P is σ = +
The z score for the difference between two proportions is given by:
( ) ( )
Z= has standard normal distribution.
( ) ( )
( ) ( )
Therefore (P − P )~N[μ = P − P , σ = + for large sample size
3.2.3 Comparison of Proportions in Independent Samples

Assume we have two binomial populations for which the probability of success in population 1 is P and
in population 2 is P . Based on independent samples of size n and n we want to make inferences on the
difference between P and P , that is, (P − P ). The estimate of P is P = X /n , where X is the number
of successes in sample 1, and likewise the estimate of P is P = X /n . Assuming sufficiently large sample
sizes, the difference (P − P ) is normally distributed with mean P − P and variance
P (1 − P ) P (1 − P )
n + n
Therefore the appropriate statistic for inferences on (P − P ) is

(P − P ) − (P − P )
z=
P (1 − P ) P (1 − P )
n + n
Note that the expression for the variance of the difference contains the unknown parameters P and P . In
the single-population case, the null hypothesis value for the population parameter p was used in
calculating the variance. If the two population proportions are hypothesized to be different, then we
substitute P and P in their places.
If the two population proportions are hypothesized to be equal, then we substitute a pooled proportion P in
both places. This is analogous to the use of S and S separately or combining them into s in the case
of t-test for comparing population means. Letting P and P be the sample proportions for samples 1 and
69
2, respectively, the estimate of the common proportion p is a weighted mean of the two-sample
proportions,
The pooled proportion is given by the formula
In construction of a confidence interval for the difference in proportions, we cannot assume a common
proportion, hence we use the individual estimates P and P in the variance estimate. The (1 − α)
confidence interval on the difference P and P is:
As in the one-population case the use of the t distribution is not appropriate since the variance is not
calculated as a sum of squares divided by degrees of freedom. However, samples must be reasonably large
in order to use the normal approximation.
Comparing two population proportions can be done using a z-test if the two samples are sufficiently large.
Here a large sample means both np and n(1 - p) are at least 5.
Procedures
 Hypothesis
The null hypothesis is H : P = P and the alternative hypothesis is one of the following:
H : P ≠ P Two-tailed
H : P < P Left-tailed
H : P > P Right-tailed
 Decide on the significance level, α.
 The critical value(s) are
Use Z-table to find the critical value(s)
±Z For two-tailed
−Z For left-tailed
Z For right-tailed
70
 Compute the value of the test statistic, Z =

( )
 If the value of the test statistic falls in the rejection region, reject H0 otherwise, do not reject H0.
 State the conclusion in words.
The p-value approach compares the p-value for the test statistic with the α level.
Example 1: 200 patients suffering from a certain disease were randomly divided into two groups. Of the
first group consisting of 120 patients, who received treatment A, 99 recovered within three days. Out of
the other 80, who were treated by treatment B, 62 recovered within 3 days. Can we conclude that
treatment A is more effective?
Solution:-Let P = the population proportion of treatment A & P is that of treatment B
Given that n =120, n = 80, x = 99, x = 62, and n = n + n = 200. Thus,
p = = = 0.83, and
p = = = 0.78
We need to test: : p = p versus :p >p

Level of significance, α = 0.05
Test statistics: Since the samples are large, the sampling distribution of p − p is approximately normal.
Under the null hypothesis, the common proportion is estimated by p
p= = = 0.81
q = 1 − p = 1 − 0.81 = 0.19
The test statistics is,
Z= = 0.88
Z =Z . = 1.645, that is, R: z > 1.65 but the calculated value of Z does not lie in the rejection region.
Hence, we fail to reject Ho and conclude the two treatments are equally effective.
Example 2: A sample of 100 students at the university showed that 43 had taken one or more remedial
courses. A sample of 200 students at a junior college showed that 90 had taken one or more remedial
college courses. At a=0.05, test the claim that there is no difference in the proportion of students who
complete remedial course at a university or a junior college.
Solution: Let p = the population proportion of university students who complete remedial courses & p =
the population proportion of Junior college students who complete remedial courses. Thus,
The hypothesis to be tested is:
:p =
71
:p ≠
Level of significance, a=0.05
Test statistics:
Since n =100 and n =200 are large, an appropriate test statistics is:
Decision: Since |Z| < 1.96, we do not reject Ho at 5% level of significance

Conclusion: There is no difference in proportion of the students who complete remedial course in a
university of Junior college.
3.2.4 Comparison of Proportions in Paired Samples
Difference of Two Dependent Proportions
Suppose that a sample of n subjects has been selected to examine the relationship between the presences
of a particular attribute at two time points for the same individuals (paired observations). The situation
could also be used to examine the relationship between two different attributes for the same individuals.
The sample data for these situations can be arranged as follows:
Attribute at time
1 2 Number of subjects Difference(D ) Proportion
Present Present A 0 a/n
Present Absent B 1 b/n
72
Absent Present c -1 c/n

Absent Absent D 0 d/n
Total n 1
Let Present = 1 and Absent = 0
Objective: To compare the difference in the proportion of subjects with the attribute at two time points;
: = Or : − =0
: ≠ : − ≠0
The sample difference is − .
Assumptions
 Pair wise the observations are dependent
 n pairs among themselves are independent
 ’s are independent
Then the estimated proportion of subjects with the attribute at time 1 is = (a + b)/n, and the estimated
proportion with the attribute at time 2 is p = (a + c)/n. The difference between the two estimated
proportions is
a+b a+c b−c
P −P = − =
n n n
What is the sampling distribution of p − p for large sample size n?
Since the two population probabilities are dependent, we cannot use the same approach for estimating the
standard error of the difference that we used in the previous section. Instead of showing the steps in the
derivation of the formula, we simply present the formula for the estimated standard error.
73
Under the null hypothesis, H0 the standard error for p − p is given by:
( ) ( )
The sampling distribution of p − p is normal with mean zero and variance under the null
hypothesis.
Confidence Intervals:
The confidence interval for the difference of two dependent proportions P − P , is then given by
74
Example 1: Suppose that 100 students took both calculus and computer tests, and 18 failed in calculus (p
= 0.18) and 10 failed in computer (p = 0.10). There is an 8 percentage point difference (p - p = 0.08).
The confidence interval for the difference of these two failure rates cannot be constructed using the
method in the previous subsection because the two rates are dependent.
We need additional information to assess the dependency. Nine students failed both tests (p =0.09), and
this reflects the dependency. The dependency between p and p can be seen more clearly when the data
are presented in a 2 by 2 table.
Computer
Calculus Failed Passed Total
Failed 9(a) 9(b) 18
Passed 1(c) 81(d) 82
Total 10 90 100(n)
Solution:
Hypothesis:
H : p − p = 0 versus H : p − p ≠ 0
The marginal totals reflect the two failure rates. The numbers in the diagonal cells (a, d) are concordant
pairs of test scores (those who passed or failed both tests), and those in the off-diagonal cells (b, c) are
discordant pairs (those who passed one test but failed the other). Important information for comparing the
two dependent failure rates is contained in discordant pairs, as the estimated difference of the two
proportions and its estimated standard error are dependent on b and c.
b − c − 0.5 9 − 1 − 0.5
Z= =Z= = 2.372
√b + c √9 + 1
Critical value
±Z = ±Z . = ±1.96 => −1.96 ≤ ≤ 1.96
Since 2.372 > 1.96 reject the null hypothesis.
Using the standard error equation, we have
75
[ ]
Estimated SE(p − p2 ) = [9 + 1] − = 0.0306
Then the 95 percent confidence interval for the difference of these two dependent proportions is 0.08 –
1.96 (0.0306) < p – p < 0.08 +1.96 (0.0306) or (0.0200, 0.1400).
This interval does not include 0, suggesting that the failure rates of these two tests are significantly
different.
Exercise: Suppose we have two methods of teaching A, and B select n pairs of students, (from each
section of students in pairs) and randomly assign one of the members to one methods.
Data layout
Method A Method B Number of students
Success Success 52
Success Failed 21
Failed Success 9
Failed Failed 18
Is there a difference between the proportion of the successful individual taught by method A and B?
Use α = 0.05. Construct 95% confidence limits.
3.3 Sample Size Determination in Comparative Experiments
3.3.1. Comparing two means from independent samples
An important issue in planning a new study is the determination of an appropriate sample size required to
meet certain conditions. For example, for a study dealing with blood cholesterol levels, these conditions
are typically expressed in terms such as “How large a sample do I need to be able to reject the null
hypothesis that two population means are equal if the difference between them is μ = 10 mg/dl?”
We focus on the sample size required to test a specific hypothesis. In general, there exists a formula for
calculating a sample size for the specific test statistic appropriate to test a specified hypothesis. Typically,
these formulae require that the user specify the α-level and Power = (1 – β) desired, as well as the
difference to be detected and the variability of the measure in question. Importantly, it is usually wise not
to calculate a single number for the sample size. Rather, calculate a range of values by varying the
assumptions so that you can get a sense of their impact on the resulting projected sample size. Then you
can pick a more suitable sample size from this range.
The hypotheses to be tested are:
H : μ = μ versus H : μ ≠ μ
H :μ >μ
76
H :μ <μ
( ) ( )
Consider the test , with equal variance assumption.
( )
 For the two-sided alternative hypothesis with significance level α, the sample size n = n = n required
to detect a true difference in means of μ − μ with power at least 1 –β is:
Examples 1: We are interested in the size for a sample from a population of blood cholesterol levels. We
know that typically σ is about 30 mg/dl for these populations. How large a sample would be needed for
comparing two approaches to cholesterol lowering using α = 0.05, to detect a difference of d = 20 mg/dl or
more with Power = 1- b = 0.90?
3.3.2 Comparing two means from paired samples

For two dependent samples, the power and sample size can be calculated exactly as in the one sample case
because the paired t-test is a one sample problem
To find the sample size in the paired sample case we needed
 The hypothesized difference in the means
77
 The α level
 The power
 The variance in the sample difference
 One-sided or two sided test
The hypotheses to be tested are:
: =0 : >0 : <0 : ≠0
− = −√ ( − ) => − = −√ ( )
Now solve for n to get the appropriate sample size
− − = −√ ( ) => + =√ ( )
+ ( + )
= √ => =
Then the minimum required sample size for mean difference is

( + )
=
( − )
3.3.3 Comparing two proportions from independent samples
H : P = P againest H : P ≠ P Two − tailed
H : P < P Left − tailed
H : P > P Right − tailed
( ̂ − ̂ )
=
(1 − ) (1 − )
+
( − )∗√
= −
(1 − )+ (1 − )
+ ( (1 − )+ (1 − )
=
( − )
For a specified pair of values and , we can find the sample sizes = = n required to give the test
of size α, that has specified type II error β.
Example: d = - = 0.7 - 0.5 = 0.2
When s = 30 mg/dl, β = 0.10, a = 0.05; = 1.96
Power = 1- β; = 1.282, d = 20mg/dl
78
3.3.4 Comparing two proportions from paired samples
79
Chapter 4
4. Analysis of Variance (ANOVA)
4.1. Introduction
Analysis of variance is for two different purposes: (1) to estimate and test hypotheses about population
variances and (2) to estimate and test hypotheses about population means.
This chapter presents statistical methods for comparing means among any number of populations based on
samples from these populations. The t test for comparing two means cannot be generalized to the
comparing of more than two means. Instead, the analysis most frequently used for this purpose is based on
a comparison of variances (F test), and is therefore called the analysis of variance, often referred to by the
acronyms ANOVA. When ANOVA (F test) is applied to only two populations, the results are equivalent
to those of the t test.
Analysis of variance is a widely used statistical technique that partitions the total variability in our data
into components of variability that are used to test hypotheses.
Model:
i = 1, 2, … , k
X =μ +ε
j = 1, 2, … , n
Now let μ = μ + τ . i = 1, 2, … , k now the above formula becomes:
i = 1, 2, … , k
X =μ +τ +ε
j = 1, 2, … , n
Where X denote j observed sample value from population i,
μ is a parameter common to all treatments under the null hypothesis called the overall mean (grand mean),
τ is a parameter associated with the i population (treatment) called the i treatment effect, and
ε is a random error component.
Assumption:
 X are assumed normally distributed about the mean, μ , and variance, σ
 We assume that the errors are normally distributed with constant variance, i.e, ε ~ NID(0, σ ) i.e.
independently and identically normally distributed with mean 0 and constant variance σ .
 This implies that the populations being sampled are also normally distributed with equal variances.
 In the fixed-effects model, the treatment effects τ are usually defined as deviations from the overall
mean μ, so that ∑ τ = 0.
 Groups are independent
 Equal variances between the groups
80
4.2. Test of Hypothesis About the Equality of More than Two Population Means
The comparison of two or more means is based on partitioning the variation in the dependent variable into
its components hence; the method is called the analysis of variance (ANOVA).
In general, suppose there are k normal populations with possibly different means,μ , μ , … , μ , but all
with the same variance . The study question is whether all the k population means are the same. We
formulate this question as the test of hypotheses
H : μ = μ = … = μ versus H : not all k population means are equal.
To perform the test k independent random samples are taken from the k normal populations. The k sample
means, the k sample variances, and the k sample sizes are summarized in the table:
Population Population Sample size population Sample Population Sample
size mean mean variance variance
1 ̅
2 ̅
3 ̅
. . . . . . .
. . . . . . .
. . . . . . .
k ̅
The ANOVA identity (assume equal sample size, n = n = ⋯ n ). Let X represent the j of the
observations under the i sample (treatment) and x . represent the average of the observations under the
i sample. Similarly, let x. . = ∑ ∑ x , i = 1, 2, … , k and j = 1, 2, … , n represent the grand total of
all observations and x.. represent the grand mean of all observations.
∑
̅.= , = 1, 2, … , , ̅. . = . . , = + + ⋯+
We are interested in testing the equality of k populations (treatment) , ,…, .

Hypothesis
H : μ = μ = ⋯ = μ againest H : μ ≠ μ for at least one pair(ij).
Thus, if the null hypothesis is true, each observation consists of the overall mean μ plus a realization of the
random error component . This is equivalent to saying that all N observations are taken from a normal
distribution with mean μ and variance . Therefore, if the null hypothesis is true, changing the levels of
the factor has no effect on the mean response. Thus, the test of the hypothesis is based on a comparison of
two independent estimates of the population variance. The total variability in the data is described by the
total sum of squares, as follows:
81
If we consider all of the data together, regardless of which sample the observation belongs to, we can
measure the overall total variability in the data by:∑ ∑ ( − ..)
But − ..= − . + . − .. squaring both sides
( − ..) = ( − . +( . − . . )]
= ( − . +( . − . . ) + 2( − . )( . − . . ) , taking the summation
The result of the forgoing calculations may then be summarized as in the following table known as the
One-way ANOVA table.
Source of variation Degrees of freedom Sum of squares Mean squares F-test
Between treatments k-1 BSS BMS=BSS/(K-1) BMS
F=
Within treatments N-k WSS WMS=WSS/(N-k) WMS
(error)
Total N-1 TSS
Definition: Analysis of variance provides a subdivision of the total variation between the responsiveness
of experimental units into separate components representing a different source of variation, so that relative
importance of the different sources can be assessed. Secondly and more importantly it gives an estimate of
the underlying variation between units which provides a basis for inferences about the effects of the
applied treatments or the population means.
82
 In ANOVA, we compare the between-group variation with the within-group variation to assess
whether there is a difference in the population means. Thus by comparing these two measures of
variance (spread) with one another, we are able to detect if there are true differences among the
underlying group population means.
 If the variation between the sample means is large, relative to the variation within the samples, then we
would be likely to detect significant differences among the sample means.
 If the variation between the sample means is small, relative to the variation within the samples, then
there would be considerable overlap of observations in the different samples, and we would be
unlikely to detect any differences among the population means.
Another estimate of the common variance (( -1) ) can be found using pooled variance sum of squares
estimates
83
 Thus if the null hypothesis is true we have two estimates of the common variance (σ ), namely the
Mean Square for Treatments (BMS) and the Mean Square Error (WMS). If BMS > WMS, i.e. the
between group variation is large relative to the within group variation, we reject Ho.
84
 If BMS ~ WMS we fail to reject Ho, i.e. the between group variation is not large relative to the within
group variation.
 Our test statistic is the F-ratio (F-statistic) which compares these two mean squares:
 A large F-statistic provides evidence against Ho while a small F-statistic indicates that the data and Ho
are compatible.
A point estimate of the population (treatment) mean may be easily determined = ̅ . . If we assume
that the sample means ̅ . are normally distributed, ̅ . ~ ( , ). Thus if were known we could
use the normal distribution to define the confidence interval. Using the mean square error as an estimator
of , we would base the confidence interval on the t-distribution.
Example 1: Assume ”treatment results” from 13 patients visiting one of three doctors are given:
Doctor A: 24, 26, 31, 27
Doctor B: 29, 31, 30, 36, 33
Doctor C: 29, 27, 34, 26
At 5% level do the treatment results are from the same population of results?
Solution:
Ho: The treatment results are from the same population of results
Ha: They are from different populations
Group Sample size Sample mean Sample variance
Dr. A n =4 x = 27 s = 8.67
Dr. B n =5 x = 31.8 s = 7.7
Dr. C n =4 x = 29 s = 12.67
The average of all 33 observations is:

∑ ∑ X ∑ n x . 4x27 + 5x31.8 + 4x29
X. . = = = = 29.46
N N 13
85
BSS = ∑ ∑ (x . − x. . ) = 4(27 − 29.5) + 5(31.8 − 29.5) + 4(29 − 29.5) = 52.431
WSS = ∑ (n − 1)S = (n − 1)S + (n − 1)S + (n − 1)S

= (4-1)8.67 + (5-1)7.7 + (4-1)12.67 = 94.820
TSS = BSS + WSS = 52.431 + 94.820 = 147.251
ANOVA Table:
Between treatments 2 52.431 26.216 = 2.765
Within treatments 10 94.820 9.482
(error)
Total 12 147.251
Critical value:
, − 1, − = . , 2,10 = 4.10
Decision: do not reject the null hypothesis
Conclusion: We are 95% confident that there are no real differences in average of the three treatments.
Example 2: A researcher wishes to try three different techniques to lower the blood pressure of
individuals diagnosed with high blood pressure. The subjects are randomly assigned to three groups; the
first group takes medication, the second group exercises, and the third group follows a special diet. After
four weeks, the reduction in each person’s blood pressure is recorded. At α 0.05, test the claim that there is
no difference among the means. The data are shown.
Medication Exercise Diet
10 6 5
12 8 9
9 3 12
15 0 8
13 2 4
X = 11.8 X = 3.8 X = 7.6
S = 5.7 S = 10.2 S = 10.3
Solution:
86
87
Example 3: Food supplements

As a vet epidemiologist you want to see if 3 food supplements have different mean milk yields. You
assign 15 cows, 5 per food supplement.
Question: At the 0.05 level, is there a difference in mean yields?
Food 1 Food 2 Food 3
25.40 23.40 20.00
26.31 21.80 22.20
24.10 23.50 19.75
23.74 22.75 20.60
25.10 21.60 20.40
Solution:
Hypothesis:
H : = μ = μ = μ againest H : μ ≠ μ for at least one pair of (ij), i, j = 1,2,3
Food supplements Sample size Sample mean Sample variance

Food 1 n =5 x = 24.93 s =1.0648
Food 2 n =5 x = 22.61 s =0.778
Food 3 n =5 x = 20.59 s =0.9205
The average of all 15 observation is:
∑ ∑ X X. . 340.65
X. . = = = = 22.71
N N 15
BBS = ∑ ∑ (X . − X. . ) = n ∑ (X . − X. . )
= 5(24.93 − 22.71) + 5(22.61 − 22.71) +5(20.59 − 22.71) = 47.164
WSS = ∑ (n − 1)s = (n − 1)s +(n − 1)s + (n − 1)s
= (5-1)1.0648 + (5-1)0.778 + (5-1)0.9205 = 11.0532
TTS = BSS + WSS = 47.164 + 11.0532 = 58.2172
ANOVA Table:
Between treatments 2 47.164 23.582
Within treatments 12 11.0532 0.9211 F = 25.6020
(error)
Total 15 58.2172
Critical value:
F , k − 1, N − k = F . , 2, 12 = 3.89
Decision: reject the null hypothesis.
Conclusion: we are 95% confident that there is evidence population means are different.
88
Unbalanced data
In some single-factor experiments, the number of observations taken under each treatment may be
different. We then say that the design is unbalanced. In this situation, slight modifications must be made
in the sums of squares formulas. Let n observations be taken under treatment i (i = 1, 2 , . . . , a), and let
the total number of observations N = ∑ n.
The sums of squares computing formulas for the ANOVA with unequal sample sizes n in each treatment
are:
∑ . ∑ . .. ..
WSS = ∑ ∑ x − , BSS = − and TSS = ∑ ∑ x −
ANOVA table:
Between treatments k-1 BSS BMS BMS
F=
Within treatments N-k WSS WMS WMS
(error)
Total N-1 TSS
Summary:
 When sample sizes are equal, we say design is balanced.
 When sample sizes are unequal we say design is unbalanced.
Example 4:
A research laboratory developed two treatments which are believed to have the potential of prolonging the
survival times of patients with an acute form of thymic leukemia. To evaluate the potential treatment
effects 33 laboratory mice with thymic leukemia were randomly divided into three groups. One group
received Treatment 1, one received Treatment 2, and the third was observed as a control group. The
survival times of these mice are given in table below "Mice Survival Times in Days". Test, at the 1% level
of significance, whether these data provide sufficient evidence to confirm the belief that at least one of the
two treatments affects the average survival time of mice with thymic leukemia.
Treatment 1: 71, 75, 72, 73, 75, 72, 80, 65, 60, 63, 65, 69, 63, 64, 78, 71, 91
Treatment 2: 77, 67, 79, 78, 81, 72, 71, 84
Control: 81, 79, 73, 71, 75, 84, 77, 67
Solution:
Step1. The test of hypotheses is
H : μ = μ = μ versus H : not all three population means are equal.
α = 0.01
89
Summary statistics
If we index the population of mice receiving Treatment 1by 1, Treatment 2 by 2, and no treatment by 3,
then the sample sizes, sample means, and sample variances of the three samples in the following table
mice survival times in days are summarized by:
Group Sample Size Sample Mean Sample Variance
Treatment 1 = 16 ̅ = 69.75 = 34.47
Treatment 2 =9 ̅ = 77.78 = 52.69
Control =8 ̅ = 75.88 = 30.69
The average of all 33 observations is:
ANOVA table:
Between treatments 3-1 = 2 435 217.50 F = 5.65
Within treatments 33-3 = 30 1153.5 38.45
(error)
Total 33-1 = 32 1588.5
The critical value is ( − 1, − ) = . (2,30) = 5.39. thus the rejection region is [5.39, ∞), as F
test value is in the rejection region the null hypothesis of equal means is rejected and we conclude that
there is a slight difference in means among the two treatments and the control group at 99% confidence
level.
Assumption for the F test for comparing three or more means
1. The populations from which the samples were obtained must be normally or approximately normally
distributed.
2. The samples must be independent of one another.
3. The variances of the populations must be equal.
4. The sample are drawn randomly from population
90
4.3. Multiple Comparison Method

A significant F-test tells us that at least one of the underlying population means are different, but it does
not tell us which ones differ from the others. We need extra tests to compare all the means, which we call
Multiple Comparisons. We look at the difference between every pair of group population means, as well
as the confidence interval for each difference.
If the overall F statistic from the ANOVA is statistically significant, multiple comparisons procedures can
be used in an attempt to discover the source of the significant differences among the group means. Most of
these procedures are designed to examine the pair wise differences among group means, although there
are more general procedures. The comparison of the group means is accomplished through the
presentation of confidence intervals for pair wise differences of group means. The use of the multiple
comparison procedures is generally not recommended when we fail to reject the null hypothesis.
However, exceptions may occur when certain comparisons have been planned in the course of the
experiment. There are many different multiple comparison procedures.
4.3.1. Fishers least significant difference method (LSD)
When an ANOVA gives a significant result, this indicates that at least one group differs from the other
groups. Yet, the omnibus test does not indicate which group differs. In order to analyze the pattern of
difference between means, the ANOVA is often followed by specific comparisons, and the most
commonly used involves comparing two means (“pair wise comparison”). This technique can be used
only if the ANOVA F omnibus is significant.
The main idea of the LSD is to compute the smallest significant difference (i.e., the LSD) between two
means as if these means had been the only means to be compared (i.e., with a t test) and to declare
significant any difference larger than the LSD.
Procedure
1. Prefer an analysis of variance to test Ho: μ = μ = ⋯ = μ V H : at least one of the means
differs from the rest.
2. If there is insufficient evidence to reject Ho using F = BMS/WMS, proceed no further.
3. If Ho is rejected, define the least significant difference (LSD) to be the observed difference
between two sample means necessary to declare the corresponding population means different.
4. For a specified value of α, the least significant difference for comparing μ to μ is
LSD = t α/ WMS + , where n and n are the respective sample sizes from population i
and j and t is the critical value for α = α/2 and df denoting the degrees of freedom for WMS.
Note that for n = n = ⋯ n = n
91
LSD = t α/ 2WMS/n
5. Then compare all pairs of sample means. If x . − x . ≥ LSD , declare the corresponding
population means μ and μ different.
6. For each pair wise comparison of population means, the probability of a type I error is fixed at a
specified value of α.
Determining significance
The difference between two mean values is compared to the LSD value. If the difference is greater than
the LSD value, then the means are significantly different.
Example: We want to compare the mean leaf width of a new variety (A) with two comparators (B & C).
The data is presented as follows:
Variety Variety A 17 15 10 14
Variety B 34 26 23 22
Variety C 23 21 8 16
Solution: We can solve this problem by following the five steps listed for the LSD procedure.
Ho: μ = μ = μ V H : at least one of the means differs from the rest.
First, calculate the variety totals and means. Then, calculate the replicate totals.
Variety Sample Size Sample Mean Sample variance
A n =4 14 s = 8.6667
B n =4 26.25 s = 29.5833
C n =4 17 s = 44.6667
The grand mean is:
∑ ∑ X ∑ nx 4(14) + 4(26.25) + 4(17)
X. . = = = = 19.08
N 12 12
BSS = ∑ ∑ (X . − X . . ) = ∑ n (X . − X . . ) = 4(14.19.08) + 4(26.25 − 19.08) +4(17 − 19.08)
= 326.1668
WSS = ∑ (n − 1)s = 3x8.6667 + 3x29.5833 + 3x44.6667 = 248.7501
TSS = BSS + WSS = 326.1668 + 248.7501 = 574.9169
ANOVA table:
Source of variation Df. SS MSS F- test
Between treatments 2 326.1668 163.0834 5.9005
Within treatments 9 248.7501 27.6389
(error)
Total 11 574.9169
Critical value: F . , 2, 9 = 4.26
92
Decision: reject the null hypothesis, since the calculated value is greater than the tabulated value.
Conclusion: The calculated F value 5.9005 is more than the table F value 4.26. This reveals that there are
significant differences among the three varieties i.e. at least one variety is different from any of others.
The next step is to calculate LSD
The least significant difference for comparing two means based on samples of size 4 is then
LSD = t α/ 2MS / = 2.262 2(27.6389)/4 = 8.4089
Note that the appropriate t value (2.262) was obtained from table value with α = α/2 = 0.025 and df 9.
Step5. When we have equal sample size, it is convenient to use the following procedures rather than make
all pair wise comparison among the sample means because the same LSD is to be used for all
comparisons.
Determining if the two varietal means are significantly different:
The difference between two mean values is compared to the LSD value. If the difference is greater than
the LSD value, then the means are significantly different. The variety means for the above example are A
(candidate) = 14, B (comparator 1) = 26.25, C (comparator 2) = 17.
The absolute difference between A and B is 12.25 which is greater than 8.4089: therefore A and B are
significantly different at P ≤ 0.05. This confirms the results of F test.
The difference between A and C is 3 which is less than 8.4089: therefore A and C are not significantly
different at P ≤ 0.05.
The difference between B and C is 9.25 which is less than 8.4089: therefore A and C are not significantly
different at P ≤ 0.05.
4.3.2 Studentized range (test Tukey’s W Method)
The Tukey-Kramer method is the recommended procedure when one wishes to estimate simultaneously
all pair wise differences among the means in a one way ANOVA assuming that the variances are equal.
This is the best known method among other proposed methods, that replaces the LSD by the criterion
based on tables own table values. The table provide q-value at α = 0.01 and α = 0.05. The tabulated values
depend on two parameters, namely, the number of treatments (a) and the degree of freedom associated
with WMS(f).
: = : ≠ ≠
The overall level of significance is exactly α when the sample sizes are equal and at most α when the
sample sizes are unequal.
Steps:
 Calculate x . − x . , for i ≠ j.
93
 Read from the studentized range table the value of q (a, f) for the required a and f.
( ,)
 Compute T = q (a, f) for equal sample sizes, or T = WMS + when the sample
√
sizes are no equal.

 Decision rule:reject H if the absolute value of their sample mean difference exceeds Tα, x − x > Tα
Or the symbol for the test value in the Tukey test is q.
q= ,
/
Where X and X are the means of the samples being compared, n is the size of the samples, and S is the
within-group variance.
When the absolute value of q is greater than the critical value for the Tukey test, there is a significant
difference between the two means being compared.
Example: consider example 2 and identify which mean is significant?
Solution: since the sample size is equal, we can calculate as follows:
Tα = q α (a, f) , Since k =3, d.f. =12, and a =0.05, the critical value is q . (3,12) = 3.77,
WMS = 8.73, and n = n = n = n = 5

Tα = 3.77 8.73/5 = 4.9815
For X versus X : X − X = 11.8 − 3.8 = 8
For X versus X : X − X = 11.8 − 7.6 = 4.2
For X versus X : X − X = 3.8 − 7.6 = −3.8
Hence, the only x − x value that is greater than the calculated Tα value is the one for the difference
between X − X . The conclusion, then, is that there is a significant difference in means for medication
and exercise.
4.3.2. Scheffe’s method
To conduct the Scheffé test, you must compare the means two at a time, using all possible combinations
of means. For example, if there are three means, the following comparisons must be done:
X versus X X versus X X versus X
( )
F = ,
Where X and X are the means of the samples being compared, n and n are the respective sample sizes,
and S is the within-group variance.
94
To find the critical value F’ for the Scheffé test, multiply the critical value for the F test by k-1:
F’ = (k-1)(C.V.)
There is a significant difference between the two means being compared when F is greater than F.
Using the Scheffé test, test each pair of means in Example 2 to see whether a specific difference exists, at
α = 0.05.
Solution:
i) for X versus X ,
(X − X ) (11.8 − 3.8)
F = = = 18.33
1 1 1 1
S + 8.73 +
n n 5 5
ii) for X versus X ,
(X − X ) (3.8 − 7.6)
F = = = 4.14
1 1 1 1
S + 8.73 +
n n 5 5
iii) for X versus X ,
(X − X ) (11.8 − 7.6)
F = = = 5.05
1 1 1 1
S + 8.73 +
n n 5 5
The critical value for the analysis of variance; by using table value with α = 0.05, d.f.N = k-1=2, and
d.f.D = N-k = 12, is 3.89. the critical value for F’ at α = 0.05, with d.f.N = 2 and d.f.D. = 12 is:
F’ = (k-1)(C.V) = (3-1)(3.89) = 7.78
Since only the F test value for part i (X versus X ) is greater than the critical value, 7.78, the only
significant difference is between X and X , that is, between medication and exercise. These results agree
with the Tukey analysis.
95
Chapter 5
5. Inference About Population Variance
5.1. Introduction
Comparing variances: if a factory manager is considering whether to buy packaging Machine A or
Machine B. During test runs, Machine A produced sample variance S while Machine B produced
sample variance S . Question: Are these variances significantly different?
Suppose the population variances for weights of mealie-meal bags packaged from machines A and B are
respectively, σ and σ . We can answer the question concerning whether the variances are different by
testing the null hypothesis H : σ =σ against the alternative H : σ ≠σ .
Other applications where testing for variance may be important includes the following:
 Foreign exchange stability is important in any economy. Too much variation of a currency is not
good.
 Price stability of other commodities is also important.
5.2. Sampling Distribution of the Sample Variance
Let x , x , … , x be a random sample from a population. The sample variance is: s = ∑ (x − x)
 The square root of the sample variance is called the sample standard deviation
 The sample variance is different for different random samples from the same population
 The sampling distribution of s has mean σ , E(s ) = σ
σ
 If the population distribution is normal, then Var(s ) =
( )
 If the population distribution is normal, then σ
has a χ distribution with n-1 degrees of
freedom. The chi-square distribution is a family of distributions, depending on degrees of freedom.

 The chi-square distribution is the sum of squared standardized normal random variables such as
(z ) + (z ) + (z ) and so on.
 The chi-square distribution is based on sampling from a normal population.
 The Chi-squared distribution is tabulated according to degrees of freedom (df) and has mean = df,
variance = 2df. For example, if a chi-squared statistic has seven degrees of freedom, its mean is 7
and its variance is 14.
5.3. Point and Interval Estimation of a Population Variance
Point estimate
The point estimate of population variance, is the sample variance,
∑( ) ∑
s = = , where n is the size of the sample.
96
Confidence Interval for the Population Variance
Interval Estimate of a Population Standard Deviation

Taking the square root of the upper and lower limits of the variance interval provides the confidence
interval for the population standard deviation.
Example 1: Plastic sheets produced by a certain machine are periodically monitored for possible
fluctuation in thickness. If the true standard deviation of thickness exceeds 1.5 millimeters, there is cause
to be concerned about product quality.
Thickness measurements in millimeters of 15 specimens produced on a particular shift resulted in the
following data. 225, 227, 228, 230, 224, 228, 227, 226, 232, 227, 228, 223, 226, 232, 228
Construct 95% and 99% confidence interval for .
97
Thus 99% confidence interval for σ becomes, (√2.99, √23.05) = (1.73, 4.8)mm
From the 95% as well as 99% confidence interval for σ, we see that the true standard deviation for the
particular shift is greater than 1.5 mm (that is, both intervals donot contain 1.5 mm).
Hence fluctuation n of the thickness is not in the tolerable range. This is thus a sign assignabel cause on
quality of the plastic sheets.
By using χ or χ , we can get one sided confidence interval for σ as well as σ.
5.4. Hypothesis testing about a population variance

Suppose that we want to test the statement that the variance of a given population is equal to some given
variance, which we can call σ . That is our null hypothesis is H : σ = σ and our alternative
hypothesis is H : σ ≠ σ .
98
If the underlying data is normally distributed, the appropriate test statistic use the test ratio
∑ ( ) ( )
χ = = , which has the chi-squared distribution with n - 1 degrees of freedom under H
σ σ
( )
(often written χ ).
Decision rule: Compare this with chi-square tables, or use the p-value.
For a two-tailed test H : σ = σ versus H : σ ≠ σ we would pick two values of chi-squared,
χ and χ , and accept the null hypothesis if χ lies between them, i.e,
χ > χ , n − 1, or χ < χ ,n −1
For one tailed test we reject H : at α level of significance if

χ > χ α , n − 1, for right tailed H : σ = σ versus H : σ > σ
χ <χ α
, n − 1, for left tailed H : σ = σ versus H : σ < σ
Here, p-value = p(χ α , n − 1 > χ ) where χ α , n − 1 is a chi-square random variable.
Example 1: Assume that we believe that the distribution of the ages of a group of workers is normal, and
we wish to test our belief that the variance is 64. Our data is a sample of 17 workers, and our computations
give us a sample variance of 100. Let us set our significance level at 2% and state our problem as follows:
Hypothesis: H : σ = 64 versus H : σ ≠ 64
n = 17, df = n-1 = 16, s = 100, σ = 64, α = 0.02
( ) ( )
We can compute χ = σ
= = 25. since α/2 =0.01 and df = 16, we go to the chi-squared table
( ) ( )
to find χ .
= 5.812 and χ .
= 32.
The acceptance region is between these two values, so we cannot reject the null hypothesis, i.e, 5.812 < 25
< 32.
Conclusion: At 2% level of significance there is no enough evidence to reject the statement that
distribution of the ages of a group of workers variance is 64.
99
Example 3: A company claims that the standard deviation of the lengths of time it takes an incoming
telephone call to be transferred to the correct office is less than 1.4 minutes. A random sample of 25
incoming telephone calls has a standard deviation of 1.1 minutes. At α = 1%, is there enough evidence to
support the company’s claim? Assume the population is normally distributed.
Solution:
Hypothesis: H : σ = 1.4 min. versus H : σ < 1.4 min. (claim)
α = 0.01, df = 25-1 = 24
rejection region:χ . , 24 = 15.659
Test statistic:
(n − 1)s (25 − 1)(1.1)
χ = = = 14.846
σ 1.4
Decision: Reject H .
100
At the 10% level of significance, there is enough evidence to support the claim that the standard deviation
of the lengths of time it takes an incoming telephone call to be transferred to the correct office is less than
1.4 minutes.
5.5. Estimation and Hypothesis Testing for Comparing Two Population Variance
Suppose that two independent normal populations are of interest, where the population means and
variances, say, μ , σ , μ , and σ , are unknown. We wish to test hypotheses about the equality of the
two variances, say, H : σ = σ . If we have two separate samples and want to test to see if their
variances are equal, we test a ratio of chi-squares instead of a difference, as we would do as with means or
proportions. The ratio of two χ 's from the same population has the F distribution, a distribution named
after R A. Fisher, one of the founders of modern statistics.
Definition: Let W and Y be independent chi-square random variables with u and v degrees of freedom,
respectively. Then the ratio F = , has the probability density function
f(x) = ( ) , 0 < x < ∞ is said to follow the f distribution with u degrees of freedom
in the numerator and v degrees of freedom in the denominator.

The mean and variance of the F distribution are:
2 2( + − 2)
= ( − 2) , >2 , = , >4
( − 2) ( − 4)
Properties of the F distribution
 The values of F cannot be negative, because variances are always positive or zero.
 F distribution is skewed to the right and ranges between zero and infinity (i.e. it only takes positive
values).
 The F distribution is a family of curves based on the degrees of freedom of the variance in the
numerator and the degrees of freedom of the variance in the denominator.
 The assumptions here are that the two independent random samples come from normally distributed
populations.
 As we will illustrate below, as the degrees of freedom increases, the F distribution approaches the
normal distribution.
 The square of a t-distributed random variable has an F distribution with 1 and n-1 degrees of freedom in
the numerator and denominator respectively
101
Assume that two random samples of size n from population 1 and of size n from population 2 are
available and let s and s be the sample variances. We wish to test the hypotheses
H : σ /σ = 1 versus H : σ /σ ≠1
( )
In section 5.3 and 5.4 we found that χ = σ
is an example of the chi-squared distribution with n-1
degrees of freedom. Thus, the first of two sample variances s , when multiplied by n , its sample size,
and σ , the variance of its parent distribution, will have the chi-squared distribution with n − 1 degrees
of freedom, and we can write
( ) ( )
χ = , similarly, χ = . If this is true and σ = σ , the ratio
σ σ
has the F distribution with n − 1 and n − 1 degrees of freedom.
Let x , x ,, … , x be a random sample from a normal population with mean μ , and variance σ ,
and let x , x ,, … , x be a random sample from a second normal population with mean μ , and
variance σ ,. Assume that both normal populations are independent. Let
102
Decision rule: we reject H at α level of significance if

F>F / , n − 1, n − 1 or F < F / , n − 1, n − 1 for H : σ =σ versus H : σ ≠σ
F > F , n − 1, n − 1 for H : σ =σ versus H : σ >σ
F<F , n − 1, n − 1 for H : σ =σ versus H : σ <σ
A 100(1-α) confidence interval on the ratio σ /σ is given by:
S σ S
F / , n − 1, n − 1 ≤ ≤ F / , n − 1, n − 1
S σ S
To use the F table, one must know the degrees of freedom for both the numerator and the denominator.
The user locates the table for the significance level desired and finds the degrees of freedom for the
numerator across the top and the degrees of freedom for the denominator down the left side.
Example 1: A researcher compares government lawyers with lawyers in private practice and discovers
that the government lawyers have a lower mean salary. The researcher hypothesizes that government
lawyers take a lower mean salary because the variance for government lawyers, which serves as a measure
of risk, is smaller. Data set for private lawyers consists of five points, (in thousands of dollars per month),
1.0, 2.0, 1.9, 4.0 and 0.5. A similar data set for government lawyers consists of six points, 0.6, 0.7, 0.8,
1.1, 0.4 and 0.6. Test the claim of the researcher at 1% level of significance.
Solution:
Let lawyers in private practice are group 1 and government lawyers are group 2, our hypotheses are:
H : = 1 versus H : >1
From these data we compute s = 1.797 and n = 5. s = 0.056 and n = 6. Thus the F-ratio is:
.
F= = = 32.09.
.
( , )
Since df = n − 1 = 4 and df = n − 1 = 5, we look up F . = 11.39 on our F table. Since the
calculated value F is much larger than the critical value, we reject the null hypothesis and conclude that
the variance for government lawyers, which serves as a measure of risk, is smaller.
Example 2: A study was performed on patients with pituitary adenomas. The standard deviation of the
weights of 12 patients with pituitary adenomas was 21.4 kg. A control group of 5 patients without
pituitary adenomas had a standard deviation of the weights of 12.4 kg. We wish to know if the weights of
the patients with pituitary adenomas are more variable than the weights of the control group.
Solution:
Pituitary adenomas: = 12, = 21.4 kg
Control: = 5, = 12.4 kg
α = 0.05
103
Assumptions
• Each sample is a simple random sample
• Samples are independent
• Both populations are normally distributed
Hypothesis: H : = 1 versus H : >1
Test statistics: since S is larger than S , thus S is the numerator

( . )
F= = = 2.98
( . )
Critical region: the critical value is F , = 5.91. Reject H if F > 5.91.

From the F . table
Discussion: We cannot reject H because 2.98 < 5.91. The calculated value for F falls in the non rejection
region (p-value = 0.1517).
Conclusions: The weights of the population of patients may not be any more variable than the weights of
the control subjects.
104
Chapter 6
6. Chi-Square Tests
6.1. Introduction
Consider we have two categorical variables, and we want to know if the differences in sample proportions
are likely to have occurred just by chance due to random sampling.
Chi-Square Distribution
The chi-squared distribution is concentrated over nonnegative values. It has mean equal to its degrees of
freedom (df), and its standard deviation equals √2df. The distribution is skewed to the right, but it
becomes more bell-shaped (normal) as df increases. The df value equals the difference between the
number of parameters in the alternative hypothesis and in the null hypothesis, as explained later in this
section. Figure 6.1 displays chi-squared densities having df = 1, 5, 10, and 20.
Figure 6.1 Examples of chi-squared distributions

To study the relationship or association between the two categorical variables, we use the chi-square (χ )
test to assess the null hypothesis of no relationship between the two categorical variables.
The chi-square (χ ) test can be applied for:
 Test of independence
 Test of homogeneity
 Test of goodness of fit
6.2. Chi-square Tests of Association (Independence)
Suppose we have two characteristics A and B. Let A has r categories: A , A , …, A and B has c
categories: B , B , …, B .
105
 If the attributes are independent then the probability of possessing both A and B is PA*PB. Where PA is
the probability that a number has attribute A. PB is the probability that a number has attribute B.
 Suppose A has r mutually exclusive and exhaustive classes. And B has c mutually exclusive and
exhaustive classes
Objective: To study the association/relationship between two categorical variables. The two
characteristics can be cross classified as follows:
Where n = observed counts for ith level of A and jth level of B. such a table that contains frequency
counts is called contingency table, since there are r rows and c columns we call it rxc contingency table.
n . = ∑ n , n. = ∑ n and n = ∑ ∑ n
The chi-square procedure test is used to test the hypothesis of independency of two attributes. For instance
we may be interested.
o Whether the presence or absence of hypertension is independent of smoking habit or not.
o Whether the size of the family is independent of the level of education attained by the mothers.
o Whether there is association between father and son regarding boldness.
o Whether there is association between stability of marriage and period of acquaintance ship prior to
marriage.
The χ statistics is given by:
( )
χ =∑ ∑ ~χ (r − 1)(c − 1)
Where O = The number of units that belong to category i of A and j of B

e = Expected frequency that belong to category i of A and j of B
The e is given by:
e =
Where R = the i rows total

C = the j columns total and n = total number of observations
106
Remark:
n=∑ ∑ O =∑ ∑ e
The null and alternative hypothesis may be stated as:
H : There is no association between A and B
H : Not H (there is association between A and B)
Decision Rule:
Reject H for independency at α level of significance if the calculated value of χ exceeds the tabulated
value with degree of freedom equal to (r-1)(c-1).
Example 1: Random samples of 200 men, all retired were classified according to education and number of
children is as shown below
Education Number of children
level 0-1 2-3 Over 3
elementary 14 37 32
Secondary 31 59 27
and above
Test the hypothesis that the size of the family is independent of the level of education attained by fathers.
(Use 5% level of significance)
Solution:
H : There is no association between the size of the family and the level of education attained by fathers.
H : Not H
 First calculate the raw and column totals
= 83, = 117, = 45, = 96, = 59
 Then, calculate the expected frequency:
×
=
= 18.675 = 39.84 = 24.485

= 26.325 = 56.16 = 34.515
 Obtain the calculated value of the chi-square.
( . ) ( . ) ( . )
=∑ ∑ = + + …+ = 6.3
. . .
 Obtain the tabulated value of chi-square

df = (r -1)(c – 1) = 1 × 2 = 2
.
(2) = 5.99 from table
 The decision is to reject H since > .
(1)
107
 Conclusion: At 5% level of significance we have evidence to say there is association between number
of children and education attained by fathers, based on this sample data.
2x2 Contingency table
a. Consider two categorical variables with only two levels each, say A (A1, A2) and B (B1, B2).
Objective: Testing for independence or no association
Characteristic A Characteristic B
B B Total
A n (a) n (b) n (a+b)
A n (c) n (d) n (c+d)
Total n (a+c) n (b+d) n = (a+b+c+d)
Where a, b, c, and d are observed frequencies and N = total sample size
Hypothesis to be tested:
The chi-square test value can be computed as:
n(ad − bc)
χ = ~χ (1)
(a + b)(a + c)(c + d)(b + d)
Examples: 1. a geneticist took a random sample of 300 men to study whether there is association between
father and son regarding boldness. He obtained the following results.
Son
Father bold Not
Bold 85 59
Not 65 91
Using α = 5% test whether there is association between father and son regarding boldness.
Solution:
H : There is no association between father and son regarding boldness.
H : Not H
The appropriate test statistic is:
n(ad − bc) 300(85x91 − 59x65) 4,563,000,000
χ = = = = 9.02
(a + b)(a + c)(c + d)(b + d) (85 + 59)(85 + 65)(65 + 91)(59 + 91) 505,440,000
.
(1) = 3.841 from table
The decision is to reject since > .
(1)
Conclusion: At 5% level of significance we have evidence to say there is association between father and
son regarding boldness, based on this sample data.
6.3. Chi-square Test of Homogeneity
Test the null hypothesis that the samples are drawn from populations that are homogeneous with respect to
some factor.
108
In a chi-square test for homogeneity of proportions, we test whether different populations have the same
proportion of individuals with some characteristic i.e., to determine whether k population are
homogeneous with respect to a certain characteristic.
The procedures for performing a test of homogeneity are identical to those for a test of independence.
Procedures
Select individuals from each population.
from population 1
from population 2
.
.
.
from population k
Let the k populations be A , A … A . Classify each sample into c categories of characteristic
B , B , … , B . The resulting classification is kxc contingency table.
Hypothesis to be tested
H0: The k populations are homogeneous with respect to characteristic B
Ha: H0 is not true
Equivalently
H : The proportions for each population are equal i.e. H : p = p = p for all j = 1, 2, …, c
H : At least one of the proportions is different from the other.
Where p are the probability or proportion of jth level of characteristic B, within the ith population A .
The expected frequencies are found by multiplying the appropriate row and column totals and then
dividing by the total sample size. Usually they are given in parentheses in the contingency table, along
with the observed frequencies. The expected frequencies in the (ij)th cell is given as;
i row total x j column total
e =
grand total
While the degrees of freedom are obtained similarly to that of independence test, i.e, df = (k-1)x(c-1).
109
Then, the χ test for homogeneity is given as:

( )
χ =∑ ∑ and we shall reject the null hypothesis for large values of χ
Example: In a study of demand for new product a random sample of 270, 250, and 300 customers are
interviewed from three cities respectively. The data are as follows:
Do the data indicate that demand for the new products differ in the three cities?
Solution:
Hypothesis:
H : The three cities are homogeneous with respect to demand for the new product.
H : H is not true
Calculate the expected frequencies
Note: Estimated expected frequencies are presented along with the observed frequencies in parentheses.
Calculate the chi square statistic:
( ) ( . ) ( . ) ( )
χ =∑ ∑ = + + …+ = 52.46
. .
Obtain the tabulated chi square value χ .

(3 − 1)(3 − 1) = χ .
(4) = 9.49
Since 52.46 exceed 9.49 we reject H , and conclude that the demand for the new product is not
homogeneous in the three cities at 5% level of significance.
6.3. Chi-square Test of Goodness of fit
Can be applied in two cases
 Checking ratios or probabilities
 Checking whether data fits a given distribution
A goodness of fit test is used to help determine whether a population has a certain hypothesized
distribution, expressed as proportions of individuals in the population falling into various outcome
categories.
Hypothesis to be tested:
H : The model fits the data
H : The model does not fit the data.
Procedures:
110
 State the appropriate hypothesis

 Number of classes, we use usually one way frequency distribution or table.
 Determine degrees of freedom
Suppose we have k classes or categories and let r be the number of unknown parameters in the assumed
model
Then df = k-1-r. if all parameters are known then the degrees of freedom becomes k-1.
 Compute the expected frequencies
Let n be the total number of observations and Oi be observed frequencies of each ith class (fi); where
=∑
Let p be the probabilities that a random variables falls in the ith class and obtained based on the assumed
model. Then the expected frequencies in each class is obtained by E = np
 Compute the chi-square test statistic
To judge whether the data contradict H , we compare Oi to Ei. If H is true, Oi should be close to Ei in
each cell. The larger the differences {Oi − Ei}, the stronger the evidence against H . The test statistics
used to make such comparisons have large-sample chi-squared distributions.
The pearson chi square statistic for testing H .
( )
χ =∑
This statistic takes its minimum value of zero when all O = E . For a fixed sample size, greater
differences {O − E } produce larger χ values and stronger evidence against H . Since larger χ values
are more contradictory to H , the P-value is the null probability that χ is at least as large as the observed
value. The χ statistic has approximately a chi squared distribution, for large n. The P-value is the chi-
squared right-tail probability above the observed χ value.
 Read the critical value from chi-square distribution table or find p-value
 Compare the calculated value and tabulated values of the chi-square statistic and make decision. Reject
the null hypothesis for large values of χ statistic.
111
Example 1: For example, suppose as a market analyst you wished to see whether consumers have any
preference among five flavors of a new fruit soda. A sample of 100 people provided these data:
Cherry Strawberry Orange Lime Grape
32 28 16 14 10
If there were no preference, you would expect each flavor to be selected with equal frequency. In this
case, the equal frequency is 100/5 =20. That is, approximately 20 people would select each flavor.
Frequency Cherry Strawberry Orange Lime Grape
Observed 32 28 16 14 10
Expected 20 20 20 20 20
Is there enough evidence to reject the claim that there is no preference in the selection of fruit soda
flavors, using the data shown previously? Let α = 0.05.
Solution:
H : Consumers show no preference for flavors (claim).
H : Consumers show a preference.
Find the critical value. The degrees of freedom are 5 -1 =4, and α = 0.05.
Hence, ( . ) = 9.488
( ) ( ) ( ) ( ) ( ) ( )
χ =∑ = + + + + = 18
Make the decision. The decision is to reject the null hypothesis, since 18.0 > 9.488
Conclusion: There is enough evidence to reject the claim that consumers show no preference for the
flavors.
Example 2: Medelian Law of Genetics
Shape and color of a certain pea be classified into four groups “round and yellow”, “round and green”,
“angular and yellow” and “angular and green” according to the ratio 9:3:3:1. For an experiment with n =
556 peas, the following table were observed. We are interested in that; is there good agreement between
the observed experiment number and the expected ratio 9:3:3:1. Assume that these measurements came
from an underlying known discrete probability distribution 9/16, 3/16, 3/16, 1/16. How can the validity of
this assumption be tested?
112
113
Chapter 7
7. Non parametric Methods
7.1. Introduction:
All statistical inference procedures discussed so far are based on specific assumptions regarding the nature
of the underlying population distribution. The most commonly used underlying population distribution is
called a normal distribution. The normal distribution plays a very important role in the statistical
reference. Although the underlying population distribution might not be “exactly” normally distributed, it,
in many cases, can be very well approximated by a normal distribution. In practice, however, the data
might come from a given population that cannot be well approximated by a normal distribution. For
example, the distribution might be very flat, peaked, or strongly skewed to the right or left.
In a nonparametric statistical inference, the methods do not depend on the specific distribution of the
population from which the sample was drawn. Therefore, assumptions regarding the underlying
population are not necessary. In order to understand what non parametric statistics are, it is first necessary
to know what parametric statistics are.
What are Parametric Statistics?

In statistics, when we refer to a distribution we often make certain assumptions about it that enable us to
work with it. One thing that helps us with this is the Central Limit Theorem, which allows us to assume
that many sampling distributions are approximately normal. This theorem, the central limit theorem, tells
us that for any distribution with a mean and variance, the sampling distribution for all samples of a given
sample size is approximately normally distributed.
114
When we do significance tests, we rely on the assumption that the sampling distribution of samples taken
follows the t-distribution or the normal-distribution, depending on the situation. When this assumption is
not true, none of our tests, which are called “parametric statistical inference tests,” are reliable.
What are Non-Parametric Statistics?

Statistical methods that do not depend on knowledge of the population distribution or its parameters are
called non parametric or distribution free methods. These are tests that can be done without the
assumption of normality, approximate normality, or symmetry. These tests do not require a mean and
standard deviation. The non parametric methods are basically developed using rank and sign. They are
less restricted compared to parametric methods.
Why Use Non-Parametric Statistics?

Sometimes statisticians use what is called “ordinal” data. This data is obtained by taking the raw data and
giving each sample a rank. These ranks are then used to create test statistics. In non-parametric statistics,
one deals with the median rather than the mean. Since a mean can be easily influenced by outliers or
skewness, and we are not assuming normality, a mean no longer makes sense. The median is another
judge of location, which makes more sense in a non-parametric test. When our data is normally
distributed, the mean is equal to the median and we use the mean as our measure of center. However, if
our data is skewed, then the median is a much better measure of center. Therefore, just like the Z, t and F
tests made inferences about the population mean(s), nonparametric tests make inferences about the
population median(s). The median is considered the center of a distribution.
Advantages and Disadvantages of Nonparametric Methods

As stated previously, nonparametric tests and statistics can be used in place of their parametric
counterparts (z, t, and F) when the assumption of normality cannot be met. However, you should not
assume that these statistics are a better alternative than the parametric statistics. There are both advantages
and disadvantages in the use of nonparametric methods.
Advantages
There are five advantages that nonparametric methods have over parametric methods:
1. They can be used to test population parameters when the variable is not normally distributed.
2. Unlike parametric methods, non parametric methods can often be applied to categorical data. They can
be used when the data are nominal or ordinal.
3. They can be used to test hypotheses that do not involve population parameters.
4. In some cases, the computations are easier than those for the parametric counterparts.
115
5. They are easy to understand.

Disadvantages
There are three disadvantages of nonparametric methods:
1. They are less sensitive than their parametric counterparts when the assumptions of the parametric
methods are met. Therefore, larger differences are needed before the null hypothesis can be rejected.
2. They tend to use less information than the parametric tests. For example, the sign test requires the
researcher to determine only whether the data values are above or below the median, not how much above
or below the median each value is.
3. They are less efficient than their parametric counterparts when the assumptions of the parametric
methods are met. That is, larger sample sizes are needed to overcome the loss of information. For
example, the nonparametric sign test is about 60% as efficient as its parametric counterpart, the z test.
Thus, a sample size of 100 is needed for use of the sign test, compared with a sample size of 60 for use of
the z test to obtain the same results.
Since there are both advantages and disadvantages to the nonparametric methods, the researcher should
use caution in selecting these methods. If the parametric assumptions can be met, the parametric methods
are preferred. However, when parametric assumptions cannot be met, the nonparametric methods are a
valuable tool for analyzing the data.
The basic assumption for nonparametric statistics is that the sample or samples are randomly obtained.
When two or more samples are used, they must be independent of each other unless otherwise stated.
7.2. Testing a hypothesis about the median

When the population distribution is highly skewed or very heavily tailed, the median is more appropriate
than the mean as a representation of the center of the population.
The estimator of the population median M is based on the order statistics. Recall that if the measurements
from a random sample of size n are given by y1, y2, … yn, then the order statistics are these values ordered
from smallest to largest. Let y(1) ≤ y(2) . ≤ . . ≤y(n) represent the data in ordered fashion. Thus, y(1) is the
smallest data value and y(n) is the largest data value. The estimator of the population median is the sample
median .
is computed as follows:
If n is an odd number, then = ( ), where m = (n+1)/2.
If n is an even number, then =( ( )+ ( ) )/2, where m = n/2.
The (1 − )100% CI for population median (M) is given by:
116
( , )=( ( / ), ( / ))
Where / = ( ), +1
/ = − ( ),
Or
( ) √
M= ±
The weekly weight of recyclable material (in pounds/week) for each household is given here.
14.2, 5.3, 2.9, 4.2, 1.2, 4.3, 1.1, 2.6, 6.7, 7.8, 25.9, 43.8, 2.7, 5.6, 7.8, 3.9, 4.7, 6.5, 29.5, 2.1, 34.8, 3.6, 5.8,
4.5, 6.7
The sample median and a confidence interval on the population are given by the following computations.
Solution:
First, we order the data from smallest value to largest value:
1.1, 1.2, 2.1, 2.6, 2.7, 2.9, 3.6, 3.9, 4.2, 4.3, 4.5, 4.7, 5.3, 5.6, 5.8, 6.5, 6.7, 6.7, 7.8, 7.8, 14.2, 25.9, 29.5,
34.8, 43.8
The number of values in the data set is an odd number, so the sample median is given by
Next we will construct a 95% confidence interval for the population median. From Table value, we find
Types of non-parametric methods:

I. One sample or paired sample
 Sign test
 Wilcoxon signed rank test
 Test of randomness (run test)
II. Two samples (independent) samples
 Mann-Whitney rank sum test
 Lord’s test
117
7.3 The Sign Test:

Suppose that we have a very small sample from a non normal distribution, we cannot use either z or t
statistics discussed so far. The z statistic is inappropriate because the sample size is small. The t statistic is
inappropriate because the underlying population distribution is non normal. In this case, the Sign Test is a
simple alternative procedure to use.
When the underlying population distribution is normal, the mean, the median, and the mode are overlaid.
Either one of them can be used to measure the central tendency of the population. However, it is no longer
the case when the underlying population distribution is non normal. Usually, we make inferences on the
population median (M) rather than the population mean ( ) in many nonparametric procedures.
The main objective of this section is to understand the sign test procedure, which is among the easiest for
non-parametric tests.
The sign test is a non-parametric (distribution free) test that uses plus and minus signs to test different
claims, including:
118
1. Claims involving matched pairs of sample data

2. Claims involving nominal data
3. Claims about the median of a single population
The basic idea underlying the sign test is to analyze the frequencies of the plus and minus signs to
determine whether they are significantly different.
Sign test
Assumptions
1. The sample data have been randomly selected
2. There is no requirement that the sample data come from a population with a particular distribution,
such as a normal distribution.
7.3.1 Single-Sample Sign Test
The simplest nonparametric test, the sign test for single samples, is used to test the value of a median for a
specific sample. When using the sign test, the researcher hypothesizes the specific value for the median of
a population; then he or she selects a sample of data and compares each value with the conjectured
median. If the data value is above the conjectured median, it is assigned a plus sign. If it is below the
conjectured median, it is assigned a minus sign. And if it is exactly the same as the conjectured median, it
is assigned a 0. Then the numbers of plus and minus signs are compared. If the null hypothesis is true, the
number of plus signs should be approximately equal to the number of minus signs. If the null hypothesis is
not true, there will be a disproportionate number of plus or minus signs.
Notation:
x = the number of times the less frequent sign occurs
n = the total number of positive and negative signs combined
Test Statistic
For n ≤ 25: x (the number of times the less frequent sign occurs)
Test value = x (the smaller number of plus or minus signs).
When the sample size is 26 or more (n ≥26), the normal approximation can be used to find the test value:
( . ) ( )
z=
√ /
where: X = smaller number of + or – signs

n = sample size
Example 1: - A convenience store owner hypothesizes that the median number of snow cones she sells
per day is 40. A random sample of 20 days yields the following data for the number of snow cones sold
each day.
119
18 39 43 34 40 39 16 45 22 28
30 36 29 40 32 34 37 39 36 52
At α = 0.05, test the owner’s hypothesis.
Solution:
Step 1: stat the hypothesis and identify the claim
H0: Median = 40 (claim) and H1: Median ≠ 40
Step 2: find the critical value. Compare each value of the data with the median. If the value is greater than
the median, replace the value with a plus sign. If it is less than the median, replace it with a minus sign.
And if it is equal to the median, replace it with a 0. The completed value follows.
- - + - 0 - - + - -
- - - 0 - - - - - +
Using n = 18 (the total number of plus and minus signs; omit the zeros) and = 0.05 for a two-tailed test;
the critical value is 4.
Step 3 Compute the test value. Count the number of plus and minus signs obtained in step 2, and use the
smaller value as the test value. Since there are 3 plus signs and 15 minus signs, 3 is the test value.
Step 4 Make the decision. Compare the test value 3 with the critical value 4. If the test value is less than
or equal to the critical value, the null hypothesis is rejected. In this case, the null hypothesis is rejected
since 3 < 4.
Step 5 Summarize the results. There is enough evidence to reject the claim that the median number of
snow cones sold per day is 40.
Example 2:-Based on information from the U.S. Census Bureau, the median age of foreign-born U.S.
residents is 36.4 years. A researcher selects a sample of 50 foreign-born U.S. residents in his area and
finds that 21 are older than 36.4 years. At α = 0.05, test the claim that the median age of the residents is at
least 36.4 years.
Solution:
Step 1 State the hypotheses and identify the claim.
H0: MD = 36.4 and H1: MD > 36.4
Step 2 Find the critical value. Since α = 0.05 and n = 50, and since it is a right-tailed test, the critical value
is 1.65, obtained from Z-table.
Step 3 computes the test value
( . ) ( ) ( . ) ( )
z= = = -3.5/3.5355 = - 0.99
√ / √ /
120
Step 4 Make the decision. Since the test value of - 0.99 is less than 1.65, the decision is to not reject the
null hypothesis.
Step 5 Summarize the results. There is not enough evidence to reject the claim that the median age of the
residents is at least 36.4.
7.3.2 Paired-Sample Sign Test
The sign test can also be used to test sample means in a comparison of two dependent samples, such as a
before-and-after test. Recall that when dependent samples are taken from normally distributed
populations, the t test is used. When the condition of normality cannot be met, the nonparametric sign test
can be used. When using the sign test with data that are matched by pairs, we convert the raw data to plus
and minus signs as follows:
Example: - A medical researcher believed the number of ear infections in swimmers can be reduced if the
swimmers use earplugs. A sample of 10 people was selected, and the number of infections for a four-
month period was recorded. During the first two months, the swimmers did not use the earplugs; during
the second two months, they did. At the beginning of the second two-month period, each swimmer was
examined to make sure that no infections were present.
The data are shown here. At =0.05, can the researcher conclude that using earplugs reduced the number
of ear infections?
Number of ear infections
Swimmer Before, After,
A 3 2
B 0 1
C 5 4
D 4 0
E 2 1
F 4 3
G 3 1
H 5 3
I 2 2
J 1 3
Solution:
Step 1 State the hypotheses and identify the claim.
H0: The number of ear infections will not be reduced.
H1: The number of ear infections will be reduced (claim).
Step 2 Find the critical value. Subtract the after values X from the before values X and indicate the
difference by a positive or negative sign or 0, according to the value, as shown in the table.
121
Swimmer Before, After, Sign of difference

A 3 2 +
B 0 1 -
C 5 4 +
D 4 0 +
E 2 1 +
F 4 3 +
G 3 1 +
H 5 3 +
I 2 2 0
J 1 3 -
Critical value (with n=9 and = 0.05) =1
Step 3 Compute the test value. Count the number of positive and negative signs found in step 2, and use
the smaller value as the test value. There are 2 negative signs, so the test value is 2.
Step 4 Make the decision. There are 2 negative signs. The decision is to not reject the null hypothesis. The
reason is that with n = 9, C.V. =1 and 1 < 2.
Step 5 Summarize the results. There is not enough evidence to support the claim that the use of earplugs
reduced the number of ear infections.
7.3.3 Sign test for nominal data
Recall that nominal data consists of names, labels, or categories only. Although such a nominal data set
limits the calculations that are possible, we can identify the proportion of the sample data that belong to a
particular category, and we can test claims about the corresponding population proportion p. The
following example uses nominal data consisting of genders (male/female). The sign test is used by
representing men with positive (+) signs and women with negative (-) signs.
Example: gender Discrimination, The hatter’s restaurant chain has been charged with discrimination
based on gender because only 30 men were hired along with 70 women. A company official concedes that
qualified applicants are about half men and half women. But she claims that “hatters does not discriminate
and the fact that 30 of the last 100 new employees are men is just a fluke.” Use the sign test and a 0.05
significance level to test the null hypothesis that men and women are hired equally by this company.
Solution: Let p denote the population proportion of hired men. The claim of no discrimination implies
that the proportions of hired men and women are both equal to 0.05, so that p = 0.05. The null and
alternative hypothesis can there for be stat as follows:
H0: p = 0.5
H1: p ≠ 0.5
122
Denoting hired men by + and hired women by -, we have 30 positive signs and 70 negative signs. We note
that the value of n =100 is above 25, so the test statistic x is converted (using a correction for continuity)
to the test statistic z as follows:
( . ) ( ) ( . ) ( )
z= = = - 3.90, with α = 0.05 in a two tailed test, the critical values
√ / √ /
are z = ±1.96
The test statistic z = -3.9 is less than 1.96, so we fail to reject the null hypothesis that the proportion of
hired men is equal to 0.5. There is sufficient sample evidence to support the claim that the hiring practices
are fair, with the proportions of hired men and women both equal to 0.5.
7.4 Wilcoxon Rank Sum Test
The reader should note that the sign test utilizes only the plus and minus signs of the differences between
the observations and μ in the one-sample case, or the plus and minus signs of the differences between the
pairs of observations in the paired-sample case; it does not take into consideration the magnitudes of these
differences. A test utilizing both direction and magnitude, proposed in 1945 by Frank Wilcoxon, is now
commonly referred to as the Wilcoxon signed-rank test.
The analyst can extract more information from the data in a nonparametric fashion if it is reasonable to
invoke an additional restriction on the distribution from which the data were taken. The Wilcoxon signed-
rank test applies in the case of a symmetric continuous distribution. Under this condition, we can test
the null hypothesis μ = μ . We first subtract μ from each sample value, discarding all differences equal to
zero. The remaining differences are then ranked without regard to sign. A rank of 1 is assigned to the
smallest absolute difference (i.e., without sign), a rank of 2 to the next smallest, and so on. When the
absolute value of two or more differences is the same, assign to each the average of the ranks that would
have been assigned if the differences were distinguishable.
For example, if the fifth and sixth smallest differences are equal in absolute value, each is assigned a rank
of 5.5. If the hypothesis μ = μ is true, the total of the ranks corresponding to the positive differences
should nearly equal the total of the ranks corresponding to the negative differences. Let us represent these
totals by w+ and w−, respectively. We designate the smaller of w+ and w− by w.
In selecting repeated samples, we would expect w+ and w−, and therefore w, to vary. Thus, we may think
of w+, w−, and w as values of the corresponding random variables W+, W−, and W. The null hypothesis μ
= μ can be rejected in favor of the alternative μ < μ only if w+ is small and w− is large. Likewise, the
alternative μ > μ can be accepted only if w+ is large and w− is small. For a two-sided alternative, we may
123
reject H0 in favor of H1 if either w+ or w−, and hence w, is sufficiently small. Therefore, no matter what
the alternative hypothesis may be, we reject the null hypothesis when the value of the appropriate statistic
W+, W−, or W is sufficiently small.
Two Samples with Paired Observations
To test the null hypothesis that we are sampling two continuous symmetric populations with μ = μ for
the paired-sample case, we rank the differences of the paired observations without regard to sign and
proceed as in the single-sample case. The various test procedures for both the single- and paired-sample
cases are summarized in the following Table.
Or
H0: The two samples come from populations with the same distribution
H1: The two samples come from populations with different distribution
The Wilcoxon signed rank test can also be used to test the claim that a sample comes from a population
with a specified median.
124
It is not difficult to show that whenever n < 5 and the level of significance does not exceed 0.05 for a one-
tailed test or 0.10 for a two-tailed test, all possible values of w+, w−, or w will lead to the acceptance of
the null hypothesis. However, when 5 ≤ n ≤ 30, the table shows approximate critical values of W+ and
W− for levels of significance equal to 0.01, 0.025, and 0.05 for a one-tailed test and critical values of W
for levels of significance equal to 0.02, 0.05, and 0.10 for a two-tailed test. The null hypothesis is rejected
if the computed value w+, w−, or w is less than or equal to the appropriate tabled value. For example,
when n = 12, the table shows that a value of w+ ≤ 17 is required for the one-sided alternative μ < μ to be
significant at the 0.05 level.
Example 1: According to the director of a county tourist bureau, there is a median of 10 hours of sunshine
per day during the summer months. For a random sample of 20 days during the past three summers, the
number of hours of sunshine has been recorded below.
Use the 0.05 level in evaluating the director’s claim.
125
So, = 18.5, = 152.5

H : μ = 10 hours H : μ ≠ 10 hours
n = 18, α = 0.05 two tailed test, then the critical value is 40
Test statistics: = 18.5 (since it is two tailed the smallest of , )
Wilcoxon signed ranks procedure for pair of data

Procedures
Step 1: For each pair of data, find the difference d by subtracting the second value from the first. Keep the
signs, but discard any pairs for which d = 0.
Step 2: Ignore the signs of the differences, then sort the differences from lowest to highest and replace the
differences by the corresponding rank value. When differences have the same numerical value, assign to
them the mean of the ranks involved in the tie.
Step 3: Attach to each rank the sign difference from which it came. That is, insert those signs that were
ignored in step 2.
126
Step 4: Find the sum of the absolute values of the negative ranks. Also find the sum of the positive ranks.
Step 5: Let T be the smaller of the two sums found in Step 4. Either sum could be used, but for a
simplified procedure we arbitrarily select the smaller of the two sums.
Step 6: Let n be the number of pairs of data for which the difference d is not 0.
Step 7: Determine the test statistic and critical values based on the sample size, as shown above.
Step 8: When forming the conclusion, reject the null hypothesis if the sample data lead to a test statistic
that is in the critical region that is, the test statistic is less than or equal to the critical value(s). Otherwise,
fail to reject the null hypothesis.
Example 2: In a large department store, the owner wishes to see whether the number of shoplifting
incidents per day will change if the number of uniformed security officers is doubled. A sample of 7 days
before security is increased and 7 days after the increase shows the number of shoplifting incidents.
Step 2 Find the critical value. Since n = 7 and α = 0.05 for this two-tailed test, the critical value is 2.
Step 3 Find the test value.
127
128
7.5 Runs test of randomness
When samples are selected, you assume that they are selected at random. How do you know if the data
obtained from a sample are truly random?
Example: Consider the situation for a research interviewing 20 students for pilot survey. Let the gender is
represented by M for male and F for female. Suppose the participants were chosen as follows:
Situation 1: MMMMMMMMMMFFFFFFFFFF (It does not look random, i.e. 10 male select first and 10
female next)
Situation 2: FMFMFMFMFMFMFMFMFMFM (It seems as the researcher select M and F alternatively)
Situation 3: FFFMMFMFMMFFMMFFMMMF (It looks random, no apparent pattern to their selection)
Rather than try to guess whether the data of a sample have been selected at random, statistician have
devised a non parametric test to determine randomness. That test is called the run test.
A run is a succession of identical letter preceded or followed by a different letter at all, such as the
beginning or the end of succession.
Example: Consider situation 1 and 3, it have 2 and 11 runs respectively.
Situation 1: Run 1: MMMMMMMMMM & Run 2: FFFFFFFFFF
Situation 2: Run: FFF Run 2: MM Run 3: F Run 4: M Run 5: F Run 6: MM Run 7: FF Run 8:
MM Run 9: FF Run 10: MMM Run 11: F
129
The main objective of this section is to introduce the runs test for randomness; whish can be used to
determine whether the sample data in a sequence are in a random order.
The test for randomness considers the number of runs rather than the frequency of the letters. For
example, for data to be selected at random there should not be too few or too many runs.
130
Level is α = 0.05, the test statistic is the number of runs G.

Reject randomness if the number of runs G is
 Less than or equal to the smaller critical value
 Or greater than or equal to the larger critical value
Example 1:
Listed below are the most recent (as of this writing) winners of the NBA basketball championship game.
Let W denote a winner from the Western Conference, E for the Eastern Conference.
Use a 0.05 significance level to test for randomness in the sequence:
EEWWWWWEWEWEWWW
Solution:
Let n1 = number of Eastern Conference Winners = 5
n2 = number of Western Conference Winners = 10
R = number of runs = 8
Because n1 ≤ 20 and n2 ≤ 20 and α = 0.05, the test statistic is R = 8. From table the critical values are 3
and 12.
131
Because G = 8 is neither less than or equal to 3 nor greater than or equal to 12, we do not reject
randomness. There is not sufficient evidence to reject randomness in the sequence of winners.
Example 2:
Let’s consider the sequence of listed products by a certain machine indicated below. Test the claim that
the sequence is random using a 0.05 significance level.
NNN D NNNNNNN D NN DD NNNNNN D NNNN D NNNNN
DDDD NNNNNNNNNNNN
Where N = normal product
D = defective product
132
7.3 Mann–Whitney U test (Wilcoxon Rank Sum Test)
The basic idea underlying the Wilcoxon rank-sum test is this: If two samples are drawn from identical
populations and the individual values are all ranked as one combined collection of values, then the high
and low ranks should fall evenly between the two samples. If the low ranks are found predominantly in
one sample and the high ranks are found predominantly in the other sample, we suspect that the two
populations have different medians.
133
o For small samples (n1 +n2 ≤ 30), compare the smaller of T1 and T2 with the rejection region
consisting of values less than the critical values. If either T1 or T2 falls in the rejection region, we
reject the null hypothesis. Note that even though this is a two-tailed test, we only use the lower
quantiles of the tabled distribution.
o The test statistic is T = smaller of T1 or T2.
134
135
136

Statistical Method

Uploaded by

Copyright:

Available Formats

Statistical Method

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistical Method

Uploaded by

Copyright:

Available Formats

Statistical Methods (Stat 1013)Lecture Notes UoG

 Comparing data: construct similar graphs to compare data sets.

Event: A, B, C, D, E etc. (any capital letter).

• The axiomatic approach.

n1 + n2= 2+3= 5 ways.

(n1 * n2 * ........ * nk ) ways.

Probability distribution can be discrete or continues.

B) Continuous probability distribution

 f x d (x) =1, then

Mean and Variance of a random variable

a) E (kX) =kE(X)  var kX   k 2 . var X 

Remark: If X is a binomial random variable with parameters n and p then

6. It is unimodal, i.e., values mound up only in the center of the curve.

a) The normal curve area between 0 and z(positive) is 0.4726

 It is symmetric about the mean.

obtained from the values of

 The variance is equal to two times the number of degrees of freedom: σ2 = 2 * v

~F , with n numerator d.f., m denominator d.f.

It is a procedure that results in a single value as an estimate for a parameter.

which is one of the possible value of X.

2.2.1 Point and Interval Estimation

done with replacement from finite or infinite population.

Summary Measures for the Population Distribution:

Now consider all possible samples of size n = 2

2 =16 possible samples (sampling with replacement)

Summary Measures of this Sampling Distribution:

∑( − ) (18 − 21) (19 − 21) … (24 − 21)

(i) E( ) = (ii) Var ( ) = ( )

4.5 1 1/10 4.5/10 20.25/10

The mean and variance of the population are:

(i) E( ) = =9 (ii) Var ( ) = = = 6.75

Central Limit Theorem

=> = has a standard normal distribution.

The confidence interval estimator for (at level of confidence of (1 - )) is:

− < < + that is true.

to vary after sample mean has been calculated.

= 2.28, = 0.95, 1 − = 0.95 => = 0.05, = 0.025 => = 2.571 ℎ =5

The z-score for the sample proportion is: = = ( )

2.2.5 Point and Interval Estimation of Population Proportion

E = = ( )= =P => is an unbiased estimator of P

Therefore, the 100(1-α)% confidence interval estimator for P is ̂ ± / .

Note also the more precise confidence interval is given by:

2.3 Hypothesis Testing

Definitions of some terms:

 Type I error: Rejecting the null hypothesis when it is true.

Use the following decision rule.

2.3.2 Hypothesis Testing about the Population Mean

 The relevant test statistic is =

Summary Table for Decision Rule

The decision rule is the same as case I.

 Is bigger than Is smaller than

Decision: Fail to reject .

Step 5: computations: = = = −1.0

Step 5: computation: = 200, = 50

Two- tailed Test

Summary Procedure for Finding P-Values

Hypothesis test based on Confidence Interval

Compute Z-test = ̂ = sample proportion (x/n)

2.4 Sample size determination

Choosing the sample size for a population proportion