Statistics
Statistics
Statistics
CHAPTER ONE
Statistics
1.1. Introduction
For a layman ‘Statistics’ means, numerical information expressed in quantitative terms. This information may
relate to objects, subjects, activities, phenomena, or regions of space. As a matter of fact, data have no limits
as to their reference, coverage, and scope.
A student knows statistics more intimately as a subject of study like economics, mathematics, chemistry,
physics, and others. It is a discipline, which scientifically deals with data, and is often described as the science
of data. In dealing with statistics as data, statistics has developed appropriate methods of collecting,
presenting, summarizing, and analyzing data, and thus consists of a body of these methods.
1.1.1.Definition of Statistics: The world statistics is an Italian word composed of two words; stato-which
means the state and statists-refers to a person involved with the affairs of the state. Therefore, statistics
was meant the collection of facts useful to the state.
Nowadays, statistics in not restricted to information about the state; It extends to almost every
realm/Monarchy of human endeavor/effort.
Statistics is defined by many scholars in different ways:
Seligman “explored that statistics is a science that deals with the methods of collecting, classifying,
presenting, comparing and interpreting numerical data collected to throw some light on any sphere of
enquiry.”
Spiegal defines statistics highlighting its role in decision-making particularly under uncertainty, as
follows: “statistics is concerned with scientific method for collecting, organizing, summarizing, presenting
and analyzing data as well as drawing valid conclusions and making reasonable decisions on the basis of
such analysis.”
Statistics is defined as a science or process of collecting, organizing, presenting, analyzing and interpreting
data to assist in making effective decision.
1.1.2.Areas (types) of statistics: There are two major divisions of statistics such as descriptive statistics and
inferential statistics.
a) Descriptive Statistics: When the population of interest is small and we can conduct a census of the
population we will be able to directly describe the important aspects of the population measurement. The
subject area of descriptive statistics includes procedures used to summarize masses of data and present
them in an understandable manner. However it has nothing to do with the future.
b) Inferential Statistics: Also known as inductive statistics is A Conclusion drawn about a population
based on information in a sample from the population is called statistical inference. Statistics is usually
concerned with inference. The population we want to study is usually large or infinite. So we need to
select a sample since it is impossible to study the population.
1.1.3.Importance (uses) of statistics: Statistics is useful for:
Government officials for making policy decisions in unemployment, inflation, health, education,
infrastructure etc.
Financial planners for trend analysis, stock market, future investment etc..
Businesses, for product development, customer satisfaction,
Production supervisors for quality control, improve product quality etc.
Politicians for legislation campaign strategy
Physicians and Hospitals on effectiveness of drugs and disease surveillance etc.
Managerial statistical analysis of data used to help in improving business processes to.
1
ETHIO-SMART COLLEGE
Variable: A variable is a factor or characteristic that can take on different possible values or outcomes. A
variable differs from a constant is that the latter term implies that the values or outcomes are always the same.
Income, height, weight, sex, age, etc are examples of variables. In an investigation, data are collected about
one or more variables of interest. A variable can be qualitative or quantitative (numeric).
Elementary Unit: An elementary unit is a specific person, business, product account, and so on, with some
characteristic to be measured or categorized.
Population: In Statistics the term population is used to mean the totality of causes (items) under consideration
in a given investigation or research. In other words, the largest collection of observations on a variable
constitutes the population. Population can be finite (limited in its size) or infinite (unrestricted). In finite
population, observations are countable- at least in theory. In contrast, infinite population is indefinitely large.
The observations cannot be even in theory.
Sample: Any non-empty subset of a population is called a sample. There are different possible samples that
can be selected from a single population. Nevertheless, the one that best reflects or represents the behavior of
the population is considered to be the most appropriate one. The critical question is “How to identify and get
that best representative sample?” In fact, the whole aim of the theory of sampling is to answer this question.
Parameter: It is a measurable characteristic of the population or it is a numerical result obtained as
measuring the population.
Statistic: It is a measurable characteristic of the sample. In short it is a sample result.
Survey: Survey or experiment is a device of obtaining the desired data.
Statistical Design: Statistical design is a process that involves a decision problem and choosing an approach
to solving the problem. It is a guide that indicates how an investigation is going to channeled.
Frame: It is the listing of all elementary units in the population under study. Strictly speaking, one cannot
present frame for infinite population, as the units in an infinite population are infinite.
1.2. Descriptive Statistics (Independent Review)
1.2.2. Statistical data:
1.2.2.1. Meaning of statistical data: Statistical data are the basic raw material of statistics. Data may
relate to an activity of our interest, a phenomenon, or a problem situation under study. They derive as a
result of the process of measuring, counting and/or observing. Statistical data, therefore, refer to those
aspects of a problem situation that can be measured, quantified, counted, or classified. Any object
subject phenomenon, or activity that generates data through this process is termed as a variable. In other
words, a variable is one that shows a degree of variability when successive measurements are recorded.
2
ETHIO-SMART COLLEGE
3
ETHIO-SMART COLLEGE
1.2.2. Organization of descriptive data: Raw data, or data that have not been summarized in any way, are
sometimes referred to as ungrouped data. Data that have been organized into a frequency distribution are
called grouped data.
1.2.2.1. Tabular presentation: The classes that we use to construct frequency distribution tables of a
categorical variable are simply the possible responses to the categorical variable.
1.2.2.2. Frequency distribution: One particularly useful tool for grouping data is the frequency
distribution, which is a summary of data presented in the form of class intervals and frequencies.
When constructing a frequency distribution, the business researcher should:
First determine the range of the raw data. The range often is defined as the difference between the largest
4
ETHIO-SMART COLLEGE
(a) Mean: Perhaps the most important measure of location is the mean or average value, for a variable. The
mean provides a measure of central location for the data. If the data are for a sample, the mean is denoted
̅ ; if the data are for a population, the mean is denoted by the Greek letter μ.
by 𝐗
When Individual Observations are given.
Let there be n observations X1, X2..... Xn. Their mean can be calculated
Sum of all observations Σxi
̅=
X =
number of observations n
Where;
𝐗 ̅= Mean (average)
N= the number of data values in the sample
xi = the value of the ith data value of random variable x
Σxi = the sum of the n data values, i.e. x1 + x2 + x3 + x4 + … + xn
Illustration : The number of seminar training days attended last year by 20 financial advisors is shown in
below Table.
16 20 13 19 24 22 18 18 15 20
21 21 18 20 18 20 15 20 18 20
Required: What is the average number of training days attended by these financial advisors?
Solution: To find the average, sum the number of days for all 20 financial advisors (Σxi = 376) and divide
this total by the number of financial advisors
(n = 20).
Σxi
̅
X=
n
376
̅
X= = 18.8 days
20
On average, each financial advisor attended 18.8 days of seminar training last year
When data are in the form of an ungrouped frequency distribution
Let there be n values X1, X2,..... Xn out of which X1 has occurred f1 times, X2 has occurred f2 times ... Xn has
occurred fn times. In case of ungrouped data where weights are involved, our approach for calculating
arithmetic mean will be different from the one used earlier.
Illustration: The following is the frequency distribution of age of 670 students of a school.
Age in years (X) Frequency (f)
5 25
6 45
7 90
8 165
9 112
10 96
11 81
12 26
13 18
14 12
Total 𝚺f=670
Required: Compute the arithmetic mean of the data
5
ETHIO-SMART COLLEGE
Solution
Age in years (X) Frequency (f) fx
5 25 125
6 45 270
7 90 630
8 165 1,320
9 112 1,008
10 96 960
11 81 891
12 26 312
13 18 234
14 12 168
Total Σf=670 Σfx=5,918
Σfx
̅X =
Σf
5,918
x̅ = = 8.83 years
670
The average age of the students in the school is 8.83 years
When data are in the form of a grouped frequency distribution
In a grouped frequency distribution, there are classes along with their respective frequencies. Let Li be the
lower limit and Ui be the upper limit of ith class. Further, let the number of classes be n, so that i = 1, 2 ...n.
Also let fi be the frequency of ith class. It may be recalled here that, in a grouped frequency distribution, we
only know the number of observations in a particular class interval and not their individual magnitudes.
Therefore, to calculate mean, we have to make a fundamental assumption that the observations in a class are
uniformly distributed. Under this assumption, the mid-value of a class will be equal to the mean of
observations in that class and hence can be taken as their representative. Therefore, if Xi is the mid-value of ith
class with frequency fi the above assumption implies that there are fi observations each with magnitude Xi (i =
1 to n). Illustration ;The following table gives the distribution of weekly wages of workers in a factory.
Weekly wages 240-269 270-299 300-329 330-359 360-389 390-419 420-449
No. of workers 7 19 27 15 12 12 8
𝚺fd
̅=A+
X
n
𝟕𝟖𝟎
̅ = 344.5 −
X
100
̅
X = 336.7
b) Median: The median is another measure of central location. The median is the value in the middle when
the data are arranged in ascending order (smallest value to largest value) or descending order (largest value to
smallest value). With an odd number of observations, the median is the middle value. An even number of
observations has no single middle value. In this case, we follow convention and define the median as the
average of the values for the middle two observations.
Thus, in an ungrouped frequency distribution if the n values are arranged in ascending or descending order of
magnitude, the median is the middle value if n is odd. When n is even, the median is the mean of the two
middle values.
(a) When individual observations are given: The following steps are involved in the determination
of median:
The given observations are arranged in either ascending or descending order of magnitude.
Given that there are n observations, the median is given by:
n+1 th
The size of( ) observations, when n is odd.
2
n n
The mean of the sizes of (2) th and (2 + 1)th observations, when n is even.
Illustration: Find median of the following observations: 20, 15, 25, 28, 18, 16, 30
Solution: Writing the observations in ascending order, we get 15, 16, 18, 20, 25, 28, 30
7+1 th
Since n = 7, i.e., odd, the median is the size of ( ) i.e., 4th observation.
2
Hence, median, denoted by Md = 20.
Illustration: Find median of the data: 245, 230, 265, 236, 220, 250
Solution: Arranging these observations in ascending order of magnitude, we get 220, 230, 236, 245, 250, 265.
Here n = 6, i.e., even.
6 6
So Median will be arithmetic mean of the size of (2) th i.e., 3rd and (2 + 1)th, i.e., 4th observations.
236 + 245
Hence Md ( ) = 240.5
2
(b) When ungrouped frequency distribution is given: In this case, the data are already arranged in the order
of magnitude. Here, cumulative frequency is computed and the median is determined in a manner similar
to that of individual observations.
Illustration : Locate median of the following frequency distribution:
Variable (x) 10 11 12 13 14 15 16
Frequency (f) 8 15 25 20 12 10 5
Solution
x f c.f.
10 8 8
11 15 23
12 25 48
13 20 68
14 12 80
7
ETHIO-SMART COLLEGE
15 10 90
16 5 95
95+1 th
Here n = 95, which is odd. Thus, median is size of [ ] i.e. 48th observation. From the table 48th
2
observation is 12
Md = 12.
(c) When grouped frequency distribution is given: The determination of median, in this case, will be
explained with the help of the following example.
Illustration: Suppose we wish to find the median of the following frequency distribution.
Class 0-10 10-20 20-30 30-40 40-50 50-60
Frequency 5 12 14 18 13 8
Solution: The median of a distribution is that value which divides the distribution into two equal parts. In case
of a grouped frequency distribution, this implies that the ordinate drawn at the median divides the area under
the histogram into two equal parts. Writing the given data in a tabular form, we have:
Class Frequency (f) c.f.
(1) (2) (3)
1-10 5 5
11-20 12 17
21-30 14 31
31-40 18 49
41-50 13 62
51-60 8 70
8
ETHIO-SMART COLLEGE
In the words of Croxton and Cowden, “The mode of a distribution is the value at the point around
which the items tend to be most heavily concentrated. It may be regarded the most typical of a series
of values.”
Further, according to A.M. Tuttle, “Mode is the value which has the greatest frequency density in its
immediate neighborhood.”
The concept of mode, as a measure of central tendency, is preferable to mean and median when it is desired to
know the most typical value, e.g., the most common size of shoes, the most common size of a ready-made
garment, the most common size of income, the most common size of pocket expenditure of a college student,
the most common size of a family in a locality, the most popular candidate in an election, etc.
(a) When data are either in the form of individual observations or in the form of ungrouped frequency
distribution
Illustration : Compute mode of the following data
3, 4, 5, 10, 15, 3, 6, 7, 9, 12, 10, 16, 18,
20, 10, 9, 8, 19, 11, 14, 10, 13, 17, 9, 11
Solution : To determine the mode we will Wright this in the form of a frequency distribution, we get
Values 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
frequency 2 1 1 1 1 1 3 4 2 1 1 1 1 1 1 1 1 1
Therefore the mode is 10
Note:
If the frequency of each possible value of the variable is same, there is no mode.
If there are two values having maximum frequency, the distribution is said to be bimodal
(b) In the case of grouped data: In the case of grouped data, mode is determined by the following formula:
f1 − fo
𝐌𝐨𝐝𝐞 = L1 + ×i
(f1 − f0 ) + (f1 − f2 )
Where:
L1 = the lower value of the class in which the mode lies
fl = the frequency of the class in which the mode lies
fo = the frequency of the class preceding the modal class
f2 = the frequency of the class succeeding the modal class
i = the class-interval of the modal class
While applying the above formula, we should ensure that the class-intervals are uniform throughout. If the
class-intervals are not uniform, then they should be made uniform on the assumption that the frequencies are
evenly distributed throughout the class. In the case of unequal class-intervals, the application of the above
formula will give misleading results.
Illustration : Let us take the following frequency distribution:
Class intervals (1) Frequency(2)
31-40 4
41-50 6
51-60 8
61-70 12
71-80 9
81-90 7
91-100 4
9
ETHIO-SMART COLLEGE
Solution : In each of these three sets, the highest number is 15 and the lowest number is 5. Since the range is
the difference between the maximum value and the minimum value of the data, it is 10 in each case. But the
range fails to give any idea about the dispersal or spread of the series between the highest and the lowest
value. This becomes evident from the above data.
Note: In a frequency distribution, range is calculated by taking the difference between the upper limit of
the highest class and the lower limit of the lowest class.
Illustration : Find the range for the following frequency distribution:
Size of item Frequency
20-40 7
40-60 11
60-80 30
80-100 17
100-120 5
10
ETHIO-SMART COLLEGE
Total 70
Solution : Here, the upper limit of the highest class is 120 and the lower limit of the lowest class is 20. Hence,
the range is 120 - 20 = 100. Note that the range is not influenced by the frequencies.
b) Variance: The variance is a measure of variability that utilizes all the data. The variance is based on the
difference between the value of each observation (xi) and the mean. The difference between each xi and the
mean is called a deviation about the mean. For a sample, a deviation about the mean is written (xi-x̅ ); for a
population, it is written (xi - μ). In the computation of the variance, the deviations about the mean are squared.
If the data are for a population, the average of the squared deviations is called the population variance. The
population variance is denoted by the Greek symbol σ2. For a population of N observations and with μ
denoting the population mean, the definition of the population variance is as follows.
𝚺(xi − μ)2
σ2 =
N
The sample variance, denoted by s2, is defined as follows.
𝚺(xi − x̅)2
s2 =
n−1
Illustration : The following data is on class size for the sample of five college classes: 46, 54, 42, 46, 32
Required: calculate the variance
Solution
No. of students in Mean class size Deviation about the Square deviation
class (𝐱̅) mean (xi -𝐱̅)2
(xi) (xi -𝐱̅)
46 44 2 4
54 44 10 100
42 44 -2 4
46 44 2 4
32 44 -12 144
2
𝚺 (xi -𝐱̅) =256
𝚺 (xi -𝐱̅)=0
2
𝚺(xi _ x̅)2 256
s = = = 64
n−1 5−1
Note: I recommend that you think of the variance as a measure useful in comparing the amount of
variability for two or more variables. In a comparison of the variables, the one with the
largest variance shows the most variability. Further interpretation of the value of the variance may
not be necessary.
c) Standard deviation: The standard deviation is a number that summarizes how far away from the average
the data values typically are. The standard deviation is a very important concept in statistics since it is the
basic tool for summarizing the amount of randomness in a situation. Specifically, it measures the extent of
randomness of individuals about their average.
The population standard deviation
𝚺(xi − μ)2
σ = √ σ2 = √
N
Sample standard deviation
11
ETHIO-SMART COLLEGE
𝚺(xi − x̅)2
𝐬 = √s 2 = √
n−1
Illustration : The following is marks obtained by students:(20, 15, 19, 24, 16, 14)
Solution
Marks of students (x) x-𝛍̅ ̅)2
(x-𝛍
20 20-18=2 4
15 15-18=-3 9
19 19-18=1 1
24 24-18=6 36
16 16-18=-2 4
14 14-18=-4 16
𝚺 (x-𝛍 2
̅) = 70
𝚺(xi − μ)2 70
σ=√ = √ = 𝟑. 𝟒𝟐
N 6
In the case of frequency distribution where the individual values are not known, we use the midpoints of the
class intervals. Thus, the formula used for calculating the standard deviation is as given below:
𝚺𝐟𝐢 𝐝𝐢 𝟐 𝚺𝐟𝐢 𝐝𝐢 2
σ=√ −⌈ ⌉ ×c
N N
Where
fi= is the frequency of the ith class
di =is the deviation of the of item from an assumed origin; and
N =is the total number of observations.
C = is the class interval
Illustration: The following is a sample distribution relating to marks obtained by students in an examination:
Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100
No. of students 1 3 6 10 12 11 6 3 2 1
Required: calculate the standard deviation
Solution
Mid-point Frequency m −55
di= i10
Marks (mi) (fi) fi×di fi×di2
0-10 5 1 -5 -5 25
10-20 15 3 -4 -12 48
20-30 25 6 -3 -18 54
30-40 35 10 -2 -20 40
40-50 45 12 -1 -12 12
50-60 55 11 0 0 0
60-70 65 6 1 6 6
70-80 75 3 2 6 12
80-90 85 2 3 6 18
90-100 95 1 4 4 16
12
ETHIO-SMART COLLEGE
σ 10
CV for B = × 100 = × 100 = 22.2%
μ 45
These calculations clearly indicate that although typist B types out more pages, there is a greater variation in
his output as compared to that of typist A. We can say this in a different way: Though typist A's daily output
is much less, he is more consistent than typist B. The usefulness of the coefficient of variation becomes clear
in comparing two groups of data having different means.
CHAPTER TWO
Probability and Probability Distribution
2.1. Basic definitions of probability: Managers often base their decisions on an analysis of uncertainties such
as the following:
1. What are the chances that sales will decrease if we increase prices?
2. What is the likelihood a new assembly method will increase productivity?
3. How likely is it that the project will be finished on time?
4. What is the chance that a new investment will be profitable?
Probability is a numerical measure of the likelihood that an event will occur. Thus, probabilities can be used
as measures of the degree of uncertainty associated with the four events previously listed. If probabilities are
available, we can determine the likelihood of each event occurring.
Probability is a number between zero and one inclusive. The probability of 0 represents something that cannot
happen and the probability of one represents something that is certain to happen. The closer a probability is to
zero, the more improbable it is that something will happen the closer the probability is to one the more sure
we are it will happen. When probability is 0.5 uncertainty will reach its maximum.
𝑚
𝑝(𝐴) = 𝑛
Where:
m→ Number of outcomes favorable to A
n→ Number of exhaustive outcomes
Illustration : What is the probability of obtaining at least one head in the simultaneous toss (throw) of two
unbiased coins?
13
ETHIO-SMART COLLEGE
Solution: The equally likely, mutually exclusive and exhaustive outcomes of the experiment are (H, H), (H,
T), (T, H) and (T, T), where H denotes a head and T denotes a tail. Thus, n = 4.
Let A be the event that at least one head occurs. This event corresponds the first three outcomes of the random
experiment. Therefore, m = 3.
𝟑
Hence, probability that A occurs, i.e, 𝑷(𝑨) = 𝟒
How to Interpret Probability: Mathematically, the probability that an event will occur is expressed as a
number between 0 and 1. Notational the probability of event A is represented by P(A).
If P(A) equals zero, there is no chance that the event A will occur.
If P(A) is close to zero, there is little likelihood that event A will occur.
If P(A) is close to one, there is a strong chance that event A will occur
If P(A) equals one, event A will definitely occur.
The sum of all possible outcomes in a statistical experiment is equal to one. This means, for example, that if
an experiment can have three possible outcomes (A, B, and C), then P(A) + P(B) + P(C) = 1.
2.2. Fundamental concepts: In the study of probability, developing a language of terms and symbols is
helpful. The structure of probability provides a common framework within which the topics of probability can
be explored.
A. Experiment: “An experiment is a process that leads to one of several possible outcomes. An outcome of
an experiment is some observation or measurement.”
Any action, whether it is the drawing a card out of a deck of 52 cards, or reading the temperature, or
measurement of a product's dimension to ascertain quality, or the launching of a new product in the market,
constitute an experiment in the probability theory terminology.
The experiments in probability theory have three things in common:
there are two or more outcomes of each experiment
it is possible to specify the outcomes in advance
there is uncertainty about the outcomes
B. Sample space: The sample space is an exhaustive list of all the possible outcomes of an experiment. Each
possible result of such a study is represented by one and only one point in the sample space, which is usually
denoted by S.
Examples:
Experiment rolling a die once:
Sample space S = {1, 2, 3, 4, 5, 6}
Experiment Tossing a coin:
Sample space S = {Heads, Tails}
Experiment Measuring the height (cms) of a girl on her first day at school:
Sample space S = the set of all possible real numbers
C. Event: It is a collection of one or more outcomes of an experiment, or any experimental outcome that may
or may not occur. If the experiment is tossing a coin the events are Head, or Tail.
Set theory is used to represent relationships among events. In general, if A and B are two events in the sample
space S, then
(A U B) (A union B) = 'either A or B occurs or both occur'
(AᴒB) (A intersection B) = 'both A and B occur'
S (the sample space) = an event that is certain to occur
Example
14
ETHIO-SMART COLLEGE
A B
A ∩B
Illustration :Suppose your chance of being offered a certain job is 0.45, your probability of getting another job is
0.55, and your probability of being offered both jobs is 0.30. What is the probability that you will be offered at
least one of the two jobs?
Solution: Let A be the event that the first job is offered and B the event that the second job is offered
Then, P (A) = 0.45 P (B) = 0.55 and P (A ∩ B) = 0.30
So, the required probability is given as:
P ( A U B ) = P ( A) + P (B ) − P ( A ∩ B )
= 0.45 + 0.55 – 0.30
= 0.70
2) Addition Rule for Two Mutually Exclusive Event: Two events are said mutually exclusive if they have
no sample space outcomes in common. In this case the event A&B cannot occur simultaneously and thus,
P(AnB) = 0
Let A&B Mutually exclusive events, then, the probability that either A or B will occur is
P(AUB) = P(A) + P(B)
15
ETHIO-SMART COLLEGE
Illustration : A card is drawn from a well-shuffled pack of playing cards. Find the probability that the card
drawn is either a king or a queen.
Solution; Let A be the event that a king is drawn and
B the event that a queen is drawn
Since A and B are two mutually exclusive events, we have,
P ( A U B ) = P ( A) +P (B )
= 4/52 + 4/52
= 8/52
= 2/13
2.2.1.2. Rule of Subtraction: In a previous lesson, we learned two important properties of probability:
The probability of an event ranges from 0 to 1.
The sum of probabilities of all possible events equals 1.
The rule of subtraction follows directly from these properties.
Rule of Subtraction shows The probability that event A will occur is equal to 1 minus the probability that
event A will not occur.
P (A) = 1 - P(A')
Illustration: Suppose the probability that Bill will graduate from college is 0.80. What is the probability that
Bill will not graduate from college? Based on the rule of subtraction, the probability that Bill will not graduate
is 1.00 - 0.80 or 0.20.
2.2.1.3. Rule of Multiplication: The rule of multiplication applies to the situation when we want to know the
probability of the intersection of two events; that is, we want to know the probability that two events (Event A
and Event B) both occur.
The probability that Events A and B both occur is equal to the probability that Event A occurs times the
probability that Event B occurs, given that A has occurred.
P (A ∩ B) = P (A) P (B|A)
Illustration : A box contains 6 red marbles and 4 black marbles. Two marbles are drawn without replacement
from the box. What is the probability that both of the marbles are black?
Solution: Let A = the event that the first marble is black; and
B =the event that the second marble is black. We know the following:
In the beginning, there are 10 marbles in the box, 4 of which are black. Therefore, P(A) = 4/10.
After the first selection, there are 9 marbles in the box, 3 of which are black. Therefore, P(B|A) =
3/9.
Therefore, based on the rule of multiplication:
P (A ∩ B) = P (A) P (B|A)
P (A ∩ B) = (4/10)*(3/9) = 12/90 = 2/15
E) Conditional probability: A conditional probability is the probability of event A occurring, given that
event B has already occurred. It is written as P(A/B). Often, the probability of an event is influenced by
whether a related event already occurred. Suppose we have an event A with probability P(A). If we obtain
new information and learn that a related event, denoted by B, already occurred, we will want to take
advantage of this information by calculating a new probability for event A. This new probability of event A is
called a conditional probability and is written P(A\B). We use the notation to indicate that we are
considering the probability of event A given the condition that event B has occurred. Hence, the notation
𝑃(𝐴∩𝐵)
P(A\B) reads “the probability of A given B.” 𝑃(𝐴/𝐵) = 𝑃(𝐵)
16
ETHIO-SMART COLLEGE
Where
P(A/B) = the probability of A given B. We often refer to such a probability as the conditional
probability of A given B.
𝑃(𝐴 ∩ 𝐵)= the (unconditional) probability that event A and event B both occur
P(B) = the (unconditional) probability that event B occurs
Illustration : Suppose that we randomly select a household, and that the chosen households reports it
subscribes to Herald. Given this new information we wish to find the probability that this household
subscribes to Addis Zemen. The new probability is called a conditional probability.
The probability of the event A, given the condition that the event H has occurred, is written
In order to find the conditional probability that a household subscribes to Addis Zemen given that it
subscribes to Herald we know that we are considering one of 500,000 households since 250,000 of these
500,000 Herald subscribers also subscribe to Addis Zemen we have
P(A/H) = 250,000 = 0.5
500,000
i.e 50% of the Herald subscribes also subscribe to Addis Zemen
F) Joint probability: A joint probability is the probability that both event A and event B will occur
simultaneously on a single trial of a random experiment. The joint probability of events A and B occurring is
denoted P(A ∩ B). Sometimes P(A ∩ B) is read as the probability of A and B. To qualify for the intersection,
both events must occur. An example of joint probability is the probability of a person owning both a Ford and
a Chevrolet. Owning one type of car is not sufficient. A second example of joint probability is the probability
that a person is a redhead and wears glasses.
2.3. Definitions of probability distribution: The probability distribution is listing all possible values of the
random variable with corresponding probabilities. The outcome of the experiment is either a success or
failure. The number of ways to get a certain number of successes will determine the value that the random
variable will assume.
Illustration : A multinational bank is concerned about the waiting time of its customers for using their
ATMs. A study of a random sample of 500 customers reveals the following probability distribution:
x(waiting time /customer in minutes) 0 1 2 3 4 5 6 7 8
P(x) 0.20 0.18 0.16 0.12 0.10 0.09 0.08 0.04 0.03
Required: a) What is the probability that a customer will have to wait for more than 5 minutes?
b) What is the probability that a customer will have to wait for less than 4 minutes?
Solution:
a) p(x>5) =p(6)+ p(7)+p(8)=0.08+0.04+0.03=0.15
b) p(x<4)=p(0)+p(1)+p(2)+p(3)=0.20+0.18+0.16+0.12= 0.66
2.4. Basic Concepts
17
ETHIO-SMART COLLEGE
2.4.1. Random variable: A random variable is a variable that contains the outcomes of a chance
experiment. For example, suppose an experiment is to measure the arrivals of automobiles at a parking lot
during a 30-second period. The possible outcomes are: 0 cars, 1 car, 2 cars… n cars. These numbers (0, 1, 2 . .
. n) are the values of a random variable. Suppose another experiment is to measure the time between the
completions of two tasks in a production line. The values will range from 0 seconds to n seconds. These time
measurements are the values of another random variable. The two categories of random variables are (1)
discrete random variables and (2) continuous random variables.
2.4.1.1. Discrete Random Variable: A random variable is a discrete random variable if the set of all
possible values is at most a finite or a countable infinite number of possible values. In most statistical
situations, discrete random variables produce values that are nonnegative whole numbers.
For example, if six people are randomly selected from a population and how many of the six are left-handed
is to be determined, the random variable produced is discrete. The only possible numbers of left-handed
people in the sample of six are 0, 1, 2, 3, 4, 5, and 6. There cannot be 2.75 left-handed people in a group of six
people; obtaining non-whole number values is impossible.
Example:
The No. of employees absent in a given day
Toss two coins and count the number of heads
Number of defective products produced in a factory at a given shift or day or month.
Number of customers entering to a bank in an hour time.
It could be said that discrete random variables are usually generated from experiments in which things
are “counted” not “measured.”
2.4.2. Continuous Random Variable: A random variable that may assume any numerical value in an interval
or collection of intervals is called a continuous random variable. Experimental outcomes based on
measurement scales such as time, weight, distance, and temperature can be described by continuous random
variables.
For example, if a person is assembling a product component, the time it takes to accomplish that feat could
be any value within a reasonable range such as 3 minutes 36.4218 seconds or 5 minutes 17.5169 seconds.
It could be said that continuous random variables are generated from experiments in which things are
“measured” not “counted.”
Example
The distance b/n two cities
The weight of a person.
The rate of return on investment
The time that a customer must wait to receive his changes.
Discrete Probability Distributions : The values assumed by a discrete random variable depend upon the
outcome of an experiment. Since the outcome of the experiment will be uncertain the value assumed by the
random variable will also be uncertain.
The probability distribution of a discrete random variable is listing of all the outcomes of an experiment and
the probabilities associated with each outcome The probability distribution of a discrete random variable is a
table, graph or formula that gives the probability associated with each possible value that a random variable
can assume or if we organize the value of a discrete random variable in a probability distribution the
distribution is called a Discrete Probability distribution. In this unit we will discuss three types of discrete
probability distribution.
Illustration: Probability distribution of the random variable X = number of boys in three births.
18
ETHIO-SMART COLLEGE
Illustration: A car dealer has established the following probability distribution for the number of cars he
expects to sell on a particular Saturday.
Number of cars sold (X) Probability P(x)
0 0.10
1 0.20
2 0.30
3 0.30
4 0.10
Sum=1
Solution :On a typical Saturday, how many cars should the dealer expect to sell?
= E(x) = [p(x)(x)]
µ = 0 (0.1) + 1(0.2) + 2(0.3) +3(0.3) + 4(0.1) = 2.1
For the car dealer find the variance and standard deviation
X p(x) (x - ) (x - )2 (x - )2 p(x)
0 0.1 0 – 2.10= -2.10 4.41 0.441
1 0.2 1 – 2.10= -1.10 1.21 0.242
2 0.3 2 – 2.10= -0.10 0.01 0.003
3 0.3 3 – 2.10= 0.90 0.81 0.243
4 0.1 4 – 2.10= 1.90 3.61 0.361
1 = 1.29
2
19
ETHIO-SMART COLLEGE
2= 1.29
= 1.29 = 1.136 cars
On a typical Saturday, the dealer expect to sell 1.136 cars
2.5. The Normal / continuums / probability distribution: As noted earlier in this unit continues random
variable is one that can assume an infinite number of possible values within a specified range. It usually
results from measuring something.
The most important probability distribution for describing a continuous random variable is the normal
probability distribution. The normal distribution has been used in a wide variety of practical applications in
which the random variables are heights and weights of people, test scores, scientific measurements, amounts
of rainfall, and other similar values. In such applications, the normal distribution provides a description of the
likely results obtained through sampling.
CHAPTER - THREE
SAMPLING AND SAMPLING DISTRIBUTIONS
3.1. INT TRODUCTION
Sampling in statistics is a common and important as salt is in food. In homes, ladies take out one teaspoonful
to detect the quality what she is cooking. In medical sciences, a few drops of blood are taken and tested
microscopically or chemically to know whether the blood contains some abnormalities or not.
Nowadays, sampling methods are extensively used in socio-economic surveys to know the living condition,
cost of living index etc. of a class of people. In biological studies, experiments are conducted on some units
(persons, animals or plants) and inferences are drawn about the breed or variety to which the units belong. In
the industries sampling procedures are predominantly used for quality control.
Sampling theory is the study of relationships existing between a population and samples drawn from the
population.
3.2. SOME CONCEPTS ASSOCIATED WITH SAMPLING
Population or study population: - are individuals, groups or communities or societies from which you select a few
from in order to find answer to your research questions and usually denoted by the letter N.
Sample: - the small group from whom you collect the required info. to estimate the prevalence of the issue.
Statistic: - Statistical measurable value of the sample or a measurable characteristic value of the sample.
Sampling: - is the process of selecting a few (a sample) from a bigger group (the sampling population) to
become the basis for estimating/predicting a fact, situation/outcome regarding the bigger group..
Sampling frame:- a list identifying each unit in the study population.
Parameter: - A measurable value of the population or a measurable characteristic value of the population. It is
a population result.
Sampling design: - A sample design is a definite plan for obtaining a sample from the sampling frame. It
refers to the technique or the procedure that one would adopt in selecting some sampling units
from which inferences about the population is drawn. Sampling design is determined before any
data are collected. Sampling techniques are divided in to two: Random Sampling and non-
Random Sampling.
Sampling error: - is the difference between a sample statistic and its corresponding population parameter.
Population distribution: - is the distribution of individual measurement of a population.
Sampling distribution: - is a probability distribution of a sample and statistics.
3.3 The Need For Sampling
The following points summarize the benefits of studying samples.
20
ETHIO-SMART COLLEGE
a. There could be resources (time, finance, manpower etc.) limitations which would make it difficult to study
the whole population.
b. In some cases, tests may be destructive. For example, when we test the breaking strength of materials, we
must destroy them. A census would mean complete destruction of materials. In such a cases, we must
sample.
c. Sampling provides much quicker results than does a census. When the time between the recognition of the
need of information and the availability of that information is short, sampling helps not to miss the
information.
d. Sampling is the only process possible if the population is infinite.
e. There is also an argument that the quality of a study is often better with sampling than with a census. The
basis of the argument is that sampling possesses the possibilities of better interviewing; more thorough
investigation of missing, wrong, or suspicious information, better supervision, and better processing than
is possible with complete coverage.
3.4 Types of Sampling Techniques
Sampling technique refers to the method of selecting a sample from the universe (population). The right type
of sampling technique is of paramount importance in the execution of a sample survey in accordance with the
objectives and the scope of the inquiry. The sampling methods may be classified:
1. Random/Probability sampling
2. Non – Random/Non-probability sampling
1. RANDOM (PROBABILITY) SAMPLING
Random sampling method is a method of selection of a sample such that each item within the population has
equal chance of being selected.
In this method, there is no place for investigator’s bias in sample selection since it depends on probability. It
provides more accurate estimates in the sense of greater precision. There are three commonly used types of
probability sampling.
a. Simple Random Sampling Method (SRSM):- involves very simple method of drawing a sample from a
given population. The selection of samples is random in character. The oldest method adopted in simple
random sampling is the use of lottery system.
Suppose population size is 100 and sample size is 10 I.e. N = 100 and n = 10. Hundred chits would be
prepared bearing the serial number of units in the universe. These chits would be put together and shuffled
thoroughly, and then ten would be drawn one by one. The sampling units corresponding to the number on
the selected chits will form a random sample. This method gives a sample, which is quite independent of
the natures of universe. This method is commonly in practice even at present.
b. Stratified Random Sampling Method (STRSM)
Under this method, the whole population is divided into a number of homogeneous groups or strata. From
each of these strata, random sample of size ‘n’ is selected. Thus, stratified RS means selecting a number of
random samples, one from each stratum of the universe. It is used when each group has small variation within
itself but wide variation between the groups.
The sample may be either proportionate or disproportionate. With proportionate stratified sampling, the
number of elements from each stratum in relation to its proportion in the total population is selected, whereas
in disproportionate stratified sampling, consideration is not given to the size of the stratum. Suppose the
universe is divided into two groups consisting of 100 and 160 respectively and their respective sample sizes
being 10 % of the universe. Meaning a sample of size 10 + 16 = 26 is drawn in proportion to the total number
21
ETHIO-SMART COLLEGE
of items. But in disproportionate stratified RSM, samples are taken from each stratum regardless of the
number of units in the universe. Thus in the above example, an equal number of units i.e. 13 from each
stratum may be drawn in which the total number of items in the sample is 26.
- The size of sample items which must be selected from the ith stratum is denoted by ni and is given by
nN i Where n – Sample size
ni
N N – Population size
Ni – Size of the ith stratum
Example: In Ethio-Smart College a survey is to be conducted on 120 students’ tendency towards Accounting
and Finance. The total number of students in each field is as indicated below. Give the sample size of each
field of study.
Ni N1 N2 N3 N4 N5
Field of study E/Engineering Management IT Accounting Automotives
No. of students 3000 2000 1500 2500 1000
Solution: n = 120
N1 = 3000, N2 = 2000, N3 = 1500, N4 = 2500, N5 = 1000
N = N1 + N2 + N3 + N4 + N5 = 10,000
nN 1 120 x 3000 nN 2 12
Then, n1 36 n2 x 2000 24
N 10 ,000 N 1000
n 120 n
Or n1 , N1 x 3000 36 i.e. 0.012
N 10 ,000 N
12 12 12
n3 x 1500 18 , n 4 x 2500 30 , n 5 x 1000 12
1000 1000 1000
c. Systematic Random Sampling Method (SYRSM)
In this method, a random starting point is selected from the list representing the universe and the remaining
units are automatically selected in a definite sequence at an equal spacing from one another. This method is
recommended if the sample units are arranged in systematic order such as chronological, geographical,
alphabetical, etc. and also if the sample units in the universe are uniquely identified. Systematic sampling is
also called sampling by regular intervals or sampling by fixed intervals.
- To get a systematic sample of size n from a population of size N, draw a random number i from 1 to K,
N
where K = , and then select i, i + K, i + 2K, i + 3K, …
n
th
N
In general, the i element of the sample is n i i
th
w item. Where 0 w n – 1
n
Or we can have an alternative method,
Ai = A1 + (i – 1) K. Where፡ A1 – the random starting point or the first sample item.
Ai – the ith item in the sample
Example 1: - From the files of 24 cases of the federal high court, the cases of only 4 of these is to be seen.
The fifth file was selected randomly. Indicate the remaining three elements of the sample.
Solution: - N = 24 , n = 4 , A1 = 5
N 24
K= 6
n 4
Then A2 = A1 + (2 – 1) K = A1 + K = 5 + 6 = 11. The 11th file is the second element
A3 = A1 + (3 – 1) K= A1 + 2K = 5 + 2 (6) = 17. The 17th file is the third element.
22
ETHIO-SMART COLLEGE
CHAPTER FOUR
23
ETHIO-SMART COLLEGE
STATISTICAL ESTIMATIONS
4.1. Basic concepts: Inferential statistics is concerned with estimation.
In many cases values for a population parameter are unknown. If parameters are unknown it is generally not
sufficient to make some convenient assumption about their values, rather those unknown parameters should
be estimated. Estimation is the method to estimate the value of a population parameter from the value of the
corresponding sample statistic.
In business many decision are made without complete information.
A firm does not know exactly what will be its sales volume next year or next month. A college does not know
exactly how many students will enroll next year. Both must estimate to make decision about the future. There
are two types of estimations that; point and interval.
4.2. Point estimate: A number or a simple number is used to estimate a population parameter.
A random sample of observations is taken from the population of interest and the observed values are used to
obtain a point estimate of the relevant parameter.
a. The ample mean, x , is the best estimator of the population mean .
Different samples from a population yield different point estimates of ,
b. Sample proportion p is a good estimator of population proportion, p.
- Population proportion P is equal to the number of elements in the population belonging to the category of
X
interest divided by the total number of elements in the population p =
N
x
Sample proportion, p =
n
X is the number of elements in the sample found to belong to the category of interest and n is the sample size.
or p = Number of success in a sample
Number sampled
Example of 2000 persons sampled 1600 favored more strict environmental protection measures, what is the
estimated population proportion.
p = 1600 = 0.80
2000
80% is an estimate of the proportion in the population that favour more strict measures
In general:
The statistic x estimates
S estimates
S2 estimates 2
p estimates p
Estimators and this property/ Goodness of an estimator
The properties of good estimators are
a) Unbiasedness
b) Efficiency
c) Consistency and
d) Sufficiency
i) An estimator is said to be unbiased if its expected value is equal to the population parameter it estimates.
24
ETHIO-SMART COLLEGE
E(x) = The sample mean, x , is therefore, an unbiased estimator of the population mean. Any systematic
deviation of the estimator away from the parameter of interest is called Bias.
ii) An estimator is efficient if it has a relatively small variance (as standard deviation)
iii) An estimator is said to be consistent if its probability of being close to the parameter it estimates increases
as the sample size increases.
The sample mean is a consistent estimator of . This is so because the standard deviation of x is x is
n
Z = . As the sample size n increases, the standard deviation of x decreases and hence the probability that x
will be closes to its expected value increases.
iv. An estimator is said to be sufficient if it contains all the information in the data about the parameter it
estimates. Other estimators like the median and mode do not consider all values. But the mean considers
all values (added and divided by the sample size).
4.3. Interval Estimates: Interval estimate states the range within which a population parameter probably lies.
The interval with in which a population parameter is expected to lie is usually referred to as the confidence
interval.
The confidence interval for the population mean is the interval that has a high probability of containing the
population means,
Two confidence intervals are used extensively.
1. 95% confidence interval and
2. 99% confidence interval
A 95% confidence interval means that about 95% of the similarly constructed intervals will contain the
parameter being estimated. If we use the 99% confidence interval we expect about 99% of the intervals to
contain the parameter being estimated.
Note that
a) Not every interval constructed includes the parameter
b) If we construct 100 intervals and use the 95% level, not exactly 95% of the intervals will include the
parameter.
Another interpretation of the 95 % confidence interval is that 95 % of the sample means for a specified sample
size will lie within 1.96standred deviations of the hypothesized population mean. For 99% the sample means
will lie, with in 2.58 standard deviations of the hypothesized population mean.
Where do the values 1.96 and 2.58 come form?
The middle 95% of the sample mean lie equally on either side of the mean and logically 0.95/2=0.4750 or
47.5%. Thus the area to the right of the mean is 0.4750 and the area to the left of the mean is 0.4750.
The Z value for this probability is 1.96.
The Z to the right of the mean is + 1.96 and Z to the left is – 1.96.
4.4. Constructing Confidence Interval
a) Compute the standard error of the mean
Standard error of the mean is the standard deviation of the sample means.
= population standard
x
n deviation
If the population standard dseavmiaptlieonsizies not know, the standard deviation of the sample s, is used to
n =
approximate the population standard deviation. S x
n
This indicates that the error in estimating the population means decreases as the sample size increases.
25
ETHIO-SMART COLLEGE
b) The 95% and 99% confidence intervals are constructed as follows when n > 30.
3
95% confidence interval x 1.96
n
S
99% confidence interval x 2.58
n
1.96 And 2.58 indicate the Z values corresponding to the middle 95% or 99% of the observation.
S
In general a confidence interval for the mean is computed by x Z , Z reflects the selected level of
n
confidence.
Example: An experiment involves selecting a random sample of 256 middle managers for studying their
annual income. The sample mean is computed to the 35,420 and the sample standard deviation is 2,050.
a. What is the estimated mean income of all middle managers (the population ) ?
b. What is the 95% confidence interval c(rounded to the nearest 10)
c. What are the 95% confidence limits?
d. Interpret the finding.
Solution:
A) Sample mean is 35 420 so this will approximate the population mean so = 35420. It is estimated from the
sample mean.
B) The confidence interval is between 35170 and 35670 found by
S 2050
X 1.96 = 35420 1.96 = 35168.87 and 35671.13
n 256
C) The end points of the confidence interval are called the confidence limits. In this case they are rounded to
35170 and 35670. 35170 is the lower limit and 35070 is the upper limit.
D) Interpretation: If we select 100 samples of size 256 form the population of all middle managers and
compute the sample means and confidence intervals, the population mean annual income would be found in
about 95 out of the 100 confidence intervals. About 5 out of the 100 confidence intervals would not contain
the population mean annual income.
Confidence interval for a population proportion
The confidence interval for a population proportion is estimated
p Zp
Where p is the standard error of the proportion and
p (1 p )
p
n
Therefore the confidence interval for population proportion is constructed by
p (1 p )
pZ
n
Example: Suppose 1600 of 2000 union members sampled said they plan to vote for the proposal to merge
with a notional union. Union by laws state that at least 75% of all members must approve for the merger to be
enacted. Using the 0.95 degree of confidence, what is the interval estimate for the population proportion?
Based on the confidence interval, what conclusion can be drawn?
The interval is computed as follows.
26
ETHIO-SMART COLLEGE
p (1 p ) 0.80 (1 0.8)
pZ = 0.80 1.96 = 0.08 1.96 0.00008
n 2000
= 0.78247 and 0 – 81753 rounded to 0.782 and 0.818.
Based on the sample results when all 2000 union members vote, the proposal will probably pass because 0.75
lie below the interval between 0.782 and 0.818.
Finite Population Correction Factor
The population we have sampled so far has been very large, or assumed to be infinite.
If the sampled population is not infinite or not larger we need to make some adjustments in the standard error
of the mean and the standard error of the proportion.
A population that has a fixed upper bond is said to be finite. A finite population can be small or can be very
large.
For a finite population, where the total number of objects is N, and the size of the sample is n the following
adjustment is made to the standard errors of the mean and the proportion.
Standard error of the mean
N n
x
n N 1
Standard error of the proportion
p (1 p ) N n
p
n N 1
This adjustment is called finite population correction factor.
Why is it necessary to apply a factor and what is its effect?
Logically, if a sample is a substantial percentage of the population, then we would expect any estimate to be
more precise than those for a smaller sample.
1000 100 900
Suppose the population is 1000 and the sample is 100. Then this ratio is or . Taking the
1000 1 999
square root gives the correction factor 0.9492. Multiplying by the standard error reduces the error by about
5% or (1-0.9492)= 0.5. This reduction of the size of the standard error yields a smaller range of values in
estimating the population mean. If the sample size is 200 the correction factor is 0.8949. Meaning that the
standard error has been reduced by more than 10%.
The usual rule is that If n/N is less than 0.05, the finite population correction factor is ignored.
Example: There are 250 families in a small town A poll of 40 families revealed that the mean annual church
contribution is 450 with a standard deviation of 75. Construct a 95% confidence interval for the mean annual
contribution.
Solution: - First note that the population is finite.
Second the sample constitute more than 5% of the population n/N = 40/250 =0.16 Hence the finite population
correction factor is applied.
S N n 75 250 40
xZ = 450 1.96
n N 1 40 250 1
= 450 23.24 0.8433
= 450 21.34
27
ETHIO-SMART COLLEGE
28
ETHIO-SMART COLLEGE
E S 200 S
= 102.04
Z n 1.96 n
Since there are two unknowns for one equation we cannot some for both.
C) Variation in the population. There are still two unknowns. To solve for the number to be sampled we need
to estimate the variation in the population. The standard deviation is a measure of variation. Thus the standard
deviation of the population must be estimated.
This can be done either:
a- By taking a small pilot survey and using the standard deviation of the pilot sample as an estimate of the
population standard deviation or
b- By estimating the standard deviation based on knowledge of the population.
c- Example: Suppose a pilot survey is conducted and sample standard deviation is computed to be 3000. The
number to be sampled can now be estimated.
S E
Sx
n Z
200 3000
n = 864.36
1.96 n
S x is standard error of the mean, the error we commit in estimating
A more convenient computational formula for determining n is.
2
Z .S
n=
E
Where; E = allowable error
Z = Z value for the degree of confidence selected
S = Sample deviation
2
1.96 3000
For this example, n = = 864.36
200
Example 1: A marketing research firm wants to conduct a survey to estimate the average amount spent on
entertainment by each person visiting a popular pub. The people who plan the survey would like to be able to
determine the average amount spent by all people visiting the pub to within br. 120 , with 95% confidence.
From past operations of the pub, an estimate of the population standard deviation is = br. 400 what is the
minimum required sample sizes.
2
1 .96 400
n= = 42.68 43
120
4.5.2. Sample size for proportion
The procedure used to determine the sample size for the mean is applicable to determine when proportions are
involved.
Three things must be specified.
- Decide on the level of confidence
- Indicate how precise the estimate of the population proportion must be
- Approximate the population proportion, P, either from past experience or from a small pilot survey p
The formula for determining the sample size n for a proportion
29
ETHIO-SMART COLLEGE
n = p (1 - p ) ZE 2
where p - estimated proportion
Z = Z value for the selected confidence level
E = the maximum tolerable error
Example : A member of parliament wants to determine here popularity in her area. She indicates that the
proportion of voters who will vote for her must be estimated with in + 2 percent of the population proportion.
Further, the 95% degree of confidence is to be used. In past elections she received 40% of the popular vote in
that area. She doubts whether it has changed much. How many registered voters should be sample?
Z = 1.96
p = 0.40
E = 0.02
n = p (1 - p ) ZE 2
2
1 .96
= 0.40 (1 – 0.4) = 2,304.96 2305
0 .02
This sample size might be too large, or
too small or exactly correct depending on the accuracy of p .
Note: if there is no logical estimate of p , the sample size can be estimated by letting p =0.5
Example : Suppose the president wants an estimate of the proportion of the population that support this
current policy on unemployment. The president wants the estimate to be with in 0.04 of the true proportion.
Assume a 95% level of confidence and the proportion supporting current policy to be 0.60.
a) How large a sample is required
b) How large would the sample have to be if the estimate were not available?
Solution:
a) E = 0.04 2
Z = 1.96 1 .96
n = 0.6(1 – 0.6)
p = 0.60 0 .04
= 577
b) E = 0.4
Z = 1.96
p = 0.50 (since there is no estimate)
2
1 .96
n = 0.5 (1 – 0.5)
0 .04
= 600
CHAPTER – FIVE
HYPOTHESIS TESTING
What is Hypothesis?
Hypothesis is a statement about the value of a population parameter developed for testing.Hypothesis is an
assertion or tentative claim which requires justification.
There are two types of hypothesis:
30
ETHIO-SMART COLLEGE
1) The Null hypothesis: - is an assertion that a population parameter assumes a fixed value. It always
includes the equality sing, and is denoted by Ho. The null hypothesis is often established in such a way
that it states ‘nothing is different’ from what it is supposed to be, is claimed to be, or has been in the past.
2) The Alternative/Research hypothesis: - describes what you will conclude if you reject the null hypothesis.
It is a statement that is accepted if the sample data provide evidence that the null hypothesis is false. It is
written as H1 and is read “H sub-one”. It is also referred to as the research hypothesis. The alternative
hypothesis is accepted if the sample data provide us with statistically significant evidence that the null
hypothesis is false.
Example 1:A soft drink bottling company’s advertisement states that a bottle of its product contains 330
milliliters (ml). But customers are complaining that the company is under filling its products. To check
whether the complaint is true or not, an inspector may test the following hypothesis:
HO: The average content of a bottle of this product equals 330 ml,against
H1: The average content of a bottle of this product is less than 330 ml.
Or symbolically,
HO: =330 ml
H1: < 330 ml
If the inspector takes a random sample of bottles of this product and finds that the mean content per bottle is
much less than 330 ml, then he may conclude that the complaint of the customers is correct
Hypothesis testing: - is a procedure based on sample evidence and probability theory used to determine
whether the hypothesis is reasonable statement and should not be rejected, or is unreasonable and should be
rejected. Or hypothesis testing is a procedure for checking the validity of a statistical hypothesis. It is the
process by which we decide whether the null hypothesis should be rejected or not. The value, computed from
sample information, used to determine whether or not to reject the null hypothesis is called test statistic.
As mentioned earlier, in order to determine whether HO is accepted or not, one computes a test statistic from
the sample data. Our decision is then based on where this figure (the test statistic) falls.
Depending on the sampling distribution of the statistic under consideration, one can identify the acceptance
region and the rejection region. If the test statistic falls in the acceptance region, then HO is accepted and if the
test statistic falls in the rejection region also called (the critical region), then we reject HO and accept H1. The
value that is a borderline between acceptance and rejection is called the critical value. The critical value is
obtained from appropriate statistical table such as standard normal distribution table, the student t distribution
table and others.
The following three charts indicate the acceptance and rejection regions for a given level of significance.
Figure 1Sampling Distribution for statistic Z for a one-Tailed Test [right], 0.05 level of siignificance.
Z/2 = 1.96
Note in the chart that:
1. The area where the null hypothesis is not rejected includes the area to the left of 1.96. Where the value
1.96 is Z.05/2 read from the table of standard normal distribution.
2. The area of rejection is to the right of 1.96
3. A one-tailed test (right) is being applied (this will be explained soon in the unit)
4. The 0.05 level of significance is chosen.
5. The sampling distribution of the statistic Z is normally distributed.
6. The value 1.96 separates the regions where the null hypothesis is rejected
and where it is not rejected.
7. The value 1.96 is called the critical value.
Figure 2S
Sampling Distribution for the statistic Z, one-tailed (Left) test, and 0.05 level of significance.
Z/2 = −1.96
Figure 3RRegions of non-rejection and Rejection for a Two-Tailed Test, Z-statistics, and 0.05 level of
significance.
32
ETHIO-SMART COLLEGE
We can construct a similar acceptance and rejection region while using the t-statistics.
types of tests
Based on the form of the null and alternative hypothesis, there are two types of tests: a one-sided test and a
two-sided test.
a) A one-sided test: a test is said to be one sided (one tailed) when the alternative hypothesis, H1, states a
direction, such as:
HO: The mean income of females is less-than or equal to the mean income of males.
H1: The mean income of males is greater than the mean income of females.
There are two kinds of one-sided test
i) A left-tailed test:This is a type of test in which the less than sign is involved in the alternative
hypothesis. It has one rejection region at the left tail of the appropriate distribution.
Example 2:Suppose O is the hypothesized (assumed) mean.A left-tailed test for O is
HO: O
H1: <O
Figure 2 indicates the acceptance and rejection region of left-tailed test using the Z-distribution.
ii) A right-tailed test: This is a type of test in which the greater than sign is involved in the alternative
hypothesis. It has one rejection region at the right tail of the appropriate distribution.
Example 3:Suppose O is the assumed mean. A right-tailed test for O is
HO: OH1: >O
Figure 1 can be taken as the acceptance and rejection region of right-tailed test while using the Z-distribution.
Step 3
Identify the test statistic
33
ETHIO-SMART COLLEGE
34
ETHIO-SMART COLLEGE
Z0.05/2= 1.96
The rejection region is Z > 1.96, and as the observed (calculated) test statistic is 7.03 which is greater than
1.96, i.e. which is in the rejection region, thus, there is an evidence to reject HO. I.e. HO is rejected and we can
conclude that the sample data indicate that Selam’s Hotel sales have shown a considerable increase.
Example 5:The mean of a certain production process is known to be 50 with a standard deviation of 3. The
production manager may welcome any change is mean value towards higher side but would like to safeguard
against decreasing values of mean.
He takes a sample of 36 items that gives a mean value of 48.5. What inference should the manager take for the
production process on the basis of sample results? Use 1 percent level of significance for the purpose.
Solution: -
Taking the mean value of the population to be 50, we may write:
HO: O≥50
H1: < 50 (since the manager wants to safeguard against decreasing values of mean)
And the given information as X = 48.5, = 3,
n = 36, assuming the population to be normal, we can work out the test statistic Z as under:
X O 48 .5 50 1 .5 1 .5
3.00
/ n 3 / 36 3/6 1
2
As H1 is one-sided in the given question, we shall determine the rejection region applying one-tailed test (in
the left tail as H1 is of less than type) at 1 percent level of significance and it comes to as under, using normal
curve area table:
Rejection region is Z < Z /2 R: Z < -Z 0.01/2
R: Z < -2.58
Z/2 = -2.58
The observed value of Z which we call Z calculated is –3 which is < - 2.58 i.e. it is in the rejection region, and
thus, HO is rejected at 1 percent level of significance. We can conclude that the production process is showing
mean which is significantly less than the population mean and this calls for some corrective action concerning
the said process.
Example 6:A sample of 400 male students is found to have a mean height of 67.47 inches. Can it be
reasonably regarded as a sample from a large population with mean height 67.39 inches and standard
deviation 1.30 inches? Test at 5% level of significance.
Solution:
35
ETHIO-SMART COLLEGE
Taking the null hypothesis that the mean height of the population is equal to 67.39 inches, we can write:
HO: O = 67.39
H1: 67.39
And the given information as X = 67.47” = 1.30” n = 400. Assuming the population to be normal, we can
work out the test statistic Z as under:
X O 67 .47 67 .39 0.08
1.231
/ n 1.30 400 0.065
As H1 is two-sided in the given question, we shall be applying a two-tailed test. In the two-tailed test, the
rejection region at 5% level is
R: Z> 1.96
N n
Where is what we call the finite population correction
N 1
And the procedure of testing is the same as in the above three examples.
Example 7:A workers’ union is on strike for higher wages with a total of 1000 population. The union claims
that the mean salary for workers is at most Birr 8,400 per year. The legislator does not want to reject the
union’s claim, however, unless the evidence is very strong against it. Assume that salaries follow a normal
distribution and the population standard deviation is known to be Birr 3000. A random sample of 64 workers
is obtained, and the sample mean is Br, 9,400. Test if the state legislator accepts the unions’ claim or not at
1% significance level.
Solution:
HO: 8,400
H1: > 8,400
𝛿 = 3,000
𝑥̅ = 9,400
𝑛 = 64 0.96795
𝛼 = 0.01
𝑧0.01 = 2.58
36
ETHIO-SMART COLLEGE
X O
Z
N n
n N 1
9, 400 8, 400
Z
3000 1000 64
64 1000 1
Z= 2.585
𝑧0.01/2 = 2.58
Case 2. Testing for the population mean; large sample, population standard deviation is
unknown
In such a case, the population standard deviation will be approximated by the sample standard deviation S
provided the sample size is considerably greater than 30. And the test statistic will be:
X
Z
S n
And the testing procedure will be the same as in case I for different level of significance and different types of
tests (one-tailed or two-tailed) as mentioned in the three examples in case I.
Example 8:The Kebede’s discount store chain issues its own credit card. The credit manager wants to find
out if the mean monthly-unpaid balance is more than birr 400. The level of significance is set at 0.05. A
random check of 172 unpaid balances revealed the sample mean to be birr 407 and the standard deviation of
the sample to be birr 38. Should the credit manager conclude that the population mean is greater than birr400,
or is it reasonable to assume that the difference of birr 7, (407 - 400) is due to chance?
Solution:
The null and alternative hypothesis is stated as:
HO: birr 400
H1: > birr 400
Because the alternative hypothesis states a direction, a one-tailed test is applied. The critical value of Z, Z/2 =
Z0.05/2 = 1.96. The computed value of Z, using the statistic
X 407 400
Z= 2.42
S n 38 172
As the computed value of the test statistic (2.42) is larger than the critical value (1.96) or as 2.42 is in the
rejection region R: Z > 1.96, the null hypothesis is rejected and the credit manager can conclude that the mean
unpaid balance is greater than birr 400.
Case 3. Population Normal, Small sample and Standard deviation of the population is Unknown
1) If the population is infinite, we use the test statistic
37
ETHIO-SMART COLLEGE
X O
t Which follows a student t-distribution with (n – 1) degree of freedom.
S n
And S is the sample standard deviation given by the formula,
Xi X
2
S=
n 1
2) If the population is finite, using the finite population correction, the test statistic used is modified as:
X O
t= with (n – 1) df.
S N n
n N 1
While testing, the value of the test statistic calculated from the sample result will be compared with the
tabulated value of the t-distribution at the given level of significance.
Example 9:The specimen of copper wires drawn from a large lot have the following breaking strength (in kg.
Weight): 578, 572, 570, 568, 572, 578, 570, 572, 596, 544.
Test whether the mean breaking strength of the lot may be taken to be 578 kg. Weight at 5% level of
significance.
Solution: -
Taking the null hypothesis that the population mean is equal to hypothesized mean of 578 kg., we can write:
HO: = O = 578 kg.
H1: 578 kg.
As the sample size is small (n = 10) and the population standard deviation is not known, we shall use t-test
assuming normal population and shall work out the test statistic t as under:
X O
t=
S n
Calculating the sample mean, and sample standard deviation, one can obtain
X = 572 kg. And S = 12.72 kg.
572 578
Hence t= 1.488
12 .72 10
Degree of freedom = (n – 1) = (10 – 1) = 9.
As H1 is two-sided, we shall determine the rejection regions applying two-tailed test at 5% level of
significance, and it comes to as under, using table of t-distribution for 9degree of freedom.
R: t > 2.262 t > t / 2 or t < -t / 2
or t > t / 2
Acceptance region – 2.262 < t < 2.262
As the observed value of t (i.e. –1.488) is in the acceptance region, we accept HO at 5% level and conclude
that the mean breaking strength of copper wires lot may be taken as 578 kg. Weight
* For two-tailed test in t-distribution, if the level is, the area to the right tail and the area to the left tail must
add up to . Thus we consider them each / 2.
Statistics for finance individual Assignment
1. Show the Probability distribution of the random variable X = number of girls in two births. (5 points)
38
ETHIO-SMART COLLEGE
39
ETHIO-SMART COLLEGE
c) Much of the secondary data available has been collected for many years and therefore it can be used to
plot trends
d) Secondary data is valuable (helpful) to the government in making decisions and planning future policy
e) Secondary data is valuable (helpful) to business and industry in areas such as marketing ,and sales in
order to appreciate the general economic and social conditions and to provide information on
competitors
f) None of them are incorrect
9. Which of the following is in correct statement?
a) Business statistics is the collection, summarizing , analyzing , and reporting of numerical findings
relevant to business decision or situation
b) In a broad sense, statistics is defined as the art and science of collecting, analyzing, presenting, and
interpreting data
c) Particularly statistics in business and economics, the information provided by collecting, analyzing,
presenting ,and interpreting data gives managers and decision makers a better understanding of the
business and economic environment and thus enables them to make more informed and better decisions
d) Statistics always deals with quantitative issues
10. Which of the following cannot be an example of population?
a) All students of Ethio-Smart College l 2012 E.C
b) All 2nd year accounting department students of Aksum University 2012E.C
c) 10 employees out of 300employees of CBE Adwa Branch of 2010 E.C
d) All employees of AlmedaTexitile
11. Who is an effective manager according statistics?
a) The one who have well prepared sales report
b) The one who have well prepared money supply figures, interest rates
c) The one who can have market survey results
d) The one who knows investment information
12. Which one of the following is not limitation of statistics
a) Statistics doesn’t deal with individual items
b) Statistics deals only with quantitatively expressed items
c) Statistical results are not universally true
d) Statistics is liable to misused
13. Which of the following cannot be an example of discreet data?
a) Number of customers visiting a bank during any given day
b) Number of cattle’s owned by some one
c) The price of a house
d) The number of people in any meeting
14. Which of the following cannot be an example of continuous data?
a) The height of a person
b) The time taken to complete a given exam
c) The time taken to monitor the incoming telephone calls to claims office of a given insurance
company
d) The weight of a baby
e) The number of customers in a restaurant who visited per day
15. If a given data is not numeric and if we want to measure the data using measures of central tendency,
which one is better to measure it?
40
ETHIO-SMART COLLEGE
41