0% found this document useful (0 votes)
21 views

Statistics For Management I

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Statistics For Management I

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Chapter One

Introduction to Statistics

Introduction

Statistics is the science that deals with the method of collection, organization, analysis of data
and interpretation of the results. The term statistics can also be defined in its plural sense. In the
plural sense statistics are collections of numerical facts, values that are obtained from sample
results are called statistics. The science of statistics is very essential for research and decision
processes in all aspects of human life.

Statistical analysis begins with data collection and the analysis of the data is then undertaken for
one of the following purpose:

 To summarize the finding of some inquiry.


 To obtain a better understanding of the phenomenon under study, primarily as an aid in
generalization or theory validation.
 To make a forecast of some variables, for example, rate of price movements in the
coming ten years in a given area.
 To evaluate the performance of some program.
 To help in selecting a course of action among a number of alternatives.
Statistics is primarily concerned with how to summarize and interpret variables. A variable is
any characteristic of an object that can be represented as a number. The values that the variable
takes will vary when measurements are made on different objects or at different times.

Each time that we record information about an object we observe a case. We might include
several different variables in the same case. For example, we might measure the height, weight,
and hair color of a group of people in an experiment. We would have one case for each person,
and that case would contain that person's height, weight, and hair color values. All of our cases
put together are called our data set.To some people, statistics means summarized data, such as
unemployment figures or the number of runs, hits, and errors in a baseball game. To others, it
means a course of study. Neither description is adequate. In presenting the statistics as a method
of getting information from data to help managers make decisions, we will see that statistics

1|Page
comprises various techniques with a wide range of applications to practical problems. In this
section you will be introduced with the definition of statistics, classification and applications of
statistical methods.

The definitions of statistics are very dynamic, changed from time to time. Some of the
definitions of statistics are given below:

Classification of Statistics

Statistics means different things to different people. And we can say there are as many
definitions as the number of people who have tried to define the term Statistics. Some of these
definitions are:

 Statistics is a branch of mathematics that consists of a set of analytical techniques that can be
applied to data to help in making judgments and decisions in problems involving
uncertainty.
 Statistics is a scientific discipline consisting of procedures for collecting, describing,
analyzing and interpreting numerical data.
 Statistics is a body of principles and methods concerned with extracting useful information
from a set of numerical data.
 Statistics is a body of methods dealing with collection, description, analysis, and
interpretation of information that can be given in a numerical form.
Classification of Statistics

Generally statistics can be classified as descriptive and inferential statistics based on their scope
of coverage.

Descriptive Statistics deals with methods of organizing, summarizing, and presenting numerical
data in a convenient form through graphs, charts, tables, etc. It deals with description of the
characteristics of large masses of data. E.g. the computation of average weekly sales for a
business, average number of students in a class, the average mark for a section for Introductory
statistics course, etc. Most of the statistical information in newspapers, magazines, company
reports, and other publications consists of data that are summarized and presented in a form that

2|Page
is easy for the reader to understand. Such summaries of data, which may be tabular, graphical, or
numerical, are referred to as descriptive statistics.

Inferential statistics consists of a set of procedures that helps in making inferences and
predictions about a whole population (a collection of persons, objects, or items of interest)
based on information from a sample (a portion of the whole and, if properly taken, is
representative of the whole) of the population. It is a body of methods for drawing conclusions
(that is, making inferences) about characteristics of a population, based on information available
in a sample taken from the population. E.g. Using the average mark of a section for estimating
the average mark for ten sections for a given course. Many situations require information about a
large group of elements (individuals, companies,voters, households, products, customers, and so
on). But, because of time, cost, andother considerations, data can be collected from only a small
portion of the group. The largergroup of elements in a particular study is called the population,
and the smaller group iscalled the sample.

NB. While descriptive statistics describes the characteristics of the observed data and helps to
reach conclusions about that same group only, inferential statistics provides methods for making
generalizations about the whole population based on the sample of observed data.

Important Terms in Statistics

Population is the totality of items under observation. It consists of all those items falling in to a
defined category or it is a set or collection of all possible observations of some specific
characteristic (usually people, objects, transactions or events). It is frequently large and may

3|Page
sometimes be indefinitely large. E.g. All students in Ethiopia, all students in Addis Ababa, all
students in Faculty of Business, all students of a given department.

Census is the gathering of information from all elements in a population. It is the study of each
and every element in the population. It is sometimes called complete enumeration.

Parameter is the descriptive measure of the population under consideration. It is a measured


value of a population. Parameters are usually denoted by Greek or capital letters. Examples of
parameters are population mean (µ), population variance (σ2), and population standard deviation
(σ).

Sample is a finite part of a population. It is a representative subset of the population under


consideration. While taking a sample it is hoped that the characteristics of the population is
reflected in the sample so that we can generalize or infer from the latter to the former.

Sampling is a process of selecting out the representatives of a population from the population. It
is gathering information from the part of a population.

Statisticis a descriptive measure of a sample. It is a measured value of a sample. Statistics are

usually denoted by lower case Roman letters. Examples of statistics are sample mean ( x ),
sample variance (s2), and sample standard deviation (s).

Variable

It is property of an object or event that can take on different values. For example, college major
is a variable that takes on values like mathematics, computer science, English, psychology, etc.

 Discrete Variable - a variable with a limited number of values (e.g., gender


(male/female), college class (freshman/sophomore/junior/senior).
 Continuous Variable - a variable that can take on many different values, in theory, any
value between the lowest and highest points on the measurement scale.

Variables can also be broken down into two types:

 Quantitative variables: are those for which the value has numerical meaning. The value
refers to a specific amount of some quantity. You can do mathematical operations on the

4|Page
values of quantitative variables (like taking an average). A good example would be a
person's height.
 Qualitative variables: are those for which the value indicates deferent groupings. Objects
that have the same value on the variable are the same with regard to some characteristic,
but you can't say that one group has \more" or \less" of some feature. It doesn't really
make sense to do math on categorical variables. A good example would be a person's
gender
Applications of Statistics

Statistics can be used in the following major business areas

1. Marketing: Statistical analysis are frequently used in providing information for making
decision in the field of marketing it is necessary first to find out what can be sold and the to
evolve suitable strategy, so that the goods which to the ultimate consumer. A skill full
analysis of data on production purchasing power, man power, habits of compotators, habits
of consumer, transportation cost should be consider to take any attempt to establish a new
market.
2. Production: In the field of production statistical data and method play a very important role.
The decision about what to produce? How to produce? When to produce? For whom to
produce is based largely on statistical analysis.
3. Finance: The financial organization discharging their finance function effectively depend
very heavily on statistical analysis of peat and tigers.
4. Banking: Banking institute have found if increasingly to establish research department within
their organization for the purpose of gathering and analysis information, not only regarding
their own business but also regarding general economic situation and every segment of
business in which they may have interest.
5. Investment: Statistics greatly assists investors in making clear and valued judgment in his
investment decision in selecting securities which are safe and have the best prospects of
yielding a good income.
6. Purchase: the purchase department in discharging their function makes use of statistical data
to frame suitable purchase policies such as what to buy? What quantity to buy? What time to
buy? Where to buy? Whom to buy?

5|Page
7. Accounting: statistical data are also employer in accounting particularly in auditing function,
the technique of sampling and destination is frequently used.
8. Control: the management control process combines statistical and accounting method in
making the overall budget for the coming year including sales, materials, labor and other
costs and net profits and capital requirement.

6|Page
CHAPTER TWO

DATA COLLECTION AND PRESENTATION

2.1 Data Collection

Statistical Data
Data are set of values collected for some purpose. They are raw facts about a phenomenon which
do not give any message. Data are records of the actual state of some measurable aspects of the
universe at a particular point in a given time. They are not abstract but are concrete, tangible and
countable features of a particular aspect.

2.1.1 Classification of Data


A. Based on their sources

There are two types of data based on their source. These are primary and secondary data.

Primary data – These are data which are the measurements and records of original study. These
are data which are collected as a fresh and for the first time and thus happens to be original in
character. These are data which are directly measured and recorded from the source. These are
data which are not collected by someone else before.

Secondary Data – In some situations there are cases which are not conducive for the principal
investigator to start his study from the very beginning. In such a situation he may use and take in
to consideration what have already been collected by others.

Secondary data are those which have already been collected by someone else and which have
already been passed through some statistical process. When an investigator uses the data which
have already been collected by others, such data are called secondary data. Secondary data can
be taken from journals, reports, periodicals, publications, etc.

Secondary data should be used with greater care. The investigator, before using these data, must
observe that they possess the following characteristics.

1. Reliability of Data: The data collected from other source should be reliable enough to be
used by the investigator. Determining and testing the reliability of secondary data is the

7|Page
most important as well as difficult task. Reliability can be tested by answering questions
like:
 Who collected them?
 What were the sources of data?
 What methods were used to collect them?
 At what time were they collected?
2. Suitability of Data:
Before using the secondary data, they must be evaluated whether they could serve for
another purpose other than the one for which they were collected. The suitability of data
can be evaluated from the point of the nature and scope of investigation view.

3. Adequacy of Data: Reliability and suitability of secondary data may not be sufficient for
the investigator to use these data for analysis. Besides these, they should be tested for
adequacy. Adequacy can be tested by evaluating the data in terms of area coverage, level
of accuracy; number of respondents participated and so on.

B. Based on their nature


Data are facts, observations, and information that come from investigations.
The field of statistics deals with measurements—some quantitative and others qualitative. The
measurements are the actual numerical values of a variable. (Qualitative variables could be
described by numbers, although such a description might be arbitrary; for example, N = 1, E = 2,
S =3, W = 4, Y = 1, N = 0.)

Scales of Measurement
The four generally used scales of measurement are listed here from weakest to strongest.

A. Nominal Scale. In the nominal scale of measurement, numbers are used simply as labels
for groups or classes. If our data set consists of blue, green, and red items, we may
designate blue as 1, green as 2, and red as 3. In this case, the numbers 1, 2, and 3 stand
only for the category to which a data point belongs. “Nominal” stands for “name” of
category. The nominal scale of measurement is used for qualitative rather than

8|Page
quantitative data: blue, green, red; male, female; professional classification; geographic
classification; and so on.
B. Ordinal Scale. In the ordinal scale of measurement, data elements may be ordered
according to their relative size or quality. Four products ranked by a consumer may be
ranked as 1, 2, 3, and 4, where 4 is the best and 1 is the worst. In this scale of
measurement we do not know how much better one product is than others, only that it is
better.
C. Interval Scale. In the interval scale of measurement the value of zero is assigned
arbitrarily and therefore we cannot take ratios of two measurements. But we can take
ratios of intervals. A good example is how we measure time of day, which is in an
interval scale. We cannot say 10:00 A.M. is twice as long as 5:00 A.M. But we can say
that the interval between 0:00 A.M. (midnight) and 10:00 A.M., which is duration of 10
hours, is twice as long as the interval between 0:00 A.M. and 5:00 A.M., which is
duration of 5 hours. This is because 0:00 A.M. does not mean absence of any time.
Another example is temperature. When we say 0°F, we do not mean zero heat. A
temperature of 100°F is not twice as hot as 50°F.
D. Ratio Scale. If two measurements are in ratio scale, then we can take ratios of those
measurements. The zero in this scale is an absolute zero. Money, for example, is
measured in a ratio scale. A sum of $100 is twice as large as $50. A sum of $0 means
absence of any money and is thus an absolute zero. We have already seen that
measurement of duration (but not time of day) is in a ratio scale. In general, the interval
between two interval scale measurements will be in ratio scale. Other examples of the
ratio scale are measurements of weight, volume, area, or length.

Data can also be classified as either qualitative or quantitative. Qualitative data include labels
or names used to identify an attribute of each element. Qualitative data use either the nominal or
ordinal scale of measurement and may be nonnumeric or numeric. Quantitativedata require
numeric values that indicate how much or how many. Quantitative data are obtained using either
the interval or ratio scale of measurement.

9|Page
Methods of Data Collection

Data are records of the actual state of some measurable aspect of the universe at a particular
point in time. Data are not abstract; they are concrete, they are measurements or the tangible and
countable features of the world. In general, data could be quantitative (expressed in numerical
form) or qualitative (expressed in the form of verbal descriptions rather than numbers).

Methods of Primary Data Collection

Primary data are those which are collected afresh and for the first time, and thus happen to be
original in character. Its advantage is its relevance to the user, but it is also likely to be expensive
in time and money terms to collect. The primary data can be collected using the following
methods

A. OBSERVATION

Observation is the most commonly used method of data collection especially, in behavioral
studies. This method could be used both for cross checking information obtained using other
methods and for understanding processes which are difficult to grasp in an interview context.
This method is useful when studying subjects who are not capable of giving verbal reports of
their feelings for one reason or another.

Advantages of observation method:

1. subjective bias is eliminated, if observation is done accurately

2. the information obtained relates to what is currently happening; it is not complicated by


either the past behavior or future intentions or attitudes
3. it is independent of respondents’ willingness to respond and as such is relatively less
demanding of active cooperation on the part of respondents as happens to be the case in
the interview or the questionnaire method.
Limitations:

1. expensive;
2. the information obtained is limited ;
3. Sometimes unforeseen factors may interfere with the observational task.

10 | P a g e
B. Interview

The interview method of collecting data involves presentation of oral-verbal stimuli and reply in
terms of oral-verbal responses. This method can be used through personal interviews and, if
possible, through telephone interviews.

Personal interviews: This method requires a person (interviewer) asking questions in a face-to-
face contact to the interviewee.

If the interview is carried out in a structured way, it is called structured interview. This involves
the use of a set of predetermined questions and highly standardized techniques of recording. The
interviewer in a structured interview follows a rigid procedure laid down, asking questions in a
form and order prescribed. As against it, the unstructuredinterviews are characterized by a
flexibility of approach to questioning. In unstructured interview, the interviewer is allowed much
greater freedom to ask, in case of need, supplementary questions or at times he may omit certain
questions if the situation so requires. He may even change the sequence of questions. But this
sort of flexibility results in lack of comparability of one interview with another and the analysis
of unstructured responses becomes much more difficult and time consuming than that of the
structured responses obtained in case of structured interviews.

Advantages of personal interviews:

1. More information and in greater depth can be obtained


2. The interviewer by his own skill can overcome the resistance, if any, of the respondents
3. There is greater flexibility especially in case of unstructured interviews
4. personal information can be obtained easily
5. samples can be controlled effectively as there arises no difficulty of missing returns; non-
response generally remains very low
6. the language of the interview can be adopted to the ability or educational level of the person
interviewed
Some of the weaknesses of the personal interview method:

1. It is very expensive, especially when large and widely spread geographical sample is taken
2. The possibility of the bias of interviewer as well as that of the respondent

11 | P a g e
3. Certain types of respondents may not be easily approachable (eg. Important officials or
executives, people in high income groups)
4. It is relatively more time consuming
Telephone interviews: This method of collecting information consists in contacting respondents
on telephone itself. It is not a very widely used method, but plays important part in industrial
surveys, particularly in developed countries.

Some of the chief merits of telephone interview are:

1. It is faster than other methods


2. It is cheaper than personal interview method; the cost per response is relatively low
3. Recall is easy; callbacks are easy and economical
4. Replies can be recorded without causing embarrassment to respondents
5. No field staff is required
Some of the demerits of telephone interview are:

1. Little time is given to respondents for considered answers


2. Surveys are restricted to respondents who have telephone facilities
3. It is not suitable for intensive surveys where comprehensive answers are required
4. Questions have to be short and to the point; probes are difficult to handle
C. Questionnaire

This method is quite popular, particularly in case of big inquiries. Service evaluations of hotels,
restaurants, transportation providers, and other service providers are good examples of self-
administered questionnaire. Often a short questionnaire is left to be completed by the respondent
in a convenient location. In a mail survey, a questionnaire can also be sent (usually by post) to
the persons concerned with a request to answer the questions and return the questionnaire.

A questionnaire consists of a number of questions printed or typed in a definite order on a form


or set of forms. The questionnaire is mailed to respondents who are expected to read and
understand the questions and write down the reply in the space meant for the purpose in the
questionnaire itself.

12 | P a g e
The merits of this method are:
1. it is free from the bias of the interviewer; answers are in respondents’ own words
2. respondents have adequate time to give well thought out answers
3. respondents who are not easily approachable can also be reached conveniently
The main demerits of this system can be:
1. it can be used only when respondents are educated and cooperating
2. the control over questionnaire may be lost once it is sent
3. there is inbuilt inflexibility because of the difficulty of amending the approach once
questionnaires have been dispatched
4. There is also possibility of ambiguous replies or omission of replies altogether to certain
questions
Methods of Secondary Data Collection

The use of existing data (secondary data) in a research activity is termed as desk research simply
because the person carrying it out can usually gather such data with out leaving his/her desk. In
any type of study, it is advisable to assess the availability of secondary data before embarking
upon a primary data collection exercise, since the latter is expensive in terms of time, money and
manpower.
The following list includes Sources of Secondary data:
 Historical documents, archives, maps, photographs, letters, biographies, autobiographies,
diaries, textbooks, periodicals
 On-line and Electronic Data Bases;
 Different Central Statistical Authority Publications;
 Different Publications by Regional Governments;
 Various publications by the different Ministries;

Data Presentation
After data have been collected, the next step is to present it in some convenient way. The logic
behind data presentation is that statistical data in their raw form are difficult to understand and
summarize. When data are presented, the user can understand it in some meaningful form with in
short period of time. Therefore, Data presentation is the process of re-organization, classification,
compilation and summarization of data to present it in a meaningful form.

13 | P a g e
2.1.2 Tabular presentation or Frequency Distributions (Absolute, Relative and
Cumulative Distributions)

Frequency distribution: A grouping of data into categories showing the number of observations
in each mutually exclusive category

It is the process of organization of raw data in a table form using classes and frequencies. There
are two types of frequency distributions; these are categorical and grouped frequency
distribution.

A. Categorical Frequency Distribution


The categorical frequency distribution is used for data which are qualitatively described. The
important thing here is that it can be able to classify the data in to complete and non-overlapping
categories.

Example: The following are data of employees of organization X by level of education (LOE)

No Name LOE

1 Abebe Diploma

2 Hordofa B.Sc

3 Toga M.Sc

4 Kahsay PhD

5 Ahmed Diploma

6 Hirut B.Sc

“ ” “

“ “ “

50 Kassech Ph.D

14 | P a g e
There are 15 workers having diploma, 20 workers having B.Sc, 10 works having M.Sc and 5
workers having Ph.D.

Required: Present the data using appropriate method of presentation?

Since level of education is a qualitative variable the appropriate method of presentation is using
categorical distribution. The following is the categorical distribution of employees by level of
education.

Employees of Organization X by LOE

LOE NO Percentage

Diploma 15 30%

Bachelor 20 40%

Master 10 20%

Ph.D 5 10%

Total 50 100%

B. Grouped Frequency Distribution

This is a method of presenting data which is quantitatively measured and when a variable
contains a large volume of raw data. It contains several important concepts such as class limits,
class width, class interval and frequencies. Class limits are classified as lower class limit and
upper class limit. .

In constructing a frequency distribution, we have to first choose the number of classes.

15 | P a g e
The steps necessary to define the classes for a frequency distribution with quantitative data are:

1. Determine the range

2. Determine the number of non-overlapping classes.

3. Determine the width of each class.

4. Determine the class limits.

Let us demonstrate steps by developing a frequency distribution for the audit time data

A. Number of Classes: Classes are formed by specifying ranges that will be used to group
the data. As a general guideline, we recommend using between 5 and 20 classes. For a
small number of data items, as few as five or six classes may be used to summarize the
data. For a larger number of data items, a larger number of classes is usually required.
The goal is to use enough classes to show the variation in the data, but not so many
classes that some contain only a few data items.
There is also a formula which helps to determine K.
K = 1  3.322 log n . This formula will not give us a whole number.
B. Width of the Classesthe second step in constructing a frequency distribution for
quantitative data is to choose a width for the classes. As a general guideline, we
recommend that the width be the same for each class. Thus the choices of the number of
classes and the width of classes are not independent decisions. A larger number of classes
means a smaller class width, and vice versa. To determine an approximate class width,
we begin by identifying the largest and smallest data values. Then, with the desired
number of classes specified, we can use the following expression to determine the
approximate class width.
Approximate class width = Largest data value _ Smallest data value

Number of classes

The approximate class width given by equation can be rounded to a more convenient
value based on the preference of the person developing the frequency distribution. For

16 | P a g e
example, an approximate class width of 9.28 might be rounded to 10 simply because 10
is a more convenient class width to use in presenting a frequency distribution.

In practice, the number of classes and the appropriate class width are determined by trial
and error. Once a possible number of classes is chosen, the equation is used to find the
approximate class width. The process can be repeated for a different number of classes.
Ultimately, the analyst uses judgment to determine the combination of the number of
classes and class width that provides the best frequency distribution for summarizing the
data. For the audit time data in Table after deciding to use five classes, each with a
width of five days, the next task is to specify the class limits for each of the classes.

C. Class Limits Class limits must be chosen so that each data item belongs to one and only
one class. The lower class limit identifies the smallest possible data value assigned to the
class. The upper class limit identifies the largest possible data value assigned to the class.
In developing frequency distributions for qualitative data, we did not need to specify
class limits because each data item naturally fell into a separate class. But with
quantitative data, such as the audit times in the table class limits are necessary to
determine where each data value belongs. Using the audit time data in the table, we
selected 10 days as the lower class limit and 14 days as the upper class limit for the first
class.

The smallest data value, 12, is included in the 10 –14 class. We then selected 15 days as the
lower class limit and 19 days as the upper class limit of the next class. We continued defining the
lower and upper class limits to obtain a total of five classes: 10–14, 15–19, 20–24, 25–29, and
30–34. The largest data value, 33, is included in the 30 –34 class. The difference between the
lower class limits of adjacent classes is the class width. Using the first two lower class limits of

17 | P a g e
10 and 15, we see that the class width is 15 - 10 = 5. With the number of classes, class width, and
class limits determined, a frequency distribution can be obtained by counting the number of data
values belonging to each class.

The most frequently occurring audit times are in the class of 15–19 days. Eight of the 20 audit
times belong to this class. Only one audit required 30 or more days. Other conclusions are
possible, depending on the interests of the person viewing the frequency distribution. The value
of a frequency distribution is that it provides insights about the data that are not easily obtained
by viewing the data in their original unorganized form.

D. Class Midpoint In some applications, we want to know the midpoints of the classes in a
frequency distribution for quantitative data. The class midpoint is the value halfway
between the lower and upper class limits. For the audit time data, the five class midpoints
are 12, 17, 22, 27, and 32.
Exercises

1. The dean of the college of Business and Economics wishes to determine the amount of
studying business and economics students do. He selects a random sample of 30 students
and determines the number of hours each student studies per week: 15.0, 23.7, 19.7, 15.4,
18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9,
10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6.Organize the data into a
frequency distribution.
2. Consider the following class

Class Intervals Frequency

10-15 10
16 – 21 20
22 – 27 30
28-33 25
Total 85
Required:
i. Determine LCL, LCB, UCL, and UCB for each class.

18 | P a g e
ii. Develop additional columns for class marks, relative frequencies, less than
cumulative frequencies, and more than cumulative frequencies.

Cumulative Frequency
There are two types of cumulative frequency

 Less than type cumulative frequency


 Greater than type cumulative frequency

Less than type cumulative frequency


the total frequencies of a particular class and all classes prior to that particular class is called the
Less than type cumulative frequency of that class or simply the Cumulative frequency of that
class.

Example

The marks of 30 students of a class, obtained in a test out of 75, are given below:
42, 21, 50, 37, 42, 37, 38, 42, 49, 52, 38, 53, 57, 47, 29, 59, 61, 33, 17, 17, 39, 44, 42, 39, 14, 7,
27, 19, 54, 51.
A frequency table and a cumulative frequency table with equal class interval is formed.

19 | P a g e
The Greater than Type cumulative frequency:The cumulative frequency of a particular class
and all the classes after that class is called the "greater than" type cumulative frequency.
The greater than cumulative frequencies are related to lower class limit and form a
decreasing sequence. The marks of 30 students of a class, obtained in a test out of 75, are given
below:
42, 21, 50, 37, 38, 42, 49, 52, 38, 53, 57, 47, 29, 59, 61, 33, 17,17, 39, 44, 42, 39, 14, 7, 27, 19,
54, 51.
On the basis of the given data a frequency table and a cumulative frequency "greater than" type
of table with equal class interval would look like this.

2.1.3 Diagrammatical presentation of data

Even though tabular method of presentation yields good information for those who can
understand them, they may not generate understandable information for common people.
Because of this reason we are introducing other means of data presentation which will have more
importance. Diagrammatic presentation of data has the following advantages:

 They help in drawing the required information with short period of time without any
complexity.

20 | P a g e
 They have greater attraction than figures.
 They facilitate comparison
Diagrammatic presentations have greater importance in the presentation of categorical data.
There are different types of diagrammatic presentation that are in use these days. Some of these
are discussed next.

A. Bar charts
Bar charts are one dimensional rectangular diagrams used to display usually qualitative
distributions. Bar charts have the following common characteristics:

a. The length or height of the bar associated with a category of a class interval represents
the corresponding frequency.
b. The bars are equally spaced. Equal space should be left between consecutive bars.
c. Each bar has equal width
There are different types of bar charts used for data presentation: Vertical bar graph, horizontal
bar graph, grouped bar graph, Stocked bar graph, etc.

Example: Consider the preceding illustration about the level of Education of employees of
certain organization.

LOE NO

Diploma 15

Bachelor 20

Master 10

Ph.D 5

Total 50

21 | P a g e
60
Total
50

40 Ph.D

30 Master

20
Bachelor
10
Diploma
0
Diploma Bachelor Master Ph.D Total 0 20 40 60

Vertical bar graph Horizontal bar graph

Example: The following describes types of clothes and area of sales of these clothes for the year
2005 by a textile factory. Present it using a bar chart.

Types of cloth Area of Sales Total

Local Export

Men’s 150 100 250

Women’s 125 225 350

Children’s 70 110 180

Total 345 435 780

22 | P a g e
Total sales for 2005 by the type of clothes manufactured and areas of sales

1000

800

600 Local

400 Export
Total
200

0
Men’s Women’s Children’s Total

2000

1500
Total
1000
Export

500 Local

0
Men’s Women’s Children’s Total

Total

Children’s
Total
Export
Women’s
Local

Men’s

0 200 400 600 800 1000

23 | P a g e
B. Pie- Chart
Pie-Chart is a circle divided in to component sectors according to the proportion of components
from the total. It is constructed by dividing 3600 of a circle in to angles each of which is
proportional to the size of the respective component.

Example: Take the data of employees of organization X. Present it using pie-chart.

LOE fi fi Ai
n

Diploma 15 15/50 = 0.3 0.3 x 360 = 108

Bachelor 20 20/50 = 0.4 0.4 x 360 = 144

Masters 10 10/50 = 0.2 0.2 x 360 = 72

Ph.D 5 5/50 = 0.1 0.1 x 360 = 36

Total 50 1 360

In pie-chart it is better to shade different colors for each component. It is also being better that
percentages are associated with each component for easy comparison.

Diploma
Bachelor
Masters
Ph.D

Graphical presentation of data

24 | P a g e
Histogram

It is a type of diagrammatic presentation which is commonly used for frequency distribution with
continuous classes. The following are important features of histogram.
 It can’t be used with frequency distribution having open ended class.
 No space is left between bars
What is a Histogram?
A histogram is used to summarize discrete or continuous data. In other words, it provides a
visual interpretation of numerical data by showing the number of data points that fall within a
specified range of values (called “bins”). It is similar to a vertical bar graph. However, a
histogram, unlike a vertical bar graph, shows no gaps between the bars.

Parts of a Histogram

1. The title: The title describes the information included in the histogram.

2. X-axis:refers intervals that show the scale of values which the measurements fall under.

3. Y-axis: The Y-axis shows the number of times that the values occurred within the
intervals set by the X-axis.

4. The bars: The height of the bar shows the number of times that the values occurred
within the interval, while the width of the bar shows the interval that is covered. For a
histogram with equal bins, the width should be the same across all bars.

25 | P a g e
Frequency Polygon

A frequency polygon is a graphical form of representation of data. It is used to depict the shape
of the data and to depict trends. It is usually drawn with the help of a histogram but can be drawn
without it as well. A histogram is a series of rectangular bars with no space between them and is
used to represent frequency distributions.

Steps to Draw a Frequency Polygon

 Mark the class intervals for each class on the horizontal axis. We will plot the frequency
on the vertical axis.

 Calculate the class mark for each class interval. The formula for class mark is:

Class mark = (Upper limit + Lower limit) / 2

 Mark all the class marks on the horizontal axis. It is also known as the mid-value of every
class.
 Corresponding to each class mark, plot the frequency as given to you. The height always
depicts the frequency. Make sure that the frequency is plotted against the class mark and
not the upper or lower limit of any class.
 Join all the plotted points using a line segment. The curve obtained will be kinked.
 This resulting curve is called the frequency polygon.

26 | P a g e
Note that the above method is used to draw a frequency polygon without drawing a histogram.
You can also draw a histogram first by drawing rectangular bars against the given class intervals.
After this, you must join the midpoints of the bars to obtain the frequency polygon. Remember
that the bars will have no spaces between them in a histogram.

We now start by plotting the class marks such as 54.5, 64.5, 74.5 and so on till 94.5. Note that
we will also plot the previous and next class marks to start and end the polygon, i.e. we plot 44.5
and 104.5 as well.

Then, the frequencies corresponding to the class marks are plotted against each class mark. Like
you can see below, this makes sense as the frequency for class marks 44.5 and 104.5 are zero and
touching the x-axis. These plot points are used only to give a closed shape to the polygon. The
polygon looks like this:

Frequency Polygon

Cumulative frequency curve or O-give

The word Ogive is a term used in architecture to describe curves or curved shapes. Ogives
are graphs that are used to estimate how many numbers lie below or above a particular
variable or value in data. To construct an Ogive, firstly, the cumulative frequency of the
variables is calculated using a frequency table. It is done by adding the frequencies of all the
previous variables in the given data set. The result or the last number in the cumulative

27 | P a g e
frequency table is always equal to the total frequencies of the variables. Let us discuss one of
the graphs called “Ogive” in detail. Here, we are going to have a look at what is an Ogive,
graph, chart and an example in detail.

The graphs of the frequency distribution are frequency graphs that are used to exhibit the
characteristics of discrete and continuous data. Such figures are more appealing to the eye
than the tabulated data. It helps us to facilitate the comparative study of two or more
frequency distributions. We can relate the shape and pattern of the two frequency
distributions.

The two methods of Ogives are:

 Less than Ogive

 Greater than or more than Ogive

The graph given above represents less than and the greater than Ogive curve. The rising
curve (Brown Curve) represents the less than Ogive, and the falling curve (Green Curve)
represents the greater than Ogive.

Less than Ogive

The frequencies of all preceding classes are added to the frequency of a class. This series is
called the less than cumulative series. It is constructed by adding the first-class frequency to
the second-class frequency and then to the third class frequency and so on. The downward
accumulation results in the less than cumulative series.

28 | P a g e
Greater than or More than Ogive

The frequencies of the succeeding classes are added to the frequency of a class. This series is
called the more than or greater than cumulative series. It is constructed by subtracting the
first class, second class frequency from the total, third class frequency from that and so on.
The upward accumulation result is greater than or more than the cumulative series.

29 | P a g e
Chapter Three
Measures of Central Tendency and Dispersion
3.1. Introduction
Complex data analysis in statistics begins with knowing the data. i.e., describing the data. As
long as we are obsessed with exploring data, we have to focus on the following five issues.

1. Center: Finding a single value that represents the center of a data. i.e., mean, median, and
mode
2. Variation: Finding the scatteredness of a data. i.e., variance, coefficient of variation,
standard deviation
3. Distribution: Determining the shape of a data. i.e., skewness and kurtosis
4. Outliers: Checking or detecting the presence wild observations in the data
5. Time: Population characteristics changed through time

3.2. Objectives of Measures of CT

Objectives of Measures of Central Tendency

 To summarize a set of data by a single value


 To facilitate comparison among different datasets
 To use for further statistical analysis or manipulation. i.e. for instance, simple arithmetic
mean is useful to calculate variance

3.3. Parameter and Statistic


A statistic is a characteristic or measure obtained by using the data values from the sample.
__
Denoted by capital letters such as X , S2 and so on

A parameter is a characteristic or measure obtained by using the data values from the population.
Denoted by Greek letters such as µ, σ2and so on

3.4. The Summation Notation


Variables are denoted by capital letters. Suppose X is a variable having n observations. i.e.
x1,x2,··· ,xn . Let xi represent the ith observation of variable X. i is called index or subscript. The
sum of all observations of variable X is x1 + x2 +···+ xnThe above summation can be shortly
denoted by a Greek capital letter sigma (  ) as follows

30 | P a g e
n

 xi = x1 + x2 +···+ xn
i 1

n
The symbol  xi
i 1
is used to denote the sum of all xi from i = 1 to i = n

Example: Consider the following data

I Age(X) Xi The variable Age is denoted by X .x1 is the first observation of


1 25 X1 variable X which is 25. x1 = 25, x2 = 36, x3 = 12 and x4 = 3. The
2 36 X2 sum is
3 12 X3
4 3 X4 n

 xi = x1 + x2 + x3 + x4 = 25 + 36 + 12 + 3 = 76
i 1

Notice that n = 4

Example: Extend the following expressions


4
1.  (i  xi) = (1 + x1) + (2 + x2) + (3 + x3) + (4 + x4)
i 1

3
i
 (i  i ) =
1 2 3
2. + +
i 1
2
(1  1 ) (2  2 ) (3  32 )
2 2

4
i
 i 1 =
0 1 2 3 4
3. + + + +
i 0 0 1 11 2 1 3 1 4 1

3
4.  (3  i) = (3+1) + (3+2)2 + (3+3)3
i 1
i

Basic Properties of Summation

Suppose X and Y are variables with n observations each


n

 (x
i 1
i  y i ) =(x1 + y1) + (x2 + y2)+ …(xn + yn)

 i  1 xi + 
n
= x1  x2  ...  xn  y1  y 2  ...  y n
n
= yi
  i 1

i 1 xi
n n

 yi
i 1

31 | P a g e
n n n n
Likewise,  ( xi  yi ) =  xi -  yi (  yi )2 = (y1 + y2+ …+yn)2
i 1 i 1 i 1 i 1

Basic Properties of Summation:


n

 xiyi = x1y1 + x2y2 +···+ xnyn. Suppose X has n observations and let a be a non-zero arbitrary
i 1

constant number, then

= a( x  x  ...  x )  =a  i  1 xi
n
n

 axi = ax1 + ax2 +···+ axn 


1 2 n
i 1

i 1 xi
n

i 1 a = a + a + …+a = na adding a n-times


n

Suppose X is a variable with n observation and a and b are constant numbers

i 1 ( axi  b ) =(axi + b) = (ax1 + b) + (ax1 + b) +···+ (axn + b) = ax1 + ax2 +···+ axn + b + b
n

+···+ b

i1 xi + nb
n
= a(x1 + x2 +···+ xn) + b + b +···+ b =a

i 1 axi b = a i1 xi
n n
Likewise, - nb


n
x  x1  x2  ...  xn
2 2 2 2
i 1 i

3.5. Characteristics of Measures of CT


A typical average must have the following characteristics

 It should be easy to calculate and understand


 It should be rigidly defined. In other words, it should have one and only one
interpretation so that personal bias of the investigator does not affect the value of its
usefulness
 It should be representative of the data under consideration
 It should have sampling stability. i.e., it should not be affected by sampling fluctuations.
 It should not be affected by extreme values
 It should be amenable for further algebraic manipulation

32 | P a g e
Note: If a particular measure of central tendency has failed to show some of these
characteristics, the failurity will be considered as a disadvantage of using that MCT.

3.5.1. Measures of Central Tendency


Simple arithmetic mean is the most familiar measure of central tendency. It is the first MCT
that comes into our mind when we think of average

A. Simple Arithmetic Mean

It is defined as the sum of all observations divided by the total number of observation. The
simple arithmetic mean which is computed from the sample is denoted by bar over the head
__
of the variable. If the variable is denoted by X, then the mean will be denoted as X
(pronounced X bar)
__
x1  x2  ...  xn 1 n
X =   xi
n n i 1

where n is total number of observations in the sample for X

The simple arithmetic mean computed from the population is denoted by µ

x1  x2  ...  xn 1 n
µ=   xi
N N i 1

where N is total number of observations in the population


__
Note: X is sample statistic and µ is population parameter. Example: The number of tourists
who have visited Ethiopia from 2009 to 2014 are 427000, 468000, 523000, 597000, 681000
and 770000, respectively. Compute the average number of tourists per year.

Let X be number of tourists, then the average number of tourists is


__
x1  x 2  ...  x n 1 n 1 6 1
X n
 
n i 1
x i  
6 i 1
xi  (427000 468000 523000 597000 681000 770000)
6
346600
  577666.7
6

On average 577,667 tourists have been visiting Ethiopia every year from 2009 to 2014

Example: The grades of a student on six examinations were 84, 91, 72, 68, 87 and 78. Find the
arithmetic mean of the grades. Suppose Y represents students' grade

33 | P a g e
__
1 n 1 6 y  y 2  y3  y4  y5  y6 84  91  72  68  87  78 480
Y  n i1 6 
 y i 
i 1
yi  1
6

6

6
 80

The average grade of six students is 80.

Exercise: The data represent the number of days off per year for a sample of individuals selected
from nine different countries. Find the mean. 20, 26, 40, 36, 23, 42, 35, 24, 30 Ans: 30.7

Suppose the data has k distinct observations. Let x1 occurs f1 times, x2 occurs f2times, ...xk occurs
fk times as in the table below

Distinct value of frequency Then, the simple arithmetic mean is computed as follows:
x
X1 F1 __
f1 x1  f 2 x2  ...  f k xk 1  k 
X2 F2 x f1  f 2  ...  f k
 k  f i xi 
. .  fi  i 1 
i 1
. .
. . Example: The number of cylinder of 32 automobiles have been
Xi fi recorded and the following table has been constructed
. .
. . # of cylinder 4 6 8
. . # of 11 7 14
Xk fk automobiles

Compute the average number of cylinder for 32 automobiles.

Let X represent the number of cylinder in a car. k = 3

The average number of cylinder per car is 6

Mean for Grouped Data: The class mark (class midpoint) is the representative of the class limit.
So, class mark is taken as data value (xi)

Class Class frequency The simple arithmetic mean is


limit mark(xi)
5–9 1 4 3x7  4 x12  1x17  3x 22

7 3
x 4 f i xi   13.8
10 – 14 12 4 3  4 1 3
15 – 19 17 1 
i 1
fi i 1

20 – 24 22 3
Exercise: The hourly compensation costs (in birr) for production workers in selected countries
are represented below. Compute the simple arithmetic mean

34 | P a g e
Class Frequency
2-4 7
4-6 3
6-8 1
8-10 7
10-12 5
12-14 5
Exercise: Consider the grouped frequency distribution in chapter 2

Compute the mean of the grouped data

Weighted Mean

Suppose variable X has n observations where x1 has a weight of w1, x2 has a weight
of w2, ...,xn has a weight of wn, then the mean of X is

w1 x1  w2 x2  ...  wk xk 1  k 
xw     wi xk 
w1  w2  ...  wk k

 wt  i1 
i 1

 Simple arithmetic mean is a special type of weighted arithmetic mean


 When all observations have equal weights, simple arithmetic and weighted
means are the same

Example: A teacher attaches weights 2 to homework (HM) 3 to midterm exam


(MT) and 5 for final exam (FE). If a student score 90, 50 and 60 for HM, MT and
FE, respectively, what is his/ her average academic performance?

w1 x1  w2 x2  ...  wk xk 1  k  2 x90  3 x50  5 x60


xw     wi xk    63
w1  w2  ...  wk k
  235
 wt
i 1
i 1

Exercise: Mr. Mamohas the following grade in the first semester. Find his
semester grade

Course code CrHr Grade


Stat3022 3 A
Econ4032 4 B
Mgmt3033 3 C

35 | P a g e
B. Mode

Mode is the most frequently observed value of a variable. i.e. mode is the value of X with the
highest frequency.Mode may not exist; even if it does exist, it may not be unique

Example: The set 2, 2, 5, 7, 8, 9, 9, 9, 10, 10, and 11 has mode 9. The set 3, 5, 8, 10, 12, 15, and
16 has no mode. The set 2, 3, 4, 4, 4, 5, 5, 7, 7, 7, and 9 has two modes, 4 and 7

A distribution having only one mode is called unimodal and two modes is called bimodal. Mode

is denoted by hat over the head of a variable like x
Exercise: Consider the following ungrouped frequency distributions. What is/are their mode?

Age F Weight F Salary f


15 2 25 6 1500 78
20 10 40 25 2500 45
25 5 56 5 3500 78
30 2 74 22 10000 32

Mode for Grouped Data: The mode can be computed as:


 1 
x  LCB    w
 1   2 
m

whereLCBm is the lower class boundary of the modal class, W is the weight, ∆1 =
fmodal−fmodal−1, ∆2 = fmodal−fmodal+1

Modal class is a class limit with the highest frequency. The modal class contains the mode

Example: Consider the following grouped frequency distribution and compute the mode

Class Class f 1 =7 – 2 =5
Limit boundary
6–11 5.5-11.5 2 2 = 7 – 4 = 3
12 – 17 11.5 - 17.5 ②
18 -23 17.5 - 23.5 ⑦ W = 11.5 – 5.5 = 6
24 – 29 23.5 - 29.5 ④
30 – 35 29.5 - 35.5 3 LCBm = 17.5
36 –41 36.5 - 41.5 2

36 | P a g e

 1 
x  LCB    w = 17.5 +(5/(5+3))6=21.25
 1   2 
m

Exercise: Compute the mode of the following grouped frequency distributions

Age F Age F Age f


10 – 14 5 10 - 12 10 – 14 12
15 – 19 12 14 15 – 19 9
20 - 24 6 15 - 19 10 20 – 24 6
25 - 29 10 20 – 6 25 – 29 12
C. Median 24
25 - 29 10
The median of a dataset is the measure of center that is the middle value when the original data
values are arranged in order of increasing (or decreasing) magnitude.
~
The median is denoted by tilde over the head of the variable (like x)
Median for Raw Data: The steps are:

1. Sort the data either in ascending or descending order ·


2. Determine whether the n is even or odd . If n is odd, then the medial will be

 n 1
th
~

x= x n1 =   observation
2  2 

If n is even, then the median will be

~ xn / 2  x( n / 2)1
x= 2
i.e., the average of the middle two observations

Example: The median of 2, 5, 3 , 6, 9, 7, 8 is 6. The median of 2, 5, 0, 5, 6, 9 is 5

Exercise: What is the median of the following data sets

Salary F <f Salary F <f Salary F <f


2500 10 10 2 2 2 2.53 3 3
2800 5 15 3 6 8 2.75 25 28
3200 6 21 4 9 17 3.25 30 58
3600 3 24 5 4 21 3.52 45 103
Median for Grouped Data:

The median can be obtained by using the following formula

37
n 
~  C 
x  LCB me  2 w
 f me 
 
 

whereLCBme is the lower class boundary of the median class, C is the <cf which comes before
the median frequency (fme), W is the class width,

Example: Find the median for the following grouped data

Class Class F <F n = 20


Limit Boundary
6 - 11 5.5 - 11.5 2 2 n/2 = 10
12 - 17 11.5 - 17.5 2 ④
C= 4 ,fme=7
18 -23 17.5 - 23.5 ⑦ 11
24 - 29 23.5 - 29.5 4 15 W=6
30 - 35 29.5 - 35.5 3 18
3 -416 36.5 - 41.5 2 20
LCBme = 17.5

n   20 
~  C   4
x  LCBme   2f me w  17.5   2 7 6  22.6
   
   

Exercise: Consider the following grouped data & compute median

Grade F CF Grade F CF
10 - 19 5 5 40 – 49 6 6
20 - 29 2 7 50 – 59 5 11
30 -39 9 16 60 -69 9 20
40 - 49 0 16 70 – 79 25 45
50 - 59 10 26 80 – 89 9 54
60 -69 8 34 90 -99 3 57

Emperical Relationships between Mean, Mode and Median. For unimodal frequency curves that
are moderately skewed (asymmetrical), we have the following empirical relation

Mean – Mode = 3(mean - median)

Measures of Dispersion

38
The term dispersion is generally used in two senses. Dispersion refers to the variations of the
items among themselves. Dispersion refers to the variation of items around an average. If the
difference between the value of items and the average is large, the dispersion will be high and on
the other hand if the difference between the value of the items and averaging is small, the
dispersion will be low

Dispersion is the scatteredness or spreadness of the individual items in a given series. Objectives
of Measures of Dispersion

 To determine the reliability of an average: If the variation is small, the average will
closely represent the individual values and it is highly representative on the other hand, if
the variation is large, the average will be quite unreliable
 To compare the variability of two or more datasets: A high degree of variation would
mean less consistency or less uniformity as compared to the data having less variation
For facilitating the use of other statistical measures: Measures of dispersion serve the
basis of many other statistical measures such as correlation, regression, testing of
hypothesis etc.
 Basis of statistical quality control: The extent of the dispersion gives indication to the
officials whether the variation in the quality of the product is due to random factors or
some defect in the manufacturing process

Measures of variation are classified as absolute and relative.

Absolute Measures of Dispersion: They are expressed in the same unit in which the original
data are given such as kilograms, tones etc. These measures are suitable for comparing the
variability in two distributions having variables expressed in the same units and of the same
averaging size. These measures are not suitable for comparing the variability in two
distributions having variables expressed in different units.

Relative Measures of Dispersion: They are the ratio of a measure of absolute dispersion to
an appropriate average or the selected items of the data. They are unitless and can be used to
compare the degree of dispersion of different dataset

Range

It is the simplest measures of dispersion. It is defined as the difference between the largest and
smallest value in the dataset. R = Max−Min.

Relative Range

The relative measures of range, also called coefficient of range, is defined as

RR = (Max−Min)/(Max + Min)

39
Note: For grouped frequency distribution Max is the upper class limit of the last class and Min is
the lower class limit of the first class

Example: Five students obtained the following marks in statistics: 20, 35, 25, 30, 15. Find the
range and coefficient of range. Max = 35, Min = 15 R = Max−Min = 35−15 = 20

RR =(Max−Min)/(Max + Min)=20 /(35 + 15) = 0.4

Example: Compute range for the following grouped dataset

Size 5-10 11-15 16-20 21-25 26-30


Frequency 4 9 15 30 40

Max = 30, Min = 5, R = Max−Min = 30−5 = 25

Variance

Suppose variable X has n observations x1, x2, ···, xn, the variance which is computed from the
sample is defined as:

S 
2 1 n

n  1 i 1
 2

xi  x where n−1 is the number of independent observations called degree of

freedom

The variance which is computed from a population is denoted by Greek letter sigma (σ2)

1 N
2   xi   2
N  1 i 1

Reading Assignment: In some statistics books, the sample variance is defined as the mean of the
square deviation of each observation from the center (mean). i.e.,

S2 
1 n

n i
 2

xi  x This formula is not advised as an estimator of the population variance. Why?

The standard deviation of a statistical data is defined as the positive square root of the variance.
i.e.,

S 
1 n
 xi  x
n  1 i 1
 
2

Example: Compute the variance and standard deviation of the following data 2, 4, 6, 8

40
The mean is = 1/4(2 + 4 + 6 + 8) = 1/4(20) = 5. The variance is S2 = 1/( 4−1)((2−5)2 + (4−5)2 +
(6−5)2 + (8−5)2) = (9 + 1 + 1 + 9)/ 3 = 20 /3 = 6.67. The standard deviation is S = √6.67 = 2.58

Example: Compute the variance and standard deviation of the following data 48, 36, 63, 45, 12,
32, 45, 65

The mean is 43.25,the variance is 292.5,standard deviation is 17.103

Exercise: The number of highway miles per gallon of the 10 worst vehicles is shown.

12, 15, 13, 14, 15, 16, 17, 16, 17, 18. Compute variance and standard deviation.

Variance for Grouped Data

 fixi  x 
k
1 2
S2  k
Where xi and fi are the class mark and frequency of of the ith class. k
 fi  1
i 1
i

is number of classes.

Exercise: Compute the variance and standard deviation for the following grouped frequency
distributions and identify which frequency distribution is more dispersed.

Weight No. of Score No. of students


(KG) managers
45.5 – 48.4 6 10 -19 24
48.5 – 51.4 4 20 – 29 16
51.5 – 54.4 26 30 – 39 13
54.5 – 57.4 1 40 – 49 15
57.5 – 60.4 25 50 – 59 43
60.5 – 63.4 3 60 – 69 18
63.5 – 66.4 8 70 – 79 2
66.5 - 69.4 12 80 - 89 4

Coefficient of Variations

The coefficient of variation, denoted by CV , is the standard deviation divided by the mean. The
result is expressed as a percentage

s
CV = X 100%, ×100%, CV =σ/µ ×100%
x

Example: Compare the variation in heights of men to the variation in weights of men, using these
sample results obtained from a dataset. For men, the heights yield x = 68.34 in. and Sx = 3.02 in.
the weights yield y = 172.55 lb and Sy = 26.33 lb.

41
Height: CV =Sx/ x =(3.02 in. /68.31 in. )×100% = 4.42%

Weight: CV =Sy / y =(26.33lb /172.55lb) ×100% = 15.26%

We can see that heights have considerably less variation than weights.

Exercise: The weekly income (in Birr) of 10 men and 15 women workers are listed below.
Whose weekly income is more dispersed?

Men: 14, 19, 20, 30, 100, 125, 236, 300, 142, 63

Women: 254, 250, 123, 352, 142, 22, 12, 32, 458, 100, 200, 235, 224, 162, 364, 122

Shape of Distribution

Skewness

Skewness is a measure of symmetry, or more precisely, the lack of symmetry or departure from
symmetry. i.e., it is a means of measuring the horizontal movement of the distribution

Skewness can be measured in absolute terms by taking the difference between arithmetic mean
and mode. The absolute measure of skewness is Sk = Mean−Mode. If the value of arithmetic
mean is greater than mode, Skewness is positive and if the value of mode is greater than mean,
the skewness is negative.

42
 0 positivelyskewed

0, symmetric
 0, negativelyskewed

Empirical Rule: For moderately skewed distribution Mean−Mode = 3(Mean−Median)

Kurtosis Kurtosis in Greek language mean bulginess, it measures the flatness of the curve

Three terms are used for indicating flatness, Mesokurtic stands for a normal curve, Leptokurtic
for a peaked curve and ,Platykurtic for a curve less peaked than normal.

43
Chapter Four
Theory of Probability
4.1. Probability Theory

Basic Concepts

Probability is a measure of the likelihood or chance that an uncertain event will occur. It is a
numerical measure of the chance of an outcome’s occurrence. It can assume a value between 0
and 1, inclusive. A probability near zero indicates that the outcome is very unlikely to occur,
while a probability near 1 indicates that the event is almost certain to happen. If we go to the
extreme, a probability of something will always to happen. Thus, probabilities are non-negative
proper fractions. It is the basis for inferential statistics

Experiment

An experiment is any well-defined situation or procedure that results in one or more possible
outcomes. Or simply it can be defined as any process that generates well defined outcomes. For
instance, tossing a coin, rolling a die, football match, etc can be taken as experiments.

Outcome

An outcome is a particular result of an experiment. For example, getting either head or tail is a
possible outcome of the experiment tossing a coin. Winning, loosing or tie/draw are the possible
outcomes of the football experiment, and getting 1, 2, 3, 4,5, or 6 are possible outcomes of the
rolling a die experiment.

Events

An event is a specific collection of basic outcomes, that is, a set containing one or more of the
basic outcomes from the sample space. An experiment identifies one or more outcomes of an
experiment. For example, in the rolling a die experiment, the simple collection of two or more of
the six possible outcomes can be taken as an event.

44
Sample Space

A sample space is a complete roster or listing of all possible out comes of an experiment. The
sample space of an experiment is usually illustrated either by a list or some type of diagram –
Venn diagrams and tree diagrams.

Illustration of an experiment, outcomes, events, and sample space.

Tossing/Flipping a coin twice………………. Experiment

Heads or Tails……………………………….. Outcomes/elementary events

HH, HT, TH, TT…………………………….. 4 Events

(HH, HT, TH, TT)…………………………… Sample Space

Exercise

Identify the experiment, outcomes, events and sample space for the following questions.

1. Sitting for an exam ………………………………. Experiment


Scoring A, B, C, D, F ……………………………... Possible outcomes

[A, B, C, D, F] ……………………………………... Sample Space

Scoring B and above ……………………………… Event

C and above ……………………………… Event

D or below ……………………………….. Event

2. Football game ……………………………………… Experiment


Win, Loose, Tie/Draw ……………………………… Outcomes

[W, L, T] ……………………………………………… Sample Space

Not winning (L, D), Not losing (W, D) …………… Events

Events

1. Independent events
Two or more events are independent if the occurrence or nonoccurrence of one of the events
does not affect the occurrence or nonoccurrence of the others. Certain experiments, such as
rolling dice, yield independent events; each die is independent of the other. Whether a 6 is rolled

45
on the first die has no influence on whether a 6 is rolled on the second die. Coin tosses always
are independent of each other. The possibility of getting a head on the first toss of a coin in
independent of getting a head on the second toss.

The impact of independent events on the probability is that, if two events are independent, the
probability of attaining the second event is the same regardless of the outcome of the first event.
The probability of tossing a head is always ½ regardless of what was tossed previously. Thus, if
someone tosses a coin six times and gets six heads, the probability of tossing a head on the
seventh time is ½, because coin tosses are independent. In terms of symbolic notation, if X and Y
are independent: P(X/Y) = P(X) and P(Y/X) = P(Y), where P(X/Y) denotes the probability of X
occurring given that Y has occurred, and P(Y/X) denotes the probability of Y occurring given
that X has occurred.

2. Dependent Events
Two or more events are dependent if the occurrence or nonoccurrence of one of the events
affects the occurrence or nonoccurrence of the others. Certain experiments, such as rolling a die,
yields dependent events; the occurrence of one of the six events is dependent on the occurrence
or nonoccurrence of other events.

The impact of dependent events on the probability is that, if two events are dependent, the
probability of attaining the second event is different from that of the outcome of the first event.
In terms of symbolic notation, if X and Y are dependent: P(X/Y) ≠ P(X) and P(Y/X) ≠ P(Y),
where P(X/Y) denotes the probability of X occurring given that Y has occurred, and P(Y/X)
denotes the probability of Y occurring given that X has occurred.

3. Mutually exclusive events/ Disjoint Events – opposite of Joint events

Two or more events are mutually exclusive if the occurrence of one event precludes the
occurrence of the other events. This characteristic means that mutually exclusive events cannot
occur simultaneously and therefore can have no intersection.

In the toss of a single coin, the events of heads and tails are mutually exclusive. The person
tossing the coin gets either a head or a tail but never both. The probability of two mutually
exclusive events occurring at the same time is zero. In terms of set notation, if events X and Y
are mutually exclusive, P(X n Y) = 0, or the probability of X intersecting Y is zero.

Relating the above three types of events, mutually exclusive events must be dependent, but
dependent events need not be mutually exclusive. Events that are independent cannot be

46
mutually exclusive. Therefore, mutually exclusive implies dependence and independence implies
not mutually exclusive, but no other simple implications among these conditions hold true.

4. Collectively exhaustive events


A list of collectively exhaustive events contains all possible elementary events for an experiment.
Thus all sample spaces are collectively exhaustive lists. The sample space for an experiment can
be described as a list of events that are mutually exclusive and collectively exhaustive.

5. Complementary events
The complement of an event A is denoted A . All elementary events of an experiment not in A
comprise its complement. For example, if in rolling one die, event A is getting an even number,
the complement of A is getting an odd number. If event A is getting a 5 on the roll of a die, the
complement of A is getting a 1, 2, 3, 4, or 6. The complement of event A contains whatever
portion of the sample space that event A does not contain.

Using the complement of an event can be helpful some times in solving for probabilities because
of the rule: P ( A ) = 1- P (A).

Principles of counting

Counting the number of ways in which events may occur in an experiment plays a major role in
probability. Some rules for counting are presented in this section. The first of these is called the
fundamental principle of counting.

The fundamental principle of counting specifies that if one event can occur in n1ways and
another event can occur in n2 ways, the two events can occur together in n1n2 ways.

Permutations

Other important counting rules pertain to the arrangement of items with regard to the order of
items. In this case with use Permutations

Permutations are groups of items where both the composition of the groups and the order with
in a group are important.

47
n!
The number of permutations in n distinct items arranged x at a time is n Px  , where n!,
n  x !
read n factorial, is

n! = n (n -1) (n-2)……. (1).

By definition, 0! = 1.

Combinations

Permutations concern ways in which both order and composition are important. In combinations
what matters is the composition of the group not the order of items as what we have in
permutations.

n!
The number of combinations possible by selecting x out of n distinct items is: n C x  .
n  x ! x!

Methods of assigning probabilities

The three general methods of assigning probabilities are

 The classical method – the equally likely approach


 The relative frequency method
 The subjective method

Classical Method

The classical method of assigning probabilities is based on the assumption that each outcome is
equally likely to occur. Classical probability utilizes rules and laws. It involves an experiment
and an event. The definition assumes that all n possible outcomes have the same chance for
occurring. In this method probability values are assigned as follows:

48
ne
P( E ) 
,
N
where : N  total possible number of outcomes of an exp eriment
ne  the number of outcomes in which the event occurs out of N outcomes

As ne can never be greater than N (no more than N outcomes in the population could possibly
possess attribute e), the highest value of any probability is 1. If the probability of an outcome
occurring is 1, the event is certain to occur. The smallest possible probability is zero. If none of
the outcomes of N possibilities possesses the desired characteristic, e, the probability is 0/N = 0,
and the event is certain not to occur. The range of possibilities for probabilities is: 0  P ( E )  1.

Thus the probabilities are non-negative proper fractions or non-negative decimal values less than
or equal to 1.

Relative Frequency of Occurrence Method

The relative frequency of occurrence method of assigning probabilities is based on cumulated


historical data. With this method, the probability of an event occurring is equal to the number of
times the event has occurred in the past divided by the total number of opportunities for the event
to have occurred:

Probability by number of

Relative frequency = times an event has occurred


of occurrence total number of opportunities
for the event to occur

Relative frequency of occurrence is not based on rules or laws but on what has occurred in the
past

Subjective method

The subjective method of assigning probability is based on the feelings or insights of a person
determining the probability. Subjective probability comes from the person’s intuition or
reasoning. Although not a scientific approach to probability, the subjective method often is based
on the accumulation of knowledge, understanding, and experience stored and processed in the
human mind. At times it is merely a guess. At other times, subjective probability can potentially
yield accurate probabilities.

49
Subjective probability can be a potentially useful way of tapping a person’s experience,
knowledge, and insight and using them to forecast the occurrence of some event. E.g. Weather
forecast

Types of probabilities

There are four types of probabilities. These are:

 Simple probability
 Joint probability
 Marginal probability
 Conditional probability

Simple Probability

Simple probabilities are relatively straight forward which are obtained using the formula P (A) =
n (A)/n – relative frequency method.

Marginal probability

Marginal probability is denoted by P (E), where E is some event. A marginal probability is


usually computed by dividing some subtotal by the whole. An example of marginal probability is
the probability that a student is infected by HIV/AIDS. This probability is computed by dividing
the number of students infected by HIV/AIDS by the total number of students. The probability of
a person wearing glasses is also a marginal probability. This probability is computed by dividing
the number of people wearing glasses by the total number of people. A marginal probability is
found in the margin of any joint probability table. It is the sum of the joint probabilities for a
single category of one attribute over all possible categories of another attribute.

Example:

ABC Company manufactures window air conditioners in both a deluxe model (D) and a standard
model (S). An auditor engaged in a compliance audit of the firm is validating the sales account
for the month April. She has collected 200 invoices for the month, some of which were sent to
wholesalers (W) and the remainders to retailers (R). Of the 140 retail invoices, 28 are for the
standard model. Only 24 of the wholesale invoices are for the standard model. If the auditor
selects one invoice at random, find the following probabilities.

50
a) The invoice selected is for the deluxe model.
b) The invoice selected is for the standard model.
c) The invoice selected is a wholesale invoice.
d) The invoice selected is a retail invoice.

Solution

Wholesale, W Retail, R Total

Deluxe, D 36 0.18* 112 0.56* 148 0.74**

Standard, S 24 0.12* 28 0.14* 52 0.26**

Total 60 0.30** 140 0.70** 200

P (D) = 148/200 = 0.74 P (W) = 60/200 = 0.30

P (S) = 52/200 = 0.26 P (R) = 140/200 = 0.70

P (D) = P (WnD) + P (RnD) P (W) = P (WnD) + P (WnS)

= 0.18 + 056 = 0.18 + 0.12

= 0.74 = 0.30

* Joint Probabilities

** Marginal Probabilities

Union probability

A second type of probability is the union of two events. Union probability is denoted by P (E1 U
E2), where E1 and E2 are two events. P (E1 U E2) is the probability that E1 will occur or that E2
will occur or that both E1 and E2 will occur. An example of union probability is the probability
that a person is infected by HIV/AIDS or Cancer. To qualify for the union, the person has to be
infected with at least one of the diseases. Another example is the probability of wearing eye
glasses or is a soldier. All people wearing eye glasses are included in the union along with all
people who are soldiers and all soldiers who wear eye glasses.

51
Joint probability

A third type of probability is the intersection of two events or joint probability. A joint
probability shows the probability that an observation will possess two (or more) characteristics
simultaneously. That is, it measures the probability of two or more events occurring together.
The joint probability of events E1 and E2 occurring is denoted P (E1 n E2). Some times P (E1 n
E2) is read as the probability of E1 and E2. To qualify for the intersection, both events must
occur. Joint probability ranges from 0 to 1, inclusive [0, 1]. The sum of all joint probabilities
must be equal to 1.0. An example of joint probability is the probability of a person to be infected
with HIV/AIDS and Cancer. Being infected with one of the diseases is not sufficient. A second
example of joint probability is the probability that the person is a soldier as well as he/she wears
eye glasses

Conditional probability

The fourth type is conditional probability. Conditional probability is denoted by P (E1 / E2). This
expression is read as: the probability that E1 will occur given that E2 is known to have occurred.
The conditional probability of an event E1, given event E2 is the ratio of the joint probability of
two events to the marginal probability of E2.

 X  P X and Y   X Y 
For PY  0, P    P 
     Y   Y  PY and X   Y  X 
e) For P X   0, P
Y P Y
  P 
X P X   X 

Example:

Blue Nile University recently conducted a survey of undergraduate students in order to gather
information about the usage of the library. The population for this study included all 4000
undergraduate students enrolled in the university. The library officers are interested in increasing
usage, particularly among females (F) and seniors (S) at the university. Of the 4000 students, 800
students are seniors, 1800 students are females and 450 of the 1800 females are seniors.

Required:

1. What is the probability that a student selected at random is a senior given that the selected
student is female?
2. What is the probability that a student selected at random is female given that the selected student
is senior?

Solution:

52
Senior, S Non-Senior, N Total

Female, F 450 0.1125 1350 0.3375 1800 0.45

Male, M 350 0.0875 1850 0.4625 2200 .055

Total 800 0.20 3200 0.80 4000

1. P (S/F) = P (SnF)/ P (F) = 0.1125/.45 = 0.25


2. P (F/S) = P (SnF)/ P (S) = 0.1125/.20 = 0.5625

Using conditional probability, joint probability of X and Y is calculated as:


P X  Y   P X  Y
* P Y   P Y
X
 *P (X ) 
The Bayes’ Rule

An extension to the Conditional Law of Probabilities is Bayes’ rule, which was developed by and
named for an English Clergy man Thomas Bayes (1702-1761). Bayes’ rule is a formula that
extends the use of the law of conditional probabilities to allow revision of original probabilities
when new information is needed. The two core ideas in Bayes’ Rule are the prior probability and
posterior/revised probability.

Prior probability – is initial probability which is determined before new information is


obtained. It is the starting point for Bayes theorem.

Posterior probability - a probability that has been revised based on new information, because it
represents a probability calculated after new information is obtained.

prior  probability  New Information  Application of Bayes Theorem Posterior Pr obability


Deter min ed Subjectively Sample Information

The Bayes’ theorem simplifies the computation of P(X/Y) when P (XnY) and P(Y) are not given
directly.

 Conditional Probability Rule (The Bayes’ Theorem for One Event)

53
 
P Y   P X 
P X
Y
 
 X 
P (Y )
.

 Bayes’ Theorem for Two Events

P 
 Y X   P X i 
 i 
P 
Xi
  .
 Y 
P
Y X
  P  X   P Y
 
  P X
 
   X 2 
1 2
1

 Bayes’ Theorem for Three Events

P Y   P X 

 Xi 
i
P 
Xi
Y 
.
  
P Y   P X 


 P Y 
  P  X 2   P X   P  X 3 
Y
 X 1  X2   3
1

The general Bayes’ rule is presented below.

P Y   P X 
 P ( X i ) * P Y 

 X   X 
P 
i
Xi
 n
i i
Y 
.
  
P Y   P X 


 P Y 
  P  X   .......  P Y
 X   P  X   P  X I  * P Y 

 X1   X2  n 
1 2 n
i 1  Xi 

Example:

1. A company has three machines A, B and C which all produce the same two parts, X and Y. of all
the parts produced, machine A produces 60%, machine B produces 30%, and machine C
produces the rest. 40% of the parts made by machine A are part X, 50% of the parts made by
machine B are part X, and 70% of the parts made by machine C are part X. A part produced by
this company is randomly sampled and is determined to be an X part. With the knowledge that it
is an X part, find the probabilities that the part came from machine A, B or C.

Solution:

P (A) = 0.6 P (X/A) = 0.4 P (A/X) =?


54
P (B) = 0.3 P (X/B) = 0.5 P (B/X) =?

P (C) = 0.1 P (X/C) = 0.7 P (C/X) =?

Method 1

 X  X   A
  P  A
PX P A  X 

 P A  P X  PB   P X  PC 
PA
P P X 
A B C
(0.6 * 0.4) 0.24
   0.52
0.6 * 0.4  0.3 * 0.5  0.1 * 0.7  0.46

 X  X   B
  P B 
PX P B  X 

 P A  P X  PB   P X  PC 
PB
P P X 
A B C
(0.3 * 0.5) 0.15
   0.33
0.6 * 0.4  0.3 * 0.5  0.1 * 0.7  0.46

PC
  PC 
 X  PX  P A  PX  PB   PX  PC   PCPX X 
C
PX

A B C
(0.1 * 0.7) 0.07
   0.15
0.6 * 0.4  0.3 * 0.5  0.1 * 0.7  0.46
NB: P (A/X) + P (B/X) + P (C/X) = 1.00

P (A/X) + P (A’/X) = 1.0

Method 2

Machine

Product A B C Total

X 0.24 0.15 0.07 0.46

Y 0.36 0.15 0.03 0.54

Total 0.6 0.3 0.1 1.00

P (A/X) = 0.24/0.46 = 0.52

55
P (B/X) = 0.15/0.46 = 0.33

P (C/X) = 0.07/0.46 = 0.15

1.00 The sum of joint and conditional probabilities should be equal to


one.

Method 3 – Bayesian Table

Event, Prior probability, Conditional Prob., Joint prob., Posterior/Revised


Ei P(Ei) P(X/Ei) P(EinX) prob., P(Ei/X)

A 0.60 0.40 0.24 0.24/0.46 = 0.52

B 0.30 0.50 0.15 0.15/0.46 = 0.33

C 0.10 0.70 0.07 0.07/0.46 = 0.15

P(X) = 0.46 1.00

0.24

Method 4 – Tree Diagram X/A P (A/X) = 0.24/0.46 = 0.52

0.40

0.60 0.60 0.36

Y/A

A 0.15 P(X) = 0.46

B 0.30 X/B 0.50 P (B/X) = 0.15/0.46 = 0.33

0.50 0.15

C Y/B 0.07 P(C/X) = 0.07/0.46 = 0.15

0.10 X/C 0.70 1.00

0.30 0.03

Y/C

Find: P(Y) = 0.54

P (A/Y) = 0.36/0.54 = 0.667

P (B/Y) = 0.15/0.54 = 0.278

56
P (C/Y) = 0.03/0.54 = 0.055

1.000

2. Bruk, Alemayehu and yohannes fill orders in a fast food restaurant. Bruk fills incorrectly 20% of
the orders he takes. Alemayehu fills incorrectly 12% of the orders he takes, and Yohannes fills
incorrectly 5% of the orders he takes. Bruk fills 30% of all orders, Alemayehu fills 45% of all
orders, and Yohannes fills 25% of all orders. An order has just been filled.

a) What is the probability that Alemayehu filled the order? 0.45


b) If the order was filled by Yohannes, what is the probability that it would was filled
correctly? 0.95
c) Who filled the order is unknown, but the order was filled correctly. What are the revised
probabilities that Bruk, Alemayehu or Yohannes filled the order? 0.2748, 0.4533 and
0.2719
d) Who filled the order is unknown, but the order was filled incorrectly. What are the revised
probabilities that Bruk, Alemayehu or Yohannes filled the order? 0.4743, 0.4269 and
0.0988

3. A major league base ball team has four starting pitchers: Girma, Robel, Solomon, and Asrat. Each
pitcher starts every fourth game. The team wins 60% of all games that Girma starts, 45% of all
games that Robel starts, 35% of all games that Girma starts, 40% of all games that Girma starts.
An avid fan has just returned from a three week vacation in the wilderness and found out that the
team played yesterday.

a) What is the probability that Girma started the game? 0.25


b) What is the probability that Solomon started the game? 0.25
c) If the team won yesterday, revise the probability of each pitcher starting the game? 0.333,
0.250, 0.194 and 0.222

Laws of Probability

Additive Law

The general law of addition is used to find the probability of the union of two events, P (E1 U
E2). The general Law of Addition is presented as follows:

57
PE1  E2   PE1   PE2   PE1  E2 

Where E1 , E2 are events and E1  E2  is the int er sec tion of E1 and E2 .

Special Rule of Addition

If two events are mutually exclusive, the probability of the union of the two events is the
probability of the first event plus the probability of the second event. Because mutually exclusive
events do not intersect, nothing has to be subtracted out. The formula is shown below.

If E1 and E 2 are mutually exclusive events,


p E1  E 2   PE1   PE 2 .

Example:

1. A husband and a wife, each 20 years old, are debating whether to setup a retirement program for
themselves. Benefits are paid to the man or woman at the age of 70. If both have died before
reaching age 70, no benefits are paid. Assume that the probability that a man aged 20 lives up to
age 70 is approximately 0.7. If the husband and wife join the program, what is the probability
that either the man or the woman will collect benefits? Assume that the chances of the man or
woman dying are independent of each other.

Solution:

Let M= man lives up to age 70, W = woman lives up to age 70.

P (M) = 0.60 P (W) = 0.70

P (WUM) = P (W) + P (M) – P (WnM)

= 0.70 + 0.60 – P (WnM). Since the two events are independent, the joint probability
that both the man and the woman lives up to age 70 is equal to the product of the
individual marginal probabilities. P(WnM) = P (M) * P (W)

= 0.60 * 0.70

= 0.42

= 0.70 + 0.60 – 0.42

= 0.88

58
2. According to a recent study conducted by businessmen, 76% of all shareholders have some
college education. Suppose that 37% of all adults have some college education and that 22% of
all adults are shareholders. For a randomly selected adult:

a) What is the probability that the person did not own shares of stock? 0.78
b) What is the probability that the person owns shares of stock or had some college
education? 0.4228
c) What is the probability that the person has neither some college education nor owns
shares of stock? P( A  B ) = 1 – P(AUB) = 0.5772
d) What is the probability that the person does not own shares of stock or has no college
education? P( A  B ) = 1 – P(AnB) = 0.8382
e) What is the probability that the person owns only shares of stock or had some college
education but not both? P(AUB) – P(AnB) = 0.4227 – 0.1672 = 0.2556

3. A 1999 survey of 20,000 sales professionals conducted by Ethiopian Telecommunication


Corporation (ETC) found that 15% of all sales professionals use home fax machines and 35%
use mobile telephones. Suppose that 1% of all sales professionals have both fax machines and
use mobile telephones.

a) What is the probability that a randomly selected sales professional has a home fax
machine or uses a mobile telephone?
b) What is the probability that a randomly selected sales professional neither has a home fax
machine nor uses a mobile telephone?
c) Suppose that no sales professional has both a home fax machine and uses a mobile
telephone. What is the probability that a randomly selected sales professional has a home
fax machine or uses a mobile telephone?

Multiplicative law

The probability of the intersection of two events (E1  E2) is called the joint probability. The
general law of multiplication is used to find the probability of the intersection of two events or
joint probability. The general law of multiplication is stated as follows:

PE1  E 2   PE1   P 2   PE   P E1 .


E
  
 E1  E 2 
2

Special Rule of Multiplication

59
If events E1 and E2 are independent, a special law of multiplication can be used to find the
intersection of E1 and E2.

If E1 and E 2 are independent events,


p E1  E 2   PE1  PE 2 .

Example:

1. Test the matrix for the 200 executive responses to determine whether industry type is independent
of geographic location.

Geographic Location

Industry type North East, D South East, E Mid West, F West, G Total

Finance, A 24 10 8 14 56

Manufacturing, B 30 6 22 12 70

Communication, C 28 18 12 16 74

Total 82 34 42 42 200

Solution:

Select one industry type and one geographic location (Say A – Finance and G – West). Does P
(A/G) = P (A)?

P (A/G) = P (AnG)/P (G) = 0.07/0.21 = 0.33

P (A) = 56/200 = 0.28

Since P (A/G) ≠P (A), industry type and geographic location are not independent.

2. Considering the above problem, if a respondent is randomly selected from these data:

a) What is the probability that this executive is from the mid west? 0.21
b) What is the probability that a respondent is from the communication industry or from north
east? 0.64

60
c) What is the probability that a respondent is from the south east or from finance industry?
0.36
d) What is the probability that this executive is from the south east or the west? 0.38

3. The results of a survey asking, “Do you have a calculator and/or a computer in your home?” are
as follows:

Calculator

Yes No

Computer Yes 46 3

No 11 15

Is the variable calculator independent of the variable computer? Why or why not? NO

Laws of conditional probability

Conditional probabilities are based on knowledge of one of the variables. If E1 and E2 are two
events, the conditional probability of E1 occurring given that E2 is known or has occurred is
expressed as P 1  . The formula for finding a conditional probability is given below.
E
 E2 

P   P E 
E2
  P E1  E 2    E1 
P 1
1
E

 E2  P E 2  P E1 

Special Law of Conditional Probability

If E1 and E2 are independent events, the conditional probability and marginal probability of the
two events are equal. That is, P (E1/E2) = P (E1), and P (E2/E1) = P (E2).

61
Chapter Five
Probability Distributions
Basic concepts

A variable is a characteristic that can have different values or outcomes. A variable whose
numerical value is determined by an outcome of a random experiment, or, a variable whose
outcomes occur by chance is called a random variable. Depending on the values a random
variable can take, there are two types of random variables: Discrete random variables and
Continuous random variables.

Discrete Random Variables: these are random variables which can only assume non-negative
whole numbers such as 0, 1, 2, 3………, n for example, the number of students in a class, the
number of telephone calls received in a given hour, the number of people living in certain area
and the like can take only non-negative whole numbers. As a result, there are gaps or voids in
them along an interval.

Continuous Random Variables: these are random variables which can take any value, that is, it
can take any value over an interval. Thus, continuous random variables have no gaps or
unassumed values. These are random variables that can assume an unaccountably infinite
number of values. For example, the height of an individual, the distance traveled by a truck
driver in a given hour, the temperature of a room on a given day, and the like produces a
continuous random variable.

NB: Continuous random variables typically record the value of a measurement such as time,
weight, volume, or length. While discrete random variable counts the number of times a
particular attribute is observed.

Probability Distribution: is a listing of the possible values that a random variable can assume
along with their probabilities. It is any representation of the values of a random variable and the
associated probabilities. Depending on the types of random variables with which we deal with,
we do have two types of Probability Distributions.: Discrete Probability Distributions and
Continuous Probability Distributions.

Discrete Probability Distribution is any representation of the values of discrete random


variable and the associated probabilities. The most commonly used discrete probability
distributions include the Binomial, hyper geometric and the Poisson distributions.

The Binomial Distribution

Perhaps the most widely known of all discrete probability distribution is the binomial
distribution. The binomial distribution has the following underlying assumptions:

62
i.The experiment involves n identical trials or sampling is done with replacement.
ii.Each trial has only two possible mutually exclusive outcomes.[Bi = Two]
iii.Each trial is independent of the previous trials
iv. The probability of success (P) and failure (q = 1-P) remain constant for each trial.
v. In n trials, only X successes are possible where X is a whole number between 0 and n [0≤
X≤ n]
vi. It is applicable if the sample size n is less than 5% of the population size N or if samples
are taken with replacement.

To compute the probability of occurrences in binomial distribution we do have the Binomial


Formula. It is stated as follows:

The probability of exactly X success in n trials

P X    p x q  Where : Px   probability of X successin n trial


n! nx

x!n  x !
n  number of trialssample size
x  number of successes desired
P  probabiltiy of success
q 1  p  probabiltiy of failure

Example:

1. If we toss a coin three times, what is the probability of getting exactly two heads?

Solution:

In a single toss, the probability of getting a head or a tail is 0.5. In tossing the coin three times,
the following are the possible outcomes.

HHH, HHT, HTH, HTT, THH, THT, TTH, TTH, TTT

The probability of getting exactly two heads is, therefore, computed as

= (0.5*0.5*0.5) + (0.5*0.5*0.5) + (0.5*0.5*0.5)

= 0.125 * 3 = 0.375

Using the Binomial formula

P = 0.50 q = 1 – 0.50 = 0.50 n=3 x=2

P(x=2) = ncx * PX * q1-x

63
= 3c2 *0.52*0.51
=
3(0.25*0.5) = 3(0.125) = 0.375

There are three ways of choosing exactly two heads from a total of three trials.

2. A researcher wants to test the claim that 10% of all people are left-handed by randomly selecting
forty students at a university. What is the probability of getting six left handed students among
forty?

Solution:

P = 0.10 q = 1 – 0.10 = 0.90 n = 40 x=6

P(x=6) = 0.1068

If 10% of the population is left-handed, about 10.68% of the time the researcher would get six
who are left handed in a sample of forty.

3. Based on past data, approximately 30% of the oil wells drilled in areas having a certain favorable
geological formation have struck oil. A company has identified 5 locations that possess this
information. Assuming that the chance of striking oil on any location is independent of any
others, calculate the probability that exactly 2 of the 5 wells strike oil.

Solution:

P = 0.30 q = 1 – 0.30 = 0.70 n=5 x=2

P(x = 2) = 0.3087

If the probability of getting oil in areas having certain favorable geological formation is 0.3, 31%
of the time we can get 2 drills which have oil in a sample of 5 drills.

4. The quality control department of a manufacturer tested the most recent batch of 1000 catalytic
converters produced and found 50 of them defective. Subsequently, an employee
unwittingly/unintentionally mixed the defective converters with the non-defective ones. If a
sample of three converters is randomly selected from the mixed batch, what is the probability
that the employee may get one defective item?

Solution:

Before we try to solve this problem we have to check whether all the assumptions of a Binomial
distribution are satisfied or not. One of the assumptions states that the sample size, n must be
less than five percent of the population size, N. in our case, the sample size is less that 5% of the
population size[ 3/1000 = 0.003< 0.05] so we can use the binomial distribution to solve this
specific exercise.

64
N = 1000 p = R/N, where R- the number of success in the population, N

n=3 = 50/1000 = 0.05

x=1 q = 1 – 0.05 = 0.95

P(x=1) = 0.1354

If 5% of the product contains defective converters, 13.54% of the time the quality control
department would get 1 defective item in a sample of three converters.

5. A town has three ambulances for emergency transportation to a hospital. The probability that any
one of these will be available at a given time is 0.75. if a person calls for an ambulance, what is
the probability that an ambulance will be available?

Solution:

n=3 p = 0.75 q = 0.25

Probability of getting (at least) an ambulance is calculated as one minus the probability of getting
no ambulance.

P(ambulance) = 1 – P(0 ambulance)

= 1 – (3c0*0.750*0.253)

= 1 – 0.0156

= 0.9844

Using Individual Binomial Probability Table

Tables have been developed that give the probability of x successes in n trials for a binomial
experiment. These tables are generally easy to use and quicker than the Binomial Formula,
especially when the number of trials involved or sample size, n is large. In order to use this table,
it is necessary to specify the values of n, p and x. (See your text Van Matre Appendix A-
Individual Terms).

Some Binomial tables only show values up to 0.5. Thus, it would appear these tables are can not
be used when the probability of success exceeds p= 0.5. However, such tables can be used by
noting that the probability of n-x failures is also the probability of x successes. That is, finding
the probability of x successes is equal to finding the probability of n-x failures. ncx and ncn-x are
always equal.

Example:
65
Suppose that 70% of all cola drinkers select non diet colas. If 10 cola drinkers are randomly
selected, what is the probability that 4 of them will be diet cola drinkers?

Solution:

Finding the probability of 4 diet cola drinkers is equivalent to finding the probability of 6 non
diet cola drinkers.

n = 10 p= 0.7 q= 0.3 x= 6

P(x=6) = 0.2001

Finding the Probabilities that the Number of Successes X Lie In a Given Interval
(Cumulative Probabilities)

Cumulative probabilities are the sum of individual probability values. The Binomial formula
P( X )    X * P X * q n  X gives us the probability of exactly x successes in n trials/sample size n.
to find cumulative probabilities such as P(x≥3), P(x≤2), P(x›10) or P(X1≤X≤X2) = P(10≤X≤20),
we should add the respective exact/individual probability values.

Example:

1. A project manager has determined that a subcontractor fails to deliver standard orders 20% of the
time. The project manager has six orders that his subcontractor has agreed to deliver. What is the
probability that

a) The subcontractor will deliver all of the orders? 0.2621


b) The subcontractor will deliver at least four of the orders? 0.9011
c) The subcontractor will deliver exactly five orders? 0.3932
d) The subcontractor will fail to deliver at most two of the orders? 0.9011
e) What do you conclude from your answers in parts (b) and (d)? Finding the probability
of x successes is equal to finding the probability of n-x failures.
2. About 20% of all pro football players are injured during a given season. A team has four star
players. What is the probability that at least one of the star players gets injured?

Solution:

n=4 p= 0.2 q= 0.8 x≥ 1

P (x≥1) = 1 – P (X≤0) = P(x=0)

= 1 – 0.4096 = 0.5904

3. A lawyer estimates that 40% of the cases in which she represented the defendant were won. If the
lawyer is presently representing 10 defendants in different cases, what is the probability that at
least 5 of the cases will be won? What are you assuming here?

66
Solution:

The assumption we are taking here is the cases in which the lawyer is representing are
independent. With this assumption:

n = 10 p= 0.4 q= 0.6 x≥ 5

P (x≥5) = P(x=5) + P(x=6) + P(x=7) + P(x=8) + P(x=9) +P(x=10)

= 0.2007 + 0.1115 + 0.0425 + 0.0106 + 0.0016 + 0.0001

= 0.3670

Using Cumulative Binomial Probability Table

If cumulative probability table is given, one must subtract from the cumulative probability of X
the cumulative probability of X-1 to get the exact/individual probability value of X. That is,

P (X=a) = P (X≤a) – P (X≤a-1)

E.g. P (X=3) = P (X≤3) – P (X≤2)

 P (X≥a) = 1- P (X≤a-1)

E.g. P (X≥3) = 1- P (X≤2)

 P (X>a) = 1- P (X≤a)

E.g. P (X>3) = 1- P (X≤3)

 P (a1≤X≤a2) = P (X≤a2) - P (X≤a1)

E.g. P (10≤X≤20) = P (X≤20) - P (X≤10)

 P (a1<X<a2) = P (X≤a2-1) - P (X≤a1-1)

E.g. P (10<X<20) = P (X≤19) - P (X≤9)

Example:

1. According to a study conducted approximately 55% of all hospitals in a given town contained
100 or more beds. A researcher draws a sample of 15 hospitals by randomly selecting names
from a directory of hospitals.

a) What is the probability of selecting 10 or more hospitals that have 100 or more beds?
b) What is the probability of selecting less than five hospitals that have 100 or more beds?

67
c) What is the probability of selecting from six to ten hospitals, inclusive, that have 100 or more
beds?
2. A manufacturing company produces 10, 000 plastic parts per week. This company supplies
plastic parts to another company, which packages the plastic parts as part of picnic sets. The
second company randomly samples10 plastic parts sent from the supplier. If two or less of the
sampled plastic parts are defective, the second company accepts the lot. What is the probability
that the lot will be accepted if the part manufacturing company actually producing parts is 10%
defective? 20% defective? 30% defective? 40% defective?

Computation of Mean (µ) and Variance (δ2) of a Discrete Random Variable

Expected value or mean of a random variable is a measure of the central location for the random
variable. It is a long run average of occurrences. We must realize that on any one trail using a
discrete random variable, there will be one outcome. However, if the process is repeated long
enough, there is some likelihood that the results will begin to approach some expected value or
mean. This mean or expected value is computed as

µ = E(X) = ∑[X*P(X) =
 Xifi
 fi

Where: E(X) = long run average

X = an outcome

P(X) = the probability of that outcome

Variance of a discrete random variable, which measures how far the variables are dispersed
around the mean, is calculated as

δ2 = ∑(X-µ) 2*P(X)

Where: X = an outcome

µ = mean

P(X) = the probability of that outcome

And the standard deviation of a discrete random variable is calculated simply by taking the
square root of the variance. δ = √∑(X-µ) 2*P(X).

Mean, Variance and Standard Deviation of a Binomial Distribution

68
Binomial probability distribution is a discrete probability distribution. And hence, the method
used to compute mean and standard deviation for a discrete random variable is similar with the
method used to compute  and  for a binomial distribution.

A binomial distribution has an expected value or long run average, which is denoted by µ. The
value of µ is determined by n*p. the long run average or expected value means that if n items
are sampled over and over again for a long period of time and if P is the probability of getting a
success on one trial, the average number of success per sample is expected to be n*p.

Like for other discrete variables, the variance of the binomial distribution is calculated as δ2 =
∑(X-µ) 2*P(X) which is also equal to npq. The standard deviation of the binomial distribution is
also calculated by taking the square root of the variance. δ = √∑(X-µ) 2*P(X) = √npq.

The Poisson distribution

The Poisson distribution is named after the French Mathematician Simeon Denis Poisson (1781-
1840), who published an article in 1837 discussing the distribution. The Poisson distribution is
another discrete probability distribution which is used to describe a number of processes,
including the distribution of telephone calls going through a switch board system, the demand of
patients for service at a health institution, the arrival of trucks and cars at a tool booth, and the
number of accidents at an intersection.

While a binomial random variable counts the number of successes that occur in a fixed number
of trials, a Poisson random variable counts the number of rare events (successes) that occur in a
specified continuous time interval or specified region.

The Poisson distribution has the following characteristics.

1. The probability of an occurrence is the same throughout the time interval or space per unit.
2. The number of occurrences in one interval is independent of the number of occurrences in
another interval.
3. The probability of two or more occurrences in a subinterval is small enough to be ignored.
4. It must be possible to divide the time interval of interest in to many sub intervals.
5. The expected number of occurrences in an interval is proportional to the size of the interval.

Examples of Poisson random variable

1. The number of air planes arriving at an airport in an hour.


2. The number of accidents at a factory in a day.
3. The number of cars crossing a bridge during a five second interval
4. The number of misprints on a page of newsprint
5. The number of white blood cells in a blood suspension.
6. The number of typographical errors on a page.
7. The number of bacteria in an ounce of fluid.

69
The formula for Poisson distribution

P X  
 xe 

t x  et  , Where :   mean number of arrivals per unit of time or space
X! X!
X  numberof arrivals for which the probability is desired
e  the base of natural log arithm
  exp ected number of occurrences in a specified int erval
t  the proportionof this specifird int erval for the question of int erest
(numberof units of time)

Example:

1. Assume that a bank knows from past experience that between 10 and 11 a.m. of each day, the
mean arrival rate is 60 customers per hour. Suppose that the bank wants to determine the
probability that exactly two customers will arrive in a given minute time minute interval between
10 and 11 a.m. Arrivals are assumed to be constant over a given time interval. Calculate the
probability.

Solution:

λ = 60 customers/hr t= 1 minute x = 2 customers

µ = λ* ι = 60customers/60minutes * 60 minutes = 1

 xe 12 e 1
P(x=2) = = = 0.1839
X! 2!
The probability of getting 2 customers during the next one minute in a bank is 0.1839. Or there is
18.39% chance that exactly 2 customers will arrive in one minute at a bank.

2. Suppose that bank customers arrive randomly on weekday afternoons at an average rate of 3.2
customers every four minutes. What is the probability of getting 10 customers during an eight
minute interval?

λ = 3.2 customers/4 minute t= 8 minutes x =10 customers µ = λ* ι = 3.2


customers/4 minutes * 8minutes = 6.4 customers

70
 xe 6.410 e 6.4
P(x=10) = = = 0.0528
X! 10!

The probability of getting 10 customers during the next eight minutes in a bank is 0.0528. Or
there is 5.28% chance that exactly 10 customers will arrive in eight minutes at a bank.

3. If a real estate office sells 1.6 houses on average weekday and sales of houses on weekdays are
Poisson distributed, what is the probability of selling?

a) Four houses in a day? 0.0551


b) No house in a day? 0.2019
c) More than five houses in a day? 0.0060
d) Ten or more houses in a day? 1 – 1 = 0.000
e) Four houses in two days? 0.1781

4. A secretary types 75 words per minute and averages six errors per hour of typing. Assuming error
occurrences are a Poisson process, what is the probability that a 225-word letter will be typed
without error? 0.7408

5. A pen company averages 1.2 defective pens per carton produced (200 pens). The number of
defects per cartoon is Poisson distributed.

a) What is the probability of selecting a cartoon and finding no defective pen? 0.0312
b) What is the probability of finding eight or more defective pens in a cartoon? 0.0000
c) Suppose that a purchaser of these pens will quit buying from the company if a cartoon
contains more than three defectives. What is the probability that the purchaser will quit
buying from this company? 0.0338

6. A certain manufacturer sells a machine that has numerous moving parts. A quality control
inspector counts the number of moving parts that are misaligned as the number of
nonconformities for a particular machine. It is believed that the number of nonconformities per
machine follows a Poisson distribution, with an average of three nonconformities per machine.

a) Determine the probability that the quality control inspector finds no more than one
nonconformity on a particular machine selected at random. 0.0996
b) What is the probability that three or more nonconformities may be obtained by the quality
control inspector on three machines? 0.9938
7. The number of paint blisters produced by an automated painting process at Associated Industries
is Poisson distributed with a rate of 0.06 blisters per square feet. The process is about to be used
to paint an item that measures 9 by 15 feet.

71
a) What is the probability that the finished surface will have no blister in it? 0.0003
b) What is the probability that the finished surface will have between 5 and 8, inclusive? 0.4846
c) What is the probability that the finished surface will have more than 2 blisters? 0.9873
8. The defects in an automated weaving process at Sharp Industries are Poisson distributed at a
mean rate of 0.00025 per square foot. The process is to be used to weave a piece of materials that
is 5 by 16 yards.

a) What is the probability that this piece will have no defects?


b) What is the probability that it will have one defect?

Mean and Variance of Poisson distribution


The Poisson distribution has only one parameter, the expected value λι = µ. Additionally, for the
poison distribution, the expected value and variance are equal. The expected vale and variance a
Poisson probability distribution are E(X) = µ = λι = δ2.

Poisson Approximation to Binomial Probability Distribution


The Poisson probability distribution can be used as an approximation to the binomial probability
distribution when P, the probability of success is small and n, the number of trials/sample size, is
large. Simply set µ = np and use the Poisson tables. As a rule of thumb, the approximation will
be good whenever P≤0.05 and n≥20. However, this approximation is reasonably accurate if n>20
and np≤5.

Binomial tables are often not available for large values of n, so in these cases the approximation
can be useful. So in cases where P≤0.05 and n≥20, substitute the mean of the binomial
distribution (µ = np) in place of the mean of the Poisson distribution (µ = λι), so that the formula

np x   np 
e
becomes P(X) =
X!
In general, the larger n is and smaller p is, the better will be the approximation.

Why approximation?

- The Poisson formula is easier to use than the binomial formula.


- It can be tabulated more efficiently than binomial probabilities because Poisson distribution
has only one parameter µ (λι), where as binomial distribution has two parameters n and p.
Example:

72
n = 500 p= 0.02 µ = np = 500*0.02 = 10

n = 1000 p= 0.01 µ = np = 1000*0.01 = 10

If we want to calculate P(X) for both cases we can tabulate on a single column- Poisson. Had it
been binomial for the above cases we should have formulated two columns.

1. A company sells insurance policies to a random sample of 1000 men who are 35 years of age.
The probability that a 35-year old man dies with in a year is approximately 0.002. What is the
probability that the insurance company will have to pay claims on 2 or more policies next year?

Solution:

Steps: 1. Make sure P≤0.05 and n≥20

P = 0.002 n = 1000…………. Both requirements satisfied

2. Calculate µ = np = 1000*0.002 = 2

3. Calculate P(X)

P (X≥2) = 1 – [P(X=0) + P(X=1)]

20 e 2 21 e 2
=1-[ +[ ]
0! 1!
= 1 – (0.1353 + 0.2707)

= 0.5940

The exact binomial probability is 0.5942.

2. Suppose that the probability of a bank making a mistake processing a deposit is 0.0003. If 10,000
deposits are audited, what is the probability that exactly six mistakes were made in processing
deposits?

Solution:

Steps: 1. Make sure P≤0.05 and n≥20

P = 0.0003 n = 10,000…………. Both requirements satisfied

2. Calculate µ = np = 10,000*0.0003 = 3
73
3. Calculate P(X)

36 e 3
P (X=6) =
6!
= 0.0504

Continuous Probability Distribution

Up to this point, we have focused our attention on discrete distributions of random variables that
have either a finite number of possible value (E.g. 0, 1, 2, 3 …n) or a countably infinite number
of values (E.g. 0, 1, 2, 3 …), and we can also list all of the possible values of a discrete random
variable and it is meaningful to consider the probability that a particular individual value will be
assumed. In contrast, a continuous random variable has an uncountably infinite number of
possible values and can assume any value in the interval between two points and b(a<x<b). As a
result the only meaningful way to compute a probability is the probability that the variable will
fall within a specified region. That is, the probability that a continuous random variable X will
assume any particular value is zero.

It is any representation of the values of continuous random variable and the associated
probabilities. The continuous probability distribution includes the normal distribution and
exponential distribution.

The Normal Distribution

The normal distribution is a continuous distribution that has a bell shape and is determined by its
mean and standard deviation. It occupies a place of central importance in continuous probability
distribution in particular and statistics in general. It is the most important theoretical distribution
because of the following three reasons:

1. The normal distribution approximates the observed frequency distributions of many natural
and physical measurements, such as, IQS, weights, heights, sales, product life times, etc.
2. The normal distribution can often be used to estimate binomial probabilities when n (sample
size) is greater than 20.
3. The normal distribution is a good approximation of distributions of both sample means and
sample proportions of large samples (n > 30).

Characteristics of normal distribution

i. It is a continuous distribution.
ii. It has a bell shape and is symmetrical about its mean.
iii. It is asymptotic to the X- axis.
iv. It extends infinitely in either direction from the mean.

74
v. It is defined by two parameters: µ and δ. Each combination of these two parameters
specifies a unique normal distribution. The value of µ indicates where the center of the
bell lies, while δ represents how spread out (or wide) the distribution is.
vi. It is measured on a continuous scale and the probability of obtaining a precise value is
zero.
vii. The total area under the curve is equal to 1.0 or 100%; 50% of the area is above the mean
and 50% is below the mean.
viii. The probability that a random variable will have a value between any two points is
equal the area under the curve between those two points.

Each combination of µ and δ specifies a unique normal distribution. This brings about having an
infinite family of normal distributions. This problem of dealing with an infinite family of
distributions can be solved by transforming all normal distributions to the standard normal
distribution, which has a mean equal to 0 and a standard deviation equal to 1. Standard Normal
Distribution is a normal distribution in which the mean is 0 and the standard deviation is 1. It is
denoted by z.

Any normal distribution can be converted to the standard normal distribution by standardizing
each of its observations in terms of Z- values. The Z- value measures the distance in standard
deviations between the mean of the normal curve and the X- value of interest. Any random
variable can be transformed to a standard random variable by subtracting the mean and dividing
by the standard deviation.

If a random variable X has mean µ and standard deviation δ, the standardized variable Z is
defined as:

X 
Z , Where : Z  number of s tan dard deviations from the mean.

X  value of int erest
  mean of the distribution
  s tan dard deviation of distribution

A Z- score is the number of standard deviations that a value, X, is away from the mean. If the
value of X is less than the mean, the Z-score is negative; if the value of X is greater than the
mean, the Z-score is positive. Z-score is also known as z-value. A standardized score in which
the mean is zero and the standard deviation is 1. The Z score is used to represent the standard
normal distribution

75
The probability calculations in normal distribution are made by computing areas under the graph.
Thus, to find the probability that a random variable lies within any specific interval we must
compute the area under the normal curve over that interval.

Probabilities for some commonly used intervals are:

a) 68.26% of the time, a normal random variable assumes a value within ±1δ of its mean.
b) 95.44% of the time, a normal random variable assumes a value within ±2δ of its mean.
c) 99.72% of the time, a normal random variable assumes a value within ±3δ of its mean.
Example:

1. The Graduate Management Admission Test (GMAT) is widely used by graduate school of
business as an entrance requirement. In one particular year, the mean score for the GMAT
was 485, with a standard deviation of 105. assuming that GMAT scores are normally
distributed, what is the probability that a randomly selected score from this administration of
the GMAT:
a) Falls between 600 and the mean, inclusive?
b) Is greater than 650?
c) Is less than 300?
d) Falls between 350 and 550, inclusive?
e) Is less than 700?
f) Is exactly 500?
g) If 500 applicants take the test, how many would you expect to score 590 or below?
Solution:

Steps to find the probability value of a random variable which lies over an interval:

2 Calculate the appropriate z values


2 Find the areas (probabilities) in the table
2 Interpret your results

µ = 485 δ = 105 485≤X≤600

a) P (485≤X≤600) =?
X 
1. First convert X values in to Z-score using the formula Z 

Z485 = 0

600 485
Z600 = = +1.10
105

2.P(485≤X≤600) = P(0≤Z≤+1.10)
= P (0 to +1.10)

76
= 0.36433

b) P (X>650) =?
X 
1. First convert X values in to Z-score using the formula Z 

650 485
Z650 = = +1.57
105

2.P(X>650) = P(Z>+1.57)
= 0.5- P (0 to +1.57)

= 0.5-0.44179

= 0.05281

c) P (X<300) =?
X 
1. First convert X values in to Z-score using the formula Z 

300 485
Z300 = = -1.76
105

2.P(X<300) = P(Z<-1.76)
= 0.5- P (0 to -1.76)

= 0.5-0.46080

= 0.03920

d) P (350≤X≤550) =?
X 
1. First convert X values in to Z-score using the formula Z 

350 485
Z350 = = -1.29
105

550 485
Z550 = = +0.62
105

2.P(350≤X≤550) = P (-1.29≤Z≤-1.29)
= P (0 to -1.29) + P (0 to 0.62)

= 0.40147 + 0.23237

77
= 0.63384

e) P (X<700) =?
X 
1. First convert X values in to Z-score using the formula Z 

700 485
Z700 = = +2.05
105

2.P(X>300) = P(Z<+2.05)
= P (X<485) + P (485≤X<700)

= 0.5+ P (0 to +2.05)

= 0.5 + 0.47982

= 0.97982

f) P(X=500) = 0. The probability of an exact/single value of a continuous random variable is


zero. Consequently, the probability of an interval is the same whether the end points are
included or not.
g) To find the expected number of applicants who score 590 or below, we first find P (X≤590)
and we multiply it by the number of applicants.
P (X≤590) =?

X 
1. First convert X values in to Z-score using the formula Z 

590 485
Z590 = = +1.00
105

2.P(X≤590) = P(Z≤+1.00)
= P (X<485) + P (485≤X<590)

= 0.5+ P (0 to +1.00)

= 0.5 + 0.34134

= 0.84134

If 500 applicants take the test, the number of students expected to score 590 or below is
500(0.84134) = 420.65 or 421 students.

78
2. The result of an exam score for a given class is normally distributed. If the mean score is 85
points and the standard deviation is equal to 20 points, find the cutoff passing grade such that
83.4% of those taking the test will pass.

Solution:

µ = 85 prob. Of passing = 83.4%

δ = 20 cutoff point =?

Since 83.4% is greater than 50%, the cutoff point should be less than the mean, and hence the Z-
value is negative. And this calls for the inverse use of the standard normal table.

(Z/P=0.334) = -0.97

X  485
-0.97 =
20

-19.4 = X-85

X = 65.6 Points – Minimum point to pass the test.

3. Data accumulated by the National Climatic Data Center shows that the average wind speed in
miles per hour for Addis is 9.7mph. Suppose that wind speed measurements are normally
distributed for a given geographical location. If 22.45% of the time the wind speed
measurements are more than 11.6mph, what is the standard deviation of wind speed in Addis?

Solution:

µ = 9.7mph δ =? X > 11.6

P(X> 11.6) = 22.45%

(Z/P = 0.2755) = +0.76

11.6  9.7
+0.97 =

0.97δ = 1.9

δ = 2.5

4. The cylinder making machine has δ = 0.5mm and µ = 25mm. within what interval of values
centered at the mean will, the diameters of 80%of the cylinder lie?

Solution:

79
µ = 25mm δ =0.5mm

From the statement it is clear that the interval is centered at the mean; i.e., 50% of the 80%
(40%) lies below the mean and 50% lies above the mean.

(Z/P=0.4) = ± 1.28

X1 = µ - Z δ X2 = µ + Z δ

X 1  25 X 2  25
-1.28 = +1.28 =
0 .5 0 .5

-0.64 = X1-25 +0.64 = X2-25

X1 = 24.36mm X2 = 25.64mm

80% of the diameter of the cylinder lies between 24.36mm and 25.64mm.

5. The lives of light bulbs follow a normal distribution. If 90% of the bulbs have lives exceeding
2000 hrs and 3% have lives exceeding 6000 hrs. What are the mean and standard deviation of the
lives of light bulbs?

Solution:

P(X>2000) = 0.90 P(X>6000) = 0.03

µ=? δ =?

(Z/P=0.4) = - 1.28 (Z/P=0.47) = + 1.88

2000  6000 
-1.28 = +1.88 =
 

-1.28δ = 2000 - µ +1.88δ = 6000 - µ

µ = 2000 + 1.28δ µ = 6000 - 1.88δ

Using simultaneous equation,

µ = 6000 - 1.88δ

µ = 2000 + 1.28δ

3.16δ = 4000

δ = 1265.82

80
µ = 2000 + 1.28 δ

= 2000 + 1.28(1265.82)

= 3620.25 points

6. On a civil service exam, the grades are normally distributed with µ = 70 points and δ = 10 points.
The police department hires the applicants whose grades are among the top 10% of the
population. What is the minimum grade required to be hired?

Solution:

µ = 70points δ =10points

(Z/P=0.4) = + 1.28

X 
+1.28 =

X  70
+1.28 =
10

12.8 = X - µ

X - 70 = 12.8

X = 82.8 – the minimum grade required to be hired.

7. A bakery shop sells loaves of freshly made bread. Any unsold loaves at the end of the day are
either discarded or sold elsewhere at a loss. The demand for this bread has followed a normal
distribution with µ = 35 loaves and δ = 8 loaves. How many loaves should the bakery make each
day so that they can meet the demand 90% of the time?

Solution:

µ = 70 loaves δ = 8 loaves

(Z/P=0.4) = + 1.28

X 
+1.28 =

81
X  35
+1.28 =
8

10.24 = X - 35

X = 45.24 ≈ 46- by stocking 46 loaves of breads each day, the bakery will meet the demand for
this product 90% of the time.

82

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy