Chapter One
Chapter One
In the plural sense:- statistics is defined as the collection of numerical facts or figures ( or the
raw data themselves).
Eg. 1. Vital statistics (numerical data on marriage, births, deaths, etc).
2. The average mark of statistics course for students is 70% would be considered as a
statistics whereas Abebe has got 90% in statistics course is not statistics.
Remark: statistics are aggregate of facts. Single and isolated figures are not statistics as they
cannot be compared and are unrelated.
In its singular sense:- the word Statistics is the subject that deals with the methods of collecting,
organizing, presenting, analyzing and interpreting statistical data.
Classification of Statistics
Statistics is broadly divided into two categories based on how the collected data are used.
Descriptive Statistics:- deals with describing the data collected without going further
conclusion.
Example 1.1: Suppose that the mark of 6 students in Statistics course for Mathematics is given
as 40, 45, 50, 60, 70 and 80. The average mark of the 6 students is 57.5 and it is considered as
descriptive statistics.
Inferential Statistics:- It deals with making inferences and/or conclusions about a population
based on data obtained from a sample of observations. It consists of performing hypothesis
testing, determining relationships among variables and making predictions.
Example 1.2: In the above example, if we say that the average mark in Statistics course for
Mathematics students is 57.5, then we talk about inferential statistics (draw conclusion based on
the sample observation).
~1~
1.2 Stages of Statistical Investigation
The area of statistics points out the following five stages. These are collection, organization,
presentation, analysis and interpretation of data.
Collection of data: This is the process of obtaining measurements or counts or obtaining raw
data.
Data can be collected in a variety of ways; one of the most common methods is through the use
of sample or census survey. Survey can also be done in different methods, three of the most
common methods are:
Telephone survey
Mailed questionnaire
Personal interview.
Organization of data: - Data collected from published sources are generally in organized form.
However if an investigator has collected data through a survey, it is necessary to edit these data
in order to correct any apparent inconsistencies, ambiguities, and recording errors.
This phase also includes correcting the data for errors, grouping data into classes and tabulating.
Presentation of data:- After the data have been collected and organized they can be presented in
the form of tables, charts, diagrams and graphs. This presentation in an orderly manner facilitates
the understanding as well as analysis of data.
Analysis of data: - the basic purpose of data analysis is to dig out useful information for
decision making. This analysis may simply be a critical observation of data to draw some
meaningful conclusions about it or it may involve highly complex and sophisticated
mathematical techniques.
Interpretation of data: - Interpretation means drawing conclusions from the data collected and
analyzed. Correct interpretation will lead to a valid conclusion of the study & thus can aid in
decision making.
~2~
1.3 Definition of some statistical terms
Population: - It is the totality of objects under study. The population represents the target of an
investigation, and the objective of the investigation is to draw conclusions about the population
hence we sometimes call it target population.
Examples:
All clients of Telephone Company
All students of Debre Tabor University (DTU)
Population of families, etc.
The population could be finite or infinite (an imaginary collection of units).
Sample: - is part or subset of population under study.
Sampling frame: - is the list of all possible units of the population that the sample can be drawn
from it.
Example: List of all students of DTU, List of all residential houses in Debre Tabour town, etc
~3~
1.4 Applications, uses and limitations of Statistics
Applications
Statistics can be applied in any field of study which seeks quantitative evidence.
For instance, engineering, economics, natural science, etc.
a) Engineering: Statistics have wide application in engineering.
To compare the breaking strength of two types of materials
To determine the probability of reliability of a product.
To control the quality of products in a given production process.
To compare the improvement of yield due to certain additives such as fertilizer,
herbicides, e t c.
b) Economics: Statistics are widely used in economics study and research.
To measure and forecast Gross National Product (GNP)
Statistical analyses of population growth, inflation rate, poverty, unemployment figures,
rural or urban population shifts and so on influence much of the economic policy making.
Financial statistics are necessary in the fields of money and banking including consumer
savings and credit availability.
c) Statistics and research: there is hardly any advanced research going on without the use of
statistics in one form or another. Statistics are used extensively in medical, pharmaceutical and
agricultural research.
Functions/Uses of Statistics
Today the field of statistics is recognized as a highly useful tool to making decision process by
managers of modern business, industry, frequently changing technology. It has a lot of functions
in everyday activities. The following are some uses of statistics:
• It condenses and summarizes a mass of data: the original set of data (raw data) is normally
voluminous and disorganized unless it is summarized and expressed in few presentable,
understandable & precise figures.
• Statistics facilitates comparison of data: measures obtained from different set of data can be
compared to draw conclusion about those sets. Statistical values such as averages, percentages,
~4~
ratios, rates, coefficients, etc, are the tools that can be used for the purpose of comparing sets of
data.
• Statistics helps to predict future trends: statistics is very useful for analyzing the past and
present data and forecasting future events.
• Statistics helps to formulate & review policies: Statistics provide the basic material for
framing suitable policies. Statistical study results in the areas of taxation, on unemployment rate,
on inflation, on the performance of every sort of military equipment, etc, may convince a
government to review its policies and plans with the view to meet national needs and aspirations.
• Formulating and testing hypothesis: Statistical methods are extremely useful in formulating
and testing hypothesis and to develop new theories.
Limitations of Statistics
The field of statistics, though widely used in all areas of human knowledge and widely applied in
a variety of disciplines such as engineering, economics and research, has its own limitations.
Some of these limitations are:
a) It does not deal with individual values: as discussed earlier, statistics deals with aggregate of
facts. For example, wage earned by an individual worker at any one time, taken by itself is not a
statistics.
b) It does not deal with qualitative characteristics directly: statistics is not applicable to
qualitative characteristics such as beauty, honesty, poverty, standard of living and so on since
these cannot be expressed in quantitative terms. These characteristics, however, can be
statistically dealt with if some quantitative values can be assigned to these with logical criterion.
For example, intelligence may be compared to some degree by comparing IQs or some other
scores in certain intelligence tests.
c) Statistical conclusions are not universally true: since statistics is not an exact science, as is
the case with natural sciences, the statistical conclusions are true only under certain assumptions.
d) It can be misused: statistics cannot be used to full advantage in the absence of proper
understanding of the subject matter.
~5~
1.5 Levels of Measurement
Proper knowledge about the nature and type of data to be dealt with is essential in order to
specify and apply the proper statistical method for their analysis and inferences.
Scale Types
Measurement is the assignment of values to objects or events in a systematic fashion. Four levels
of measurement scales are commonly distinguished: nominal, ordinal, interval, and ratio and
each possessed different properties of measurement systems. The first two are qualitative while
the last two are quantitative.
Nominal scale: The values of a nominal attribute are just different names, i.e., nominal attributes
provide only enough information to distinguish one object from another. Qualities with no
ranking or ordering; no numerical or quantitative value. These types of data are consists of
names, labels and categories. This is a scale for grouping individuals into different categories.
Example 1.3: Eye color: brown, black, etc, sex: male, female.
In this scale, one is different from the other
Arithmetic operations (+, -, *, ÷) are not applicable, comparison (<, >, ≠, etc) is
impossible
Ordinal scale: - defined as nominal data that can be ordered or ranked.
Can be arranged in some order, but the differences between the data values are
meaningless.
Data consisting of an ordering of ranking of measurements are said to be on an ordinal
scale of measurements. That is, the values of an ordinal scale provide enough information
to order objects.
One is different from and greater /better/ less than the other
Arithmetic operations (+, -, *, ÷) are impossible, comparison (<, >, ≠, etc) is possible.
Example 1.4
-Letter grading (A, B, C, D, F), -Rating scales (excellent, very good, good, fair, poor), military
status (general, colonel, lieutenant, etc).
~6~
Interval Level: data are defined as ordinal data and the differences between data values are
meaningful. However, there is no true zero, or starting point, and the ratio of data values are
meaningless. Note: Celsius & Fahrenheit temperature readings have no meaningful zero and
ratios are meaningless.
In this measurement scale:-
One is different, better/greater and by a certain amount of difference than another.
Possible to add and subtract. For example; 800c – 500c = 300c, 700c – 400c = 300c.
Multiplication and division are not possible. For example; 600c = 3(200c). But this does
not imply that an object which is 600c is three times as hot as an object which is 200c.
Most common examples are: IQ, temperature.
Ratio scale: Similar to interval, except there is a true zero (absolute absence), or starting point,
and the ratios of data values have meaning.
Arithmetic operations (+, -, *, ÷) are applicable. For ratio variables, both differences and
ratios are meaningful.
One is different/larger /taller/ better/ less by a certain amount of difference and so much
times than the other.
This measurement scale provides better information than interval scale of measurement.
Example 1.5: weight, age, number of students.
~7~