Unit 1 Introduction To Statistics
Unit 1 Introduction To Statistics
INTRODUCTION TO STATISTICS
Written By:
Aftab Ahmad
Reviewed By:
Dr. Rizwan Akram Rana
Introduction
Statistics is a broad subject with applications in vast variety of fields. The word
“statistics” is derived from the Latin word “Status”, which means a political state.
Statistics is a branch of knowledge that deals with facts and figures. The term statistics
refers to a set of methods and rules for organizing, summarizing, and interpreting
information. It is a way of getting information from data.
Statistics
Data Information
We can say that Statistics is a science of collecting, organizing, interpreting and reporting
data. It is a group of methods which are used for collecting, displaying, analyzing, and
drawing conclusions from the data.
In other words, statistics is a methodology which a researcher uses for collecting and
interpreting data and drawing conclusion from collected data (Anderson & Sclove, 1974;
Agresti & Finlay, 1997).
Objectives of Unit
After reading this unit the students will be able to:
1. demonstrate basic understanding of statistics.
2. know the characteristics of statistics.
3. explain the functions of statistics.
4. Enlist the characteristics of statistics.
5. tell the importance and limitations of statistics.
6. briefly explain the application of statistics in educational research.
7. distinguish between descriptive and inferential statistics.
8. describe variables and its types.
9. distinguish between the levels of measurement.
10. identify various statistical notations.
2
1.1 Functions of Statistics
Functions of Statistics are summarized under following headings.
i) To present facts in a definite form
Daily we encounter millions of pieces of information which are often vague,
indefinite and unclear. When such pieces of information undergo certain statistical
techniques and are represented in the form of tables or figures, they represent
things in a perspective which is easy to comprehend. For example, when we say
that some students out of 1000 who appeared for B. Ed examination were declared
successful. This statement is not giving as much information. But when we say that
900 students out of 1000 who appeared for B. Ed examination were declared
successful; and after using certain statistical techniques we conclude that “90% of
B. Ed. students were successful”; now the sentence becomes more clear and
meaningful.
3
department for statistical intelligence or statistical bureau, the work of which is to
collect, compare and coordinate figures for formulating future policies of the firm
regarding production and sales.
4
v) Statistics are collected in a Systematic Manner
In order to have reasonable standard of accuracy statistics/data must be collected in
a very systematic manner. Any rough and haphazard method of collection will not
be desirable for that may lead to improper and wrong conclusion.
5
v) Print and electronic media use statistical tools to make predictions of winner of
elections and coming government.
vi) Statistics has widely been used in psychology and education to determine the
reliability and validity to a test, factor analysis etc.
vii) Apart from above statistics has a wide application in marketing, production,
finance, banking, investment, purchase, accounting and management control.
6
work at the analysis stage of the research process when data have been collected. It does
not mean that social scientists can plan and carry out entire research projects without any
knowledge of statistics. Planning and carrying out research project and trying to analyze
data without using statistical techniques will carry away from the objectives of the study.
Statistics enters in the process right from the beginning of the research when whole plan
for the research, selection of design, population, sample, analysis tools and techniques
etc., is prepared.
Only summarizing and organizing data is not the whole purpose of a researcher. He often
wishes to make inferences about a population based on data he has obtained from a
sample. For this purpose, he uses inferential statistics. Inferential Statistics are techniques
that allow a researcher to study samples and then make generalizations about the
populations from which they are selected.
Population of a research study is typically too large and it is difficult for a researcher to
observe each individual. Therefore a sample is selected. By analyzing the results obtained
from a sample, a researcher hopes to make general conclusion about the population. One
problem with using sample is that a sample provides only limited information about the
population. To address this problem is the notion that the sample should be representative
of the population. That is, the general characteristics of the sample should be consistent
with the characteristics of the population.
7
1.7 Variable
A variable is something that is likely to vary or something that is subject to variation. We
can also say that a variable is a quantity that can assume any of a set of values. In other
words, we can say that a variable is a characteristic that varies from one person or thing
to another. It is a characteristic, number or quantity that increases or decreases over time
or takes different value in different situations; or in more precise words, it is a condition
or quality that can differ from one case to another. We often measure or count it.
A variable may also be called a data item. Examples of variables for human are height,
weight, age, number of siblings, business income and expenses, country of birth, capital
expenditure, marital status, eye color, gender, class grades, and vehicle type, etc.
On the other hand, variables such as time, height, and weight are not limited to a fixed set
of separate, indivisible categories. They are divisible in an infinite number of fractional
parts. Such variables are called continuous variables. For example, a researcher is
measuring the amount of time required to solve a particular mental arithmetic problem.
He can measure time in hours, minutes, seconds, or fractions of seconds
8
Variable
Categorical
Numeric
Discrete Nominal
Continuous Ordinal
.
1.8 Level of Measurement
There are two basic types of variables – quantitative and categorical. Each uses different
type of analysis and measurement, requiring the use of different type of measurement
scale. A scale of a variable gives certain structure to the variable and also defines the
meaning of the variable. There are four types of measurement scales: nominal, ordinal,
interval, and ratio.
Nominal Scale
A nominal scale is the simplest form of measurement researchers can use. The word
nominal means “having to do with names.” Measurements made on this scale involve
merely naming things. It consists of a set of categories that have different names. Such
measurements label and categorize observations but do not make quantitative distinctions
between them. For example, if we wish to know the sex of a person responding to the
questionnaire, we would measure it on nominal scale consisting of two categories (male
or female). A researcher observing the behavior of a group of infant monkeys might
categorize responses as playing, grooming, feeding, acting aggressively or showing
submissiveness. As the researcher merely gives names to each category so, this is a
nominal scale of measurement. The nominal scale consists of qualitative distinctions.
Although, a nominal scale consists of qualitative differences, yet it does not provide any
information about quantitative differences between individuals. Numerical values like 0
and 1 are merely used as code for nominal categories when entering data into computer
programs.
9
Ordinal Scale
In ordinal scale of measurement, the categories that make up the scale not only have
separate names but also are ranked in terms of magnitude. This scale consists of a set of
categories that are organized in an ordered sequence. For example, a manager of a
company is asked to rank employees in term of how well they perform their duties. The
collected data will tell us who the manager considers the best worker, the second best,
and so on. The data may reveal that the worker, who is ranked second, is viewed as doing
better work than the worker who is ranked third. However, we can get no information
about the amount that the workers differ in job performance, i.e. we cannot get the
answer of the question “How much better?” Thus, an ordinal scale provides us
information about the direction of difference between two measurements, but it does not
reveal the magnitude of the difference.
Interval Scale
An interval scale possesses all the characteristics of an ordinal scale, with additional
feature that the categories form a series of intervals that are exactly of the same size. This
additional information makes it possible to compute distances between values on an
interval scale. For example, on a ruler 1-inch interval is the same size at every location on
the ruler. Similarly 4-inch distance is exactly the same size no matter where it is
measured on the ruler. Similarly, the distance between the scores of 70 and 80 is
considered to be the same as the distance between scores of 80 and 90. For all practical
purposes these numbers can undergo arithmetic operations to be transformed into
meaningful results. Interval scale answers the question “How much better?” or “How
much is the difference?” But there is no intrinsic zero, or starting point. The zero point on
the interval scale does not indicate a total absence of what is being measured. For
example, 0o (zero degree) on the Celsius or Fahrenheit scale does not indicate no
temperature.
Ratio Scale
A ratio scale has all the characteristics of an interval scale but adds an absolute zero
point. It means on a ratio scale a value of zero indicates complete absence of the variable
being measured. Advantage of absolute zero is that a ratio of numbers on scale reflects
ratio of magnitude for the variable being measured. We can say that one measurement is
three times larger than another, or one score is only half as large as another. Thus, ratio
scale not only enables us to measure the difference between two individuals, but also to
describe the difference in terms of ratios.
Scientific method is a process for explaining the world we see. It is a process used to
validate observations while minimizing observer bias. This method is a series of steps
10
that lead to answers that accurately describe the things we observe. Its goal is to conduct
research in a fair, unbiased and repeatable manner.
Scientific method is a tool for: (a) forming and framing questions, (b) collecting
information to answer those questions, and (c) revising old and developing new questions.
The scientific method is not the only way, but the best-known way to discover how and
why the world works. It is not a formula. It is a process with a manner of sequential steps
designed to create an explainable outcome that increases our knowledge base. The
process is as follows:
i) Ask a question
Asking a question is the first step of scientific method. Good questions come from
careful observations. Our senses are a good source of observation. Sometime
certain instruments like a microscope or a telescope are also used. These
instruments extend the range of senses. During the observation many questions
come in the mind. These questions derive the scientific method.
11
1.10 Statistical Notations
Commonly used statistical notations are given in the following table.
12
34 ME Margin of error
35 DF or Df Degree of freedom
36 Q1 Lower/first quartile (25% of population are below this value)
37 Q2 Median/second quartile (50% of population are below this
value, also median of the sample)
38 Q3 Upper/third quartile (75% of population are below this value)
39 IQR Inter-quartile range (Q3 – Q1)
40 X~ Distribution of random variable X
41 N (µ,σ2) Normal distribution / Gaussian distribution
42 U (a, b) Uniform distribution (equal probability in range a, b)
43 gamma (c, λ) Gamma distribution
44 χ2 (k) Chi-square distribution
45 Bin (n, p) Binomial distribution
46 F (k1, k2) F distribution
47 Poisson (λ) Poisson distribution
1.12 Activities
1. Diagrammatically show how “data” becomes “information”.
2. Make a list of the questions that can be answered using statistics.
3. Make a list of the “functions of statistics”.
4. Think and write down any two characteristics not given in the unit.
5. Make a diagram to show the types of variables.
6. Draw a hierarchy of levels of measurement.
7. Make a list of the steps of scientific method.
13
1.13 Bibliography
Agresti, A. & Finlay, B. (1997). Statistical Methods for Social Sciences, (3rd Ed. ).
Prentice Hall.
Dietz, T., and Kalof, L. (2009). Introduction to Social Statistics. UK: Wiley-Blackwell
Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to Design and Evaluate in
Education. (8th Ed.) McGraw-Hill, New York
Gravetter, F. J., & Wallnau, L. B. (2002). Essentials of Statistics for the Behavioral
Sciences (4th Ed.). Wadsworth, California, USA.
14