0% found this document useful (0 votes)

5 views

Chapter 1 (2)

The document provides an introduction to statistics, defining key terms such as data, population, and sample, and differentiating between descriptive and inferential statistics. It outlines the stages of statistical investigation, including data collection, organization, presentation, analysis, and interpretation, as well as the various methods of data collection and their applications across different fields. Additionally, it discusses the limitations of statistics and the classification of variables and scales of measurement.

Uploaded by

famiwalinwalif

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Chapter 1 (2)

Uploaded by

famiwalinwalif

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 22

Chapter One

Introduction

Definition of terms

Data: are figures or facts from which conclusion can be made. Data are the numerical results of
any scientific measurement. Any value that is expressed in numbers is called data.
Population: the totality of all elements under study.
Sample: is a portion or part of the population taken so that some generalization about the
population can be made. It is the subset of the population which is assumed to be the
representative of the population.

Definition of Statistics

Statistics can be defined in two senses: plural (as Statistical Data) and singular (as Statistical
Methods).

Plural sense: Statistics are collection of facts (figures). This meaning of the word is widely used
when reference is made to facts and figures on sales, employment or unemployment, accident,
weather, death, education, e.t.c. Eg: Sales Statistics, Labor Statistics, Employment Statistics,
e.t.c. In this sense the word Statistics serves simply as data. But not all numerical data are
statistics.

Singular sense: Statistics is the science that deals with the methods of data collection,
organization, presentation, analysis and interpretation of data. It refers the subject area that is
concerned with extracting relevant information from available data with the aim to make sound
decisions. According to this meaning, statistics is concerned with the development and
application of methods and techniques for collecting, organizing, presenting, analyzing and
interpreting statistical data.

According to the singular sense definition of statistics, a statistical study (statistical

investigation) involves five stages: Collection of Data, Organization of Data, Presentation of
Data, Analysis of Data and Interpretation of Data.

1
1. Collection of Data: This is the first stage in any statistical investigation and involves the
process of obtaining (gathering) a set of related measurements or counts to meet
predetermined objectives. The data collected may be primary data (data collected directly by
the investigator) or it may be secondary data (data obtained from intermediate sources such
as newspaper s, journals, official records, e.t.c).
2. Organization of Data: It is usually not possible to derive any conclusion about the main
features of the data from direct inspection of the observations. The second purpose of
statistics is describing the properties of the data in a summary form. This stage of statistical
investigation helps to have a clear understanding of the information gathered and includes
editing (correcting), classifying and tabulating the collected data in a systematic manner.
Thus the first step in the organization of data is editing. It means correcting (adjusting)
omissions, inconsistencies, irrelevant answers and wrong computations in the collected data.
The second step of the organization of data is classification that is arranging the collected
data according to some common characteristics. The last step of the organization of data is
presenting the classified data in tabular form, using rows and columns (tabulation).
3. Presenting of Data: The purpose of data presentation is to have an overview of what the data
actually looks like, and to facilitate statistical analysis. Data presentation can be done using
Graphs and Diagrams which have great memorizing effect and facilitates comparison.
4. Analysis of Data: The analysis of data is the extraction of summarized and comprehensive
numerical description in order to reach conclusions or provide answers to a problem. The
problem may require simple or sophisticated mathematical expressions.
5. Interpretation of Data: This is the last stage of statistical investigation. Interpretation
involves drawing conclusions from the data collected and analyzed in order to make decision.

Classification of Statistics

Based on the scope of the decision, statistics can be classified into two; Descriptive and
Inferential Statistics.

Descriptive Statistics refers to the procedures used to organize and summarize masses of data.
It is concerned with describing or summarizing the most important features of the data. It deals
only the characteristics of the collected data without going beyond it. That is, this part deals with
only describing the data collected without going any further: that is without attempting to
2
infer(conclude) anything that goes beyond the data themselves. The methodology of descriptive
statistics includes the methods of organizing (classification, tabulation, Frequency Distributions)
and presenting (Graphical and Diagrammatic Presentation) data and calculations of certain
indicators of data like Measures of Central Tendency and Measures of Dispersion (Variation)
which summarize some important features of the data.

Inferential (Inductive) Statistics includes the methods used to find out something about a
population, based on the sample. It is concerned with drawing statistically valid conclusions
about the characteristics of the population based on information obtained from sample. In this
form of statistical analysis, descriptive statistics is linked with probability theory in order to
generalize the results of the sample to the population. Performing hypothesis testing, determining
relationships between variables and making predictions are also inferential statistics.

Ex: Classify the following statements as Descriptive and Inferential Statistics

a. The average age of the students in this class is 21 years.

b. At least 5% of the killings reported last year in city X were due to tourists.
c. Of the students enrolled in Haramaya University in this year 74% are male and 26% are
female.
d. The chance of winning the Ethiopian National Lottery in any day is 1 out of 167000.

Applications of Statistics

In this modern time, statistical information plays a very important role in a wide range of fields.
Today, statistics is applied in almost all fields of human endeavor.

 In Scientific Research: Statistics is used as a tool in a scientific research. Statistical

formulas and concepts are applied on a data which are results of an experiment.
 In Quality Control: Statistical methods help to check whether a product satisfies a given
standard.
 For Decision Making: statistics helps to enhance the power of decision making in the face
of uncertainty by providing sufficient information.
 In Agriculture: Experiments are designed and analyzed using statistical procedures.

3
 In Public Health and Medicine: statistical methods are used for computation and
interpretation of birth and death rates.
 In Economics: for modeling functional relationships between or among variables
 In Education and Agricultural Extension: to study the effects of certain trainings.
 In Natural and Social Sciences, Business, Planning, Behavior Sciences, e.t.c.

Uses of Statistics

 Condenses and summarizes masses of data and presents facts in numerical and definite
form
 Facilitates comparison: statistical devises such as averages, percentages, ratios, e.t.c. are
used for this purpose.
 Formulating and testing hypothesis: For instance, hypothesis like whether a new medicine
is effective in curing a disease, whether there is an association between variables can be
tested using statistical tools.
 Forecasting: Statistical methods help in studying past data and predicting future trends.

Limitations of Statistics

 It cannot deal with a single observation; rather it deals aggregate of facts.

 Statistical methods are not applicable to qualitative character i.e. it deals with quantitative
characteristics.
 Statistical results are true on average; i.e. for the majority of case. Laws of statistics are not
universally true like the laws of physics, chemistry and mathematics.
 Statistics are liable to be misused or misinterpreted. This may be due to incomplete
information, inadequate and faulty procedures during data collection and sample selection
and mainly due to ignorance (lack of knowledge).

Variable
It is a characteristics or an attribute that can assume different values.
Eg: Height, Family size, Gender
Based on the values that variables assume, variables can be classified as

4
1. Qualitative variables: do not assume numeric values.
Eg: Gender
2. Quantitative variables: assume numeric values. These variables are numeric in
nature.
Eg: Height, Family size
 Discrete variable: takes whole number values and consists of distinct
recognizable individual elements that can be counted. It is a variable that
assumes a finite or countable number of possible values. These values are
obtained by counting (0, 1, 2, . . ,).
Eg: Family size, Number of children in a family, number of cars at the
traffic light
 Continuous variable: takes any value including decimals. Such a variable
can theoretically assume an infinite number of possible values. These values
are obtained by measuring.
Eg: Height, Weight, Time, and Temperature
Generally the values of a variable can be obtained either by counting for discrete variables, by
measuring for continuous variables or by making categories for qualitative variables.

Ex: Classify each of the following as Qualitative and Quantitative and if it is quantitative classify
as Discrete and Continuous.

a. Color of automobiles in a dealer’s show room.

b. Number of seats in a movie theater.
c. Classification of patients based on nursing care needed (complete, partial or seafarer)
d. Number of tomatoes on each plant on a field.
e. Weight of newly born babies.
Scales of Measurements/Levels of Measurements
Consider the following two cases.
 Mr A wears 5 when he plays foot ball.
 Mr B wears 6 when he plays foot ball.
Who plays better?
What is the average shirt number?

5
 Mr A scored 5 in Stat quiz.
 Mr B scored 6 in Stat quiz.
Who did better?
What is the average score?

Based on the number on the shirts it is not possible to judge, whether Mr B plays better. But by
using the test score, it is possible to judge that Mr B did better in the exam. Also it not possible
to find the average shirt numbers (or the average shirt number is nothing) because the numbers
on the shirts are simply codes but it is possible to obtain the average test score.

Therefore scales of measurement

 Shows the information contained in the value of a variable.

 Shows also that what mathematical operations and what statistical analysis are
permissible to be done on the values of the variable.
 Nominal Scales of variables are those qualitative variables which show category of
individuals. They reflect classification into categories (name of groups) where there is no
particular order or qualitative difference to the labels. Numbers may be assigned to the
variables simply for coding purposes. It is not possible to compare individual basing on the
numbers assigned to them. The only mathematical operation permissible on these variables is
counting.
These variables
 Have mutually exclusive (non-overlapping) and exhaustive categories.
 No ranking or order between (among) the values of the variable.
Eg: Gender, Religion, ID No, Ethnicity, Color

 Ordinal Scales of variables are also those qualitative variables whose values can be ordered
and ranked. Ranking and counting are the only mathematical operations to be done on the
values of the variables. But there is no precise difference between the values (categories) of
the variable.

6
Eg: Academic qualifications (B.Sc., M.Sc., Ph.D.), Grade Scores (A, B, C, D, F), Strength
(very weak, week, strong, very strong), Health status (very sick, sick, cured)

 Interval Scales of variables are those quantitative variables when the value of the variables is
zero it does not show absence of the characteristics i.e. there is no true zero. Zero indicates
low than empty. There is a precise difference between the units of measurement (levels)
Eg: temperature, 00c does not mean there is no temperature but to say it is too cold.
A student who scored “0” doesn’t mean he/she has no knowledge. A person having an IQ =
0 doesn’t mean that he/she no IQ.
 Ratio Scales of variables are those quantitative variables when the values of the variables are
zero it shows absence of the characteristics. Zero indicates absence of the characteristics.
Eg: Height, Weight, Income, Amount of yield, Expenditure, Consumption.
All mathematical operations are allowed to be operated on the values of the variables.

7
Chapter Two
Data Collection and Presentation

Classification of Data

Based on the source, data can be classified into two: Primary Data and Secondary Data.

 Primary data are data collected for the first time either through direct observation or by
enquiring individuals. It refers to the data collected either by or under the direct
supervision and instruction of the researcher.
 Secondary data are data obtained from published or unpublished sources like
newspapers, journals, official records, e.t.c.

Based on the role of time, data can be classified as Cross-sectional and Time series.

 Cross-sectional Data: is a set of observations taken at a point of time.

 Time series Data: is a set of observations collected for a sequence of time usually at
equal intervals.
Methods of Data Collection

The first and foremost task in statistical investigation is data collection. Before data collection,
four important points should be considered. These are the purpose of data collection (why we
need to collect data), the data to be collected (what kind of data to be collected), the source of
data (where we can get the data) and the methods of data collection (how can we collect this
data). These steps are called the why, what, where and how of the data collection.

Primary data are collected from primary sources and secondary data from secondary sources.

Primary data can be collected through experimental methods in laboratory in natural sciences
and through survey method in social sciences.

The survey methods of data collection are personal interview, telephone interview, mailed
questionnaire and personal observation.

8
 Observational Method: This method involves monitoring of an ongoing activity and
direct recording of data. It avoids incompleteness of data. However, it is rarely used as it
is not possible to plan when the events will happen.
 Personal Interview: a trained interviewer asks a series of questions and records
responses on a specially designed form called questionnaire. In this approach the
enumerator is with the respondent s/he explains some points which is not clear for the
respondent. In this approach the quality of the data affected both the design of the
questionnaire and the quality of the interviewer.
Advantage
It has obtaining information in depth from a person being interviewed, since we can
make some clarifications to the questions and avoids incompleteness and disorder
responses.
Disadvantage:
 It is costly than other methods, since it requires training of interviewers
and transportation cost.
 The respondent may not tell us the real information for sensitive questions,
since there is face to face interaction. Eg: Asking about salary, if his/her
salary is very small, he/she might tell us the wrong one, since the
respondent gets ashamed of it.
 Telephone Interview: This method involves contacting the respondent on telephone and
collecting information. It is faster to collect information. The absence of telephone lines
makes this approach less usable. It cannot be also used for rural surveys.
Advantage: It is less costly, since it requires less number of interviewers and the cost for
calling is than the cost for transportation. The respondent may give his/her opinion
candidly since there is no face to face interaction. Because of this, the data we get
through this method are more realistic than the previous one.
Disadvantage: this method is not applicable in developing countries because of the lack
of access to telephone. The respondent might not be in his/her house or may not respond
to the call, and in the meantime the interviewer might get bored. There is a high chance
of getting incomplete response, since the connection can be interrupted.

9
 Mailed Questionnaire: the researcher sends the questionnaire to the respondent; the
respondents complete the form and sends back to the researcher.
Advantage
Costs are low. The responses are free from biases of the interviewer and respondents can
have more time to give well thought answers. But it is applicable for educated persons.
Non response, Partial response, low return rates.
Disadvantage: the respondent might give in appropriate answers to questions, since there
is no one is there with them they may understand the question wrongly and respond it
incorrectly.
Types of Surveys
In general there are two methods of data collection: Census Survey and Sample Survey
Method.

Census Survey: is (complete enumeration) a study covered all the elements in the population
under consideration. In this method we resort a 100% inspection of the population and each and
every unit of the population is enumerated. It enables to obtain information about each and every
element in the population.

Sample Survey: is a survey in which some elements which are representatives of the population
(sample) are taken to infer about the whole population. It is a statistical process in which we
select and examine a sample instead of considering the whole population

The Sampling method has many advantages over the census methods.
1. Sampling reduces cost of data collection.
2. Greater speed i.e. it enables us to obtain results on time.
3. Greater accuracy. It helps us to get data of good quality as the number of
enumerators’ decreases we can train and supervise them well in the process of data
collection.
4. Greater scope (under circumstances where human and material resources are
limited).
5. Census may be destructive. Samples reduce the damages caused by some tests in
quality control. For example, in cooking food mothers check whether the food has

10
enough amount of salt, spices, butter and so on, by taking small amount and testing
it. What would happen if the test is all what is in the dish?
6. Complete enumeration may be impossible or impractical (when the population is
infinite), thus sampling is the only way.
Questionnaire

It is a form containing the cover letter that explains about the person conducting the survey and
the objectives of the survey, and a set of related questions which will be answered by the
respondents.

It requires great care in preparing a questionnaire for data collection. One of the most important
points in preparing it is that all questions in it must have relevance to the objectives of the
survey.

A highly structured questionnaire is the one in which the questions to be asked and the response
permitted are completely predetermined.

Example why did you buy Sony TV?

Lower price
Good quality
Better picture
Longer guarantee

This is accomplished by employing fixed alternative questions in which the respondents are
limited to the stated alternative.

A highly unstructured questionnaire is the one in which the questions to be asked are only
loosely determined, and the respondent is free to respond in his her own words and in any s/he
sees it.

Unstructured questionnaire has two major disadvantages.

1. They are slow, and hence ,costly to administer in the field and to tabulate
2. The data collection process and the interpretation of the results are both subjective
and hence, open to bias.
11
Structured techniques overcome these problems, but they are difficult to use in situations where
respondents may hesitate to report their attitudes.

The disguised questionnaire attempts to hide the purpose of the study; whereas the undisguised is
one in which the purpose of the research is obvious from the question posed.

Structured undisguised questionnaires are the most common used type in practice. These are
simple to administer and easy to tabulate and analyze. The alternative questions are more
productive when possible replies are well known, limited in number and clear cut.

The unstructured undisguised questionnaire is the one in which the purpose of the study is not
concealed but the response to the question is open ended. Consider the question “how do you
express the need for democracy in developing countries?” such a question provides complete
freedom to the respondent. However, the responses are difficult to tabulate and analyze. In the
unstructured undisguised questionnaire, the respondents are not told about the purpose of the
study and the questions are framed in a manner that there is complete freedom for the respondent
to answer. The basic philosophy underlining such questionnaire is that, the more the unstructured
and ambiguous a stimulus, the more a subject can and will project the respondent emotions,
needs, motivations, attitudes and values. The practical difficulties of editing, coding and
tabulation of replies impose serious limitations on the use of methods. This method is more often
used for Investigative Research. The unstructured undisguised questionnaires are also not
popularly in practice. Having decided which type of questionnaire to use, the following points
should be kept in mind while designing a questionnaire.

 The person conducting the survey should introduce himself and state the objective of the
survey, promise of the anonymity and include instructions as are necessary in giving
correct responses (on the cover letter).
 The number of questions should be as few as possible.
Once the objectives of the survey are clearly defined only questions pertinent to the
objectives should. The time of the respondent should not be wasted by asking irrelevant
questions. In general 5 to 25 may be regarded as affair number. If a lengthy questionnaire
is unavoidable, it should preferably be divided into two or more parts.

12
 Questions should be logically arranged. Put the questions in the appropriate sequence of
topics. Topics should not be mixed up.
The questions should be in a logical order so that a natural and spontaneous reply is
introduced. They should not skip back and forth.
It is undesirable to ask a person how many children s/he has before asking whether s/he is
married or not.
Questions related to identification and description of the respondent should be come first,
followed by major information questions. If opinions are requested, such questions
should usually be placed at the end of the list.
 Questions should be simple, short and easy to understand and they should convey one
and only one idea. Technical terms should be avoided.
 Sensitive questions (questions of personal and financial nature) should be avoided. Such
questions should be obtained indirectly, among asset of ranges. Unless put them at the
last part and within a set of ranges. Eg: Age (0-25, 26-50, 51-75,>75)
Salary (Below 200,200-500,500-1000,>1000)
 Leading questions should be completely avoided. If you ask person like “Don not you
smoke?” the person will automatically say ‘Yes I do not’
 Answers to the questions should not require any calculation.
 There should be instructions how to fill the form.
 Questions should be capable of objective answers.

Types of questions
Different types of questions that may form a questionnaire can be grouped into three
categories.
1. Dichotomous questions
2. Multiple-choice questions, and
3. Open-ended questions

Dichotomous questions are type of questions which have two alternative responses. Such
questions can be answered in ‘Yes’ or ‘No’.

Eg: Do you intend to purchase TV?

13
Yes No

If you use such types of questions in a questionnaire it is an excellent technique but it is applied
to situations where a clear alternative exists.

However, if a questions do not have a clear choices like ‘Yes’ or ‘No’, such questions cannot be
used as a dichotomous questions or additional answers should be added as follows.

Do you drink coffee?

Yes No
If Yes, how often
Always Occasionally
Seldom

Multiple-choice questions: in such types of questions the respondent is asked to select one out of
a number of alternative responses. This process not only facilitates tabulation of data but also
takes very little time of the respondent to fill the questionnaire.

Eg: Why did you purchase a Sony TV?

Lower price
Best quality
Better picture
Longer guarantee
Any other

The problem with multiple-choice questions is that the respondent may like to tick more than one
alternative. So to avoid such a problem either we have to inform the respondent to choose the
most important one or to make a rank among his choices.

The use of multiple choice questions are indicated only when the investigator is confident of the
existence of a limited group of important alternatives.

Open-ended or free answer questions: In such types of questions, the respondent will have the
chance to answer the questions in his/her own words.

14
Eg: -What is your opinion on the teaching policy?

The difficulty with these types of questions is in classifying the questions during tabulations and
analysis.

Pre-test: test the questionnaire on a few numbers of respondents for some correction before
actual data collection. The pre-test helps the researcher to improve the language and structure of
the questionnaire. It can also help to estimate the average time taken to complete the
questionnaire and thereby to estimate the cost required for the survey.

Secondary data

Secondary data should be used with utmost care. So before using this data, the following three
points should be considered.

1. Whether the data are suitable for the purpose of investigation. This can be judged in the
light of the nature and scope of investigation.
2. If the data obtained is suitable for our purpose it should be look at whether the data are
adequate for the purpose of investigation. This can be judged in the light of the time and
geographical area covered by the available data.
3. Whether the data are reliable. The data obtained should be checked for its accuracy.

Tabular Method of Data Presentation

2.1.1. Frequency Distributions

To describe situations, draw conclusions, or make inferences about events, the researcher must
organize the data in some meaningful way. The most convenient method of organizing data is to
construct a frequency distribution.
Two types of frequency distributions that are most often used are the categorical frequency
distribution and the grouped frequency distribution. The procedures for constructing these
distributions are shown now.

15
Categorical Frequency Distributions
The categorical frequency distribution is used for data that can be placed in specific categories,
such as nominal- or ordinal-level data. For example, data such as political affiliation, religious
affiliation, or major field of study would use categorical frequency distributions.
 Nominal Scales of variables are those qualitative variables which show category of
individuals. They reflect classification into categories (name of groups) where there is no
particular order or qualitative difference to the labels. Numbers may be assigned to the
variables simply for coding purposes. It is not possible to compare individual basing on the
numbers assigned to them. The only mathematical operation permissible on these variables is
counting.
These variables
 Have mutually exclusive (non-overlapping) and exhaustive categories.
 No ranking or order between (among) the values of the variable.
Eg: Gender, Religion, ID No, Ethnicity, Color
 Ordinal Scales of variables are also those qualitative variables whose values can be ordered
and ranked. Ranking and counting are the only mathematical operations to be done on the
values of the variables. But there is no precise difference between the values (categories) of
the variable.
Eg: Academic qualifications (B.Sc., M.Sc., Ph.D.), Grade Scores (A, B, C, D, F), Strength
(very weak, week, strong, very strong), Health status (very sick, sick, cured).
Example: Distribution of Blood Types
Twenty-five army inductees were given a blood test to determine their blood type. The data
set is
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
Construct a frequency distribution for the data.
Solution

16
Since the data are categorical, discrete classes can be used. There are four blood types: A, B,
O, and AB. These types will be used as the classes for the distribution.
The procedure for constructing a frequency distribution for categorical data is given next.
Step 1: Make a table as shown.
A B C D
Class Tally Frequency Percent
A
B
O
AB
Step 2: Tally the data and place the results in column B.
Step 3: Count the tallies and place the results in column C.
Step 4: Find the percentage of values in each class by using the formula
f
% = n . 100%
Where f = frequency of the class and, n = total number of values. For example, in the class of
type A blood, the percentage is:
5
% = 25 . 100% = 20%
Percentages are not normally part of a frequency distribution, but they can be added since
they are used in certain types of graphs such as pie graphs.
Also, the decimal equivalent of a percent is called a relative frequency.
Step 5: Find the totals for columns C (frequency) and D (percent). The completed table is
shown.
Grouped Frequency Distributions
When the range of the data is large, the data must be grouped into classes that are more than
one unit in width, in what is called a grouped frequency distribution.

Steps to construct a grouped frequency distribution

Step 1: Determine the classes.
 Find the highest and lowest values.
 Find the range.

17
 Select the number of classes desired.
 Find the width by dividing the range by the number of classes and rounding up.
 Select a starting point (usually the lowest value or any convenient number less than the
lowest value); add the width to get the lower limits.
 Find the upper class limits.
 Find the boundaries.
Step 2: Tally the data.
Step 3: Find the numerical frequencies from the tallies, and find the cumulative frequencies.
For example, a distribution of the number of hours that boat batteries lasted is the following.

Class Class
Limits boundaries Tally Frequency
24–30 23.5–30.5 /// 3
31–37 30.5–37.5 / 1
38–44 37.5–44.5 //// 5
45–51 44.5–51.5 ///// //// /9
52–58 51.5–58.5 ///// / 6
59–65 58.5–65.5 / 1
25
The procedure for constructing the preceding frequency distribution is given in Example 2–2;
however, several things should be noted. In this distribution, the values 24 and 30 of the first
class are called class limits. The lower class limit is 24; it represents the smallest data value
that can be included in the class. The upper class limit is 30; it represents the largest data
value that can be included in the class. The numbers in the second column are called class
boundaries. These numbers are used to separate the classes so that there are no gaps in the
frequency distribution. The gaps are due to the limits; for example, there is a gap between 30
and 31. Students sometimes have difficulty finding class boundaries when given the class
limits. The basic rule of thumb is that the class limits should have the same decimal place
value as the data, but the class boundaries should have one additional place value and end in
a 5. For example, if the values in the data set are whole numbers, such as 24, 32, and 18, the
limits for a class might be 31–37, and the boundaries are 30.5–37.5. Find the boundaries by
subtracting 0.5 from 31 (the lower class limit) and adding 0.5 to 37 (the upper class limit).
Lower limit - 0.5 = 31 - 0.5 = 30.5 = Lower boundary

18
Upper limit + 0.5 = 37 + 0.5 = 37.5 = Upper boundary
If the data are in tenths, such as 6.2, 7.8, and 12.6, the limits for a class hypothetically might
be 7.8–8.8, and the boundaries for that class would be 7.75–8.85. Find these values by
subtracting 0.05 from 7.8 and adding 0.05 to 8.8.
Finally, the class width for a class in a frequency distribution is found by subtracting the
lower (or upper) class limit of one class from the lower (or upper) class limit of the next
class. For example, the class width in the preceding distribution on the duration of boat
batteries is 7, found from 31 - 24 = 7.
The class width can also be found by subtracting the lower boundary from the upper
boundary for any given class. In this case, 30.5 - 23.5 = 7
Note: Do not subtract the limits of a single class. It will result in an incorrect answer.
The researcher must decide how many classes to use and the width of each class. To
construct a frequency distribution, follow these rules:
1. There should be between 5 and 20 classes. Although there is no hard-and-fast rule for the
number of classes contained in a frequency distribution, it is of the utmost importance to
have enough classes to present a clear description of the collected data.
2. It is preferable but not absolutely necessary that the class width be an odd number.
This ensures that the midpoint of each class has the same place value as the data.
The class midpoint Xm is obtained by adding the lower and upper boundaries and
Dividing by 2, or adding the lower and upper limits and dividing by 2:
LowerBoundery+Upperboundery
Xm = 2
Or
LowerLimit+UpperLimit
Xm = 2
For example, the midpoint of the first class in the example with boat batteries is:
24+ 30 23 .5+30. 5
2 = 27 or 2 = 27

19
The midpoint is the numeric location of the center of the class. Midpoints are necessary for
graphing (see Section 2–2). If the class width is an even number, the midpoint is in tenths.
For example, if the class width is 6 and the boundaries are 5.5 and 11.5, the midpoint is:
5. 5+11. 5
2 = 8.5
Rule 2 is only a suggestion, and it is not rigorously followed, especially when a computer is
used to group data.
3. The classes must be mutually exclusive. Mutually exclusive classes have non-overlapping
class limits so that data cannot be placed into two classes. Many times, frequency
distributions such as:
Age
10–20
20–30
30–40
40–50
are found in the literature or in surveys. If a person is 40 years old, into which class should
she or he be placed? A better way to construct a frequency distribution is to use classes such
as:
Age
10–20
21–31
32–42
43–53
4. The classes must be continuous. Even if there are no values in a class, the class must be
included in the frequency distribution. There should be no gaps in a frequency distribution.
The only exception occurs when the class with a zero frequency is the first or last class. A
class with a zero frequency at either end can be omitted without affecting the distribution.
5. The classes must be exhaustive. There should be enough classes to accommodate all the
data.
6. The classes must be equal in width. This avoids a distorted view of the data.

20
One exception occurs when a distribution has a class that is open-ended. That is, the class has
no specific beginning value or no specific ending value. A frequency distribution with an
open-ended class is called an open-ended distribution. Here are two examples of
distributions with open-ended classes.
Age Frequency Minutes Frequency
10–20 3 below 110 16
21–31 6 110–114 24
32–42 4 115–119 38
43–53 10 120–124 14
54 and above 8 125–129 5
The frequency distribution for age is open-ended for the last class, which means that anybody
who is 54 years or older will be tallied in the last class. The distribution for minutes is open-
ended for the first class, meaning that any minute values below 110 will be tallied in that
class.
Example 2–2 shows the procedure for constructing a grouped frequency distribution, i.e.,
when the classes contain more than one data value.

Reasons for constructing a frequency distribution

1. To organize the data in a meaningful, intelligible way.
2. To enable the reader to determine the nature or shape of the distribution.
3. To facilitate computational procedures for measures of average and spread
4. To enable the researcher to draw charts and graphs for the presentation of data
5. To enable the reader to make comparisons among different data sets.
Class Work:
These data represent the record high temperatures in degrees Fahrenheit ( 0F) for each of the
50 states. Construct a grouped frequency distribution for the data using 7 classes.

112 100 127 120 134 118 105 110 109 112

110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114
21
Page: 43 (Bluman)
Required:
i. Determine the classes.
ii. Tally the data.
iii. Find the numerical frequencies from the tallies, and find the cumulative frequencies.

Introduction To Statistics Hand Out 2022 Alebachew A.
No ratings yet
Introduction To Statistics Hand Out 2022 Alebachew A.
41 pages
Probability and Statistics 2022 Hand Out
No ratings yet
Probability and Statistics 2022 Hand Out
34 pages
Statatics Cha 1
No ratings yet
Statatics Cha 1
8 pages
Basic Statistics
No ratings yet
Basic Statistics
53 pages
Chapter One
No ratings yet
Chapter One
8 pages
Chapter 1 & 2
No ratings yet
Chapter 1 & 2
17 pages
2nd Software Engineering
No ratings yet
2nd Software Engineering
107 pages
CH 1
No ratings yet
CH 1
20 pages
Probability and Statistics Acct Y II T III HU SWE 12 May, 2024
No ratings yet
Probability and Statistics Acct Y II T III HU SWE 12 May, 2024
123 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
Stat 1&2
No ratings yet
Stat 1&2
24 pages
Satatistics
No ratings yet
Satatistics
40 pages
Probability and Statistics for Engineers
No ratings yet
Probability and Statistics for Engineers
106 pages
STATISTICS
No ratings yet
STATISTICS
98 pages
Prepared by Kenish
No ratings yet
Prepared by Kenish
28 pages
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
No ratings yet
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
111 pages
Chapter One: 1.1definition and Classification of Statistics
No ratings yet
Chapter One: 1.1definition and Classification of Statistics
22 pages
Note for Int to Statistics
No ratings yet
Note for Int to Statistics
24 pages
Chapter 1
No ratings yet
Chapter 1
18 pages
Intro To Stat Acc 233
No ratings yet
Intro To Stat Acc 233
3 pages
Chapter 1-2 Basic Stat.docx NEW (1) (1)
No ratings yet
Chapter 1-2 Basic Stat.docx NEW (1) (1)
15 pages
Statstics Full Handout
0% (1)
Statstics Full Handout
95 pages
stat 1-3 chapters
No ratings yet
stat 1-3 chapters
30 pages
Chapter 1-4 For Fundametals of Biostat
No ratings yet
Chapter 1-4 For Fundametals of Biostat
36 pages
Chap 1
No ratings yet
Chap 1
5 pages
Chpt1 4
No ratings yet
Chpt1 4
19 pages
Wakgari Chapter One Introduction
No ratings yet
Wakgari Chapter One Introduction
47 pages
Ch#5# ST
No ratings yet
Ch#5# ST
79 pages
Statistics by Begashaw Moltot
100% (2)
Statistics by Begashaw Moltot
232 pages
Business Stastics
No ratings yet
Business Stastics
82 pages
Math-101-Statistics
No ratings yet
Math-101-Statistics
100 pages
CHAPTER ONE Statistics Statistic
No ratings yet
CHAPTER ONE Statistics Statistic
24 pages
Basic Statistics2222
No ratings yet
Basic Statistics2222
52 pages
Lecture Note (Chapter-I and II) PDF
No ratings yet
Lecture Note (Chapter-I and II) PDF
26 pages
CHAPTER ONE. INTRODUCTION TO STATISTICS docx
No ratings yet
CHAPTER ONE. INTRODUCTION TO STATISTICS docx
6 pages
1 Chapt 1 Part 1
No ratings yet
1 Chapt 1 Part 1
41 pages
Chapter One Definition of Statistics
No ratings yet
Chapter One Definition of Statistics
17 pages
Stat 1-3 Chapters
No ratings yet
Stat 1-3 Chapters
36 pages
Basic Stat 1-2 PDF-1-1
No ratings yet
Basic Stat 1-2 PDF-1-1
15 pages
Introduction To Statistics - Lecture Note RC-1
No ratings yet
Introduction To Statistics - Lecture Note RC-1
64 pages
Basic Statistics PDF
No ratings yet
Basic Statistics PDF
43 pages
Note For Students
No ratings yet
Note For Students
68 pages
CHAPTER ONE Stat I
No ratings yet
CHAPTER ONE Stat I
6 pages
Introductio To Statistics Module in DMU
No ratings yet
Introductio To Statistics Module in DMU
100 pages
Business Statistics
No ratings yet
Business Statistics
186 pages
Chapter One
No ratings yet
Chapter One
7 pages
Stat195 Handout (Rev)
50% (2)
Stat195 Handout (Rev)
101 pages
Begashaw Probability Full @keleme - 2013
100% (3)
Begashaw Probability Full @keleme - 2013
232 pages
statistics 1-1
No ratings yet
statistics 1-1
4 pages
Statistics For Management I
No ratings yet
Statistics For Management I
82 pages
CHAPTER 1 (1)
No ratings yet
CHAPTER 1 (1)
11 pages
Statistics Unit One
No ratings yet
Statistics Unit One
7 pages
STATISTICS Powrepoint 2
No ratings yet
STATISTICS Powrepoint 2
82 pages
Statistics Chapter 1
No ratings yet
Statistics Chapter 1
8 pages
BHRM 242 - Collection, Organisation and Presentation of Data
No ratings yet
BHRM 242 - Collection, Organisation and Presentation of Data
13 pages
Nature of Statistics
100% (1)
Nature of Statistics
7 pages
Nature of Statistics
No ratings yet
Nature of Statistics
7 pages
Chapter 1 and 2
No ratings yet
Chapter 1 and 2
60 pages
Statistics and Probability A Brief History of Statistics
No ratings yet
Statistics and Probability A Brief History of Statistics
42 pages
Business Statistics
From Everand
Business Statistics
Knowledge Flow
No ratings yet
Chapter 1
No ratings yet
Chapter 1
23 pages
Zolas Et Al - Wrapping It Up in A Person - Examining Employment and Earnings Outcomes For Ph.d. Recipients
No ratings yet
Zolas Et Al - Wrapping It Up in A Person - Examining Employment and Earnings Outcomes For Ph.d. Recipients
6 pages
Traffic Counts Manual
100% (1)
Traffic Counts Manual
23 pages
Applied GIS and Spatial Analysis 1st edition Edition John Stillwell download
100% (1)
Applied GIS and Spatial Analysis 1st edition Edition John Stillwell download
53 pages
Unit 7
No ratings yet
Unit 7
114 pages
CGT 270 Data Visualization Hackathon Fall 2021: Covid-19 Vaccination Demographics
No ratings yet
CGT 270 Data Visualization Hackathon Fall 2021: Covid-19 Vaccination Demographics
16 pages
Velasco Et Al. (2000) - Manual of Crime Analysis Map Production
No ratings yet
Velasco Et Al. (2000) - Manual of Crime Analysis Map Production
36 pages
The Oppidan Press - Edition 1 (O-Week) - 2012
No ratings yet
The Oppidan Press - Edition 1 (O-Week) - 2012
11 pages
Block 1
No ratings yet
Block 1
82 pages
TIRUNELVELI Vallioor
No ratings yet
TIRUNELVELI Vallioor
809 pages
BBA III Store Design
No ratings yet
BBA III Store Design
147 pages
2018042016
No ratings yet
2018042016
186 pages
Sosc 201 Lecture Notes
No ratings yet
Sosc 201 Lecture Notes
13 pages
Proper and Humane Relocation Procedures PDF
No ratings yet
Proper and Humane Relocation Procedures PDF
8 pages
Just Plain Data Analysis Finding Presenting and Interpreting Social Science Data 2nd Edition Gary M. Klass instant download
No ratings yet
Just Plain Data Analysis Finding Presenting and Interpreting Social Science Data 2nd Edition Gary M. Klass instant download
66 pages
Chapter13 Sampling Non Sampling Errors
No ratings yet
Chapter13 Sampling Non Sampling Errors
7 pages
Research Methords Unit Iii
No ratings yet
Research Methords Unit Iii
13 pages
Day 1 - Census Research
No ratings yet
Day 1 - Census Research
2 pages
Example of Water Supply
100% (2)
Example of Water Supply
3 pages
Enp. Ultimate Reviewer Ver.1.0 (250 Questions) J.P.DeAla
No ratings yet
Enp. Ultimate Reviewer Ver.1.0 (250 Questions) J.P.DeAla
12 pages
© Ncert Not To Be Republished: Organisation of Data
No ratings yet
© Ncert Not To Be Republished: Organisation of Data
18 pages
The schematic state race transnationalism and the politics of the census First Paperback Edition Thompson - Download the ebook today to explore every detail
100% (1)
The schematic state race transnationalism and the politics of the census First Paperback Edition Thompson - Download the ebook today to explore every detail
59 pages
Welcome To Year 8, To The Geography Lesson!: Textbook Your Small Light Blue
No ratings yet
Welcome To Year 8, To The Geography Lesson!: Textbook Your Small Light Blue
17 pages
2011 Census: Key Statistics For Local Authorities in England and Wales
No ratings yet
2011 Census: Key Statistics For Local Authorities in England and Wales
516 pages
SOE Report 2016
No ratings yet
SOE Report 2016
404 pages
Atlantic Media/Siemens State of The City Raw Data
0% (1)
Atlantic Media/Siemens State of The City Raw Data
611 pages
Problem Tree Analysisof Special Educationin Pakistan
No ratings yet
Problem Tree Analysisof Special Educationin Pakistan
21 pages
Business Research Analysis
No ratings yet
Business Research Analysis
36 pages
Immigrants Learn English
No ratings yet
Immigrants Learn English
5 pages
Igcse Sociology TRCD Web PDF
70% (10)
Igcse Sociology TRCD Web PDF
20 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 1 (2)

Uploaded by

Chapter 1 (2)

Uploaded by

Chapter One

According to the singular sense definition of statistics, a statistical study (statistical

Ex: Classify the following statements as Descriptive and Inferential Statistics

a. The average age of the students in this class is 21 years.

 In Scientific Research: Statistics is used as a tool in a scientific research. Statistical

 It cannot deal with a single observation; rather it deals aggregate of facts.

a. Color of automobiles in a dealer’s show room.

Therefore scales of measurement

 Shows the information contained in the value of a variable.

 Cross-sectional Data: is a set of observations taken at a point of time.

Example why did you buy Sony TV?

Unstructured questionnaire has two major disadvantages.

Eg: Do you intend to purchase TV?

Do you drink coffee?

Eg: Why did you purchase a Sony TV?

Tabular Method of Data Presentation

Steps to construct a grouped frequency distribution

Reasons for constructing a frequency distribution

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.