What Is Statistics1
What Is Statistics1
What Is Statistics1
Description of
1
Samples and
Populations
Objectives
After completing this chapter, you should be able to
➢ identify the different types of variables;
➢ show and illustrate the relationship between populations and
samples;
➢ illustrate how to construct frequency distribution in making charts
and graphs;
INTRODUCTION
1. Like professional people, you must be able to read and understand the
various statistical studies performed in your fields. To have this
understanding, you must be knowledgeable about the vocabulary,
symbols, concepts, and statistical procedures used in these studies.
3. You can also use the knowledge gained from studying statistics to
become better consumers and citizens. For example, you can make
intelligent decisions about what products to purchase based on consumer
studies, about government spending based on utilization studies, and so
on.
These reasons can be considered the goals for studying statistics. It is the
purpose of this chapter to introduce the goals for studying statistics by answering
questions such as the following:
What are the branches of statistics?
What are data?
How are samples selected?
2 STATISTICS
The second area of statistics is called inferential statistics.
If the subjects of a sample are properly selected, most of the time they
should possess the same or similar characteristics as the subjects in the
population.
A discrete variable is a numeric variable for which we can list the possible
values. For example, the number of eggs in a bird’s nest is a discrete variable
because only the values 0, 1, 2, 3, . . . , are possible.Other examples of discrete
variables are Number of bacteria colonies in a petri dish.
As a summary, variables can be classified as follows;
DATA
QUALITATIVE QUANTITATIVE
QUALITATIVE QUANTITATIVE
The next level of measurement is called the ordinal level. Data measured
at this level can be placed into categories, and these categories can be ordered,
or ranked. For example, from student evaluations, guest speakers might be
ranked as superior, average, or poor. Floats in a homecoming parade might be
ranked as first place, second place, etc. Note that precise measurement of
differences in the ordinal level of measurement does not exist. For instance,
4 STATISTICS
when people are classified according to their build (small, medium, or large), a
large variation exists among the individuals in each class. Other examples of
ordinal data are letter grades (A, B, C, D, F).
The third level of measurement is called the interval level. This level differs
from the ordinal level in that precise differences do exist between units. For
example, many standardized psychological tests yield values measured on an
interval scale. IQ is an example of such a variable. There is a meaningful
difference of 1 point between an IQ of 109 and an IQ of 110. Temperature is
another example of interval measurement, since there is a meaningful difference
of 1 degrees Fahrenheit between each unit, such as 72 and 73 degrees F. One
property is lacking in the interval scale: There is no true zero. For example, IQ
tests do not measure people who have no intelligence.
The final level of measurement is called the ratio level. Examples of ratio
scales are those used to measure height, weight, area, and number of phone
calls received. Ratio scales have differences between units (1 inch, 1 pound,
etc.) and a true zero. In addition, the ratio scale contains a true ratio between
values. For example, if one person can lift 200 pounds and another can lift 100
pounds, then the ratio between them is 2 to 1. Put another way, the first person
can lift twice as much as t he second person.
In addition, true ratios exist when the same variable is measured on two
different members of the population. There is not complete agreement among
statisticians about the classification of data into one of the four categories. For
example, some researchers classify IQ data as ratio data rather than interval.
Also, data can be altered so that they fit into a different category. For instance, if
the incomes of all professors of a college are classified into the three categories
of low, average, and high, then a ratio variable becomes an ordinal variable.
5 STATISTICS
1.3. Data Collection and Sampling Techniques
Sampling Techniques
EXERCISES 2.1
7 STATISTICS
H. Experts say that mortgage rates may soon hit bottom (Source: USA
TODAY ).
2. Classify each as nominal-level, ordinal-level, intervallevel, or ratio-level
measurement.
A. Pages in the city of Cleveland telephone book.
B. Rankings of tennis players.
C. Weights of air conditioners.
D. Temperatures inside 10 refrigerators.
E. Salaries of the top five CEOs in the United States.
F. Ratings of eight local plays (poor, fair, good, excellent).
G. Times required for mechanics to do a tune-up.
H. Ages of students in a classroom.
I. Marital status of patients in a physician’s office.
J. Horsepower of tractor engines.
3. Classify each variable as qualitative or quantitative.
A. Number of bicycles sold in 1 year by a large sporting goods store.
B. Colors of baseball caps in a store.
C. Times it takes to cut a lawn.
D. Capacity in cubic feet of six truck beds.
E. Classification of children in a day care center (infant, toddler,
preschool).
F. Weights of fish caught in Lake George.
G. Marital status of faculty members in a large university
4. Classify each variable as discrete or continuous.
A. Number of doughnuts sold each day by Doughnut Heaven.
B. Water temperatures of six swimming pools in Pittsburgh on a given
day.
C. Weights of cats in a pet shelter.
D. Lifetime (in hours) of 12 flashlight batteries.
E. Number of cheeseburgers sold each day by a hamburger stand on a
college campus.
F. Number of DVDs rented each day by a video store.
G. Capacity (in gallons) of six reservoirs in Jefferson County.
5. Give three examples each of nominal, ordinal, interval, and ratio data.
8 STATISTICS
6. For each of these statements, define a population and state how a sample
might be obtained.
A. The average cost of an airline meal is $4.55 (Source: Everything Has
Its Price, Richard E. Donley, Simon and Schuster).
B. More than 1 in 4 United States children have cholesterol levels of 180
milligrams or higher (Source: The American Health Foundation).
C. Every 10 minutes, 2 people die in car crashes and 170 are injured
(Source: National Safety Council estimates).
D. When older people with mild to moderate hypertension were given
mineral salt for 6 months, the average blood pressure reading dropped
by 8 points systolic and 3 points diastolic (Source: Prevention).
E. The average amount spent per gift for Mom on Mother’s Day is $25.95
(Source: The Gallup Organization)
7. Select a newspaper or magazine article that involves a statistical study,
and write a paper answering these questions.
A. Is this study descriptive or inferential? Explain your answer.
B. What are the variables used in the study? In your opinion, what level of
measurement was used to obtain the data from the variables?
C. Does the article define the population? If so, how is it defined? If not,
how could it be defined?
D. Does the article state the sample size and how the sample was
obtained? If so, determine the size of the sample and explain how it
was selected. If not, suggest a way it could have been obtained.
E. Explain in your own words what procedure (survey, comparison of
groups, etc.) might have been used to determine the study’s
conclusions.
F. Do you agree or disagree with the conclusions? State your reasons.
8. Information from research studies is sometimes taken out of context.
Explain why the claims of these studies might be suspect.
A. The average salary of the graduates of the class of 1980 is $32,500.
B. It is estimated that in Podunk there are 27,256 cats.
C. Only 3% of the men surveyed read Cosmopolitan magazine.
D. Based on a recent mail survey, 85% of the respondents favored gun
control.
E. A recent study showed that high school dropouts drink more coffee
than students who graduated; therefore, coffee dulls the brain.
F. Since most automobile accidents occur within 15 miles of a person’s
residence, it is safer to make long trips. 17. Identify each study as
being either observational or experimental. a. Subjects were randomly
assigned to two groups, and one group was given an herb and the
other group a placebo. After 6 months, the numbers of respiratory tract
infections each group had were compared. b. A researcher stood at a
9 STATISTICS
busy intersection to see if the color of the automobile that a person
drives is related to running red lights.
1 2 6 7 12 13 2 6 9 5
18 7 3 15 15 4 17 1 14 5
4 16 4 5 8 6 6 18 5 2
9 11 12 1 9 2 10 11 4 10
9 18 8 8 4 14 7 3 2 6
To construct the frequency distribution, we first arrange the set of data into
array. Array is the arrangement of data from the highest to lowest or from lowest
to highest. The frequency distribution consist of classes and their corresponding
frequencies. Each raw data is placed into category called class. The class
frequency refers to the number of observations belonging to a class interval for
the number of items within a category.
The frequency distribution of the above set of data can be shown below.
Class Limits
Frequency
(in miles)
1-3 10
4-6 14
7-9 10
10 - 12 6
13 - 15 5
16 - 18 5
Total = 50
10 STATISTICS
Using this table, general observations can be made. For example, it can
be gleaned from the table that majority of the employees live within 9 miles away
from the company.
Find the range of the score in the given data. The range is the difference
between the highest and the lowest number. R H L .
Notice that if series contains less than 50 cases, 10 classes or less are
just enough. If series contains 50 to 100 cases, 10 to 15 classes are
recommended. If more than 100 cases, 15 or more classes are good.
3. Tally the data and find the numerical frequencies from the tallies.
Step 4: Tally the data and find the numerical frequencies from the tallies.
11 STATISTICS
Class Intervals
Frequency
Scores
45 - 47 3
42 - 44 4
39 - 41 4
36 - 38 4
33 - 35 2
30 - 32 3
27 - 29 13
24 - 26 8
21 - 23 3
18 - 20 3
15 - 17 0
12 - 14 2
9 - 11 1
N = 50
After all the data have been organized into a frequency distribution, they
can now be presented in graphical form. Graphical representations of data are
helpful tool to convey the mathematical relations of one variable to another.
12 STATISTICS
Class Intervals
Frequency
Scores
45 - 47 3
42 - 44 4
39 - 41 4
36 - 38 4
33 - 35 2
30 - 32 3
27 - 29 13
24 - 26 8
21 - 23 3
18 - 20 3
15 - 17 0
12 - 14 2
9 - 11 1
N = 50
13 STATISTICS
14
13
12
11
10
9
Frequency
8
7
6
5
4
3
2
1
8.5 11.5 14.5 17.5 20.5 23.5 26.5 29.5 32.5 35.5 38.5 41.5 44.5 47.5
Class Boundaries
B. Frequency Polygon.
Steps in making Frequency Polygon
1. Label the points on the base line.
2. Plot the midpoints. Scores within the interval are concentrated on the
midpoint.
3. When all points are plotted, join them by series of short lines.
14 STATISTICS
Above shows the frequency polygon. Notice that along the x – axis, the
class boundaries are plotted and the frequency are situated on the y – axis. Each
point on the line is plotted on the class mark or midpoint of each class interval.
15 STATISTICS
Example 2: The data below shows the record of high temperatures observed for
each of the 50 provinces in the country. Construct the histogram, frequency
polygon and cumulative frequency graph (Ogive).
Class Boundaries
Frequency
(in degree Celsius)
99.5 - 104.5 2
104.5 - 109.5 8
109.5 - 114.5 18
114.5 - 119.5 13
119.5 - 124.5 7
124.5 - 129.9 1
129.5 - 134.5 1
N = 50
Solution:
A. Histogram
1. Draw and label the x and y axes.
2. Represent the frequency on the x - axis and the class boundaries on the
y - axis.
3. Using the frequencies as the height , draw vertical bars for each class.
16 STATISTICS
B. Frequency Polygon.
1. Find the midpoints for each class.
Class Boundaries
Midpoint Frequency
(in degree Celsius)
99.5 - 104.5 102 2
104.5 - 109.5 107 8
109.5 - 114.5 112 18
114.5 - 119.5 117 13
119.5 - 124.5 122 7
124.5 - 129.9 127 1
129.5 - 134.5 132 1
N = 50
2. Draw and label the x and y axes. Label the x - axis with the midpoints
of each class, and the use of suitable scale on the y - axis for the
frequencies.
3. Using the midpoint for the x value and the frequencies as the y
values, plot the points.
4. Connect the adjacent points with line segments.
17 STATISTICS
C. The Cumulative Frequency (Ogive) Graph
1. Find the cumulative frequency for each class.
Class Boundaries
Cumulative
Midpoint Frequency
(in degree Celsius) Frequency
99.5 - 104.5 102 2 2
104.5 - 109.5 107 8 10
109.5 - 114.5 112 18 28
114.5 - 119.5 117 13 41
119.5 - 124.5 122 7 48
124.5 - 129.9 127 1 49
129.5 - 134.5 132 1 50
N = 50
2. Draw and label the x and y axes. Label the x - axis with the class
boundaries. Use an appropriate scale y - axis to represent the
cumulative frequencies.
3. Plot the cumulative frequency at each upper class boundary. Upper
boundaries are used since the cumulative frequencies represent
number of data values accumulated up to the upper boundary of each
class.
18 STATISTICS
4. Connect the adjacent points with line segments.
EXERCISES 2.2
4. How many classes should frequency distributions have? Why should the
class width be an odd number?
5. Shown here are four frequency distributions. Each is incorrectly
constructed. State the reason why.
A. Class Frequency
27–32 1
33–38 0
39–44 6
45–49 4
50–55 2
B. Class Frequency
5–9 1
9–13 2
13–17 5
17–20 6
20–24 3
C. Class Frequency
123–127 3
19 STATISTICS
128–132 7
138–142 2
143–147 19
D. Class Frequency
9–13 1
14–19 6
20–25 2
26–28 5
29–32 9
7. State Gasoline Tax The state gas tax in cents per gallon for 25 states is
given below. Construct a grouped frequency distribution and a cumulative
frequency distribution with 5 classes.
8. Weights of the NBA’s Top 50 Players Listed are the weights of the NBA’s
top 50 players. Construct a grouped frequency distribution and a
cumulative frequency distribution with 8 classes. Analyze the results in
terms of peaks, extreme values, etc.
240 210 220 260 250 195 230 270 325 225 165
295 205 230 250 210 220 210 230 202 250 265
230 210 240 245 225 180 175 215 215 235 245
250 215 210 195 240 240 225 260 210 190 260
230 190 210 230 185 260
Source: www.msn.foxsports.com
88 88 110 88 80 69 102 78 70 55 79
85 80 100 60 90 77 55 75 55 54 60
75 64 105 56 71 70 65 72
Source: New York Times Almanac.
767 770 761 760 771 768 776 771 756 770 763
760 747 766 754 771 771 778 766 762 780 750
746 764 769 759 757 753 758 746
Source: U.S. News & World Report Best Graduate Schools.
20 STATISTICS