Descriptive Statistics: Tabular and Graphical Methods: Mcgraw-Hill/Irwin

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 40

Chapter 2

Descriptive Statistics: Tabular


and Graphical Methods
McGraw-Hill/I rwin Copyright 2011 by The McGraw-Hill Companies, I nc. All rights reserved.
Descriptive Statistics
2.1 Graphically Summarizing Qualitative Data
2.2 Graphically Summarizing Quantitative
Data
2.3 Dot Plots
2.4 Stem-and-Leaf Displays
2.5 Crosstabulation Tables (Optional)
2.6 Scatter Plots (Optional)
2.7 Misleading Graphs and Charts (Optional)
2-2
2.1 Graphically Summarizing
Qualitative Data
With qualitative data, names identify the
different categories
This data can be summarized using a
frequency distribution
Frequency distribution: A table that
summarizes the number of items in each of
several non-overlapping classes
LO 1: Summarize
qualitative data by using
frequency distributions,
bar charts, and pie
charts.
2-3
Example 2.1: Describing 2006
Jeep Purchasing Patterns
Table 2.1 lists all 251 vehicles sold in 2006
by the greater Cincinnati Jeep dealers
Table 2.1 does not reveal much useful
information
A frequency distribution is a useful summary
Simply count the number of times each model
appears in Table 2.1
LO1
2-4
Relative Frequency and
Percent Frequency
Relative frequency summarizes the
proportion of items in each class
For each class, divide the frequency of the
class by the total number of observations
Multiply times 100 to obtain the percent
frequency
LO1
2-5
Bar Charts and Pie Charts
Bar chart: A vertical or horizontal rectangle
represents the frequency for each category
Height can be frequency, relative frequency, or
percent frequency
Pie chart: A circle divided into slices where
the size of each slice represents its relative
frequency or percent frequency
LO1
2-6
Excel Bar and Pie Chart of
the Jeep Sales Data
LO1
2-7
Pareto Chart
Pareto chart: A bar chart having the different
kinds of defects listed on the horizontal scale
Bar height represents the frequency of
occurrence
Bars are arranged in decreasing height from left
to right
Sometimes augmented by plotting a cumulative
percentage point for each bar
LO2: Construct and
interpret Pareto
charts.
2-8
Excel Frequency Table and Pareto
Chart of Labeling Defects
LO2
2-9
2.2 Graphically Summarizing
Qualitative Data
Often need to summarize and describe the
shape of the distribution
One way is to group the measurements into
classes of a frequency distribution and then
displaying the data in the form of a histogram
LO3 Summarize quantitative
data by using frequency
distributions, histograms,
frequency polygons, and
ogives.
2-10
Frequency Distribution
A frequency distribution is a list of data
classes with the count of values that belong
to each class
Classify and count
The frequency distribution is a table
Show the frequency distribution in a
histogram
The histogram is a picture of the frequency
distribution
LO3
2-11
Constructing a Frequency
Distribution
Steps in making a frequency distribution:
1. Find the number of classes
2. Find the class length
3. Form non-overlapping classes of equal width
4. Tally and count
5. Graph the histogram
LO3
2-12
Example 2.2 The Payment Time
Case: A Sample of Payment Times
22 29 16 15 18 17 12 13 17 16 15
19 17 10 21 15 14 17 18 12 20 14
16 15 16 20 22 14 25 19 23 15 19
18 23 22 16 16 19 13 18 24 24 26
13 18 17 15 24 15 17 14 18 17 21
16 21 25 19 20 27 16 17 16 21
LO3
2-13
Number of Classes
Group all of the n data into K number of
classes
K is the smallest whole number for which 2
K

n
In Examples 2.2 n = 65
For K = 6, 2
6
= 64, < n
For K = 7, 2
7
= 128, > n
So use K = 7 classes
LO3
2-14
Class Length
Find the length of each class as the largest
measurement minus the smallest divided by
the number of classes found earlier (K)
For Example 2.2, (29-10)/7

=

2.7143
Because payments measured in days, round to
three days
LO3
2-15
Form Non-Overlapping
Classes of Equal Width
The classes start on the smallest value
This is the lower limit of the first class
The upper limit of the first class is smallest
value + class length
In the example, the first class starts at 10 days
and goes up to 13 days
The next class starts at this upper limit and
goes up by class length
And so on
LO3
2-16
Tally and Count the Number of
Measurements in Each Class
LO3
2-17
Histogram
Rectangles represent the classes
The base represents the class length
The height represents
the frequency in a frequency histogram, or
the relative frequency in a relative frequency
histogram
LO3
2-18
Histograms
LO3
2-19
Some Common Distribution
Shapes
Skewed to the right: The right tail of the
histogram is longer than the left tail
Skewed to the left: The left tail of the
histogram is longer than the right tail
Symmetrical: The right and left tails of the
histogram appear to be mirror images of each
other
LO3
2-20
Skewed Distribution
LO3
Right Skewed Left Skewed
2-21
Frequency Polygons
Plot a point above each class midpoint at a
height equal to the frequency of the class
Useful when comparing two or more
distributions
LO3
2-22
Cumulative Distributions
Another way to summarize a distribution is to
construct a cumulative distribution
To do this, use the same number of classes, class
lengths, and class boundaries used for the
frequency distribution
Rather than a count, we record the number of
measurements that are less than the upper
boundary of that class
In other words, a running total
LO3
2-23
Various Frequency
Distribution
LO3
2-24
Ogive
Ogive: A graph of a cumulative distribution
Plot a point above each upper class boundary at
height of cumulative frequency
Connect points with line segments
Can also be drawn using
Cumulative relative frequencies
Cumulative percent frequencies
LO3
2-25
2.3 Dot Plots
LO4 Construct and
interpret dot plots.
2-26
2.4 Stem-and-Leaf
Displays
Purpose is to see the overall pattern of the
data, by grouping the data into classes
the variation from class to class
the amount of data in each class
the distribution of the data within each class
Best for small to moderately sized data
distributions
LO5 Construct and
interpret stem-and-
leaf displays.
2-27
Car Mileage Example
Refer to the Car Mileage Case
Data in Table 2.14
The stem-and-leaf display:

29 8
30 13455677888
31 0012334444455667778899
32 01112334455778
33 03
33 + 0.3 = 33.3
29 + 0.8 = 29.8
LO5
2-28
Car Mileage: Results
Looking at the stem-and-leaf display, the
distribution appears almost symmetrical
The upper portion (29, 30, 31) is almost a mirror
image of the lower portion of the display (31, 32,
33)
Stems 31, 32*, 32, and 33*
But not exactly a mirror reflection
LO5
2-29
Constructing a Stem-and-
Leaf Display
There are no rules that dictate the number of
stem values
Can split the stems as needed
LO5
2-30
2.5 Crosstabulation
Tables (Optional)
Classifies data on two dimensions
Rows classify according to one dimension
Columns classify according to a second
dimension
Requires three variable
1. The row variable
2. The column variable
3. The variable counted in the cells
LO6 Examine the
relationships between
variables by using cross-
tabulation tables.
(Optional)
2-31
Bond Fund Satisfaction
Survey
Raw data in Table 2.16
Fund Type High Medium Low Total
Bond Fund 15 12 3 30
Stock Fund 24 4 2 30
Tax Deferred
Annuity
1 24 15 40
Total 40 40 20 100
LO6
2-32
More on Crosstabulation
Tables
Row totals provide a frequency distribution
for the different fund types
Column totals provide a frequency
distribution for the different satisfaction levels
Main purpose is to investigate possible
relationships between variables
LO6
2-33
Percentages
One way to investigate relationships is to
compute row and column percentages
Compute row percentages by dividing each cells
frequency by its row total and expressing as a
percentage
Compute column percentages by dividing by the
column total
LO6
2-34
Row Percentage for Each
Fund Type
Raw data in Table 2.16
Fund Type High Medium Low Total
Bond Fund 50.0% 40.0% 10.0% 100%
Stock Fund 80.0% 13.3% 6.7% 100%
Tax Deferred
Annuity
2.5% 60.0% 37.5% 100%
LO6
2-35
Types of Variables
In the bond fund example, we crosstabulated
two qualitative variables
Can use a quantitative variable versus a
qualitative variable or two quantitative
variables
With quantitative variables, often define
categories
LO6
2-36
2.6 Scatter Plots (Optional)
Used to study relationships between two
variables
Place one variable on the x-axis
Place a second variable on the y-axis
Place dot on pair coordinates
LO7 Examine the
relationships between
variables by using scatter
plots (Optional).
2-37
Types of Relationships
Linear: A straight line relationship between
the two variables
Positive: When one variable goes up, the other
variable goes up
Negative: When one variable goes up, the other
variable goes down
No Linear Relationship: There is
no coordinated linear movement
between the two variables
LO7
2-38
2.7 Misleading Graphs
and Charts: (Optional)
Mean Salaries at a Major University, 2004 - 2007
Break the vertical scale to exaggerate effect
LO8 Recognize misleading
graphs and charts (optional).
2-39
Horizontal Scale Effects
Mean Salary Increases at a Major University, 2004 - 2007
Compress vs. stretch the horizontal scales to exaggerate
or minimize the effect
LO8
2-40

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy