Applied Statistics and Data Analysis For Engineers (2143B) : Lecture 2: Graphical Summaries

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 30

Applied Statistics and Data Analysis for Engineers

(2143B)
Lecture 2 : Graphical Summaries
January 9, 2017
Three Brunches of Probability and Statistics …

• Descriptive Statistics

• Inferential Statistics

• Probability
Descriptive Statistics
• Sample data are the numeric observations of a
phenomenon of interest.

• We gain an understanding of this collection by


describing it numerically and graphically.

• We describe the collection in terms of shape, center,


spread , and outliers.
Graphical Summaries

• Quantitative data
 Stem-and-Leaf Plots
 Dot Plots
 Histograms
 Pareto Charts
 Box Plots (Later)

• Qualitative/Categorical data
 Bar Charts
What is a Dot Plot?
A dot plot is a graph that shows the distribution of a quantitative
variable above a number line with small periods, dots, circles or
x’s. It plots a quantitative variable against a quantitative
variable.
Axes on a dot plot
A dot plot only has an x-axis.
The y-axis is never drawn
Advantage of a dot plot
Moderate amounts of quantitative data can be quickly visualized

5
Fuel Consumption—Data

Fuel Consumption for 2009 Passenger Fords


30 27 22 25 24 25 24 15
35 35 33 49 49 10 27 18
20 23 24 25 30 24 24 24
18 20 25 27 24 32 29 27
24 27 26 25 24 28 33 30

6
Fuel Consumption—Dot Plot
Fuel Consumption for 2009 Passenger Fords

.
:
: . .
: : : .
. . : : . . : : . : . . : . : : :
| | | | | | | | | | | | | | | | | | | | |
10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
Source: Consumer Reports x-axis: miles per gallon
Histograms
• Provides a visual display of the distribution of quantitative variables by reducing (or
summarizing) the variable to a categorical one
• We must divide the range of the data into intervals, which are usually called class
intervals, cells, or bins. If possible, the bins should be of equal width to enhance
the visual information in the histogram.
• A histogram is a visual display of a frequency distribution, similar to a bar chart or a
stem-and-leaf diagram.
• Steps to construct a histogram with equal bin widths:

1) Label the bin boundaries on the horizontal scale.


2) Mark & label the vertical scale with the frequencies or relative frequencies.
3) Above each bin, draw a rectangle whose height is equal to the frequency
corresponding to that bin.
Frequency Distributions
• A frequency distribution is a compact summary of data,
expressed as a table.
• The data is gathered into bins or cells, defined by class
intervals.
• The number of classes, multiplied by the class interval,
should exceed the range of the data. The square root of
the sample size is a guide.
• The boundaries of the class intervals should be convenient
values, as should the class width.
Unemployment Rate Data
Frequency Distribution of Unemployment Rate Data
Constructing a Frequency Distribution
Constructing a Frequency Distribution
• The number of classes should be between 5 and 15.
– Fewer than 5 classes cause excessive summarization.
– More than 15 classes leave too much detail.
• Class Width
– Divide the range by the number of classes for an approximate class width
– Round up to a convenient number
– So if the number of classes is 6, then
Histogram Construction
Class Interval Frequency
20-under 30 6
20 20
20 20
30-under 40 18 18
18

40-under 50 11 15
15

Frequency
50-under 60 11

Frequency
11 11
11 11
10 10
10 10

60-under 70 3 6
6

70-under 80 1 5
3
3

1
1
0
0 20 30 40 50 60 70 80
20 30 40 50 60 70 80
Years
Years
Relative Frequency
The relative frequency is the proportion of the total frequency that is any given class interval in a
frequency distribution. One may use relative frequency instead of frequency in the Histogram plots.
Histogram Example : Criminal Finger Lengths

8000
• Length of left middle
7000

finger collected on 6000

3000 criminals in 1902 5000

Frequency
4000

3000

• Distribution is nearly 2000

symmetric 1000

0
9.6 10.2 10.8 11.4 12.0 12.6 13.2 13.8
Finger Length
700

600

Frequency 500

400

300

200

100

0
9.6 10.2 10.8 11.4 12.0 12.6 13.2 13.8
Finger Length
Coal Mining Disasters
50
• Time in days between
coal mining disasters in 40

England from 1851 to


1896
30

Frequency
20

• Distribution is skewed 10

to the right 0
0 150 300 450 600 750
Time in Days to the Next Disaster
Pareto Chart

An important variation of the histogram is the Pareto


chart. This chart is widely used in quality and process
improvement studies where the data usually represent
different types of defects, failure modes, or other categories
of interest to the analyst. The categories are ordered so that
the category with the largest number of frequencies is on
the left, followed by the category with the second largest
number of frequencies, and so forth.
Histogram (Pareto Chart)
Bar Charts

• Used for categorical variables

• Provides a visual display of the relative sizes of


each category
UM – Undergraduate
male
UF – Undergraduate
female
MM – Masters male
MF – Masters female
PM – PhD male
PF – PhD female
Pie Charts
• Used for categorical variables

• Provides a visual display of what fraction of


the whole each category takes up
UM – Undergraduate
male
UF – Undergraduate
female
MM – Masters male
MF – Masters female
PM – PhD male
PF – PhD female
Second Quarter Truck Production (Hypothetical values)
Pie Chart Calculations for Company A
Second Quarter : Truck Production
Pie Charts Versus Bar Charts
Pie charts:
(1) must include all the categories that make up the whole; and
(2) are useful only when you want to emphasize each category’s relation
to the whole.
Pie Charts Bar Charts

Recommendation: almost always use bar charts

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy