4 - Stat - Measures of Variation 2021
4 - Stat - Measures of Variation 2021
Measures of Variation
1
Range
The range of a data set is the difference between the maximum and minimum date entries
in the set.
Range = (Maximum data entry) – (Minimum data entry)
Example:
The following data are the closing prices for a certain stock on ten successive
Fridays. Find the range.
Stock 56 56 57 58 61 63 63 67 67 67
2
Deviation
The deviation of an entry x in a population data set is the difference between the entry and
the mean μ of the data set.
Deviation of x = x – μ
Example:
The following data are the closing prices for a Stock Deviation
certain stock on five successive Fridays. Find x x–μ
the deviation of each price. 56 56 – 61 = – 5
58 58 – 61 = – 3
61 61 – 61 = 0
63 63 – 61 = 2
The mean stock price is
67 67 – 61 = 6
μ = 305/5 = 61.
Σx = 305 Σ(x – μ) = 0
3
Variance and Standard Deviation
The population variance of a population data set of N entries is
Population variance = (x μ )2
2
.
N
“sigma
squared”
The population standard deviation of a population data set of N entries is the square root
of the population variance.
4
Finding the Population Standard Deviation
Guidelines
In Words In Symbols
1. Find the mean of the population data x
μ
set. N
5
Finding the Sample Standard Deviation
Guidelines
In Words In Symbols
1. Find the mean of the sample data set. x
x
n
2. Find the deviation of each entry.
3. Square each deviation. x x
SS x x x
2
5. Divide by n – 1 to get the sample
variance. x x
2
2
s
6. Find the square root of the variance to n 1
get the sample standard deviation.
x x
2
s
n 1
6
Finding the Population Standard Deviation
Example:
The following data are the closing prices for a certain stock on five successive Fridays.
The population mean is 61. Find the population standard deviation.
Always positive!
7
Interpreting Standard Deviation
When interpreting standard deviation, remember that is a measure of the typical amount an
entry deviates from the mean. The more the entries are spread out, the greater the standard
deviation.
14 14
12 x=4 12 x =4
10 s = 1.18 10 s=0
Frequency
Frequency
8 8
6 6
4 4
2 2
0 0
2 4 6 2 4 6
Data value Data value
8
Empirical Rule (68-95-99.7%)
Empirical Rule
For data with a (symmetric) bell-shaped distribution, the standard
deviation has the following characteristics.
1. About 68% of the data lie within one standard deviation of the
mean.
2. About 95% of the data lie within two standard deviations of the
mean.
3. About 99.7% of the data lie within three standard deviation of
the mean.
9
Empirical Rule (68-95-99.7%)
99.7% within 3
standard deviations
68% within 1
standard
deviation
34% 34%
2.35% 2.35%
13.5% 13.5%
–4 –3 –2 –1 0 1 2 3 4
10
Using the Empirical Rule
Example:
The mean value of homes on a street is $125 thousand with a
standard deviation of $5 thousand. The data set has a bell shaped
distribution. Estimate the percent of homes between $120 and
$130 thousand.
68%
12
Chebychev’s Theorem
The portion of any data set lying within k standard
deviations (k > 1) of the mean is at least
1 12 .
k
1 1 1 1 3 ,
For k = 2: In any data set, at least 22 4 or475%, of the data lie within 2 standard
deviations of the mean.
1 1 1 1 8 ,
For k = 3: In any data set, at least 32 9 or9 88.9%, of the data lie within 3 standard
deviations of the mean.
13
Using Chebychev’s Theorem
Example:
The mean time in a women’s 400-meter dash is 52.4
seconds with a standard deviation of 2.2 sec. At least 75%
of the women’s times will fall between what two values?
2 standard deviations
At least 75% of the women’s 400-meter dash times will fall between 48 and 56.8
seconds.
14
Standard Deviation for Grouped Data
(x x )2 f
Sample standard deviation = s
n 1
where n = Σf is the number of entries in the data set, and x is the data value or the
midpoint of an interval.
Example:
The following frequency distribution represents the ages of 30 students in a statistics
class. The mean age of the students is 30.3 years. Find the standard deviation of the
frequency distribution.
Continued.
15
Standard Deviation for Grouped Data
The mean age of the students is 30.3 years.
(x x )2 f 2988.8
s 103.06 10.2
n 1 29
The standard deviation of the ages is 10.2 years.
16
§ 2.5
Measures of Position
17
Quartiles
The three quartiles, Q1, Q2, and Q3, approximately divide an ordered data set into four
equal parts.
Median
Q1 Q2 Q3
0 25 50 75 100
18
Finding Quartiles
Example:
The quiz scores for 15 students is listed below. Find the first, second and third quartiles
of the scores.
28 43 48 51 43 30 55 44 48 33 45 37 37 42 38
28 30 33 37 37 38 42 43 43 44 45 48 48 51 55
Q1 Q2 Q3
About one fourth of the students scores 37 or less; about one half score
43 or less; and about three fourths score 48 or less.
19
Interquartile Range
The interquartile range (IQR) of a data set is the difference between the third and first
quartiles.
Interquartile range (IQR) = Q3 – Q1.
Example:
The quartiles for 15 quiz scores are listed below. Find the interquartile range.
Q1 = 37 Q2 = 43 Q3 = 48
20
Box and Whisker Plot
A box-and-whisker plot is an exploratory data analysis tool that highlights the important
features of a data set.
28 30 33 37 37 38 42 43 43 44 45 48 48 51 55
Continued.
21
Box and Whisker Plot
Five-number summary
• The minimum entry 28
• Q1 37
• Q2 (median) 43
48
• Q3
55
• The maximum entry
Quiz Scores
28 37 43 48 55
28 32 36 40 44 48 52 56
22
Percentiles and Deciles
Fractiles are numbers that partition, or divide, an ordered data set.
Percentiles divide an ordered data set into 100 parts. There are 99 percentiles: P1, P2,
P3…P99.
Deciles divide an ordered data set into 10 parts. There are 9 deciles: D1, D2, D3…D9.
A test score at the 80th percentile (P80), indicates that the test score is greater than 80% of
all other test scores and less than or equal to 20% of the scores.
23
Standard Scores
The standard score or z-score, represents the number of standard deviations that a data
value, x, falls from the mean, μ.
va lu e m ea n x
z
st a n da r d devia t ion
Example:
The test scores for all statistics finals at Union College have a mean of 78 and standard
deviation of 7. Find the z-score for
a.) a test score of 85,
b.) a test score of 70,
c.) a test score of 78.
Continued.
24
Standard Scores
Example continued:
a.) μ = 78, σ = 7, x = 85
x 85 78
z 1.0
7
This score is 1 standard deviation higher
than the mean.
b.) μ = 78, σ = 7, x = 70
x 70 78
z
7 1.14 lower than the mean.
This score is 1.14 standard deviations
c.) μ = 78, σ = 7, x = 78
x 78 78
z 0
7
This score is the same as the mean.
25
Example:
Relative Z-Scores
John received a 75 on a test whose class mean was 73.2 with a standard deviation of 4.5.
Samantha received a 68.6 on a test whose class mean was 65 with a standard deviation of
3.9. Which student had the better test score?
x 75 73.2 x 68.6 65
z z
4.5 3.9
0.4 0.92
John’s score was 0.4 standard deviations higher than the mean, while Samantha’s
score was 0.92 standard deviations higher than the mean. Samantha’s test score was
better than John’s.
26