Unit 19
Unit 19
Unit 19
Table of Contents
Introduction .................................................................................................................................... 2
Test Your Prerequisite Skills ........................................................................................................ 3
Objectives ...................................................................................................................................... 4
Lesson 1: Pearson’s Sample Correlation Coefficient
- Warm Up! ........................................................................................................................... 5
- Learn about It! ................................................................................................................... 6
- Let’s Practice! ..................................................................................................................... 8
- Check Your Understanding! ............................................................................................ 10
Lesson 2: Practice Problems
- Warm Up! ......................................................................................................................... 11
- Learn about It! ................................................................................................................. 12
- Let’s Practice! ................................................................................................................... 14
- Check Your Understanding! ............................................................................................ 24
Lesson 3: Solving Problems Involving Pearson’s Correlation Coefficient
- Warm Up! ......................................................................................................................... 26
- Learn about It! ................................................................................................................. 26
- Let’s Practice! ................................................................................................................... 28
- Check Your Understanding! ............................................................................................ 37
Challenge Yourself! ..................................................................................................................... 38
Performance Task ....................................................................................................................... 39
Wrap-up ....................................................................................................................................... 42
Key to Let’s Practice! .................................................................................................................... 43
References ................................................................................................................................... 44
UNIT 19
These are some of the many interesting questions that can be answered by correlation
analysis. In the previous unit, we learned that a scatter plot gives a graphical estimate of
the strength of correlation that exists between two variables. But there are times when a
scatter plot cannot evidently show that a correlation exists between two variables,
especially when there is a very weak relationship between them. Thus, we need an
accurate quantitative interpretation of the scatter plot.
In this unit, we are going to explore how to quantitatively measure the correlation
between two variables using Pearson’s sample correlation coefficient.
Before you get started, answer the following items on a separate sheet of paper. This will
help you assess your prior knowledge and practice some skills that you will need in
studying the lessons in this unit. Show your complete solution.
Objectives
Warm Up!
Scatter Plot
Instructions:
Read the following situation then answer the questions that follow.
A random sample of five students from a statistics class was asked about the
number of hours they spent in studying and the score they obtained in a 10-point
exam. The scatter plot for the data obtained in the study is illustrated below.
12
10
Exam Score
8
6
4
2
0
0 1 2 3 4 5 6 7
Number of Hours Spent on Studying
1. Based on the scatter plot, what type of relationship exists between the two
variables?
2. How would you describe the strength of relationship between the two variables?
3. What conclusion can you make on the relationship between the number of hours
spent in studying and the exam scores obtained by the students?
By examining the scatter plot, we can roughly estimate if a relationship exists between
two variables. The scatter plot shown in Warm Up! suggests a positive correlation; it
indicates that scores in the exam are associated with number of hours spent in studying.
However, the type of relationship that exists between two variables does not indicate
much about the strength or degree of their correlation. Fortunately, we have a statistical
measure that will help us determine the degree to which two variables are related.
If and are the two variables being observed, then the Pearson’s sample correlation
coefficient (or simply correlation coefficient or Pearson’s ) is given by
where
= number of paired observations;
= sum of products of paired and values;
= sum of squared values;
= sum of squared values;
= sum of values; and
= sum of values.
The correlation coefficient satisfies , that is, is neither greater than 1 nor less
than 1.
6
The sign of the correlation coefficient indicates the type of relationship between two
variables. A positive value indicates a positive relationship; a negative value indicates a
negative relationship. indicates a perfect positive correlation. In contrast,
indicates a perfect negative correlation. If , then there is no correlation
between two variables.
In terms of the degree or strength of correlation, the closer the absolute value of to 1,
the greater the strength of correlation. On the other hand, the closer the absolute value of
to 0, the weaker the strength of correlation. Thus, we can say that 0.95 and 0.95 are
equal in strength (both are strong) and so are 0.15 and 0.15 (both are weak).
To interpret the value of the correlation coefficient , the following correlation scale is
used.
Correlation Coefficient Qualitative Interpretation
Perfect
Very high
Moderately high
Moderately low
Very low
No correlation
For instance, if , then using the correlation scale, we can conclude that there is a
moderately low positive correlation between the two variables.
Note that Pearson’s sample correlation coefficient can only be used if the given variables
are in the interval or ratio level. However, both variables do not need to be measured on
the same scale, that is, one variable can be in the ratio level and the other one can be in
the interval level.
Let’s Practice!
Example 1: Suppose age and height are two variables under study. Can we use Pearson’s
sample correlation coefficient to determine the relationship between the two
variables? Explain.
Solution: Age and height are both measured at the ratio level. Thus, we can calculate
the correlation between the two variables using Pearson’s sample correlation
coefficient.
Try It Yourself!
Suppose savings and expenditures are two variables under study. Can we use
Pearson’s sample correlation coefficient to determine the relationship between the
two variables? Explain.
Example 2: The amount of pain experienced by a patient and the amount of anesthesia
injected are two variables under study. Is it appropriate to use Pearson’s
sample correlation coefficient to determine the relationship between the two
variables? Explain.
Solution: The amount of pain experienced by a patient is measured at the ordinal level
because pain can be ranked on a scale of 1 to 10. Meanwhile, the amount of
anesthesia injected is measured at the ratio level. Since one of the variables
is not measured at the interval or ratio level, we cannot use Pearson’s sample
correlation coefficient to find the correlation between the two variables.
Try It Yourself!
The ranking of honor students in class and IQ scores are two variables under study.
Is it appropriate to use Pearson’s sample correlation coefficient to determine the
relationship between the two variables? Explain.
Example 3: Suppose . Describe the correlation between the two variables under
study in terms of direction and strength.
Solution: The sign of indicates the direction of correlation while the absolute value of
indicates the strength of correlation. Thus, if , then there is a
moderately low positive correlation between the two variables under study.
Try It Yourself!
Suppose . Describe the correlation between the two variables under study
in terms of direction and strength.
Real-World Problems
Try It Yourself!
2. The correlation coefficient between the age of a car and its mileage per liter of
gasoline is 0.65. What does this value mean?
3. A team of researchers wants to determine if grades in college are related to first job
salary. If the correlation coefficient of the two variables is 0.55, describe the
direction and strength of correlation between the two variables.
10
Warm Up!
Fear of Pearson
Instructions:
Read the following situation then complete the table that follows by calculating the
required values.
A random sample of five students from a statistics class was asked about the
number of hours they spent in studying and the score they obtained in a 10-point
exam. The results are shown in the following table.
11
In Warm Up!, you just demonstrated the first step in obtaining the Pearson’s sample
correlation coefficient—constructing a table. The formula for Pearson’s looks somewhat
complicated but using a table to compute the values makes it easier to determine the
value of .
As stated in the previous lesson, the scatter plot gives us a rough estimate of the
relationship that exists between two variables. With the aid of the Pearson’s sample
correlation coefficient, we can determine specifically the direction and strength of
correlation between the and variables.
Recall that the formula for finding Pearson’s sample correlation coefficient is given by
where
= number of paired observations;
= sum of products of paired and values;
= sum of squared values;
= sum of squared values;
= sum of values; and
= sum of values.
12
To apply our formula, we must first find the sums of , , , , and . The values in the
last row of the table represent these sums: , , , , and
.
Substitute the obtained values into the formula for . Then round the value of to three
decimal places.
Thus, the Pearson’s sample correlation coefficient is . Notice that there are no units
associated with , and its value will remain unchanged if the and values are switched.
Using the correlation scale, we can conclude that there is a very high positive correlation
between the number of hours spent in studying and exam scores obtained by the
students.
13
Let’s Practice!
Example 1: The following table shows the grades of five students in Mathematics and
English. Determine the correlation coefficient between the two variables.
Solution: To solve for the correlation coefficient, construct the following table.
14
Complete the table by calculating the values required in each column. Square
all entries in the column then write the answers under the column.
Similarly, square all entries in the column then write the answers under the
column.
Finally, multiply the paired entries in the and columns then write the
answers under the column.
15
Grade in
Grade in
Student Mathematics
English
A 90 94 8100 8836 8460
B 85 82 7225 6724 6970
C 79 88 6241 7744 6952
D 83 85 6889 7225 7055
E 86 85 7396 7225 7310
Substitute the sums into the formula for the correlation coefficient. Then,
round off the answer to three decimal places.
16
Try It Yourself!
The following table shows the heights (in inches) and weights (in pounds) of five
children. Determine the correlation coefficient of the two variables. Then, interpret
the result.
Example 2: A teacher wants to find out if the number of absences and the final grades of
eight randomly selected students from a physics class are correlated. Using
the following data, calculate the correlation coefficient between the two
variables. Then, interpret the result.
Solution: Construct the additional three columns needed to solve for then complete
the table.
17
Substitute the sums into the formula for the correlation coefficient.
18
Try It Yourself!
Example 3: The operations manager of Rowell’s Café tabulated the monthly advertising
expenses and sales (in ten thousands) of the company for the last 12
months. Is there a correlation between the monthly advertising expenses
and sales of the restaurant?
9 9 103
10 11 119
11 14 153
12 24 238
Solution: Construct the additional three columns needed to solve for , then complete
the table.
Advertising Sales
Month
Expenses
1 15 123 225 15 129 1845
2 12 111 144 12 321 1332
3 18 188 324 35 344 3384
4 20 210 400 44 100 4200
5 12 112 144 12 544 1344
6 10 105 100 11 025 1050
7 13 166 169 27 556 2158
8 21 196 441 38 416 4116
9 9 103 81 10 609 927
10 11 119 121 14 161 1309
11 14 153 196 23 409 2142
12 24 238 576 56 644 5712
Substitute the sums into the formula for the correlation coefficient.
20
Try It Yourself!
21
Solution: Construct the additional three columns needed to solve for , then complete
the table.
Reading Attention
Student
Ability Ability
A 26 32 676 1024 832
B 15 24 225 576 360
C 50 31 2500 961 1550
D 42 22 1764 484 924
E 23 41 529 1681 943
F 17 33 289 1089 561
G 22 35 484 1225 770
22
Substitute the sums into the formula for the correlation coefficient.
Try It Yourself!
Number of
Building Height
Floors
A 483 38
B 509 35
C 518 39
D 533 40
E 579 36
F 612 43
G 613 30
H 632 38
23
1. A sociologist collected data from a sample of five adults on how many years they
have lived in their neighborhood and how many of their neighbors they consider as
friends. Determine the correlation coefficient between the two variables. Then
interpret the result.
2. The following table shows the number of weeks six persons have worked at an
automobile inspection station and the number of cars they have inspected on a
given day. Is there a correlation between the two variables?
A 3 14
B 8 22
C 10 24
D 2 15
E 6 16
F 13 22
24
following data for a given day, calculate the correlation coefficient between the two
variables. Then interpret the result.
25
Warm Up!
Instructions:
Determine which among the following statements are appropriate for use of the
Pearson’s sample correlation coefficient.
Two out of the seven statements in Warm Up! represent an opportunity to examine the
correlation between two variables using Pearson’s sample correlation coefficient. In the
previous lessons, we learned how to use Pearson’s in determining the direction and
26
strength of relationship between two variables. Before performing the procedures that we
have discussed, it is important to first recognize when a situation calls for the use of
Pearson’s as a measure of association.
Let us recall the steps in obtaining the Pearson’s sample correlation coefficient of two
variables and variables. These can be summarized as follows:
27
Let’s Practice!
Example 1: The following table shows the prices per kilogram of chicken and fish for the
past 7 months in Q Market. Is there a correlation between the prices of
chicken and fish in the market?
Solution: Construct the additional three columns needed to solve for then complete
the table.
Substitute the sums into the formula for the correlation coefficient.
28
Try It Yourself!
The following table shows the ages and systolic pressures of 7 adults. Is there a
correlation between the two variables?
29
C 45 20
D 80 90
E 75 10
F 20 10
G 50 67
H 90 40
I 75 20
J 45 12
Solution: Construct the additional three columns needed to solve for then complete
the table.
Time Spent
Free Throw
Player Practicing
Percentage
A 60 63 3600 3969 3780
B 30 36 900 1296 1080
C 45 20 2025 400 900
D 80 90 6400 8100 7200
E 75 10 5625 100 750
F 20 10 400 100 200
G 50 67 2500 4489 3350
H 90 40 8100 1600 3600
I 75 20 5625 400 1500
J 45 12 2025 144 540
Sum
Substitute the sums into the formula for the correlation coefficient.
30
Try It Yourself!
Example 3: The following table shows the average daily temperature (in degrees Celsius)
in Quipper City and the sales (in thousand pesos) of Elijah’s Coolers Store. Is
there a correlation between the temperature and sales?
31
4 36.9 10.4
5 34.4 12.0
6 35.5 11.5
7 36.8 12.9
8 34.2 10.0
9 36.0 13.1
10 35.3 13.0
11 37.0 10.6
12 38.1 17.1
13 36.7 12.4
14 35.5 12.0
Solution: Construct the additional three columns needed to solve for then complete
the table.
Average
Sales
Day Temperature
32
Substitute the sums into the formula for the correlation coefficient.
Try It Yourself!
33
Solution: Construct the additional three columns needed to solve for then complete
the table.
Height
Student Weight
A 134 53.28 17956 2838.7584 7139.52
B 145 60.90 21025 3708.8100 8830.50
C 133 55.86 17689 3120.3396 7429.38
D 120 55.40 14400 3069.1600 6648.00
E 112 47.04 12544 2212.7616 5268.48
F 129 53.48 16641 2860.1104 6898.92
G 138 57.56 19044 3313.1536 7943.28
H 126 51.12 15876 2613.2544 6441.12
34
Substitute the sums into the formula for the correlation coefficient.
35
Try It Yourself!
36
37
3. A researcher set out to determine whether suicide and homicide rates in a province
are correlated. Using available data for a recent year, he compared the following
sample of 10 cities with respect to the rates (per 100 000) of suicide and homicide.
What is the strength and direction of correlation between suicide and homicide
rates?
Challenge Yourself!
1. Which of the values 0.76 and 0.45 indicates a stronger correlation? Explain.
2. Give examples of two variables that are positively correlated and two variables that
are negatively correlated.
3. A study is conducted to determine the relationship between a driver’s age and the
number of accidents he/she had over a 2-year period. What is meant when the
relationship between the two variables is negative?
38
4. The table below shows the number of hours eight employees have spent working
and the number of defective products they have made.
Number of Defective
Number of Hours Worked
Employee Products
A 1.0 13
B 1.5 14
C 2.5 16
D 2.1 4
E 3.5 15
F 4.5 20
G 4.0 18
H 5.5 18
Performance Task
Obesity among adolescents is a major concern because it puts them at risk for serious
medical problems. You are a bariatric physician and you believe that a major issue related
to this is that adolescents these days spend too much time on social media and not
enough time being active. To prove this, use a sample of at least 20 students and collect
data regarding the number of hours they spend on social media per day and their weight
in pounds. Use the following table:
39
Use the results you obtained to determine if a correlation exists between the number of
hours spent on social media and weight of adolescents. Do the following:
40
You are to present your study in a medical forum so make sure it is organized and shows
accurate information and correct computations.
41
Wrap-up
Key Terms/Formulas
Correlation
Formula Descriptions
Coefficient
= number of paired
observations
= sum of products of
paired and values
= sum of squared
Pearson’s
values
= sum of squared
values
= sum of values
= sum of values
42
Lesson 2
1. (very high positive correlation)
2. (moderately high positive correlation)
3. (very high positive correlation)
4. (very low negative correlation)
Lesson 3
1. (very high positive correlation)
2. (moderately low positive correlation)
3. (very high positive correlation)
4. (very high positive correlation)
43
References
Acelajado, Maxima J., Rene R. Belecina, and Basilia E. Blay. Mathematics for the New
Millennium. Makati: Diwa Scholastic Press, Inc. 1999.
Bluman, Allan G. Elementary Statistics: A Step By Step Approach. New York: The McGraw-Hill
Companies, Inc., 2000.
Levin, Jack and James Alan Fox. Elementary Statistics in Social Research. Boston, MA:
Pearson Education, Inc., 2011
Runyon, Richard P., Kay A. Coleman, and David J. Pittenger. Fundamentals of Behavioral
Statistics. USA: The McGraw-Hill Companies, Inc., 2000.
44