Chapter 11. Simple Linear Regression and Correlation
Chapter 11. Simple Linear Regression and Correlation
Correlation
Hanoi, 2023
1 Empirical Models
4 Correlation
1 Empirical Models
4 Correlation
1 Empirical Models
4 Correlation
β̂0 = ȳ − β̂1 x̄
( n
P Pn
Pn x )( yi )
xi yi − i=1 i n i=1
β̂1 = i=1Pn ( n
P 2
2 i=1 xi )
i=1 xi − n
P P
where x̄ = ( xi )/n, ȳ = ( yi )/n.
n n
( ni=1 xi )2
X X P
2 2
Sxx = (xi − x̄) = xi −
n
i=1 i=1
i=1 i=1
n
SSE
An unbiased estimator of σ 2 is σ̂ 2 = n−2
1 Empirical Models
4 Correlation
SSE
where σ̂ 2 = n−2 .
H0 : β1 = 0
H1 : β1 6= 0
Example 1
Suppose we have the following information from a simple regression:
1 Empirical Models
4 Correlation
Sxy
R= √
Sxx SST
The coefficient of determination
2
Sxy Sxy SSR SSE
R2 = = β̂1 = =1−
Sxx SST SST SST SST
Note
−1 ≤ R ≤ 1, 0 ≤ R2 ≤ 1
The correlation coefficient measures the strength of the
relationship between two variables.
46 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 2
Given the least squares regression line ŷ = −2.87 − 1.6x and a coeffi-
cient of determination of 0.36, what is the coefficient of correlation?
Example 2
Given the least squares regression line ŷ = −2.87 − 1.6x and a coeffi-
cient of determination of 0.36, what is the coefficient of correlation?
2
Answer: We have √ R = 0.36. Since β̂1 = −1.6 < 0 the coefficient of
correlation is − 0.36 = −0.6.
Example 2
Given the least squares regression line ŷ = −2.87 − 1.6x and a coeffi-
cient of determination of 0.36, what is the coefficient of correlation?
2
Answer: We have √ R = 0.36. Since β̂1 = −1.6 < 0 the coefficient of
correlation is − 0.36 = −0.6.
Example 3
Suppose we have the following information from a simple regression:
β̂0 = 116, β̂1 = 10.2, n = 148, x̄ = 3.9, SST = 15600, SSE = 9200
What is the correlation coefficient?
Example 2
Given the least squares regression line ŷ = −2.87 − 1.6x and a coeffi-
cient of determination of 0.36, what is the coefficient of correlation?
2
Answer: We have √ R = 0.36. Since β̂1 = −1.6 < 0 the coefficient of
correlation is − 0.36 = −0.6.
Example 3
Suppose we have the following information from a simple regression:
β̂0 = 116, β̂1 = 10.2, n = 148, x̄ = 3.9, SST = 15600, SSE = 9200
What is the correlation coefficient?
Answer: The coefficient of determination
SSE 9200
R2 = 1 − =1− = 0.41
SST 15600
√
Since β̂1 > 0 the coefficient of correlation is 0.41 = 0.64
48 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 4
In a regression problem the following pairs of (x, y) are given:
(−4, 8), (−1, 3), (0, 0), (1, −3) and (4, −7). What does this indicate about
the value of coefficient of determination?
Example 4
In a regression problem the following pairs of (x, y) are given:
(−4, 8), (−1, 3), (0, 0), (1, −3) and (4, −7). What does this indicate about
the value of coefficient of determination?
Answer:
X X
xi = 0, x̄ = 0, yi = 1, ȳ = 0.2
X X X
x2i = 34, xi yi = −66, yi2 = 131
Sxy = −66 − 0 ∗ 0.2/5 = −66
Sxx = 34 − 02 /5 = 34
SST = 131 − 0.22 /5 = 130.8
Example 5
You want to explore the relationship between the grades students receive
on their first quiz (X) and their first exam (Y). The first quiz and test
scores for a sample of 16 students reveal the following summary statistics:
X
(xi − x̄)(yi − ȳ) = 320.5, sx = 2.05, sy = 16.9
Example 5
You want to explore the relationship between the grades students receive
on their first quiz (X) and their first exam (Y). The first quiz and test
scores for a sample of 16 students reveal the following summary statistics:
X
(xi − x̄)(yi − ȳ) = 320.5, sx = 2.05, sy = 16.9
H0 : ρ = 0
Test statistic √
R n−2
T0 = √
1 − R2
has a t−distribution with n − 2 degrees of freedom if H0 is true.
Example 6
You want to explore the relationship between the grades students
receive on their first two exams. For a sample of 25 students, you
find a correlation coefficient of 0.45. What is the value of the test
statistic for testing H0 : ρ = 0 and H1 : ρ 6= 0?
Example 6
You want to explore the relationship between the grades students
receive on their first two exams. For a sample of 25 students, you
find a correlation coefficient of 0.45. What is the value of the test
statistic for testing H0 : ρ = 0 and H1 : ρ 6= 0?
Answer: We known that n = 25, R = 0.45. Hence, the test statistic
is √ √
R n−2 0.45 ∗ 23
t0 = √ =√ = 2.417
1 − R2 1 − 0.452
Example 7
For a random sample of 22 professionals, the correlation between
their age and their income was found to be 0.3. You are interested
in testing the null hypothesis that there is no linear relationship
between these two variables against the alternative that there is a
positive relationship. What is your conclusion in testing H0 : ρ = 0
and H1 : ρ > 0 at α = 0.1? Let t0.1,20 = 1.325, t0.05,20 = 1.725
Example 7
For a random sample of 22 professionals, the correlation between
their age and their income was found to be 0.3. You are interested
in testing the null hypothesis that there is no linear relationship
between these two variables against the alternative that there is a
positive relationship. What is your conclusion in testing H0 : ρ = 0
and H1 : ρ > 0 at α = 0.1? Let t0.1,20 = 1.325, t0.05,20 = 1.725
Answer: We known n = 22, R = 0.3. Test statistic value is
√
0.3 ∗ 20
t0 = √ = 1.4
1 − 0.32
Since t0 > t0.1,20 , we should reject H0 .