0% found this document useful (0 votes)

14 views8 pages

Correlation and Regression Analysis

This document provides an overview of simple linear regression and correlation analysis, explaining key concepts such as dependent and independent variables, regression equations, and the purpose of regression analysis. It details the assumptions, estimation of parameters, interpretation of slope and intercept, and includes examples and exercises for practical application. Additionally, it discusses correlation coefficients and their significance in measuring the strength of relationships between variables.

Uploaded by

sudarionejie80

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views8 pages

Correlation and Regression Analysis

Uploaded by

sudarionejie80

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

UNIT 10.

SIMPLE LINEAR REGRESSION AND CORRELATION

10.1. Things to know

1. Regression analysis is a statistical method which makes use of the relationship between two
or more quantitative variables so that one variable, called the dependent variable or
response variable, can be predicted with the knowledge of the values of the other variable,
called the independent variable or explanatory variable..

2. A mathematical equation that allows us to predict values of one dependent variable from
known values of one or more independent variable is called a regression equation.

3. Purposes of Regression Analysis

i. Predicts the value of a dependent variable based on the value of a least one
independent variable.
ii. Explains the effect of the independent variables on the dependent variable.

4. Types of Regression Models

Positive Linear Relationship Relationship NOT Linear

Negative Linear Relationship No Relationship

5. For this chapter, it focuses on the problem of estimating or predicting a value of a dependent
variable Y on the basis of a known measurement of an independent variable X .

6. Scatter diagram is a graphical presentation of the independent variable (plotted on the

horizontal axis) and the dependent variable (plotted on the vertical axis). Through this graph
or diagram is the easiest way to determine if a relationship exists between the two variables.

7. A linear relationship between two variables is one in which the relationship can be most
accurately presented by a straight line.

1
8. In this section, the problem of estimating or predicting the value of a dependent variable on
the basis of a known measurement of an independent variable will be given consideration.

9. Although a graphic solution is sometimes used for prediction, it is much more common to
predict Y from the equation of the straight line. The general form of the equation is given
by
Y = a + bX , linear regression line equation or simple linear regression

10. For each X , the equation Y = a + bX will predict a value of Y. The estimated regression
line is defined by the equation
 
Y = a + bX Where: Y = is the predicted dependent variable
 
a = Y intercept (value of Y when X = 0)
b = slope of the line
a and b are the estimates of the parameters of
regression which are calculated from the available sample
points.

Remark: Through the estimated regression line equation we can now predict any Y value
just by knowing the corresponding X value.

11. Assumptions on Regression Analysis

i. The values of the independent variable X may be “fixed”, that is, X values may be
selected in advance by the researcher, or they may be obtained without the imposition
of any restriction, in which case, X is a random variable.
ii. The values of X are measured without error.
iii. The variance of the subpopulations of the dependent variable, given different values
of the independent variable, are equal.
iv. The subpopulation of the dependent variable X , given different values of the
independent variable Y , is normally distributed.
v. The means of the subpopulations Y all lies on the same straight line (assumptions of
linearity).

12. Estimation of Parameters

Given the sample {( xi , yi ), i = 1, 2, 3, n} the least squares estimates of the parameters in
the regression line are:
n n n
n xi yi −  xi  yi _ _
b= i =1 i =1 i =1
2
; a = y− b x
 n 
n
n x −   xi  2
i
i =1  i =1 
n n

_ y i _ x i
y= i =1
and x= i =1
are the means of the sample values
n n

2
12. a is the estimate of the population Y intercept  o and b is the estimate of the population
slope coefficient 1.

13. Interpretation of the Slope and the Intercept

 o is the average value of Y when the value of X is zero.
1 measures the change in the average value of Y as a result of a one-unit change
in X .
a is the estimated average value of Y when the value of X is zero.
b is the estimated change in the average value of Y as a result of a one-unit
change in X .

14. Example: Given the data in Table 10.1.1. Find the following
a. Find the equation of the regression line.
b. Scatter diagram.
c. Find the point estimate of Y when x = 113.

Table 10.1.1: IQ Scores and Math 15 Midterm Scores of 12 College Students

Student Number IQ (X) Math 15 Score (Y)
1 110 50
2 112 56
3 118 52
4 119 59
5 122 61
6 125 53
7 127 61
8 130 58
9 132 65
10 134 59
11 136 64
12 138 68

Solution:
12 12
n = 12,  xi = 110 + 112 + ... + 138 = 1,503.00,
i =1
x
i =1
2
i = 1102 + 1122 + ... + 1382 = 189,187.00
12 12 _ _

 yi = 50 + 56 + ... + 68 = 706.00,
i =1
 yi2 = 502 + 562 + ... + 682 = 41,682.00, x = 125.25, y = 58.833
i =1
12

x y
i =1
i i = 110(50) + 112(56) + ... + 138(68) = 88,857

12(88,857) − (1,503)(706)
b= = 0.4598, a = 58.833 − 0.4598(125.25) = 1.2414
12(189,187) − (1,503) 2

3

a. Y = 1.2414 + 0.4598 X

b. Scatter Plot
70

50
SCORE

40
100 110 120 130 140


c. Y = 1.2414 + 0.4598(113) = 53.20

15. Inference about the slope ( 1 ) : t Test

To test whether there is a significant linear dependency of Y on X , we have to show
that 1  0.
To test the null hypothesis H o that 1 = 0 (no linear dependency) against a suitable
alternative, we use the t − distribution with n − 2 degrees of freedom to establish the critical
region and then base our decision on the value of
b − 1
t=
sb S yx n −1 2
where sb = n 2
and s yx = ( s y − b 2 s x2 )
s n − 1(b − 1 ) _
n−2
= x
s yx
i =1
( xi − x) 2

16. Is their a linear dependency of Y on X of the given example above? Test at 0.05 level of
significance.
Solution: Step 1. H o : 1 = 0
H1 :   0
Step 2.  = 0.05
Step 3. Appropriate test statistic: t test
Step 4. Reject H o if tc  t0.05, df , that is, tc  1.812
Using the PHStat2 output for succeeding steps, we have
Standard
Coefficients Error t Stat P-value
Intercept 1.241744548 14.66505476 0.084673707 0.934191994

4
IQ Score 0.459813084 0.116796188 3.936884361 0.002788923

Thus, B1  0 which means that IQ score affects Math 15 midterm scores.

17. Correlation analysis attempts to measure the strength of the relationship between two random
variables by means of a single number called correlation coefficient. This concerned only
with the strength of the relationship and no causal effect is implied.

18. The Pearson Correlation Coefficient (  ) measure the strength of the linear relationship
between two variables X and Y . The estimated sample correlation coefficient, denoted by
(r ), is given by:
n n n
n xi yi −  xi  yi
r= i =1 i =1 i =1
where n is the sample size
 n
 n
 
2 n
 n  
2

  i   i    i   i  
− −
2 2
n x x n y y
 i=1  i=1    i=1  i=1  

19. Sample of Observation from various r values

Y Y Y

X X X
r = -1 r = -.6 r=0
Y Y

r = .6 r=1

20. Features of  and r

- unit free
- ranges from -1 to 1
- the closer to -1 the stronger the negative linear relationship
- the closer to 1 the stronger the positive linear relationship
- the closer to 0, the weaker the linear relationship

21. The Sample Coefficient of Determination, r 2 , is a number that determine the total variation
in the values of variable Y that can be accounted for or explained by the linear relationship
with the values of the variable X .

5
22. Example: Of the given example above, find the sample correlation coefficient and sample
coefficient of determination and interpret the results.

23. Test for a Linear Relationship

Hypotheses: H o :  = 0 (no correlation)
` H1 :   0 (correlation)
r−
Test statistic: t =
1− r2
n−2

24. Using the above example, is there evidence of a linear relationship between the students
Math 15 midterm scores and IQ scores at 0.05 level of significance?

Solution: Step 1. H o :  = 0 (no association)

H1 :   0 (association)
Step 2.  = 0.05
Step 3. Appropriate test statistic: t test
Step 4. Critical region: Reject Ho if tc  t , n −2
or tc  −t , n−2
2 2

tc  t0.025, 10 or tc  −t0.05, 10
tc  2.228 or tc  −2.228
0.7796 − 0
Step 5. Computation: tc = = 19.88
1 − .77962
12 − 2
Step 6. Reject H o since tc  2.228.
Step 7. There is sufficient sample evidence that there is a significant
linear relationship between Students IQ scores and their Math 15
midterm scores.

10.2. Problems/Exercises

I. True/False. Write True if the statement is true and False if otherwise.

6
1. Once you have computed the linear regression line equation, the intercept is completely
determined.
2. The slope of the regression line can be a negative value.
3. If the Y − intercept is –2.5, the X − score must be 1.00.
4. All things being equal, the higher the correlation, the more accurate in the prediction.
^
5. In the regression equation Y = a + bX , Y is used to predict the value of X .
6. A direct proportion line indicates a positive correlation.
7. The correlation value ranges from –1 to +1.
8. A correlation of +0.45 will have the same standard deviation as –0.45.
9. When the value of r = 1, it denotes a perfect positive correlation.
10. The sum of all the errors in the regression line will always add to zero.
11. When the correlation coefficient r is squared, it is called as the coefficient of
determination.
12. If the correlation of two variables is close to zero, it indicates that no relationship exists
between the two variables.
13. In testing the significance for r using the t -distribution, is dependent only to the variables
for X & Y .

II. Solve what is indicated in the problem. Show your solutions legibly.

1. The Kryplium Junior School Board is trying to anticipate building needs on the bases of
past student enrolment. From previous years, they have collected and recorded the data
for enrollment and the community population. The data are presented below:

Year Enrollment Community

Population

1993 740 20,050

1994 750 20,940
1995 772 22,050
1996 792 23,160
1997 810 24,310
1998 825 25,540
1999 890 26,830

Questions:

a. Derive regression equation line for:

1. Predicting enrolment from the community population
2. Predicting community population from year.
3. Predicting enrollment from year.

b. Use the prediction equation in (a), solve for:

1. What is the predicted enrolment in the year 2003?
2. What is the predicted population in the year 2003?

7
3. Give your best estimate of the enrolment when the community population
was 18, 000.
4. Give your best estimate for the year in which the population was 18,000.

c. Solve for the correlation coefficients in (a).

d. Test the correlation coefficient in letter c and interpret your results.

2. Listed below are the IQ scores ( X ) and the Final grade (Y ) for 10 students
X 90 90 100 100 100 110 110 115 120 120
Y 2.0 3.0 2.5 3.0 3.5 3.0 4.0 3.5 3.5 4.0
Answer the following:
a. Plot a scatter plot.
b. Solve for r and r 2 .
c. Test H o : 1 = 0 using 0.05 level of significance.
d. Test Ho:  = 0 using 0.05 level of significance.
e. Find the linear regression line equation.

3. Listed below are the undergraduate Grade Point Average GPA ( X ) and First Semesters
Graduate GPA (Y ) of 10 Senior students.
X 90 90 100 100 100 110 110 115 120 120
Y 2.0 3.0 2.5 3.0 3.5 3.0 4.0 3.5 3.5 4.0
Answer the following:
a. Plot a scatter plot.
b. Solve for r and r 2 .
c. H o : 1 = 0 using 0.05 level of significance.
d. Test Ho:  = 0 using 0.05 level of significance.
e. What is the estimated graduate GPA to be for a student if the undergrad GPA
is 3.5?

Codes and Concepts of ML-Developer
No ratings yet
Codes and Concepts of ML-Developer
125 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
Simple Linear Regression Part 1
No ratings yet
Simple Linear Regression Part 1
63 pages
Regression: Leech N L, Barret K C & Morgan G A (2011)
No ratings yet
Regression: Leech N L, Barret K C & Morgan G A (2011)
35 pages
Chapter 6 Student
No ratings yet
Chapter 6 Student
21 pages
Chapter 10
No ratings yet
Chapter 10
3 pages
Chapter7
No ratings yet
Chapter7
52 pages
Chapter 14 (14.1 - 14.2)
No ratings yet
Chapter 14 (14.1 - 14.2)
22 pages
Regression Analysis and Its Application: A Data-Oriented Approach First Edition Richard F. Gunst Download PDF
100% (2)
Regression Analysis and Its Application: A Data-Oriented Approach First Edition Richard F. Gunst Download PDF
65 pages
OpenStax Chapter 12 Power Point
No ratings yet
OpenStax Chapter 12 Power Point
81 pages
Lecture 8 Correlation and Linear Regression
No ratings yet
Lecture 8 Correlation and Linear Regression
66 pages
Midterm Exam Data Analytics
No ratings yet
Midterm Exam Data Analytics
858 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
Lesson 2 - 1
No ratings yet
Lesson 2 - 1
44 pages
Lesson 6-8 Linear Regression and Correlation
No ratings yet
Lesson 6-8 Linear Regression and Correlation
42 pages
Regression Course For Second Year (Chap 1-3)
No ratings yet
Regression Course For Second Year (Chap 1-3)
59 pages
Correlation Regression 15 16
No ratings yet
Correlation Regression 15 16
19 pages
Ae Test Bank This Is Applied Econometrics Testbank
100% (1)
Ae Test Bank This Is Applied Econometrics Testbank
134 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Linear Regression
No ratings yet
Linear Regression
19 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Sta334 Group Report
No ratings yet
Sta334 Group Report
40 pages
6.3 SSK5210 Parametric Statistical Testing - Analysis of Variance LR and Correlation - 2
No ratings yet
6.3 SSK5210 Parametric Statistical Testing - Analysis of Variance LR and Correlation - 2
39 pages
DISCRETE MATH Chapter-8
No ratings yet
DISCRETE MATH Chapter-8
34 pages
Chapter 8
No ratings yet
Chapter 8
45 pages
Lesson 12 - Introduction To Regression and Correlation Analysis Regression Analysis
No ratings yet
Lesson 12 - Introduction To Regression and Correlation Analysis Regression Analysis
39 pages
M. Amir Hossain PHD: Course No: Emba 502: Business Mathematics and Statistics
No ratings yet
M. Amir Hossain PHD: Course No: Emba 502: Business Mathematics and Statistics
31 pages
5 - Chapter9-Linear Regression
No ratings yet
5 - Chapter9-Linear Regression
15 pages
Cheat Sheet
No ratings yet
Cheat Sheet
4 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
STB1003 - Unit-3 BSC
No ratings yet
STB1003 - Unit-3 BSC
12 pages
Regression and Correlation Analysis
No ratings yet
Regression and Correlation Analysis
16 pages
1486016038da Mod12 Q1 e Text
No ratings yet
1486016038da Mod12 Q1 e Text
11 pages
@regression
No ratings yet
@regression
33 pages
REGRESSION and CORRELATION ANALYSIS STA 106 - DR. BASHIRU
No ratings yet
REGRESSION and CORRELATION ANALYSIS STA 106 - DR. BASHIRU
10 pages
Correlation and Regression 2
No ratings yet
Correlation and Regression 2
24 pages
Linear Regression II
No ratings yet
Linear Regression II
54 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Correlation and Regression Analyses
No ratings yet
Correlation and Regression Analyses
8 pages
Chapter 8
No ratings yet
Chapter 8
8 pages
Correlation & Simple Regression
No ratings yet
Correlation & Simple Regression
15 pages
Econometrics For Finance
100% (1)
Econometrics For Finance
54 pages
SAS Slide SDFDSFDSFSD Dfsdsdfwyr6u
No ratings yet
SAS Slide SDFDSFDSFSD Dfsdsdfwyr6u
37 pages
Regression and Correlation
No ratings yet
Regression and Correlation
13 pages
Regression Analysis
No ratings yet
Regression Analysis
5 pages
Factor Analysis Quiz
No ratings yet
Factor Analysis Quiz
4 pages
Chapter 5 - 1
No ratings yet
Chapter 5 - 1
5 pages
Use of Computer in Data Analysis
No ratings yet
Use of Computer in Data Analysis
48 pages
Quota Sampling
No ratings yet
Quota Sampling
53 pages
EE3211 Modelling Techniques
No ratings yet
EE3211 Modelling Techniques
47 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
7 pages
06 Regression
No ratings yet
06 Regression
18 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Reliability (Statistics)
No ratings yet
Reliability (Statistics)
7 pages
Sta404 - Chapter 5 - Bivariate Analysis (Student)
No ratings yet
Sta404 - Chapter 5 - Bivariate Analysis (Student)
27 pages
Jurnal Penelitian Tolak Angin
No ratings yet
Jurnal Penelitian Tolak Angin
9 pages
Econometrica - 2009 - Bai - Panel Data Models With Interactive Fixed Effects
No ratings yet
Econometrica - 2009 - Bai - Panel Data Models With Interactive Fixed Effects
51 pages
Correlation and Regression
No ratings yet
Correlation and Regression
23 pages
Simple Linear Regression and Correlation Analysis: Chapter Five
No ratings yet
Simple Linear Regression and Correlation Analysis: Chapter Five
5 pages
Y X y X N B: Linear Regression
No ratings yet
Y X y X N B: Linear Regression
7 pages
Correlation and Regression
No ratings yet
Correlation and Regression
31 pages
Topic:-Regression: Name: - Teotia Nidhi Class: - M.SC Biotechnology
No ratings yet
Topic:-Regression: Name: - Teotia Nidhi Class: - M.SC Biotechnology
11 pages
Purposive Sampling
No ratings yet
Purposive Sampling
15 pages
A Overview Breif of PRIDIT
No ratings yet
A Overview Breif of PRIDIT
28 pages
Regression and Correlation
No ratings yet
Regression and Correlation
17 pages
Chapter 14 Simple Linear Regression .
No ratings yet
Chapter 14 Simple Linear Regression .
39 pages
Topic - Chapter 12 - Regression Models
No ratings yet
Topic - Chapter 12 - Regression Models
1 page
Regression Analysis
No ratings yet
Regression Analysis
6 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
91 pages
Regression and Correlation
No ratings yet
Regression and Correlation
37 pages
Regression Analysis
No ratings yet
Regression Analysis
25 pages
Sciencedirect: Categorical Principal Component Logistic Regression: A Case Study For Housing Loan Approval
No ratings yet
Sciencedirect: Categorical Principal Component Logistic Regression: A Case Study For Housing Loan Approval
7 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
25 pages
Factors Influencing Consumer's Purchase Intention On Video Games (Autosaved)
No ratings yet
Factors Influencing Consumer's Purchase Intention On Video Games (Autosaved)
13 pages
Sudario, de Problem Set 1
No ratings yet
Sudario, de Problem Set 1
13 pages
Solutions To Ch12 Blanchard
No ratings yet
Solutions To Ch12 Blanchard
11 pages
Regression Analysis
No ratings yet
Regression Analysis
18 pages
Flood Level
No ratings yet
Flood Level
11 pages
Convenience Sampling
No ratings yet
Convenience Sampling
9 pages
Ecf630-Final Examination - May 2021
No ratings yet
Ecf630-Final Examination - May 2021
12 pages
Simple PD and LGD Estimation in MS Excel
100% (1)
Simple PD and LGD Estimation in MS Excel
4 pages
Lecture 1 of Quantitative Finance and Statistical Learning
No ratings yet
Lecture 1 of Quantitative Finance and Statistical Learning
7 pages
Chapter 1 Exam Review - Graphical Displays of Data
No ratings yet
Chapter 1 Exam Review - Graphical Displays of Data
8 pages
Landownership in The Philippines Under Spain (Topic Outline)
No ratings yet
Landownership in The Philippines Under Spain (Topic Outline)
2 pages
ARCH GARCH Assignment
No ratings yet
ARCH GARCH Assignment
8 pages
Linear Regression Model For Predicting Medical Expenses Based On Insurance Data
No ratings yet
Linear Regression Model For Predicting Medical Expenses Based On Insurance Data
6 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
3 pages
Sudario Assignment
No ratings yet
Sudario Assignment
1 page
Su Dario Assignment
No ratings yet
Su Dario Assignment
1 page
Mids Assignment 2
No ratings yet
Mids Assignment 2
4 pages
Jasp Chika
No ratings yet
Jasp Chika
2 pages
Omnibus Tests of Model Coefficients
No ratings yet
Omnibus Tests of Model Coefficients
2 pages
Statistical Methods For Cross-Sectional Data Analysis
No ratings yet
Statistical Methods For Cross-Sectional Data Analysis
1 page
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Correlation and Regression Analysis

Uploaded by

Correlation and Regression Analysis

Uploaded by

UNIT 10.

SIMPLE LINEAR REGRESSION AND CORRELATION

10.1. Things to know

3. Purposes of Regression Analysis

4. Types of Regression Models

Positive Linear Relationship Relationship NOT Linear

Negative Linear Relationship No Relationship

6. Scatter diagram is a graphical presentation of the independent variable (plotted on the

11. Assumptions on Regression Analysis

12. Estimation of Parameters

13. Interpretation of the Slope and the Intercept

Table 10.1.1: IQ Scores and Math 15 Midterm Scores of 12 College Students

15. Inference about the slope ( 1 ) : t Test

Thus, B1  0 which means that IQ score affects Math 15 midterm scores.

19. Sample of Observation from various r values

20. Features of  and r

23. Test for a Linear Relationship

Solution: Step 1. H o :  = 0 (no association)

I. True/False. Write True if the statement is true and False if otherwise.

Year Enrollment Community

1993 740 20,050

a. Derive regression equation line for:

b. Use the prediction equation in (a), solve for:

c. Solve for the correlation coefficients in (a).

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.