Correlation Notes
Correlation Notes
etc.
more variables.
Types of correlation
direction.
Chennai - independent
Scatter Diagram
forms.
relatively strong.
moment correlation.
Symbolically,
1
( (x − x )( y − y ))
r = n − 1
1
( x − x )2 1 ( y − y )2
n −1 n −1
Where:
2. Calculate the means (averages) x̅ for the x-variable and ȳ for the
y-variable.
3. For the x-variable, subtract the mean from each value of the x-
variable (let’s call this new variable “a”). Do the same for the y-
the formula).
6. Find the square root of the value obtained in the previous step
𝑐𝑜𝑣(𝑥,𝑦)
r=
√𝑣𝑎𝑟(𝑥 ) 𝑋 𝑣𝑎𝑟 (𝑦)
x y
xy −
r = n
( x )2
( y )2
x 2
−
n
y 2
−
n
𝑆𝑃(𝑥,𝑦)
r=
√𝑆𝑆(𝑥) 𝑋 𝑆𝑆(𝑦)
This correlation coefficient r is known as Pearson’s product
SS(Y)
SP( XY )
r =
SS ( X ) SS (Y )
pair (independent)
Problem:
Compute Pearsons coefficient of correlation between plant height (cm)
X 39 65 62 90 82 75 25 98 36 78
Y 47 53 58 86 62 68 60 91 51 84
X Y (x-
65)(y-
(x-65) (y-66) (x-65)2 (y-66)2 66)
39 47 -26 -19 676 361 494
65 53 0 -13 0 169 0
62 58 -3 -8 9 64 24
90 86 25 20 625 400 500
82 62 17 -4 289 16 -68
75 68 10 2 100 4 20
25 60 -40 -6 1600 36 240
98 91 33 25 1089 625 825
36 51 -29 -15 841 225 435
78 84 13 18 169 324 234
650 660 5398 2224 2704
n = 10
x 2
−
n
y 2
−
n
(650)(660)
45604 −
= 10
(650) 2 (660) 2
47648 − 45784 −
10 10
45604 − 42900
= = 0.7804
(73.47)( 47.1)
Correlation coefficient is positively correlated.
Both the methods give the same result.
u=y- v=x-
Y X 5.13 79.41 v sqr v sqr uXv
5.22 94.2 0.09 14.79 0.0081 218.74 1.3311
8.13 69.3 3 -10.11 9 102.21 -30.33
6.52 114.3 1.39 34.89 1.9321 1217.3 48.497
4.16 83.3 -0.97 3.89 0.9409 15.132 -3.773
8.98 85.4 3.85 5.99 14.823 35.88 23.062
3.05 68.1 -2.08 -11.31 4.3264 127.92 23.525
3.49 50.7 -1.64 -28.71 2.6896 824.26 47.084
5.4 96.2 0.27 16.79 0.0729 281.9 4.5333
2.39 76.1 -2.74 -3.31 7.5076 10.956 9.0694
2.71 52 -2.42 -27.41 5.8564 751.31 66.332
3.97 82.1 -1.16 2.69 1.3456 7.2361 -3.12
7.56 81.3 2.43 1.89 5.9049 3.5721 4.5927
61.58 953 0.02 0.08 54.407 3596.4 190.8
0.4313
Y X y-3 x-20
5.22 94.2 2.22 74.2
8.13 69.3 5.13 49.3
6.52 114.3 3.52 94.3
4.16 83.3 1.16 63.3
8.98 85.4 5.98 65.4
3.05 68.1 0.05 48.1
3.49 50.7 0.49 30.7
5.4 96.2 2.4 76.2
2.39 76.1 -0.61 56.1
2.71 52 -0.29 32
3.97 82.1 0.97 62.1
7.56 81.3 4.56 61.3
corr 0.4313
Properties
Consider
X 2 4 6 8 10
Y 5 8 11 14 17
For an increase in X value of 2 there is an increase of 3 units
in Y. The rate of change is constant and we say they are
linearly related. The correlation calculated for such data is
called simple linear correlation.
Rank correlation
rs = 1 - (6 ∑d2 / (n(n2-1)))
rs = 1- (6 X 52 / (10(102-1)))
rs = 1- (312/990) = 0.68
Judge A 40, 48, 25, 32, 50, 41, 15, 18, 27, 36, 43, 49
Judge B 35, 40, 22, 25, 47, 38, 17, 19, 26, 33, 41, 39