UNIT_-V_MA - Copy

UNIT - V
MULTIVARIATE
ANALYSIS
BY
E.MATHIVADHANA, M.Sc.,M.PHIL.
ASSISTANT PROFESSOR
DEPARTMENT OF MATHEMATICS
IFETCE/H&S- II/MATHS/MATHIVADHANA/IYEAR/
1 M.E.(CSE)/I-SEM/MA7155/APPLIED PROBABILITY AND
STATISTICS /UNIT–V/PPT/VER1.1
SYLLABUS
Random Vectors and Matrices - Mean

vectors and Covariance matrices –
Multivariate Normal density and its
properties - Principal components
Population principal components -
Principal components from standardized
variables.
Random Vectors & Matrices
A random vector is a vector whose elements are random variables.
Similarly, a random matrix whose elements are random variables.
Expected Value of a Random Matrix
The expected value of a random matrix (or vector) is the matrix
(vector) consisting of the expected values of each of the elements.
Mean Vectors
Covariance Matrices
Covariance Matrix
⚫ Covariance matrix captures the variance and linear
correlation in multivariate/ multidimensional data.
⚫ If data is an n x p matrix, the Covariance Matrix is a p x p
square matrix
⚫ .Think of n as the number of data instances (rows) and p
the number of attributes (columns).
Covariance
⚫ The covariance of the return is
⚫ It is always true that
⚫ i.
⚫ ii.
Mean Matrix
Covariance Matrix
Covariance Matrix
Example
Find the mean & covariance matrix for the 2 r.v. X1 & X2 for the given
joint probability function P12 (x1 ,x2) is
Soln:
Marginal Distribution of X
X1 -1 0 1
P(X1) 0.3 0.3 0.4
Example
Marginal Distribution of Y
X2 0 1
P(X2) 0.8 0.2
Example
Example
Sample Covariance
⚫ Example. The table provides the returns on three assets
over three years
Year 1 Year 2 Year 3

A 10 12 11
B 10 14 12
C 12 6 9
⚫ Mean returns
Sample Covariance
⚫ Covariance between A and B is
⚫ Covariance between A and C is
Variance-Covariance Matrix
⚫ Covariance between B and C is
⚫ The matrix is symmetric
Variance-Covariance Matrix
⚫ For the example the variance-covariance matrix is
Correlation Coefficient
Let the population correlation coefficient matrix be the p x p symmetric
matrix
Standard Deviation
Let the p x p standard deviation be
Then it is verified that
Example
Linear Combination of Random
Variables
Prove that the linear combination cʹX = aX1 + bX2 has
Mean = E(cʹX) = cʹμ
Var = Var(cʹX) = cʹΣc
Where μ = E(X) & Σ = cov(X)
Soln:
The previous result can be extended to a linear combination of
p random varaibles:
The linear combination cʹX = c1 X1 + c2 X2 +… + cpXp has
Mean = E(cʹX) = cʹμ
Var = Var(cʹX) = cʹΣc
In general, consider for q linear combinations Z=CX of the p

random varaibles X1 , X2 , …, Xp
μZ = E(Z) = E(CX) = C μX
ΣZ = cov(Z) = cov(CX) = CΣXCʹ
Example
Multivariate Normal Distribution
The multivariate normal density of the univariate normal
density to p ≥ 2 dimension is
The p-dimensional normal density for the random vector

X' = [X1 , X2, …, Xp ] has the form
Bivariate Normal Distribution
The Bivariate normal density
The Multivariate Normal Distribution
The univariate normal distribution has a generalized form in p dimensions
– the p-dimensional normal density function is
squared
where -∞ ≤ xi ≤ ∞, i = 1,…,p.
generalized
~ ~
distance
This p-dimensional normal density function from
is denoted by xNp(μ,Σ) where
to μ
~ ~
The simplest multivariate normal distribution is the bivariate (2
dimensional) normal distribution, which has the density function
squared
where -∞ ≤ xi ≤ ∞, i = 1, 2. generalized
~ ~
distance from x
to μ
This 2-dimensional normal density function is denoted by N2(μ,Σ) where
~ ~
We can easily find the inverse of the covariance matrix (by using
Gauss-Jordan elimination or some other technique):
Now we use the previously established relationship
to establish that
By substitution we can now write the squared distance as
which means that we can rewrite the bivariate normal probability density
function as
Graphically, the bivariate normal probability density function looks like
this:
contours
X2
X1
All points of equal density are called a contour, defined for p-dimensions
as all x such that
IFETCE/H&S- II/MATHS/MATHIVADHANA/IYEAR/ ~
The contours
form concentric ellipsoids centered at μ with axes

~
X2
contour for
constant c
f(X1, X2)
X1
where
The general form of contours for a bivariate normal probability
distribution where the variables have equal variance (σ11 = σ22) is relative
easy to derive:
First we need the eigenvalues of Σ
~
Next we need the eigenvectors of Σ
~
- for a positive covariance σ12, the first eigenvalue and its associated
eigenvector lie along the 450 line running through the centroid μ:
~
X2
contour for
constant
f(X1, X2)
X1
- for a negative covariance σ12, the second eigenvalue and its associated
eigenvector lie at right angles to the 450 line running through the
centroid μ:
~
X2
contour for
constant
f(X1, X2)
X1
What do you suppose happens when the two random variables X1 and X2
are uncorrelated (i.e., r12 = 0):
f(X1) f(X2)
- for covariance σ12 of zero the two eigenvalues and eigenvectors are equal
(except for signs) - one runs along the 450 line running through the
centroid μ and the other is perpendicular:
~
X2
contour for
constant
f(X1, X2)
X1
Contours also have an important probability interpretation – the solid
ellipsoid of x values satisfying:
~
has a probability 1 – α, i.e.,
Properties
Properties
Properties
Properties
Principal Components Analysis
A. The Basic Principle
We wish to explain/summarize the underlying variance-covariance structure
of a large set of variables through a few linear combinations of these
variables. The objectives of principal components analysis are
- data reduction
- interpretation
The results of principal components analysis are often used as inputs to
- regression analysis
- cluster analysis
B. Population Principal Components
Suppose we have a population measured on p random variables X1,…,Xp.
Note that these random variables represent the p-axes of the Cartesian
coordinate system in which the population resides. Our goal is to develop a
new set of p axes (linear combinations of the original p axes) in the directions
of greatest variability:
X2
X1
This is accomplished by
rotating the axes.
Consider our random vector
with covariance matrix Σ and eigenvalues λ1 ≥ λ2 ≥  ≥ λp.

~
We can construct p linear combinations
It is easy to show that
The principal components are those uncorrelated linear combinations Y1,…,Yp

whose variances are as large as possible.
Thus the first principal component is the linear combination of maximum

variance, i.e., we wish to solve the nonlinear optimization problem
source of restrict to
nonlineari coeﬃcient
ty vectors of unit
length
The second principal component is the linear combination of maximum
variance that is uncorrelated with the first principal component, i.e., we wish
to solve the nonlinear optimization problem
restricts
covarianc
e to zero
The third principal component is the solution to the nonlinear optimization

problem
restricts
covarianc
es to zero
Generally, the ith principal component is the linear combination of maximum
variance that is uncorrelated with all previous principal components, i.e., we
wish to solve the nonlinear optimization problem
We can show that, for random vector X with covariance matrix Σ and
eigenvalues λ1 ≥ λ2 ≥  ≥ λp ≥ 0, the ith principal component is given by
Note that the principal components are not unique if some eigenvalues are
equal.
We can also show for random vector X with covariance matrix Σ and
~
eigenvalue-eigenvector pairs (λ1 , e1), …, (λp , ep) where λ1 ≥ λ2 ≥  ≥ λp ,
~ ~
~
so we can assess how well a subset of the principal components Yi

summarizes the original random variables Xi – one common method of doing
so is
proportion of total
population variance due
to the kth principal
component
If a large proportion of the total population variance can be attributed to

relatively few principal components, we can replace the original p variables
with these principal components without loss of much information!
We can also easily find the correlations between the original random
variables Xk and the principal components Yi:
These values are often used in interpreting the principal components Yi.
Example: Suppose we have the following population of four observations
made on three random variables X1, X2, and X3:
Find the three population principal components Y1, Y2, and Y3:
First we need the covariance matrix Σ:
~
and the corresponding eigenvalue-eigenvector pairs:
so the principal components are:
Note that
and the proportion of total population variance due to the each principal
component is
Note that the third principal component is relatively irrelevant!
Next we obtain the correlations between the original random variables Xi
and the principal components Yi:
We can display these results in a correlation matrix:
Here we can easily see that
- the first principal component (Y1) is a mixture of all three random variables
(X1, X2, and X3)
- the second principal component (Y2) is a trade-off between X1 and X3

- the third principal component (Y3) is a residual of X1
When the principal components are derived from an X ~ Np(μ,Σ) distributed
population, the density of X is constant on the μ-centered ellipsoids
which have axes
where (λi,ei) are the eigenvalue-eigenvector pairs of Σ.
We can set μ = 0 w.l.g. – we can then write
where the are the principal components of x.
Setting and substituting into the previous expression yields
which defines an ellipsoid (note that λi > 0 ∀ i) in a coordinate system with

axes y1,…,yp lying in the directions of e1,…,ep, respectively.
The major axis lies in the direction determined by the eigenvector ei
associated with the largest eigenvalue λi - the remaining minor axes lie in
the directions determined
~ by the other eigenvectors.
Example: For the principal components derived from the following
population of four observations made on three random variables X1, X2, and
X3:
plot the major and minor axes.
We will need the centroid μ:
The direction of the major axis is given by
while the directions of the two minor axis are given by
We first graph the centroid:
X
2
3.0,10.0,15. X1
0
X
3
…then use the first eigenvector to find a second point on the first principal
axis:
X
2
Y1
X1
The line connecting these two

points is the Y1 axis.
X
3
…then do the same thing with the second eigenvector:
Y2
X
2
Y1
X1
The line connecting these

two points is the Y2 axis.
X
3
…and do the same thing with the third eigenvector:
Y2
X
2
Y1
X1
The line connecting these two Y3

points is the Y3 axis.
X
3
What we have done is a rotation…
Y2
X
2
Y1
X1
Y3
X
3
and a translation in p = 3 dimensions.
Y2 Y2
X
2
Note that the rotated axes

remain orthogonal!
Y1
X1
Y3
X
3
Note that we can also construct principal components for the standardized
variables Zi:
which in matrix notation is
where V1/2 is the diagonal standard deviation matrix.
Obviously
This suggests that the principal components for the standardized variables
Zi may be obtained from the eigenvectors of the correlation matrix ρ! The
operations are analogous to those used in conjunction with the covariance
matrix.
~
We can show that, for random vector Z of standardized variables with

covariance matrix ρ and eigenvalues λ1 ≥ λ2 ≥  ≥ λ~
p
≥ 0, the ith principal
component is given by ~
Note again that the principal components are not unique if some eigenvalues
are equal.
We can also show for random vector Z with covariance matrix ρ and
~
eigenvalue-eigenvector pairs (λ1 , e1), …, (λp , ep) where λ1 ≥ λ2 ≥  ≥ λp ,
~ ~
~
and we can again assess how well a subset of the principal components Yi
summarizes the original random variables Xi by using
proportion of total
population variance due
to the kth principal
component
If a large proportion of the total population variance can be attributed to
relatively few principal components, we can replace the original p variables
with these principal components without loss of much information!
Example: Suppose we have the following population of four observations
made on three random variables X1, X2, and X3:
Find the three population principal components variables Y1, Y2, and Y3 for
the standardized random variables Z1, Z2, and Z3:
We could standardize the variables X1, X2, and X3, then work with the
resulting covariance matrix Σ, but it is much easier to proceed directly with
correlation matrix ρ: ~
~
and the corresponding eigenvalue-eigenvector pairs:
These results diﬀer from

the covariance- based
principal components!
so the principal components are:
Note that
and the proportion of total population variance due to the each principal
component is
Note that the third principal component is again relatively irrelevant!

Next we obtain the correlations between the original random variables Xi
and the principal components Yi:
We can display these results in a correlation matrix:
Here we can easily see that
- the first principal component (Y1) is a mixture of all three random variables
(X1, X2, and X3)
- the second principal component (Y2) is a trade-off between X1 and X3

- the third principal component (Y3) is a trade-off between X1 and X2

UNIT_-V_MA - Copy

Uploaded by

Copyright:

Available Formats

UNIT_-V_MA - Copy

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

UNIT_-V_MA - Copy

Uploaded by

Copyright:

Available Formats

UNIT - V

Random Vectors and Matrices - Mean

⚫ It is always true that

Year 1 Year 2 Year 3

⚫ Covariance between A and C is

⚫ The matrix is symmetric

Then it is verified that

In general, consider for q linear combinations Z=CX of the p

The p-dimensional normal density for the random vector

Now we use the previously established relationship

form concentric ellipsoids centered at μ with axes

has a probability 1 – α, i.e.,

The results of principal components analysis are often used as inputs to

with covariance matrix Σ and eigenvalues λ1 ≥ λ2 ≥  ≥ λp.

The principal components are those uncorrelated linear combinations Y1,…,Yp

Thus the first principal component is the linear combination of maximum

The third principal component is the solution to the nonlinear optimization

so we can assess how well a subset of the principal components Yi

If a large proportion of the total population variance can be attributed to

and the corresponding eigenvalue-eigenvector pairs:

Note that the third principal component is relatively irrelevant!

Here we can easily see that

- the second principal component (Y2) is a trade-off between X1 and X3

which have axes

where (λi,ei) are the eigenvalue-eigenvector pairs of Σ.

where the are the principal components of x.

Setting and substituting into the previous expression yields

which defines an ellipsoid (note that λi > 0 ∀ i) in a coordinate system with

plot the major and minor axes.

The direction of the major axis is given by

while the directions of the two minor axis are given by

The line connecting these two

The line connecting these

The line connecting these two Y3

Note that the rotated axes

which in matrix notation is

where V1/2 is the diagonal standard deviation matrix.

We can show that, for random vector Z of standardized variables with

and the corresponding eigenvalue-eigenvector pairs:

These results diﬀer from

Note that the third principal component is again relatively irrelevant!

Here we can easily see that

- the second principal component (Y2) is a trade-off between X1 and X3

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.