0% found this document useful (0 votes)
35 views

Materi Discrminant Analysis

This document provides an overview of multiple discriminant analysis. It discusses that discriminant analysis is a statistical technique used to classify individuals or objects into groups based on a set of predictor variables. It outlines the key assumptions of discriminant analysis including multivariate normality, independence of observations, lack of multicollinearity, equal variance between groups, and adequate sample size. The document also describes the process of discriminant analysis including developing discriminant functions to maximize separation between groups and evaluating the accuracy of classification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Materi Discrminant Analysis

This document provides an overview of multiple discriminant analysis. It discusses that discriminant analysis is a statistical technique used to classify individuals or objects into groups based on a set of predictor variables. It outlines the key assumptions of discriminant analysis including multivariate normality, independence of observations, lack of multicollinearity, equal variance between groups, and adequate sample size. The document also describes the process of discriminant analysis including developing discriminant functions to maximize separation between groups and evaluating the accuracy of classification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

1

MULTIPLE DISCRIMINANT ANALYSIS


DR. HEMAL PANDYA
2 INTRODUCTION

• The original dichotomous discriminant analysis was developed by Sir Ronald Fisher in 1936.
• Discriminant Analysis is a Dependence technique.
• Discriminant Analysis is used to predict group membership.
• This technique is used to classify individuals/objects into one of alternative groups on the basis of a set of
predictor variables (Independent variables) .
• The dependent variable in discriminant analysis is categorical and on a nominal scale, whereas the independent
variables are either interval or ratio scale in nature.
• When there are two groups (categories) of dependent variable, it is a case of two group discriminant analysis.
• When there are more than two groups (categories) of dependent variable, it is a case of multiple discriminant
analysis.
3 INTRODUCTION

• Discriminant Analysis is applicable in situations in which the total sample can be divided
into groups based on a non-metric dependent variable.
• Example:- male-female
- high-medium-low
• The primary objective of multiple discriminant analysis are to understand group
differences and to predict the likelihood that an entity (individual or object) will belong
to a particular class or group based on several independent variables.
4 ASSUMPTIONS OF DISCRIMINANT ANALYSIS

• NO Multicollinearity
• Multivariate Normality
• Independence of Observations
• Homoscedasticity
• No Outliers
• Adequate Sample Size
• Linearity (for LDA)
5 ASSUMPTIONS OF DISCRIMINANT ANALYSIS

• The assumptions of discriminant analysis are as under. The analysis is quite sensitive
to outliers and the size of the smallest group must be larger than the number of
predictor variables.
• Non-Multicollinearity: If one of the independent variables is very highly correlated
with another, or one is a function (e.g., the sum) of other independents, then the
tolerance value for that variable will approach 0 and the matrix will not have a
unique discriminant solution. There must also be low multicollinearity of the
independents. To the extent that independents are correlated, the standardized
discriminant function coefficients will not reliably assess the relative importanceof
the predictor variables. Predictive power can decrease with an increased correlation
between predictor variables. Logistic Regression may offer an alternative to DA as it
usually involves fewer violations of assumptions.
6

Multivariate Normality: Independent variables are normal for each level of the grouping
variable. Normal distribution: It is assumed that the data (for the variables) represent a
sample from a multivariate normal distribution. You can examine whether or not variables
are normally distributed with histograms of frequency distributions. However, note that
violations of the normality assumption are not "fatal" and the resultant significance test
are still reliable as long as non-normality is caused by skewness and not outliers
(Tabachnick and Fidell 1996).

Independence: Participants are assumed to be randomly sampled, and a participant's score


on one variable is assumed to be independent of scores on that variable for all other
participants.
It has been suggested that discriminant analysis is relatively robust to slight violations of
these assumptions, and it has also been shown that discriminant analysis may still be
reliable when using dichotomous variables (where multivariate normality is often
violated).
7 ASSUMPTIONS UNDERLYING DISCRIMINANT
ANALYSIS
• Sample size: Unequal sample sizes are acceptable. The sample size of the smallest group needs to
exceed the number of predictor variables. As a “rule of thumb”, the smallest sample size should be
at least 20 for a few (4 or 5) predictors. The maximum number of independent variables is n - 2,
where n is the sample size. While this low sample size may work, it is not encouraged, and generally
it is best to have 4 or 5 times as many observations and independent variables..
• Homogeneity of variance/covariance (homoscedasticity): Variances among group variables are the
same across levels of predictors. Can be tested with Box's M statistic.. It has been suggested,
however, that linear discriminant analysis be used when covariances are equal, and that quadratic
discriminant analysis may be used when covariances are not equal. DA is very sensitive to
heterogeneity of variance-covariance matrices. Before accepting final conclusions for an important
study, it is a good idea to review the within-groups variances and correlation matrices.
Homoscedasticity is evaluated through scatterplots and corrected by transformation of variables..
8 ASSUMPTIONS UNDERLYING DISCRIMINANT
ANALYSIS
• NO Outliers: DA is highly sensitive to the inclusion of outliers. Run a test for univariate and multivariate
outliers for each group, and transform or eliminate them. If one group in the study contains extreme outliers
that impact the mean, they will also increase variability. Overall significance tests are based on pooled variances,
that is, the average variance across all groups. Thus, the significance tests of the relatively larger means (with the
large variances) would be based on the relatively smaller pooled variances, resulting erroneously in statistical
significance.

• DA is fairly robust to violations of the most of these assumptions. But highly sensitive to
Multivariate Normality and Outliers.
9 PROCESS FLOW CHART

Research Problem
STAGE 1 Select objectives:
•Evaluate group differences on a multivariate profile
•Classify observations into group
•Identify dimensions of discrimination between groups

Research design issues:


STAGE 2 Selection of independent variables
Sample size consideration
Creation of analysis and holdout samples

To
stage
3
10 PROCESS FLOW CHART
From
stage
2

Assumptions:
STAGE 3 Normality of independent variables
Linearity of relationships
Lack of multicollinearity
Equal dispersion matrices

STAGE 4 Estimation of discriminant function(s):


Significance of discriminant function
Determine optimal cutting score
Specify criteria for assessing hit ratio
Statistical significance of predictive accuracy

To
stage
5
11 PROCESS FLOW CHART
From
stage
4

STAGE 5 Interpretation of Discriminant function(s):

STAGE 6 Validation of discriminant Results:


Spilt-sample or cross validation
Profiling group differences
12 EXAMPLE

• Heavy product users from light users


• Males from females
• National brand buyers from private label buyers
• Good credit risks from poor credit risks
13 OBJECTIVES

• To find the linear combinations of variables that discriminate between categories of


dependent variable in the best possible manner.
• To find out which independent variables are relatively better in discriminating between
groups.
• To determine statistical significance of the discriminant function and whether any
statistical difference exists among groups in terms of predictor variable.
• To evaluate the accuracy of classification, i.e., the percentage of customers that is able to
classify correctly.
14 DISCRIMINANT ANALYSIS MODEL

• The mathematical form of the discriminant analysis model is:


Y = b0 + b1X1 + b2X2 +b3X3 + … + bkXk
where,Y = Dependent variable
bs = Coefficients of independent variable
Xs = Predictor or independent variable
• Y should be categorized variable and it should be coded as 0, 1 or 1,2,3 similar to dummy variables.
• X should be continuous.
• The coefficients should maximize the separation between the groups of the dependent variable.
15 ACCURACY OF CLASSIFICATION

• The classification of the existing data points is done using the equation, and the accuracy
of the model is determined.
• This output is given by the classification matrix (also called confusion matrix), which tells
what percentage of the existing data points is correctly classified by the model.
16 RELATIVE IMPORTANCE OF INDEPENDENT
VARIABLE
• Suppose we have two independent variables X1 and X2.
• How do we know which one is more important in discriminating between groups?
• Coefficients of both the variables will provide the answer.
17 PREDICTING THE GROUP MEMBERSHIP FOR A NEW
DATA POINT
• For any new data point that we want to classify into one of the groups, the coefficients of
the equation are used to calculate Y discriminant score.
• A decision rule is formulated – to determine the cutoff score, which is usually the
midpoint of the mean discriminant score of two groups.
18 COEFFICIENTS

• There are two types of coefficients.


1. Standardized coefficients
2. Unstandardized coefficients
• Main difference➔ standardized coefficient will not have constant ‘a’.
19 APRIORI PROBABILITY OF CLASSIFICATION INTO
GROUPS
• The discriminant analysis algorithm requires to assign an apriori (before analysis)
probability of a given case belonging to one of the groups.
1. We can assign equal probabilities of assignments to all groups.
2. We can assign proportional to the group size in the sample data.
20 TERMS

• Classification matrix: Means of assessing the predictive ability of the discriminant


functions.
• Hit ratio:
Percentage of objects (individual, respondent, firms etc.) correctly classified by the
discriminant function.
i.e.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑


𝑥 100
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
21 TERMS

• Cutoff Score for Two Group Discriminant Function: Criterion against which individual’s
discriminant Z score is compared to determine Predicted group membership.
C= 𝑛2𝑦1 +𝑛1𝑦2
𝑛1 +𝑛2

(Basic formula for computing the optimal cutoff score between any two groups)
22 TERMS

• Discriminant loadings: Discriminant loadings are calculated whether or not an


independent variable is included in the discriminant function(s).
• Discriminant weights: (Discriminant coefficients) Independent variables with large
discriminatory power usually have large weights, and those with little discriminatory
power usually have small weights.
• Centroid: Mean value for the discriminant Z scores of all objects within a particular
category or group.
23 Eigen Values and Multivariate Tests Interpretations

• Function – This indicates the first or second canonical linear discriminant function. The number of functions is equal to the number of discriminating variables, if
there are more groups than variables, or 1 less than the number of levels in the group variable. In this example, job has three levels and three discriminating
variables were used, so two functions are calculated. Each function acts as projections of the data onto a dimension that best separates or discriminates between the
groups.

• Eigenvalue – These are the eigenvalues of the matrix product of the inverse of the within-group sums-of-squares and cross-product matrix and the between-groups
sums-of-squares and cross-product matrix. These eigenvalues are related to the canonical correlations and describe how much discriminating ability a function
possesses. The magnitudes of the eigenvalues are indicative of the functions’ discriminating abilities. See superscript e for underlying calculations.

• % of Variance – This is the proportion of discriminating ability of the three continuous variables found in a given function. This proportion is calculated as the
proportion of the function’s eigenvalue to the sum of all the eigenvalues. In this analysis, the first function accounts for 77% of the discriminating ability of the
discriminating variables and the second function accounts for 23%. We can verify this by noting that the sum of the eigenvalues is 1.081+.321 = 1.402. Then
(1.081/1.402) = 0.771 and (0.321/1.402) = 0.229.

• Cumulative % – This is the cumulative proportion of discriminating ability . For any analysis, the proportions of discriminating ability will sum to one. Thus, the
last entry in the cumulative column will also be one.

• Canonical Correlation – These are the canonical correlations of our predictor variables (outdoor, social and conservative) and the groupings in job. If we
consider our discriminating variables to be one set of variables and the set of dummies generated from our grouping variable to be another set of variables, we can
perform a canonical correlation analysis on these two sets. From this analysis, we would arrive at these canonical correlations.
24 EIGEN VALUES AND MULTIVARIATE TESTS

• Test of Function(s) – These are the functions included in a given test with the null hypothesis that the canonical correlations
associated with the functions are all equal to zero. In this example, we have two functions. Thus, the first test presented in this table
tests both canonical correlations (“1 through 2”) and the second test presented tests the second canonical correlation alone.
• Wilks’ Lambda – Wilks’ Lambda is one of the multivariate statistic calculated by SPSS. It is the product of the values of (1-
canonical correlation2). In this example, our canonical correlations are 0.721 and 0.493, so the Wilks’ Lambda testing both canonical
correlations is (1- 0.7212)*(1-0.4932) = 0.364, and the Wilks’ Lambda testing the second canonical correlation is (1-0.4932) = 0.757.
• Chi-square – This is the Chi-square statistic testing that the canonical correlation of the given function is equal to zero. In other
words, the null hypothesis is that the function, and all functions that follow, have no discriminating ability. This hypothesis is tested
using this Chi-square statistic.
• df – This is the effect degrees of freedom for the given function. It is based on the number of groups present in the categorical
variable and the number of continuous discriminant variables. The Chi-square statistic is compared to a Chi-square distribution with
the degrees of freedom stated here.
• Sig. – This is the p-value associated with the Chi-square statistic of a given test. The null hypothesis that a given function’s
canonical correlation and all smaller canonical correlations are equal to zero is evaluated with regard to this p-value. For a given
alpha level, such as 0.05, if the p-value is less than alpha, the null hypothesis is rejected. If not, then we fail to reject the null
hypothesis.
25 TWO GROUPS DISCRIMINANT ANALYSIS

• When the Dependent variable possesses only Two Categories.


26 CASE STUDY:

• A retail outlet wants to know the consumer behavior pattern of the purchase of products in two categories- national brands
and local brands respectively, which would help it to place orders depending on demand and requirements of customer. This
retail outlet uses data from a retail outlet in another location to arrive at a decision about customers visiting at their end.
• This retail outlet wants to use discriminant analysis to screen the responsiveness of customers towards national brand and
local brand categories and find out the following:
1. The percentage of customers that it is able to classify correctly.
2. Statistical significance of discriminant function.
3. Which variable (annual income in lakh rupees and household size) are relatively better in discriminating between consumers
for national and local brand.
4. Classification of new customers into one of the two groups namely-
national (group-1) and local brand (group-2)acceptors.
27 DATA

Sr. No. Brand Annual income Household size


1 1 16.8 3
2 1 21.4 2
3 2 17.3 4
4 2 15.4 3
5 1 17.3 4
6 1 18.4 1
7 2 14.3 4
8 2 14.5 5
9 1 23.2 2
10 1 21.1 5
11 2 17.4 2
12 2 16.7 6
13 1 14.5 4
14 1 18.9 1
15 2 13.9 7
16 2 12.4 7
17 1 17.8 2
18 1 19.3 1
19 2 15.3 6
20 2 13.3 4
28 SPSS COMMANDS FOR DISCRIMINANT ANALYSIS

• After the input data has been typed along with the variable labels and value labels in an SPSS file, to get the output for a Discriminant Analysis
problem proceed as mentioned below:

• 1. Click on ANALYSE at the SPSS menu bar.

• 2. Click on CLASSIFY, followed by DISCRIMINANT.

• 3. On the dialogue box which appears, select the GROUPING VARIABLE (dependent categorical variable in discriminant analysis) by clicking on the
right arrow to transfer it from the variable list on the left to the grouping variable box on the right.

• 4. Define the range of values of the grouping variable by clicking on DEFINE RANGE just below the grouping variable box. Fill in the minimum and
maximum values (the codes used in our problem is 0 and 1) of the variable in the box which appears. Then click CONTINUE.

• 5. Select all the independent variables for discriminant analysis from the variable list by clicking on the arrow which transfers them to the
INDEPENDENTS box on the right.

• 6. Just below the INDEPENDENTS box select ‘Enter independents together’ if you want all the selected independent variables (that are in the box)
in the discriminant model. (Here you have an option to use a STEPWISE discriminant analysis by selecting ‘Use Stepwise Method’ instead of ‘Enter
independents together’).
29 SPSS COMMANDS FOR DISCRIMINANT ANALYSIS

• 7. Click on STATISTICS on the lower part of the main dialog box. This opens up a smaller dialog box. Under
STATISTICS, click on MEANS and UNIVARIATE ANOVAS. Under the title FUNCTION COEFFICIENTS,
choose UNSTANDARDIZED to obtain the unstandardized coefficients of the discriminant function. These are
used to classify a new object in a discriminant analysis. Under MATRICES click on WITHIN GROUP
CORRELATION. Click on CONTINUE to return to the main dialog box.
• 8. Click on CLASSIFY on the lower part of the main dialog box. Select SUMMARY TABLE and LEAVE-ONE-
OUT CLASSIFICATION under the heading DISPLAY in the smaller dialog box that appears. This gives you the
classification table (also called the confusion matrix) that judges the accuracy of the discriminant model when
applied to the input data points. Click on CONTINUE to return to the main dialog box.
• 9. Click on SAVE and then select PREDICTED GROUP MEMBERSHIP and DISCRIMINANT SCORES.
• 10. Click OK to get the discriminant analysis output.
30 NORMALITY TESTING
31 NORMALITY TESTING

Tests of Normality

Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

ANNUALINCOME .107 20 .200* .967 20 .692

HOUSEHOLDSIZE .154 20 .200* .931 20 .163


32 TESTING FOR OUTLIERS
33 TESTING FOR MULTICOLLINEARITY

Coefficientsa
Model Collinearity Statistics

Tolerance VIF

ANNUALINCOME .624 1.603


1
HOUSEHOLDSIZE .624 1.603
a. Dependent Variable: BRAND
34 TESTING FOR LINEARITY
35 TESTING FOR HOMOSCEDASTICITY: BOX M TEST

Box’s test is used to determine whether two or more covariance matrices are
equal. Bartlett’s test for homogeneity of variance presented in Homogeneity
of Variances is derived from Box’s test. One caution: Box’s test is sensitive to
departures from normality. If the samples come from non-normal
distributions, then Box’s test may simply be testing for non-normality.
Suppose that we have m independent populations and we want to test the null
hypothesis that the population covariance matrices are all equal, i.e.
H0: Σ1 = Σ2 =⋯= Σm
36 TEST OF HOMOSCEDASTICITY: BOX-M TEST

Test Results Log Determinants


Box's M 2.691
Approx. F .789 BRAND Rank Log
df1 3 Determinant
F National Brand 2 2.525
df2 58320.000
Sig. .500 LOcal Brand 2 1.790
Tests null hypothesis of equal population
covariance matrices. Pooled within- 2 2.307
groups
37 TEST OF HOMOSCEDASTICITY: BOX-M TEST

• 1. Box's M test tests the assumption of homogeneity of covariance matrices. This test is very sensitive to
meeting the assumption of multivariate normality.
• 2. Discriminant function analysis is robust even when the homogeneity of variances assumption is not
met, provided the data do not contain important outliers.
• 3. For our data, we conclude the groups do not differ in their covariance matrices, satisfying the
assumption of DA.
▪ 4. when n is large, small deviations from homogeneity will be found significant, which is why Box's M must
be interpreted in conjunction with inspection of the log determinants. Log determinants are a measure
of the variability of the groups. Larger log determinants correspond to more variable groups. Large
differences in log determinants indicate groups that have different covariance matrices.
38 DISCRIMINANT ANALYSIS: CORE OUTPUT: GROUP
STATISTICS

Descriptive Statistics Descriptive Statistics


N Minimu Maximu Mean Std. Skewn
m m Deviation ess
Statisti Statistic Statistic Statistic Statistic Statisti Kurtosis
c c Skewness
ANNUALINC Std. Error Statistic Std. Error
20 12.40 23.20 16.9600 2.86768 .486
OME
HOUSEHOL ANNUALINCOME .512 -.245 .992
20 1.00 7.00 3.6500 1.92696 .261
DSIZE
Valid N
HOUSEHOLDSIZE .512 -.932 .992
20
(listwise) Valid N (listwise)
39 DISCRIMINANT ANALYSIS: CORE OUTPUT: GROUP
STATISTICS
Group Statistics
BRAND Mean Std. Deviation Valid N (listwise)
Unweighted Weighted

ANNUALINCOME 18.8700 2.52809 10 10.000


National Brand
HOUSEHOLDSIZE 2.5000 1.43372 10 10.000
ANNUALINCOME 15.0500 1.69197 10 10.000
LOcal Brand
HOUSEHOLDSIZE 4.8000 1.68655 10 10.000
ANNUALINCOME 16.9600 2.86768 20 20.000
Total
HOUSEHOLDSIZE 3.6500 1.92696 20 20.000

As the two groups (National Brands/Local Brands) are to be compared on the basis of two characteristics
of the respondents namely, Annual income and Household Size it will be useful to compute their mean
values to get an idea of the differences in their mean score. The mean score for Annual Income for the
NB group is 18.87, whereas for the LB group, it is 15.05 The absolute difference in the score of the
Annual income is (18.87-15.05)= 3.82, whereas it is (4.8-2.5)=2.33 for the Household Income. But Hence
initially we can expect Annual Income to discriminate between the two group..
40 DISCRIMINANT ANALYSIS: CORE OUTPUT: TESTING
EQUALITY OF GROUP MEANS
Tests of Equality of Group Means

Wilks' Lambda F df1 df2 Sig.

ANNUALINCOME .533 15.769 1 18 .001

HOUSEHOLDSIZE .625 10.796 1 18 .004

In the ANOVA table, the smaller the Wilks’s Lambda, the more important the independent
variable to the discriminant function. Wilks’s Lambda is significant by the F test for all
independent variables. Here both the F-statistic are significant and hence we can include both
the independent variables in our Discriminant Model but Annual Income with smaller Wilk’s
lambda is more important in discrimination. (Wilk’s Lambda Represents the proportion of
Unexplained Variability in the Dependent Variable which is opposite to R-Square of Multiple
Regression).
41 TO CHECK THE MODEL FIT

Run One- way ANOVA in SPSS from Compare Means with Discriminant Scores as
Dependent Variable and Brand as a Factor Variable

ANOVA

Discriminant Scores from Function 1 for Analysis 1


Sum of df Mean F Sig.
Squares Square
Between 20.041 1 20.041 20.041 .000
Groups
Within 18.000 18 1.000
Groups
Total 38.041 19
42 TO CHECK THE MODEL FIT : WILKS' LAMBDA.

• Wilks’ lambda as the ratio of within-group sum of squares to total sum of squares, its
values should equal (18.0/38.041) = 0.473. A variable selection method for stepwise
discriminant analysis that chooses variables for entry into the equation on the basis of
how much they lower Wilks' lambda. At each step, the variable that minimizes the overall
Wilks' lambda is entered.

Wilks' Lambda

Test of Function(s) Wilks' Lambda Chi-square df Sig.


1 .473 12.721 2 .002
43 SPSS CORE OUTPUT: Wilks’ Lambda

▪ Wilks’ Lambda is the ratio of within-groups sums of squares to the total sums of
squares. This is the proportion of the total variance in the discriminant scores not
explained by differences among groups.
▪ Wilks’ Lambda is the ratio of within-groups sums of squares to the total sums of
squares. This is the proportion of the total variance in the discriminant scores not
explained by differences among groups.
▪ Wilks’ lambda takes a value between 0 and 1 and lower the value of Wilks’ lambda, the
higher is the significance of the discriminant function.
▪ Therefore, a 0 (zero) value would be the most preferred one. A lambda of 1.00 occurs
when observed group means are equal. A small lambda indicates that group means
appear to differ. The associated significance value indicate whether the difference is
significant.
44 SPSS CORE OUTPUT: Wilks’ Lambda

• The statistical test of significance for Wilks’ lambda is carried out with the chi-squared
transformed statistic, which in our case is 12.721 with 2 degrees of freedom (degrees of
freedom equals the number of predictor variables) and a p value of 0.02. Since the p
value is less than 0.05, the assumed level of significance, it is inferred that the discriminant
function is significant and can be used for further interpretation of the results.
• Here, the Lambda of 0.473 has a significant value (Sig. = 0.002); thus, the group means
appear to differ.
45 SPSS OUTPUT: Eigen Values

Eigenvalues
Function Eigenvalue % of Variance Cumulative % Canonical
Correlation

1 1.113a 100.0 100.0 .726

• The basic principle in the estimation of a discriminant function is that the variance
between the groups relative to the variance within the group should be maximized.
The ratio of between group variance to within group variance is given by eigenvalue.
A higher eigenvalue is always desirable. An Eigen value indicates the proportion of
variance explained.
• A large Eigen value is associated with a strong function.
46 CANONICAL CORRELATION

• The last column of Table above indicates canonical correlation, which is the simple
correlation coefficient between the discriminant score and their corresponding group
membership (NB/LB). The value of this is 0.726, which the readers may verify. The square
of the canonical correlation is (0.726)2 = 0.527, which means 52.7 per cent of the
variance in the discriminating model between a prospective buyer/non-buyer is due to
the changes in the four predictor variables, namely,Annual Income and Household size.
47 SPSS OUTPUT STRUCTURE MATRIX

• The structure matrix table shows the correlations of each variable with each
discriminant function. The correlations serve like factor loadings in factor analysis -- that
is, by identifying the largest absolute correlations associated with each discriminant
function.
Structure Matrix
Function

1
ANNUALINCOME .887
HOUSEHOLDSIZE -.734
48 SPSS OUTPUT: Functions at Group Centroids

Functions at Group Centroids


Means of Canonical variables
• ‘Functions at Group Centroid’ indicates the average
VAR00001 BRAND Function discriminant score for subjects in the two groups.

1
• More specifically, the discriminant score for each
group when the variable means (rather than individual
1.00 1.001
values for each subject) are entered into the
discriminant equation.
2.00 -1.001
• Note that the two scores are equal in absolute value
but have opposite signs.
49 SPSS OUTPUT: Unstandardized Coefficients

Canonical Discriminant Function Unstandardized Coefficients


Coefficients
• The Canonical Discriminant Function Coefficients
Function indicate the unstandardized scores concerning the
independent variables. It is the list of coefficients of the
1
unstandardized discriminant equation.
INCOME .335 • Each subject’s discriminant score would be computed
by entering his or her variable values (raw data) for
HOUSEHOLD SIZE -.313
each of the variables in the equation.
(Constant) -4.545 Discriminant function:
Y = -4.545 + 0.335X1- 4.545X2
50 CUT-OFF SCORE FOR TWO GROUP
CLASSIFICATION
51 USING DISCRIMINANT FUNCTION

• Y = -4.545 + 0.335X1- 4.545X2


• If X1 = 15 and X2 = 2
• Than Y = -4.545 + 0.335(15) – 4.545(2)
= -4.545 +5.025 -9.05
Y = -8.57
• Cut-off score = = =0
• Therefore this respondent will belong to Group-2 i.e. local brand buyer.
52 SPSS OUTPUT: Standardized Coefficients

Standardized Canonical Standardized Coefficients


Discriminant Function
Coefficients • The coefficients of standardized discriminant
Function function are independent of units of measurement.
1
• The absolute value of coefficient in standardized
INCOME .722

SIZE -.490
discriminant function indicates the relative
contribution of variables in discriminating between
the two groups.
53 CLASSIFICATION MATRIX AND CROSS VALIDATION

• This method is known as leave-one-out classification method in SPSS. In our example, we


had 20 observations. Here, the first observation is deleted and the discriminant model is
estimated on the remaining 19 observations. Based on this discriminant model, the
excluded case is predicted to belong to a specific category. In the same way, the second
observation is eliminated and the discriminant model is estimated using the remaining 18
observations. Again based on this model, the excluded case is predicted to belong to a
specific category. This process is repeated 20 times. That is why this method is called the
leave-one-out-classification.
54 SPSS OUTPUT: Classification Matrix

a,c
Classification Results

Predicted Group Membership •‘Classification Results’ is a simple


Brand 1.00 2.00 Total summary of number and percent of
Original Count 1.00 9 1 10
subjects classified correctly and
incorrectly.
2.00 2 8 10
•The ‘leave-one out classification’ is a
% 1.00 90.0 10.0 100.0 cross-validation method, of which the
2.00 20.0 80.0 100.0 results are also presented.
b
Cross-validated Count 1.00 8 2 10

2.00 2 8 10

% 1.00 80.0 20.0 100.0

2.00 20.0 80.0 100.0

a. 85.0% of original grouped cases correctly classified.


b. Cross validation is done only for those cases in the analysis. In cross validation, each
case is classified by the functions derived from all cases other than that case.
c. 80.0% of cross-validated grouped cases correctly classified.
55 SPSS OUTPUT: Group Membership
56 ASSESSING THE CLASSIFICATION ACCURACY:
HIT RATIO AND

• The overall classificatory ability of the model measured by the hit ratio is given as:
57 RESULTS

• In last table, the stared value indicates that respondent 3 and 11 were wrongly classified in
group 1.
• Respondent 3 and 11 were actually belongs to group 2.
• Respondent 13 was wrongly classified in group 2.
• It originally belongs to group 1.
• Hit ratio= (no. of correct predictions/ total no. of cases)*100
=(17/20)*100
=85%
58 THREE GROUPS DISCRIMINANT ANALYSIS

• We are interested in the relationship between the three continuous variables and our
categorical variable. Specifically, we would like to know how many dimensions we would
need to express this relationship. Using this relationship, we can predict a classification
based on the continuous variables or assess how well the continuous variables separate
the categories in the classification. We will be discussing the degree to which the
continuous variables can be used to discriminate between the groups.
59 THREE GROUP LDA AND QDA
60 CASE STUDY FOR THREE GROUP LDA

• A large international air carrier has collected data on employees in three different job
classifications: 1) customer service personnel, 2) mechanics and 3) dispatchers. The
director of Human Resources wants to know if these three job classifications appeal to
different personality types. Each employee is administered a battery of psychological test
which include measures of interest in outdoor activity, sociability and conservativeness.
61 DATA DESCRIPTION

• The data used in this example are from a data file, Discrim.xls, with 244 observations on
four variables. The variables include three continuous, numeric variables
(outdoor, social and conservative) and one categorical variable (job) with three
levels: 1) customer service, 2) mechanic and 3) dispatcher.
62 SPSS OUTPUT: Descriptive Statistics

We are interested in how job relates to outdoor, social and


conservative. Let’s look at summary statistics of these three continuous
variables for each job category.
63 SPSS OUTPUT: Descriptive Statistics
64 Descriptive Statistics Interpretation

• From this output, we can see that some of the means of outdoor,
social and conservative differ noticeably from group to group in job. These differences
will hopefully allow us to use these predictors to distinguish observations in
one job group from observations in another job group. Next, we can look at the
correlations between these three predictors. These correlations will give us some
indication of how much unique information each predictor will contribute to the
analysis. If two predictor variables are very highly correlated, then they will be
contributing shared information to the analysis. Uncorrelated variables are likely
preferable in this respect. We will also look at the frequency of each job group.
65 SPSS OUTPUT: Correlations
66 SPSS COMMANDS

• discriminant command in SPSS performs canonical linear discriminant analysis which is


the classical form of discriminant analysis. In this example, we specify in
the groups subcommand that we are interested in the variable job, and we list in
parenthesis the minimum and maximum values seen in job. We next list the
discriminating variables, or predictors, in the variables subcommand. In this example, we
have selected three predictors: outdoor, social and conservative. We will be
interested in comparing the actual groupings in job to the predicted groupings generated
by the discriminant analysis. For this, we use the statistics subcommand. This will
provide us with classification statistics in our output.
67 SPSS OUTPUT: Data Summary

Analysis Case Processing Summary – This table summarizes the analysis dataset in terms of
valid and excluded cases. The reasons why SPSS might exclude an observation from the
analysis are listed here, and the number (“N”) and percent of cases falling into each category
(valid or one of the exclusions) are presented. In this example, all of the observations in the
dataset are valid.
68 SPSS OUTPUT: Group Statistics

Group Statistics – This table presents the distribution of observations into the three groups
within job. We can see the number of observations falling into each of the three groups. In this example,
we are using the default weight of 1 for each observation in the dataset, so the weighted number of
observations in each group is equal to the unweighted number of observations in each group.
69 UNSTANDARDIZED COEFFICIENTS

Canonical Discriminant Function Coefficients

Function

1 2

OUTDOOR .092 .225

SOCIAL -.194 .050

CONSERVATIVE .155 -.087

(Constant) .937 -3.623


70 SPSS OUTPUT: Eigen Values and Multivariate Tests

In this example, there are two discriminant dimensions, both of


which are statistically significant.
71 SPSS OUTPUT: Discriminant Function Output
72 CLASSIFYING THROUGH FISHER’S LINEAR
DISCRIMINANT FUNCTIONS
73 DISCRIMINANT FUNCTION INTERPRETATIONS:

• Standardized Canonical Discriminant Function Coefficients – These coefficients can be used to calculate the discriminant score
for a given case. The score is calculated in the same manner as a predicted value from a linear regression, using the standardized
coefficients and the standardized variables. For example, let zoutdoor, zsocial and zconservative be the variables created by
standardizing our discriminating variables. Then, for each case, the function scores would be calculated using the following equations:
• Score1 = 0.379*zoutdoor – 0.831*zsocial + 0.517*zconservative
• Score2 = 0.926*zoutdoor + 0.213*zsocial – 0.291*zconservative
• The distribution of the scores from each function is standardized to have a mean of zero and standard deviation of one. The magnitudes
of these coefficients indicate how strongly the discriminating variables effect the score. For example, we can see that the standardized
coefficient for zsocial in the first function is greater in magnitude than the coefficients for the other two variables. Thus, social will have
the greatest impact of the three on the first discriminant score.
• Structure Matrix – This is the canonical structure, also known as canonical loading or discriminant loading, of the discriminant
functions. It represents the correlations between the observed variables (the three continuous discriminating variables) and the
dimensions created with the unobserved discriminant functions (dimensions).
74 SPSS OUTPUT: Discriminant Function Output: Functions
at Group Centroids

Functions at Group Centroids – These are the means of the discriminant function scores by group for
each function calculated. If we calculated the scores of the first function for each case in our dataset, and
then looked at the means of the scores by group, we would find that the customer service group has a
mean of -1.219, the mechanic group has a mean of 0.107, and the dispatch group has a mean of
1.420. We know that the function scores have a mean of zero, and we can check this by looking at the
sum of the group means multiplied by the number of cases in each group: (85*-
1.219)+(93*.107)+(66*1.420) = 0.
75 SPSS OUTPUT: Predicted Classifications: Classification
Processing Summary

. Classification Processing Summary – This is similar to the Analysis Case


Processing Summary (see superscript a), but in this table, “Processed” cases are
those that were successfully classified based on the analysis. The reasons why an
observation may not have been processed are listed here. We can see that in this
example, all of the observations in the dataset were successfully classified.
76 SPSS OUTPUT: Predicted Classifications Prior
Probabilities

Prior Probabilities for Groups – This is the distribution of observations into


the job groups used as a starting point in the analysis. The default prior
distribution is an equal allocation into the groups, as seen in this example.
SPSS allows users to specify different priors with the priors subcommand.
77 SPSS OUTPUT: Predicted Classifications: Classification
Matrix
78 SPSS OUTPUT: Predicted Classifications Interpretation

• . Predicted Group Membership – These are the predicted frequencies of groups from the
analysis. The numbers going down each column indicate how many were correctly and incorrectly
classified. For example, of the 89 cases that were predicted to be in the customer service group, 70
were correctly predicted, and 19 were incorrectly predicted (16 cases were in the mechanic group and
three cases were in the dispatch group).
• . Original – These are the frequencies of groups found in the data. We can see from the row totals that
85 cases fall into the customer service group, 93 fall into the mechanic group, and 66 fall into
the dispatch group. These match the results we saw earlier in the output for
the frequencies command. Across each row, we see how many of the cases in the group are classified
by our analysis into each of the different groups. For example, of the 85 cases that are in the customer
service group, 70 were predicted correctly and 15 were predicted incorrectly (11 were predicted to be
in the mechanic group and four were predicted to be in the dispatch group).
79 SPSS OUTPUT: Predicted Classifications Interpretation

• Count – This portion of the table presents the number of observations falling into the given
intersection of original and predicted group membership. For example, we can see in this
portion of the table that the number of observations originally in the customer
service group, but predicted to fall into the mechanic group is 11. The row totals of these
counts are presented, but column totals are not.
• % – This portion of the table presents the percent of observations originally in a given group
(listed in the rows) predicted to be in a given group (listed in the columns). For example, we
can see that the percent of observations in the mechanic group that were predicted to be in
the dispatch group is 16.1%. This is NOT the same as the percent of observations predicted
to be in the dispatch group that were in the mechanic group. The latter is not presented
in this table.
80 FINAL NOTES

•Note that the Standardized Canonical Discriminant Function Coefficients table and the
Structure Matrix table are listed in different orders.
•The number of discriminant dimensions is the number of groups minus 1. However, some
discriminant dimensions may not be statistically significant.
•In this example, there are two discriminant dimensions, both of which are statistically
significant.
81 FINAL NOTES

• The canonical correlations for the dimensions one and two are 0.72 and 0.49,
respectively.
• The standardized discriminant coefficients function in a manner analogous to
standardized regression coefficients in OLS regression.
• For example, a one standard deviation increase on the outdoor variable will result in a
0.32 standard deviation decrease in the predicted values on discriminant function 1.
• The canonical structure, also known as canonical loading or discriminant loadings,
represent correlations between observed variables and the unobserved discriminant
functions (dimensions).
• The discriminant functions are a kind of latent variable and the correlations are loadings
analogous to factor loadings.
• Group centroids are the class (i.e., group) means of canonical variables.
82 THINGS TO CONSIDER

• Multivariate normal distribution assumptions holds for the response variables. This means that
each of the dependent variables is normally distributed within groups, that any linear
combination of the dependent variables is normally distributed, and that all subsets of the
variables must be multivariate normal.
• Each group must have a sufficiently large number of cases.
• Different classification methods may be used depending on whether the variance-covariance
matrices are equal (or very similar) across groups.
• Non-parametric discriminant function analysis, called kth nearest neighbor, can also be
performed.
83

THANK YOU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy