MBR Lab Week 10-12-1

MBR Lab Week 10 - 12
Missing data
 If there is missing values in the data and not

less than 5% then go to Transform- Replace
missing values
SPSS - Mean, Median, Mode
 Analyze
 Descriptive Statistics
 Frequencies
 Statistics
 Select Quartile, mean,
median, mode,
skewness, kurtosis etc.
SPSS – Graphs (to check frequencies, normal
data…)
 Analyze
 Frequencies
 Charts
 To see normal
distribution curve,
bar charts and pie
charts
SPSS- Skewness and Kurtosis
 SPSS
 Analyze
 Descriptives
 Options
 Check Skewness and Kurtosis
Value of Skewness and Kurtosis
 The values of skewness and kurtosis should be zero in a

normal distribution.
 Skewness
 Positive values of skewness indicate a pile-up of scores on
the left of the distribution.
 Negative values indicate a pile-up on the right.
 Kurtosis
 Positive values of kurtosis indicate a pointy and heavy-tailed
distribution
 Negative values indicate a flat and light-tailed distribution.
 The further the value is from zero, the more likely it is that
the data are not normally distributed.
Removing Outliers
 Analyze – Descriptive Statistics – Explore

 Keep all the variables you want to check for
outliers in Dependent list
 Use label cases by if you have a unique
identifier for a case, such as serial number of
respondent
Removing Outliers - Boxplot
 This will show boxplot

 11 number data is problematic
 2, 50, 606, 25 are also outliers.
 Depends on how many you want to remove
Removing Outliers
 Data – Select Cases – If condition

to exclude extreme values
 Eg. Condition satisfied: If serial
number = 23, 25, 100
Split File – Analyze results according to
groups
 Data – Split File

 Click Compare groups or organize output by
groups
 Select the group you want to divide the data
into.
 To reverse split file function, select analyse all

cases, do not create groups
Split File – Analyze results according to
groups - Output for Descriptive stats
Merging Files
 Data – Merge Files – Add cases or Variables

Reliability Analysis
 Reliability analysis refers to the fact that a scale
should consistently reflect the construct it is
measuring. In simple words, to check the internal
consistency of the items is reliability analysis.
 Cronbach's alpha is the most common measure of
internal consistency ("reliability").
 To perform: Analyze  Scale Reliability analysis

Cont…
 Check the Cronbach’s Alpha value, it should be

more than .7
 If the value of Cronbach’s Alpha value is less than .
7
 Go check “Cronbach's Alpha if Item Deleted” in
“Item-Total Statistics” Table.
 IF there is a value greater than .7, delete that item.
Transforming Variables
 Compute Variables
 (For example, create a mean for 5 items of Job satisfaction)
 Transform – Compute Variable
 Assign a name for the new variable and in numerical expressions write
Mean(variable1,variable2,variable3,variable4,variable5).
 Recode into Same Variables
 Assign old and new values
 Recode into Different Variables
 Assign old and new values
 Define a new Variable
 Replace Missing Values
 If you want to replace all the missing values with the means, median or interpolation of the values
Cross Tab
 To compare results between two groups

 Analyze – Descriptive Stat – Cross Tabs
 Specify Rows and Width

 The variables should have limited options

(metric data)
Cross Tab (Continued)
 Analyze – Descriptive Stat – Cross Tabs

 Specify Rows and Width
 Click statistics Chi square and Phi and
Cramer V
 If the value is significant (less than 0.05), it
means there is difference on the bases of
grouping variable
T Test – Independent
 We want to see does any control, dichotomous variable such as gender (male/female),
Children (yes/no), smoking (yes/no) grouping variable affects any dependent variable?
 For that we will conduct independent T test.
 Independent because groups of controls are independent of each other. The sample of male and
female do not depend on each other.
 The value of sig(two tailed) t test equality of means (equal variance assumed) should be
less than 0.05 to check whether the result of a dependent variable is significantly
controlled by that grouping variable.
T Test – Independent (Continued)
 Analyze- Compare means –

Independent Sample T Test
 Grouping variable : Dichotomous
control variable such as gender
 Test Variable: Dependent Variable
 Click define groups and specify values
for male and female (e.g. 1 and 2 for
this data)
T Test – Independent (Continued)
 The value of sig(two tailed) t test

equality of means (equal variance
assumed) should be less than
0.05 to check whether the result
of a dependent variable is
significantly controlled by that
grouping variable.
 T value should be greater than 2
to be significant
 Variables Pss_mean and sd_mean
are significantly controlled by
variable gender
One way - ANOVA
 Use one way ANOVA if your grouping

variable has more than 2 categories
 E.g. marital status (single, divorced,
remarried, widowed etc.)
 Analyze - compare means - one-way ANOVA
 Specify dependent variable and grouping
variable/ factor
One way – ANOVA (Continued)
 Sig should be less than 0.05

 Marital status variable is significantly
controlling lifesat_mean, sest_mean, sd_mean
and pc_mean
Regression and Correlation on SPSS
 Details already covered in questionnaire analysis

 Linear Regression
 Analyze – Regression – Linear
 Correlation
 Analyze – Correlate – Bivariate
 Continuous Variable: Pearson
 Ordinal or Nominal Variable: Spearman
Partial Correlation
 Partial correlation is a measure of the strength and direction of a linear relationship

between two continuous variables whilst controlling for the effect of one or more other
continuous variables (also known as 'covariates' or 'control' variables).
 Although partial correlation does not make the distinction between independent and
dependent variables, the two variables are often considered in such a manner (i.e., you
have one continuous dependent variable and one continuous independent variable, as
well as one or more continuous control variables).
Partial Correlation (Continued)
 Example, You could use partial correlation to understand whether there is a linear relationship
between ice cream sales and price, whilst controlling for daily temperature
 Continuous dependent variable = "ice cream sales", measured in US dollars,
 Continuous independent variable = "price", also measured in US dollars
 Single control variable – that is, the single continuous independent variable you are adjusting
for = daily temperature, measured in °C).
 You may believe that there is a relationship between ice cream sales and prices (i.e., sales go
down as price goes up), but you would like to know if this relationship is affected by daily
temperature (e.g., if the relationship changes when taking into account daily temperature since
you suspect customers are more willing to buy ice creams, irrespective of price, when it is a
really nice, hot day).
Assumptions for correlation (Bivariate –
Partial)
 Pearson – Variables should be continuous

 Spearman, Kendell tou – Variables should be ranked
 There needs to be a linear relationship between all variables. (Make a scatter plot and
check if the relationship is linear or not)
 There should be no significant outliers.
 Your variables should be approximately normally distributed
Analysis of Partial Correlation
 Steps – Analyze – Correlate – Partial

 Variables = Weight VO2max
 Control = Age
 The Correlations table is split into two main parts: (a) the

Pearson product-moment correlation coefficients for all your
variables – that is, your dependent variable, independent
variable, and one or more control variables – as highlighted
by the blue rectangle
 You can get this by clicking options and tick zero order
correlation (advisable to click mean SD also)
 (b) the results from the partial correlation where the Pearson
product-moment correlation coefficient between the
dependent and independent variable has been adjusted to take
into account the control variable(s), as highlighted by the red
rectangle.
Analysis of Partial Correlation
 The results of the partial correlation highlighted by

the red rectangle show that there was a moderate,
negative partial correlation between the dependent
variable, "VO2max", and independent variable,
"weight", whilst controlling for "age", which was
statistically significant (r(97) = -.314, n = 100, p = .
002).
 However, when we refer to the Pearson's product-
moment correlation – also known as the zero-order
correlation – between "VO2max" and "weight",
without controlling for "age", as highlighted by
the blue rectangle, we can see that there was also a
statistically significant, moderate, negative
correlation between "VO2max" and "weight" (r(98) =
-.307, n = 100, p = .002). This suggests that "age" had
very little influence in controlling for the
relationship between "VO2max" and "weight".
Regression Interpretation (Exercise)
 Finding Regression between two variables

 Where IV = Advertising Spending
 DV = Sales
 Sample N= 24
 SPSS will give Model Summary Table, ANOVA and Regression Coefficients
Regression Interpretation – R Square
Model Summary
Adjusted R Std. Error of the  R Square interpretation
Model R R Square Square Estimate
1 .916a .839 .832 .73875  0.839 or 84% (r^2) of the variation in DV sales is
a. Predictors: (Constant), Advertising spending due to IV advertising spending
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 62.514 1 62.514 114.548 .000b  While remaining 16% (100%-84%) of variation in
Residual 12.006 22 .546 DV sales is due to the factors other than
Total 74.520 23 advertising spending (stochastic factors)
a. Dependent Variable: Detrended sales
b. Predictors: (Constant), Advertising spending
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 6.584 .402 16.391 .000
Advertising spending 1.071 .100 .916 10.703 .000
Regression Interpretation – Constant & Slope
Model Summary
Adjusted R Std. Error of the
1 .916 a
.839 .832 .73875  P Value or Sig - It should be less than 0.05 or
a. Predictors: (Constant), Advertising spending
5%
ANOVAa
 In this case, the value of 0.000 is less than
1 Regression 62.514 1 62.514 114.548 .000b 0.05 which means the relationship between
Residual 12.006 22 .546 advertising spending and sales is significant.
Total 74.520 23
Coefficientsa
Standardized
1 (Constant) 6.584 .402 16.391 .000
Regression Interpretation – Constant & Slope
Model Summary
Adjusted R Std. Error of the  1 unit increase in advertising will bring
1 .916a .839 .832 .73875
1.071 unit increase in sales on the average
a. Predictors: (Constant), Advertising spending  Interpretation of beta (Slope)
ANOVAa
 Even if advertising spending = 0, sales will
1 Regression 62.514 1 62.514 114.548 .000b be 6.584 due to other factors.
Residual 12.006 22 .546
Total 74.520 23
 Interpretation of Constant
Coefficientsa
Standardized
1 (Constant) 6.584 .402 16.391 .000
Regression Interpretation – Equation
Model Summary
Adjusted R Std. Error of the  Equation =
1 .916a .839 .832 .73875  Y= a + bX + e
a. Predictors: (Constant), Advertising spending
ANOVAa
 Y=Sales
1 Regression 62.514 1 62.514 114.548 .000b
Residual 12.006 22 .546
 X= Advertising Spending

Total 74.520 23  A= Constant or Intercept = 6.584
b. Predictors: (Constant), Advertising spending  B= Slope =1.071
 E= 1- R square = 1-0.839 = 0.161
Coefficientsa
Standardized
 Sales = 6.584 + 1.071(Advertising Spending) +
1 (Constant) 6.584 .402 16.391 .000 0.161
Regression Interpretation – SS MS
 Sum of square / df = Mean Square

 62.514/1 = 62.514
 12.006/22 = 0.546
 F = Mean square Regression/ Mean Square

Residual
 F= 62.514/0.546 = 114.548
Regression Interpretation – F and P Value
 Significant=
 P value = <= 0.05
 If F stat is greater then F critical we will
accept H1 and reject H0
 F stat is also called calculated value = 114.548
 F Critical value =4
 F stat > F critical value = 114.548> 4 = results
significant
Regression Interpretation – T Value
 T stat > t critical = significant results

 T stat (constant) = 16.391
 T stat (advertising spending) = 10.703
 T critical >= 2
 16.391 and 10.703 >= 2 > results
significant
Regression Equation
 Alpha a = intercept
 Beta b = slope
 E = Residual or Error in the equation
 Y= Dependent Variable
 X= Independent Variable
 Alpha a = 6.584
 Beta b = 1.071
 Y= Sales
 X= Advertisement
 Sales=6.584 + 1.071 (advertisement)
Step Wise Linear Regression
 Stepwise linear regression is a method of regressing multiple variables

while simultaneously removing those that aren't important.
 Stepwise regression is also used, when the control variable/s is identified.
 Stepwise regression essentially does multiple regression a number of times, each
time removing the weakest correlated variable. At the end you are left with the
variables that explain the distribution best.
Enter variables (e.g. control variables You can press enter or stepwise in
then press Next) Method
 The results show R2 with just the controls and then
controls and main IVs. You can compare the R2
 In linear regression, statistics, tick R square

change and collinearity diagnostic to see the
effect of new variables on R2 and their effect
on collinearity.
Difference between Correlation and
Regression?
 Correlation – Actual Data  Regression – Line of best fit

Least Square Method
 Fitting the best line in the scatter plot

 The "least squares" method is a form of mathematical regression analysis used to
determine the line of best fit for a set of data, providing a visual demonstration of the
relationship between the data points. Each point of data represents the relationship
between a known independent variable and an unknown dependent variable.
Non- Parametric Tests
Parametric Test and Non-parametric tests
Parametric Test Non-parametric Test

 Many of the statistical procedures are parametric  If you use a parametric test when your data are not
tests based on the normal distribution. parametric then the results are likely to be
inaccurate.
 Types of tests
 Used when continuous data are not normally
 t-test (paired or unpaired) distributed or when dealing with categorical/
 ANOVA (one-way non-repeated, repeated; two- qualitative variables.
way, three-way)  Types of test:
 Linear regression  chi-squared, Fisher's exact tests, Wilcoxon's matched
 Pearson rank correlation. pairs, Mann–Whitney U-tests, Kruskal–Wallis
tests and Spearman rank correlation.
Non-Parametric Tests
 Steps – Analyze – Non Parametric Test

Mann Whitney Test – Substitute Independent
Sample T Test
 Mann Whitney Test is an independent sample t

test for non-parametric (not normal) data.
 Steps – Analyze – Non Parametric Test –
Independent samples
 Test field: Current Salary
Group: Gender
 Go to settings – tick customize tests and select
Mann whitney
 SPSS provides results with analyses
Kruskal Walis Test – Substitute of ANOVA
 Rank-based nonparametric test that can be used to determine if there are statistically
significant differences between two or more groups of an independent variable on a
continuous or ordinal dependent variable.
 Alternative to the one way ANOVA
 For example, test to understand whether exam performance, measured on a continuous
scale from 0-100, differed based on test anxiety levels, measured students with "low",
"medium" and "high" test anxiety levels
 DV: exam performance (continuous)
 IV: test anxiety level which has three independent groups (groups)
Kruskal Wallis – Substitute of ANOVA
 When we have more than two groups to test field, we

use substitute of ANOVA test, the Kruskal Wallis test
 Taking last example forward we want to test whether
distribution of current salary is different across
employment category where employment category
has three options: clerical, custodial and manager
 Steps – Analyze – Non Parametric Test – Independent
samples
 Test field: Current Salary
Group: Gender
Kruskal Wallis
Wilcoxon Signed-Rank Test - Substitute of
Paired Sample T Test
 General test to compare distributions in paired  Steps – Analyze – Non Parametric Test –
samples. related samples
 This test is usually the preferred alternative to the  Go to settings – tick customize tests and select
Paired t-test when the assumptions of parametric test
are not satisfied Wilicoxon
 In a paired sampletest, each subject or entity is  Specify two items in test fields
measured twice, resulting in pairs of observations. In this case, we want to know whether there
 Eg. interested in evaluating the effectiveness of a are differences in beginning salary and current
company training program. salary of the employees
Measure the performance of a sample of employees
before and after completing the program, and analyze
the differences using a paired sample test.
 If data is non-parametric use these steps
The Friedman test - Substitute of One way
repeated ANOVA
 Steps – Analyze – Non Parametric Test –

related samples
Friedman
 Specify two items in test fields
In this case, we want to know whether there
are differences in beginning salary and current
salary of the employees
 This test shows distribution and Wilcoxon
shows median difference
Moderation and Mediation
OPTIONAL
Moderation
 A moderator variable, commonly denoted as  Examples:

just M, is a third variable that affects the  IV: Stress
strength of the relationship between a
dependent and independent variable.  DV: Health Status
 The moderator variable, if found to be  Stress has a bigger impact on men than women.
significant, can cause an amplifying or  Gender is a qualitative variable that moderates
weakening effect between x and y. the strength of an effect between stress and
health status.
Process Macro
 Download Process Macro by Andrew Hayes for

easy moderation and mediation.
 Go to http://processmacro.org/download.html
and download and zip file (also available on
slate week 13)
 After opening SPSS, under the “Utilities”
menu, choose “Custom Dialogs” and then
“Install Custom Dialog”. Then locate the
PROCESS dialog builder file and click “Open.”
 Process file is on the following path:
Process Macro
 After installing Process Macro – Go to

regression and Process v3
 The model templates for Process Macro are
available on the following link:
http://
www.personal.psu.edu/jxb14/M554/specreg/te
mplates.pdf
Also, as a Week 13 Templates Process Macro

file on Slate
Moderation on Process Macro
 If you have 1 moderator in the relationship

between X on Y, You can select model 1
Manual Moderation without Process Macro
 Create an interaction for IV and Moderating

variable by multiplying the both variables  Regress IV, Moderator and the new interaction
 Transform- Compute Variable – variable as IVs on your DV.
 Assign name for target variable
 Numeric Expression: IV * Moderator
(where IV and moderator are the names of your
variables who are representing IV and moderator)
Moderation
 If the result of interaction variable as IV on

your DV is significant and positive, it means
the moderator positively and significantly
affected the strength of the relationship
between your IV and DV such that the
relationship is stronger when moderating
variable is high.
 The first figure shows process macro results

and the second figure shows regression results
Moderation Analysis
 IV: Op mean
 DV Lifesat_mean
 Moderator: Sest
 Moderation Interaction: Op_sest
 In this case the interaction between IV and Mod (int_1 or

op_sest) has a value of -.122 which shows that the
relationship between
IV on DV is positive such that the relationship is stronger
when sest is low
(low because op_sest has a negative value)
 However, the p value of interaction is greater than 0.05
(0.785) which means that the moderation results are not
significant
Significance
 Either you can check p value for significance or LLCI and ULCI values
 LLCI and ULCI range should not have a ZERO in between to be significant
Mediation Analysis
 A mediator mediates the relationship between the independent and dependent variables

 i.e. explaining the reason for such a relationship to exist.
 Another way to think about a mediator variable is that it carries an effect.
 In a perfect mediation, an independent variable leads to some kind of change to the
mediator variable, which then leads to a change in the dependent variable.
Turning on the
Boil water
stove
Turning on the Heat on the stove Boil water

stove
Mediation Analysis
 Mi is Mediator in Process Macro Template

 Whereas M is Moderator
 X on Y = Direct Effect
 X on M and M on Y = Indirect Effect
 Use Model 4 for this template
 IV: Op_Mean
 Med: SEST_Mean
 DV: Lifesat_Mean
 Outcome variable: sest_mean

 Shows relationship:
Op_mean Sest_mean
Mediation Analysis
Op_mean Lifesat_mean
Sest_mean Lifesat_mean
NO Mediation
0.0752* -0.417
Op_mean Sest_mean Lifesat_mean

MBR Lab Week 10-12-1

Uploaded by

Copyright:

Available Formats

MBR Lab Week 10-12-1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MBR Lab Week 10-12-1

Uploaded by

Copyright:

Available Formats

MBR Lab Week 10 - 12

 If there is missing values in the data and not

 The values of skewness and kurtosis should be zero in a

 Analyze – Descriptive Statistics – Explore

 This will show boxplot

 Data – Select Cases – If condition

 Data – Split File

 To reverse split file function, select analyse all

 Data – Merge Files – Add cases or Variables

 Check the Cronbach’s Alpha value, it should be

 To compare results between two groups

 The variables should have limited options

 Analyze – Descriptive Stat – Cross Tabs

 Analyze- Compare means –

 The value of sig(two tailed) t test

 Use one way ANOVA if your grouping

 Sig should be less than 0.05

 Details already covered in questionnaire analysis

 Partial correlation is a measure of the strength and direction of a linear relationship

 Pearson – Variables should be continuous

 Steps – Analyze – Correlate – Partial

 The Correlations table is split into two main parts: (a) the

 The results of the partial correlation highlighted by

 Finding Regression between two variables

 Sum of square / df = Mean Square

 F = Mean square Regression/ Mean Square

 T stat > t critical = significant results

 Stepwise linear regression is a method of regressing multiple variables

 In linear regression, statistics, tick R square

 Correlation – Actual Data  Regression – Line of best fit

 Fitting the best line in the scatter plot

Parametric Test Non-parametric Test

 Steps – Analyze – Non Parametric Test

 Mann Whitney Test is an independent sample t

 When we have more than two groups to test field, we

 Steps – Analyze – Non Parametric Test –

 A moderator variable, commonly denoted as  Examples:

 Download Process Macro by Andrew Hayes for

 After installing Process Macro – Go to

Also, as a Week 13 Templates Process Macro

 If you have 1 moderator in the relationship

 Create an interaction for IV and Moderating

 If the result of interaction variable as IV on

 The first figure shows process macro results

 In this case the interaction between IV and Mod (int_1 or

 A mediator mediates the relationship between the independent and dependent variables

Turning on the Heat on the stove Boil water

 Mi is Mediator in Process Macro Template

 Outcome variable: sest_mean

Op_mean Sest_mean Lifesat_mean

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.