MBR Lab Week 10-12-1

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 65

MBR Lab Week 10 - 12

Missing data

 If there is missing values in the data and not


less than 5% then go to Transform- Replace
missing values
SPSS - Mean, Median, Mode

 Analyze
 Descriptive Statistics
 Frequencies

 Statistics
 Select Quartile, mean,
median, mode,
skewness, kurtosis etc.
SPSS – Graphs (to check frequencies, normal
data…)

 Analyze
 Descriptive Statistics
 Frequencies

 Charts
 To see normal
distribution curve,
bar charts and pie
charts
SPSS- Skewness and Kurtosis

 SPSS
 Analyze
 Descriptive Statistics
 Descriptives
 Options
 Check Skewness and Kurtosis
Value of Skewness and Kurtosis

 The values of skewness and kurtosis should be zero in a


normal distribution.
 Skewness
 Positive values of skewness indicate a pile-up of scores on
the left of the distribution.
 Negative values indicate a pile-up on the right.
 Kurtosis
 Positive values of kurtosis indicate a pointy and heavy-tailed
distribution
 Negative values indicate a flat and light-tailed distribution.
 The further the value is from zero, the more likely it is that
the data are not normally distributed.
Removing Outliers

 Analyze – Descriptive Statistics – Explore


 Keep all the variables you want to check for
outliers in Dependent list
 Use label cases by if you have a unique
identifier for a case, such as serial number of
respondent
Removing Outliers - Boxplot

 This will show boxplot


 11 number data is problematic
 2, 50, 606, 25 are also outliers.
 Depends on how many you want to remove
Removing Outliers

 Data – Select Cases – If condition


to exclude extreme values
 Eg. Condition satisfied: If serial
number = 23, 25, 100
Split File – Analyze results according to
groups

 Data – Split File


 Click Compare groups or organize output by
groups
 Select the group you want to divide the data
into.

 To reverse split file function, select analyse all


cases, do not create groups
Split File – Analyze results according to
groups - Output for Descriptive stats
Merging Files

 Data – Merge Files – Add cases or Variables


Reliability Analysis
 Reliability analysis refers to the fact that a scale
should consistently reflect the construct it is
measuring. In simple words, to check the internal
consistency of the items is reliability analysis.
 Cronbach's alpha is the most common measure of
internal consistency ("reliability").
 To perform: Analyze  Scale Reliability analysis

Cont…

 Check the Cronbach’s Alpha value, it should be


more than .7
 If the value of Cronbach’s Alpha value is less than .
7
 Go check “Cronbach's Alpha if Item Deleted” in
“Item-Total Statistics” Table.
 IF there is a value greater than .7, delete that item.
Transforming Variables

 Compute Variables
 (For example, create a mean for 5 items of Job satisfaction)
 Transform – Compute Variable
 Assign a name for the new variable and in numerical expressions write
Mean(variable1,variable2,variable3,variable4,variable5).
 Recode into Same Variables
 Assign old and new values
 Recode into Different Variables
 Assign old and new values
 Define a new Variable
 Replace Missing Values
 If you want to replace all the missing values with the means, median or interpolation of the values
Cross Tab

 To compare results between two groups


 Analyze – Descriptive Stat – Cross Tabs
 Specify Rows and Width

 The variables should have limited options


(metric data)
Cross Tab (Continued)

 Analyze – Descriptive Stat – Cross Tabs


 Specify Rows and Width
 Click statistics Chi square and Phi and
Cramer V
 If the value is significant (less than 0.05), it
means there is difference on the bases of
grouping variable
T Test – Independent

 We want to see does any control, dichotomous variable such as gender (male/female),
Children (yes/no), smoking (yes/no) grouping variable affects any dependent variable?
 For that we will conduct independent T test.
 Independent because groups of controls are independent of each other. The sample of male and
female do not depend on each other.
 The value of sig(two tailed) t test equality of means (equal variance assumed) should be
less than 0.05 to check whether the result of a dependent variable is significantly
controlled by that grouping variable.
T Test – Independent (Continued)

 Analyze- Compare means –


Independent Sample T Test
 Grouping variable : Dichotomous
control variable such as gender
 Test Variable: Dependent Variable
 Click define groups and specify values
for male and female (e.g. 1 and 2 for
this data)
T Test – Independent (Continued)

 The value of sig(two tailed) t test


equality of means (equal variance
assumed) should be less than
0.05 to check whether the result
of a dependent variable is
significantly controlled by that
grouping variable.
 T value should be greater than 2
to be significant
 Variables Pss_mean and sd_mean
are significantly controlled by
variable gender
One way - ANOVA

 Use one way ANOVA if your grouping


variable has more than 2 categories
 E.g. marital status (single, divorced,
remarried, widowed etc.)
 Analyze - compare means - one-way ANOVA
 Specify dependent variable and grouping
variable/ factor
One way – ANOVA (Continued)

 Sig should be less than 0.05


 Marital status variable is significantly
controlling lifesat_mean, sest_mean, sd_mean
and pc_mean
Regression and Correlation on SPSS

 Details already covered in questionnaire analysis


 Linear Regression
 Analyze – Regression – Linear
 Correlation
 Analyze – Correlate – Bivariate
 Continuous Variable: Pearson
 Ordinal or Nominal Variable: Spearman
Partial Correlation

 Partial correlation is a measure of the strength and direction of a linear relationship


between two continuous variables whilst controlling for the effect of one or more other
continuous variables (also known as 'covariates' or 'control' variables).
 Although partial correlation does not make the distinction between independent and
dependent variables, the two variables are often considered in such a manner (i.e., you
have one continuous dependent variable and one continuous independent variable, as
well as one or more continuous control variables).
Partial Correlation (Continued)

 Example, You could use partial correlation to understand whether there is a linear relationship
between ice cream sales and price, whilst controlling for daily temperature
 Continuous dependent variable = "ice cream sales", measured in US dollars,
 Continuous independent variable = "price", also measured in US dollars
 Single control variable – that is, the single continuous independent variable you are adjusting
for = daily temperature, measured in °C).
 You may believe that there is a relationship between ice cream sales and prices (i.e., sales go
down as price goes up), but you would like to know if this relationship is affected by daily
temperature (e.g., if the relationship changes when taking into account daily temperature since
you suspect customers are more willing to buy ice creams, irrespective of price, when it is a
really nice, hot day).
Assumptions for correlation (Bivariate –
Partial)

 Pearson – Variables should be continuous


 Spearman, Kendell tou – Variables should be ranked
 There needs to be a linear relationship between all variables. (Make a scatter plot and
check if the relationship is linear or not)
 There should be no significant outliers.
 Your variables should be approximately normally distributed
Analysis of Partial Correlation

 Steps – Analyze – Correlate – Partial


 Variables = Weight VO2max
 Control = Age

 The Correlations table is split into two main parts: (a) the


Pearson product-moment correlation coefficients for all your
variables – that is, your dependent variable, independent
variable, and one or more control variables – as highlighted
by the blue rectangle
 You can get this by clicking options and tick zero order
correlation (advisable to click mean SD also)
 (b) the results from the partial correlation where the Pearson
product-moment correlation coefficient between the
dependent and independent variable has been adjusted to take
into account the control variable(s), as highlighted by the red
rectangle.
Analysis of Partial Correlation

 The results of the partial correlation highlighted by


the red rectangle show that there was a moderate,
negative partial correlation between the dependent
variable, "VO2max", and independent variable,
"weight", whilst controlling for "age", which was
statistically significant (r(97) = -.314, n = 100, p = .
002).
 However, when we refer to the Pearson's product-
moment correlation – also known as the zero-order
correlation – between "VO2max" and "weight",
without controlling for "age", as highlighted by
the blue rectangle, we can see that there was also a
statistically significant, moderate, negative
correlation between "VO2max" and "weight" (r(98) =
-.307, n = 100, p = .002). This suggests that "age" had
very little influence in controlling for the
relationship between "VO2max" and "weight".
Regression Interpretation (Exercise)

 Finding Regression between two variables


 Where IV = Advertising Spending
 DV = Sales
 Sample N= 24

 SPSS will give Model Summary Table, ANOVA and Regression Coefficients
Regression Interpretation – R Square
Model Summary
Adjusted R Std. Error of the  R Square interpretation
Model R R Square Square Estimate
1 .916a .839 .832 .73875  0.839 or 84% (r^2) of the variation in DV sales is
a. Predictors: (Constant), Advertising spending due to IV advertising spending
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 62.514 1 62.514 114.548 .000b  While remaining 16% (100%-84%) of variation in
Residual 12.006 22 .546     DV sales is due to the factors other than
Total 74.520 23       advertising spending (stochastic factors)
a. Dependent Variable: Detrended sales
b. Predictors: (Constant), Advertising spending

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 6.584 .402   16.391 .000
Advertising spending 1.071 .100 .916 10.703 .000
a. Dependent Variable: Detrended sales
Regression Interpretation – Constant & Slope
Model Summary
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .916 a
.839 .832 .73875  P Value or Sig - It should be less than 0.05 or
a. Predictors: (Constant), Advertising spending
5%
ANOVAa
Model Sum of Squares df Mean Square F Sig.
 In this case, the value of 0.000 is less than
1 Regression 62.514 1 62.514 114.548 .000b 0.05 which means the relationship between
Residual 12.006 22 .546     advertising spending and sales is significant.
Total 74.520 23      
a. Dependent Variable: Detrended sales
b. Predictors: (Constant), Advertising spending

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 6.584 .402   16.391 .000
Advertising spending 1.071 .100 .916 10.703 .000
a. Dependent Variable: Detrended sales
Regression Interpretation – Constant & Slope
Model Summary
Adjusted R Std. Error of the  1 unit increase in advertising will bring
Model R R Square Square Estimate
1 .916a .839 .832 .73875
1.071 unit increase in sales on the average
a. Predictors: (Constant), Advertising spending  Interpretation of beta (Slope)
ANOVAa
Model Sum of Squares df Mean Square F Sig.
 Even if advertising spending = 0, sales will
1 Regression 62.514 1 62.514 114.548 .000b be 6.584 due to other factors.
Residual 12.006 22 .546    
Total 74.520 23      
 Interpretation of Constant
a. Dependent Variable: Detrended sales
b. Predictors: (Constant), Advertising spending

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 6.584 .402   16.391 .000
Advertising spending 1.071 .100 .916 10.703 .000
a. Dependent Variable: Detrended sales
Regression Interpretation – Equation
Model Summary
Adjusted R Std. Error of the  Equation =
Model R R Square Square Estimate
1 .916a .839 .832 .73875  Y= a + bX + e
a. Predictors: (Constant), Advertising spending

ANOVAa
Model Sum of Squares df Mean Square F Sig.
 Y=Sales
1 Regression 62.514 1 62.514 114.548 .000b
Residual 12.006 22 .546    
 X= Advertising Spending
     
Total 74.520 23  A= Constant or Intercept = 6.584
a. Dependent Variable: Detrended sales
b. Predictors: (Constant), Advertising spending  B= Slope =1.071
 E= 1- R square = 1-0.839 = 0.161
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
 Sales = 6.584 + 1.071(Advertising Spending) +
1 (Constant) 6.584 .402   16.391 .000 0.161
Advertising spending 1.071 .100 .916 10.703 .000
a. Dependent Variable: Detrended sales
Regression Interpretation – SS MS

 Sum of square / df = Mean Square


 62.514/1 = 62.514
 12.006/22 = 0.546

 F = Mean square Regression/ Mean Square


Residual
 F= 62.514/0.546 = 114.548
Regression Interpretation – F and P Value

 Significant=
 P value = <= 0.05
 If F stat is greater then F critical we will
accept H1 and reject H0
 F stat is also called calculated value = 114.548
 F Critical value =4
 F stat > F critical value = 114.548> 4 = results
significant
Regression Interpretation – T Value

 T stat > t critical = significant results


 T stat (constant) = 16.391
 T stat (advertising spending) = 10.703

 T critical >= 2
 16.391 and 10.703 >= 2 > results
significant
Regression Equation

 Alpha a = intercept
 Beta b = slope
 E = Residual or Error in the equation
 Y= Dependent Variable
 X= Independent Variable
 Alpha a = 6.584
 Beta b = 1.071
 Y= Sales
 X= Advertisement
 Sales=6.584 + 1.071 (advertisement)
Step Wise Linear Regression

 Stepwise linear regression is a method of regressing multiple variables


while simultaneously removing those that aren't important.
 Stepwise regression is also used, when the control variable/s is identified.
 Stepwise regression essentially does multiple regression a number of times, each
time removing the weakest correlated variable. At the end you are left with the
variables that explain the distribution best.
Step Wise Linear Regression

Enter variables (e.g. control variables You can press enter or stepwise in
then press Next) Method
 The results show R2 with just the controls and then
controls and main IVs. You can compare the R2
Step Wise Linear Regression

 In linear regression, statistics, tick R square


change and collinearity diagnostic to see the
effect of new variables on R2 and their effect
on collinearity.
Difference between Correlation and
Regression?

 Correlation – Actual Data  Regression – Line of best fit


Least Square Method

 Fitting the best line in the scatter plot


 The "least squares" method is a form of mathematical regression analysis used to
determine the line of best fit for a set of data, providing a visual demonstration of the
relationship between the data points. Each point of data represents the relationship
between a known independent variable and an unknown dependent variable.
Non- Parametric Tests
Parametric Test and Non-parametric tests

Parametric Test Non-parametric Test


 Many of the statistical procedures are parametric  If you use a parametric test when your data are not
tests based on the normal distribution. parametric then the results are likely to be
inaccurate.
 Types of tests
 Used when continuous data are not normally
 t-test (paired or unpaired) distributed or when dealing with categorical/
 ANOVA (one-way non-repeated, repeated; two- qualitative variables.
way, three-way)  Types of test:
 Linear regression  chi-squared, Fisher's exact tests, Wilcoxon's matched
 Pearson rank correlation. pairs, Mann–Whitney U-tests, Kruskal–Wallis
tests and Spearman rank correlation.
Non-Parametric Tests

 Steps – Analyze – Non Parametric Test


Mann Whitney Test – Substitute Independent
Sample T Test

 Mann Whitney Test is an independent sample t


test for non-parametric (not normal) data.
 Steps – Analyze – Non Parametric Test –
Independent samples
 Test field: Current Salary
Group: Gender
 Go to settings – tick customize tests and select
Mann whitney
 SPSS provides results with analyses
Kruskal Walis Test – Substitute of ANOVA

 Rank-based nonparametric test that can be used to determine if there are statistically
significant differences between two or more groups of an independent variable on a
continuous or ordinal dependent variable.
 Alternative to the one way ANOVA 
 For example, test to understand whether exam performance, measured on a continuous
scale from 0-100, differed based on test anxiety levels, measured students with "low",
"medium" and "high" test anxiety levels
 DV: exam performance (continuous)
 IV: test anxiety level which has three independent groups (groups)
Kruskal Wallis – Substitute of ANOVA

 When we have more than two groups to test field, we


use substitute of ANOVA test, the Kruskal Wallis test
 Taking last example forward we want to test whether
distribution of current salary is different across
employment category where employment category
has three options: clerical, custodial and manager
 Steps – Analyze – Non Parametric Test – Independent
samples
 Test field: Current Salary
Group: Gender
 Go to settings – tick customize tests and select
Kruskal Wallis
Wilcoxon Signed-Rank Test - Substitute of
Paired Sample T Test

 General test to compare distributions in paired  Steps – Analyze – Non Parametric Test –
samples. related samples
 This test is usually the preferred alternative to the  Go to settings – tick customize tests and select
Paired t-test when the assumptions of parametric test
are not satisfied Wilicoxon
 In a paired sampletest, each subject or entity is  Specify two items in test fields
measured twice, resulting in pairs of observations. In this case, we want to know whether there
 Eg. interested in evaluating the effectiveness of a are differences in beginning salary and current
company training program. salary of the employees
Measure the performance of a sample of employees
before and after completing the program, and analyze
the differences using a paired sample test.
 If data is non-parametric use these steps
The Friedman test - Substitute of One way
repeated ANOVA

 Steps – Analyze – Non Parametric Test –


related samples
 Go to settings – tick customize tests and select
Friedman
 Specify two items in test fields
In this case, we want to know whether there
are differences in beginning salary and current
salary of the employees
 This test shows distribution and Wilcoxon
shows median difference
Moderation and Mediation
OPTIONAL
Moderation

 A moderator variable, commonly denoted as  Examples:


just M, is a third variable that affects the  IV: Stress
strength of the relationship between a
dependent and independent variable.  DV: Health Status
 The moderator variable, if found to be  Stress has a bigger impact on men than women.
significant, can cause an amplifying or  Gender is a qualitative variable that moderates
weakening effect between x and y.  the strength of an effect between stress and
health status.
Process Macro

 Download Process Macro by Andrew Hayes for


easy moderation and mediation.
 Go to http://processmacro.org/download.html
and download and zip file (also available on
slate week 13)
 After opening SPSS, under the “Utilities”
menu, choose “Custom Dialogs” and then
“Install Custom Dialog”. Then locate the
PROCESS dialog builder file and click “Open.”
 Process file is on the following path:
Process Macro

 After installing Process Macro – Go to


regression and Process v3
 The model templates for Process Macro are
available on the following link:
http://
www.personal.psu.edu/jxb14/M554/specreg/te
mplates.pdf

Also, as a Week 13 Templates Process Macro


file on Slate
Moderation on Process Macro

 If you have 1 moderator in the relationship


between X on Y, You can select model 1
Manual Moderation without Process Macro

 Create an interaction for IV and Moderating


variable by multiplying the both variables  Regress IV, Moderator and the new interaction
 Transform- Compute Variable – variable as IVs on your DV.
 Assign name for target variable
 Numeric Expression: IV * Moderator
(where IV and moderator are the names of your
variables who are representing IV and moderator)
Moderation

 If the result of interaction variable as IV on


your DV is significant and positive, it means
the moderator positively and significantly
affected the strength of the relationship
between your IV and DV such that the
relationship is stronger when moderating
variable is high.

 The first figure shows process macro results


and the second figure shows regression results
Moderation Analysis

 IV: Op mean
 DV Lifesat_mean
 Moderator: Sest
 Moderation Interaction: Op_sest

 In this case the interaction between IV and Mod (int_1 or


op_sest) has a value of -.122 which shows that the
relationship between
IV on DV is positive such that the relationship is stronger
when sest is low
(low because op_sest has a negative value)
 However, the p value of interaction is greater than 0.05
(0.785) which means that the moderation results are not
significant
Significance

 Either you can check p value for significance or LLCI and ULCI values
 LLCI and ULCI range should not have a ZERO in between to be significant
Mediation Analysis

 A mediator mediates the relationship between the independent and dependent variables


 i.e. explaining the reason for such a relationship to exist.
 Another way to think about a mediator variable is that it carries an effect.
 In a perfect mediation, an independent variable leads to some kind of change to the
mediator variable, which then leads to a change in the dependent variable.

Turning on the
Boil water
stove

Turning on the Heat on the stove Boil water


stove
Mediation Analysis

 Mi is Mediator in Process Macro Template


 Whereas M is Moderator
 X on Y = Direct Effect
 X on M and M on Y = Indirect Effect
 Use Model 4 for this template
 IV: Op_Mean
 Med: SEST_Mean
 DV: Lifesat_Mean

 Outcome variable: sest_mean


 Shows relationship:

Op_mean Sest_mean
Mediation Analysis

Op_mean Lifesat_mean

Sest_mean Lifesat_mean
NO Mediation

0.0752* -0.417

Op_mean Sest_mean Lifesat_mean

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy