Independent Sample T Test

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Independent Samples t Test

Richard Kofi Asravor (PhD)


rasravor@gctu.edu.gh
Objectives

⚫ Understanding the Independent Samples t Test

⚫ Common Uses

⚫ Data requirement
⚫ Sample data to be used in class is Sample_Dataset_2014.sav

⚫ SPSS procedures

⚫ Interpretation of the results


2
Definition
The Independent Samples t Test:
⚫ Compares the means of two independent groups in order to
determine whether there is statistical evidence that the
associated population means are significantly different.

⚫ Also known as:


 Independent t Test; Independent Measures t Test;
Independent Two-sample t Test; Student t Test; Two-
Sample t Test; Uncorrelated Scores t Test;
Unpaired t Test; Unrelated t Test.

⚫ Variables used in this test are known as:


 Dependent variable, or test variable

 Independent variable, or grouping variable 3


Common Uses
The Independent Samples t Testis commonly used to:
⚫ Statistical difference between

 the means of two groups

 the means of two interventions

 the means of two change scores

 Note: The Independent Samples t Test can only compare the


means for two (and only two) groups.

 It cannot make comparisons among more than two groups.

4
Data Requirement
⚫ Your data must meet the following requirements:
 Dependent variable that is continuous(interval or ratio level)

 Independent variable, i.e. categorical and has exactly two


categories
 Cases having values on the dependent and independent
variables
 Independent samples/groups (i.e., independence of
observations)
⚫ There is no relationship between the subjects in each
sample. This means that:
 Subjects in the first group cannot be in the second
group
 No subject in either group can influence subjects in

the other group or group can influence others


5
Data Requirement
⚫ Violation of this assumption will yield an inaccurate p value

⚫ Random sample of data from the population.

⚫ Normal distribution (approximately) of the dependent


variable for each group
 Non-normal population distributions, especially those that
are thick-tailed or heavily skewed, considerably reduce the
power of the test
 Among moderate or large samples, a violation of
normality may still yield accurate p values

⚫ No outliers
6
Data Requirement
⚫ Homogeneity of variances (i.e., variances approximately equal
across groups)
 When this assumption is violated and the sample sizes for
each group differ, the p value is not trustworthy. However, the
Independent Samples t Test output also includes an
approximate t statistic that is not based on assuming equal
population variances.

 This alternative statistic, called the Welch t Test statistic1,


may be used when equal variances among populations cannot
be assumed. The Welch t Test is also known an Unequal
Variance t Test or Separate Variances t Test.
⚫ Note: When one or more of the assumptions for the
Independent Samples t Test are not met, you may want to run the
nonparametric Mann-Whitney U Test instead.
7
Hypothesis Testing
The hypotheses to be tested must be:
⚫ Can be expressed in two different ways:

 H0: µ1 = µ2 ("the two population means are equal")

 H1: µ1 ≠ µ2 ("the two population means are not equal")

OR
 H0: µ1 - µ2 = 0 ("the difference between the two population
means is equal to 0")
 H1: µ1 - µ2 ≠ 0 ("the difference between the two population
means is not 0")

⚫ Where
 µ1 is the population mean of group 1, and

 µ2 is the population mean of group 2.


8
Levene’s Test for Equality of Variances
⚫ The Independent Samples t Test requires the assumption
of homogeneity of variance -- i.e., both groups have the same
variance.
⚫ The test for the homogeneity of variance, called Levene's Test,
whenever you run an independent samples t test.

⚫ The hypotheses for Levene’s test are:


 H0: σ12 - σ22 = 0 ("the population variances of group 1 and 2 are
equal")
 H1: σ12 - σ22 ≠ 0 ("the population variances of group 1 and 2 are not
equal")

⚫ Indicating that if we reject the null hypothesis of Levene's Test,


it suggests that the variances of the two groups are not equal;
i.e., that the homogeneity of variances assumption is violated.9
Test Statistic

⚫ The calculated t value is then compared to the critical t value


with df = n - 1 from the t distribution table for a chosen
confidence level.

⚫ If the calculated t value is greater than the critical t value,


then we reject the null hypothesis (and conclude that the
means are significantly different). 10
About the DataSet
⚫ In our sample dataset, students reported their typical time to
run a mile, and whether or not they were an athlete. Suppose
we want to know if the average time to run a mile is different
for athletes versus non-athletes.

⚫ This involves testing whether the sample means for mile time
among athletes and non-athletes in your sample are statistically
different (and by extension, inferring whether the means for
mile times in the population are significantly different between
these two groups).

⚫ You can use an Independent Samples t Test to compare the


mean mile time for athletes and non-athletes.
11
BEFORE THE TEST
⚫ Analyze > Descriptive Statistics > Explore:
 To obtain a comparative boxplot yields the following graph

⚫ If the variances were indeed equal, we would expect the total


length of the boxplots to be about the same for both groups.

⚫ From the boxplot, the spread of observations for non-athletes


is much greater than the spread of observations for athletes.
12
RUNNING THE TEST
⚫ Click Analyze > Compare Means > Independent-Samples
T Test.
⚫ Move the variable Athlete to the Grouping Variable field, and
the variable MileMinDur to the Test Variable(s) area.
⚫ Athlete is defined as the independent variable
and MileMinDur is defined as the dependent variable.

⚫ Click Define Groups, to opens a new window.


 Use specified values is selected by default. Since our grouping
variable is numerically coded (0 = "Non-athlete", 1 = "Athlete"),
type “0” in the first text box, and “1” in the second text box.
⚫ Click Continue when finished.
⚫ Click OK to run the Independent Samples t Test. Output for
the analysis will display in the Output Viewer window. 13
Run an Independent Samples Test

(A)Test Variable(s): The dependent variable(s). This is the


continuous variable whose means will be compared between
the two groups. You may run multiple t tests simultaneously by
selecting more than one test variable.
14
Run a Independent-Sampled T Test
⚫ (B) Grouping Variable:
 The independent variable: The categories of the independent
variable must have at least two categories (groups); it may have more
than two categories but a t test can only compare two groups, so you
will need to specify which two groups to compare.
 You can also use a continuous variable by specifying a cut point to
create two groups (i.e., values at or above the cut point and values
below the cut point).
⚫ (C) Define Groups:
 Click Define Groups to define the category indicators (groups) to
use in the t test. If the button is not active, make sure that you have
already moved your independent variable to the right in
the Grouping Variable field.
 You must define the categories of your grouping variable before you
can run the Independent Samples t Test procedure.
15
Run a Independent-Sampled T Test
⚫ D Options: The Options section is where you can set your
desired confidence level for the confidence interval for the
mean difference, and specify how SPSS should handle missing
values.

⚫ Click OK to run the Independent Samples t Test, or


click Paste to have the syntax corresponding to your specified
settings written to an open syntax window.

⚫ Clicking the Define Groups button (C) opens the Define


Groups window:

16
Define Groups

⚫ (1) Use specified values:


 If your grouping variable is categorical, select Use specified
values. Enter the values for the categories you wish to compare in
the Group 1 and Group 2 fields.
 If your categories are numerically coded, enter the numeric codes.
If your group variable is string, enter exact text strings categories.
 If your grouping variable is more than two categories (e.g., takes
on values of 1, 2, 3, 4), specify two of the categories to be
compared (SPSS will disregard the other categories in this case).17
Define Groups
⚫ (2) Cut point: If your grouping variable is numeric and
continuous, you can designate a cut point for dichotomizing the
variable.
 This will separate the cases into two categories based on the cut
point. Specifically, for a given cut point x, the new categories will
be:
⚫ Group 1: All cases where grouping variable > x
⚫ Group 2: All cases where grouping variable < x
⚫ Clicking the Options button (D) opens the Options window:

18
⚫ (D) Options: Clicking Options will open a window where you
can specify the Confidence Interval Percentage and how the
analysis will address Missing Values (i.e., Exclude cases
analysis by analysis or Exclude cases listwise).

⚫ Click Continue when you are finished making specifications

19
Group Statistics

⚫ Provides basic information about the group comparisons,


including the sample size (n), mean, standard deviation, and
standard error for mile times by group.

⚫ There are 166 athletes and 226 non-athletes. The mean mile
time for athletes is 6 minutes 51 seconds, and the mean mile
time for non-athletes is 9 minutes 6 seconds.
20
Independent Samples Test
⚫ (A) Levene's Test for Equality of of Variances: This
section has the test results for Levene's Test. From left to
right:
 F is the test statistic of Levene's test

 Sig. is the p-value corresponding to this test statistic.

 The p-value of Levene's test is printed as ".000" (but should be


read as p < 0.001 -- i.e., p very small), so we reject the null of
hypothesis

21
OUTPUT
 Levene's test and conclude that the variance in mile time of athletes
is significantly different than that of non-athletes.

 The "Equal variances not assumed" row for the t test (and
corresponding confidence interval) results. (If this test result had
not been significant -- that is, if we had observed p > α -- then we
would have used the "Equal variances assumed" output.)
⚫ (B) t-test for Equality of Means provides the results for the
actual Independent Samples t Test. From left to right:
 t is the computed test statistic, using the formula for the equal-
variance-assumed test statistics (first row of table) or the formula for
the equal-variance-not-assumed test statistic(second row of table).

 df is the degrees of freedom, using the equal-variance-assumed


degrees of freedom formula (first row of table) or the equal-
variance-not-assumed degrees of freedom formula (second row of
table) 22
OUTPUT
 Sig (2-tailed) is the p-value corresponding to the given test statistic and
degrees of freedom
 Mean Difference is the difference between the sample means,
i.e. x1 − x2; it also corresponds to the numerator of the test statistic
for that test
 Std. Error Difference is the standard error of the mean difference
estimate; it also corresponds to the denominator of the test statistic
for that test

23
OUTPUT
 Levene's test and conclude that the variance in mile time of athletes
is significantly different than that of non-athletes.

 This tells us that we should look at the "Equal variances not


assumed" row for the t test (and corresponding confidence
interval) results. (If this test result had not been significant -- that
is, if we had observed p > α -- then we would have used the "Equal
variances assumed" output.)

⚫ (C) Confidence Interval of the Difference:


 Typically, if the CI for the mean difference contains 0 within the
interval -- i.e., if the lower boundary of the CI is a negative number
and the upper boundary of the CI is a positive number -- the results
are not significant at the chosen significance level.
 In this example, the 95% CI is [01:57, 02:32], which does not
contain zero; this agrees with the small p-value of the significance
24
test.
DECISION AND CONCLUSIONS
⚫ Recall that our hypothesized population value was 66.5
inches, the [approximate] average height of the overall adult
population in the U.S. Since p < 0.001.

⚫ We reject the null hypothesis that the mean height of


students at this college is equal to the hypothesized
population mean of 66.5 inches and conclude that the mean
height is significantly different than 66.5 inches.

25
DECISION AND CONCLUSIONS
⚫ Since p < .001 is less than our chosen significance level α =
0.05, we can reject the null hypothesis, and conclude that the
that the mean mile time for athletes and non-athletes is
significantly different.

⚫ Based on the results, we can state the following:


 There was a significant difference in mean mile time
between non-athletes and athletes (t315.846 = 15.047, p <
.001).
 The average mile time for athletes was 2 minutes and 14
seconds lower than the average mile time for non-
athletes.

26
⚫The End

⚫Thanks

27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy