0% found this document useful (0 votes)
219 views24 pages

Unit 5: Test of Significance/Hypothesis Testing (Topics 20, 22, 23)

This document outlines the steps for performing a hypothesis test on data from an experiment measuring students' perceptions of elapsed time. It describes the experimental data, defines the null and alternative hypotheses, checks that conditions for a t-test are met, calculates the t-statistic and p-value, and concludes that the null hypothesis is rejected at the 0.01 significance level, providing evidence that the mean estimated time differed from the actual 10 seconds.

Uploaded by

Riddhiman Pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
219 views24 pages

Unit 5: Test of Significance/Hypothesis Testing (Topics 20, 22, 23)

This document outlines the steps for performing a hypothesis test on data from an experiment measuring students' perceptions of elapsed time. It describes the experimental data, defines the null and alternative hypotheses, checks that conditions for a t-test are met, calculates the t-statistic and p-value, and concludes that the null hypothesis is rejected at the 0.01 significance level, providing evidence that the mean estimated time differed from the actual 10 seconds.

Uploaded by

Riddhiman Pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Unit 5: Test of Significance/Hypothesis Testing - Key

(Topics 20, 22, 23)

In this unit we will add to what we have learned about statistical inference by studying tests of
significance. These tests will assess the degree to which sample data provide evidence against a
particular conjecture or hypothesis about the value of the population mean. We will study the
formal structure of the tests, starting with the hypotheses and ending with a conclusion. Like a
confidence interval allows us to infer something about a population mean, so does a significance
test.

Estimating Elapsed Time (from Introduction to Statistical Investigations by Tintle, et. all)
Does it ever seem like time drags on or time flies by? Perception, including that of time, is one of the
things that psychologists study. Students in a statistics class collected data on 48 other students’
perception of time. They told their subjects that they would be listening to some music and then after it
was over, they would be asked some questions. They played 10 seconds of the Jackson 5’s song “ABC”.
Afterward, they simply asked the subjects how long they thought the song clip lasted. They wanted to see
whether students could accurately estimate the length of this short song segment. Below is a frequency
table for the data:
Time (sec) 5 6 7 8 10 12 13 15 20 21 22 30
Frequency 1 1 3 6 11 3 3 10 4 1 1 4

Let’s explore this study using a formal, step-by-step process called a test of significance. We will outline
the six main steps in such a test throughout this activity and you will use these same steps for the next two
topics as well. Since we will be working with samples of quantitative data, we will be conducting t-tests.

a. Write out a definition of the parameter of interest in the elapsed time study and indicate what symbol
you will use to represent it. The parameter of interest is the average time in seconds that a student
estimated elapsed while the clip of “ABC” was played. Use  

Page 1
b. In this elapsed time study, the null hypothesis is that the mean time elapsed is 10 seconds. Restate this
using the symbol from part a and the appropriate hypothesized value, instead of words: H0:  = 

c. In the elapsed time study, the students think the other students’ estimates will differ from the actual
time since the perception of time is inaccurate. Restate this conjecture (with symbols and a number) as an
alternative hypothesis. Ha:   

Note that the symbol and hypothesized value don’t change between the null and the alternative
hypothesis; you are just selecting between less than, greater than, or not equal to based on the research
conjecture.

d. What conditions needed to be met for the Central Limit Theorem to be satisfied?
We assume the sample was random, we know it is large (n > 30), and we assume it is independent (n
< 10% of the overall student population).

Note that you will often check these conditions assuming the null hypothesis is true. If random sampling
isn’t mentioned, it can usually be assumed because it is part of a well-designed experiment. If the sample
is not large enough (n < 30), look at displays of the distribution and/or a normal probability plot to make
sure the data appear to be normally distributed. If not, proceed with caution through the rest of the test. If
independence isn’t clear (if you don’t know how large the population is), you can usually assume a
sample is independent, but you should note you are assuming this when you state the conditions.

Page 2
At this point, it is very good practice to then draw a well-labeled sketch of the sampling distribution of the
sample mean using the mean of 10 seconds and the standard deviation of the sampling distribution of
s 6.500
= = 0.938 .
48 48

You should mark the observed value on the graph. In


this case the observed value is the sample mean of
13.708 seconds, which is essentially off the graph.

7.185 8.124 9.062 10 10.938 11.876 12.815


Sample Mean, Time Elapsed in seconds

e. Your calculator can find the test statistic, or t-statistic since this is a t-test, but you can find it as well
by calculating how many standard deviations of the sampling distribution the sample mean is away from
x −  13.708 − 10
the population mean (very similar to the z-score calculation): t = = = 3.952
s n 6.500 / 48

f. Since our alternative hypothesis is “not equal”, we will need the probability in both tails. This is a
two-sided or two-tailed test. Since our t-statistic is positive, we will find the probability above it in the t-
distribution with the correct degrees of freedom (47 in this case) and then we will multiply that
probability by 2 to calculate the total probability in both tails. You can do this using a t-table or using
your calculator (if you use your calculator with the correct alternative hypothesis, it will multiply by 2 for
you).

p-value = 0.000258 (your calculator might display something like 2.586  -4 which is scientific notation
for 2.586  10−4 ; look carefully since the p-value is a probability and should be a number from 0 to 1)

Page 3
Note that the smaller the p-value, the stronger the evidence against the null hypothesis and in favor of the
alternative hypothesis. Typical evaluations are

• A p-value above .10 constitutes little or no evidence against the null hypothesis.
• A p-value below .10 but above .05 constitutes moderately strong evidence against the null
hypothesis.
• A p-value below .05 but above .01 constitutes reasonably strong evidence against the null
hypothesis.
• A p-value below .01 constitutes very strong evidence against the null hypothesis.

In some studies, the researcher decides in advance how small the p-value must be to provide convincing
evidence against the null hypothesis. This cutoff value is called a significance level, denoted by α (alpha).
Common values are α = .10, α = .05, and α = .01. A smaller significance level indicates a stricter
standard for deciding if the null hypothesis can be rejected. If the researcher specifies a level of
significance in advance, you would then say you reject or fail to reject at that particular level. Another
common expression is to say that the data are statistically significant if it is unlikely to have occurred by
chance or sampling variability alone (assuming that the null hypothesis is true).

g. Does the p-value for the Elapsed Time study lead to reject or failing to reject the null hypothesis at the
.01 level?
We reject the null hypothesis at the .01 level because our p-value of 0.0002 is less than 0.01.

h. Does this study provide convincing evidence that the mean estimate for time elapsed was different
than 10 seconds? Explain in context. Yes. It would be surprising to obtain a sample mean of 13.708
seconds which we first saw in the graph of the sampling distribution. A second piece of evidence is
the large t-statistic, over 3, showing our sample mean is unlikely to occur in a random sample with
the true elapsed time of 10 seconds. The probability of obtaining such a t-statistic by random
chance alone is .0002. We have strong evidence to think the mean time perceived by students as the
time elapsed while a 10 sec clip of the song “ABC” is played will be different than 10 seconds.

Page 4
Practice Exercise:
Use the 6-step process outlined above to complete a test of significance for the following situation (from
AMSCO’s AP Statistics): An association of college bookstores reported that the average amount of
money spent by students on textbooks was $325.16 with a standard deviation of $76.42. A random
sample of 75 students at the local campus of the state university indicated that an average bill for
textbooks for the semester in question to be $312.34. Do these data provide significant evidence at a 95%
confidence level (same as a 0.05 significance level) that the actual bill will be less than the $325.16 that
was reported? Show all steps.

1. The parameter is the mean bill for textbooks for college students.
2. H0:  = 325.16
Ha:  < 325.16
3. Because the sample size is large (75 > 30) and randomly selected, and the sample values are
independent of each other, we can apply the Central Limit Theorem.

312.34 - 325.16
4. t= = -1.4528
76.42
75
5. p-value = .07313
6. This is not significant at the .05 level. We fail to reject the null hypothesis because our p-
value of 0.07313 is greater than .05. There is insufficient evidence to show that the average
bill for college textbooks is less than $325.16.

Page 5
Walk the Line Experiment

You’ll need someone to help you with this data collection. We want to test how much eyesight helps you
keep walking in a straight line. We assume without any eye covering, you could walk a straight line
without any issues. We will have you blindfolded, either inside or outside depending on your own
situation, and try to walk down a straight line for 10 ft. Hopefully there is already a line on the floor or
the sidewalk you use. At the end of the 10 ft, measure how far off the line (to either side) you are in
inches.
What is the parameter we wish to study?
The mean number of inches away from a line after walking blind-folded for 10 ft
represented by 

What is the null hypothesis? H0:  = 0

What is the alternative hypothesis? Ha:  > 0

Materials: blind-fold, helper to make sure you don’t run into anything, ruler

Procedure:
1. Find a spot, inside or outside, with 10 ft of a straight line.

2. Have someone blindfold you and place you at the start of the 10 foot segment.
3. Walk 10 ft ahead. Have someone stop you at the end of 10 ft.

4. Measure how far off the line you are at the end of the 10 ft in inches. Count either side as
positive.

Record the class data: (sample data given below; results based on sample)

student dist off line (in) student dist off line (in) student dist off line (in)
1 1.5 11 1.5 21 1.9
2 15 12 15.5 22 1.8

3 1.7 13 1.8 23 1.7


4 2.3 14 1.8 24 1.5
5 1.9 15 1.7 25
6 9 16 1.6 26
7 1 17 2 27
8 1.5 18 1.5 28
9 15.5 19 1.9 29
10 1.1 20 1.3 30

Calculate the sample mean and standard deviation:

x = 3.675, sx = 4.751

Page 6
In order to complete a t-test, verify the technical conditions:
Not sure if participants we randomly sampled from any
population in particular, maybe the TJ Class of 2025?
Probably not any better or worse at walking in a straight line
blind-folded than most students their age though? If the
population is the freshmen class, our sample is less than 10%
and independent. It is smaller than 30, so we must check
normality. The normal probability plot (distance in inches on
horizontal axis, z-scores on vertical axis) is mostly a straight
line with 3 points clearly off to the side. We will proceed with
caution through the rest of the test.

Calculate the test statistic:

We have one sample, but we don’t know the population standard deviation, so we will run a one
sample t-test. Since we are using the sample standard deviation, the test statistic will be a t-critical
x -μ
value and can be found using: . The calculator also finds the test statistic for us.
s
n
Test statistic: t = 3.789

Calculate the p-value and find a confidence interval using a 95% confidence level:

(95% confidence level is the same as a 5% significance level or α=5% )


p-value = 0.0004, df = 23
95% confidence interval: (1.669, 5.681)

Summarize your conclusion:

The p-value = 0.0004 < 0.05 so we have evidence to reject the null hypothesis. The confidence
interval gives us further evidence that students tend to end up 1.669 to 5.681 inches off a straight
line when they try to walk 10 ft blindfolded.

Now complete Activity 20-1 starting on page 422 and Activity 20-2 starting on page 426 in your
textbook.

Page 7
Topic 20 Summaries

1. From the Watch-Out on page 426,


a. Is a hypothesis about a parameter or a statistic? parameter

b. When should the alternative hypothesis be formulated? Before collecting the sample data,
based on the research question

c. What is the denominator of the test statistic? The standard error of the sample mean.

d. When you calculate the p-value for a two-sided alternative, what area is included? The total
area in both tails of the t-distribution beyond the value of the test statistic.

e. What does a small p-value indicate? You are unlikely to obtain such extreme sample data if
the null hypothesis is true, which provides evidence against the null hypothesis and in favor
of the alternative hypothesis.

2. From the Watch-Out on page 430, when should you be cautious about generalizing to a larger
population? When a sample is not chosen randomly.

Page 8
Complete Activity 22-1 starting on page 472 in your textbook.

Personality Quiz
Students will take a personality-style quiz using the link: https://forms.gle/6UafB96FAD3njkga7. After
answering the quiz questions, students will calculate a score. The data collection will include the score
and whether the student was born before July 1 or not. The question is whether there is a difference in
score on the personality-style quiz based on time of year you were born.

1. Identify the parameter(s) to be measured:


 early is the mean score of students who are born before July 1 and  late is the mean score of
students born on or after July 1

2. State the hypotheses:


H0:  early =  late

HA:  early   late


Data Collection

Take the personality-style quiz using the link above. It is an FCPS Google link. Report your score to the
data collection spreadsheet and then use data from there to complete this problem.
(Sample data given. Results based on sample data.)
Scores for Students born before July 1 (points from quiz):
46 30 31 18 17 20 35 41 21 16
32 56 43 34 41 24 62 38 20 50
52 20 33 57 64 63 43 58 16 22

Scores for Students born on or after July 1 (points from quiz):


49 46 63 40 21 31 42 35 54 44
53 55 28 30 20 46 26 24 42 53
43 33 44 58 28 47 30 22 46 36

3. Check Assumptions (technical conditions):


Students are from TJ Class of 2025 and are a somewhat random sample, sample size is 30 (if you
use scores from students in different classes; otherwise check normality using a display and show
your display with analysis), sample is less than 10% of the population of the TJ freshmen class so
sample is independent. Students are either born before July 1 or born on or after July 1, so
samples are independent from each other as well.

Testing the results


Let n E be the number of students born before July 1.
Let n L be the number of students before on or after July 1.
Fill in the summary statistics for each sample:

nE = 30 xE = 36.4333 sE = 16.211

nL = 30 xL = 39.633 sL = 11.877

Page 9
Which distribution will you use? Why? Are there any assumptions to be made that you haven't already
stated? Explain.
Since we don’t have the population standard deviation for either group, we will use the t-
distribution. Since our sample size might not be 30, checking that the samples are normal using a
normal probability plot would be a good idea. Either way, we will proceed with the test (possibly
with caution if samples are not normal or large enough!).

4. Test Statistic: We will run a two-sample t-test since the “treatment” is time of year born. The
observational units are the students.
t = -0.872

5. p-value = 0.387
d.f. (if applicable) = 53.169
95% C.I. = (-10.56, 4.159)

6. Test decision and conclusion in context:


With p = 0.387 > 0.05, we fail to reject the null hypothesis. We do not have evidence that the mean
score on a personality-style quiz differs by time of year students were born, before July 1 or later.
The 95% confidence interval is further evidence to fail to reject the null hypothesis since it includes
0 which shows there is no difference in personality quiz score on average by time of year born.

Page 10
Topic 22 Summaries

1. From the Notes on page 475


a. How large sample sizes need to be depends on how non-normal the original populations are. If

both are both symmetric with similar shapes, then the sample size may be

may be smaller than 30 but if the populations are skewed, we prefer sample sizes be

even larger than 30.

b. What does it mean that the degrees of freedom convention is a conservative approximation?

The degrees of freedom is on the low side, so the critical value will be slightly greater than it
needs to be; thus, the interval will be slightly wider and, therefore, will succeed in capturing
µ1 - µ2 slightly more often than the confidence level indicates.

2. From page 479, all else being the same, a test results becomes more statistically significant as

a. The difference in sample means increases.

b. The sample standard deviations decrease.

c. The sample sizes increase.

3. From the Watch-Out on page 479, is failing to reject a null hypothesis the same as accepting it? No.

4. From the Watch-Out on page 483,


a. What is a good first step to choose a correct procedure? Identify the observational or
experimental units and variables.

b. Before performing a test, what must you check? Technical conditions.

c. What should you always relate your conclusion to? The context of the study.

Page 11
Complete Activity 23-1 starting on page 498 in your textbook.
Complete Activity 23-3 on page 504 in your textbook.

Heart Rate Experiment: Is there a difference between your resting and active heart rate?

For most people, their resting heart rate is lower than their active heart rate. Heart rate is measured in
beats per minute and can be calculated by finding your pulse in your wrist, counting the beats in 10
seconds, and multiplying by 6 to get beats per minute.
In a matched pairs experiment, we can test that idea. We will test the same person’s heart rate at rest and
after 1 minute of jumping jacks.

1. Parameter:
µr is the mean resting heart rate in beats per minute (bpm), µa is the mean active heart rate in beats
per minute (bpm), and µd is the mean difference in heart rate in beats per minute (bpm), active
minus resting (µa- µr), for the population of RS 1 students participating in this experiment.

2. Hypotheses: Ho: µd = 0
Ha: µd > 0

Materials: timer

Procedure:
1. After sitting (at rest) for at least one minute, find your resting heart rate. The pulse in your wrist
is usually easiest to find. Using two fingers (not your thumb), gently press on your wrist until you feel
your pulse. Count the number of beats in 10 seconds. Multiply that by 6 to get beats per minute.

2. Do jumping jacks for 1 minute.


3. Find your active heart rate. Immediately following your minute of jumping jacks, find your pulse
and count the number of beats in 10 seconds. Multiply that by 6 to get beats per minute.
Difference in heart rate in bpm (active – resting) = _________________________________

You will record this in the data collection spreadsheet.


Record the class data: (Sample data given. Results based on sample data.)

Difference in Heart Difference in Heart Difference in Heart


Student Student Student
Rate (bpm)_ Rate (bpm)_ Rate (bpm)_
1 76 11 71 21 67
2 40 12 66 22 17
3 49 13 63 23 51
4 21 14 77 24 88
5 41 15 30 25 74
6 76 16 56 26 66
7 100 17 73 27 71
8 65 18 67 28 62
9 19 19 97 29 34
10 83 20 49 30 87
Note: You will be conducting a one-sample t-test using the distribution of differences.

Page 12
3. Technical Conditions:
Random sampling can be assumed as students were
basically randomly placed in classes by a computer to start
the school year. Since sample size may not be 30 and since
the sample size is close to 10% of the population, samples
should be checked to see that they are approximately
normal using a normal probability plot. A normal
probability plot with the differences in bpm on the
horizontal axis and the corresponding z-scores in the
vertical axis is shown:

Since the data in the plot is approximately linear, we can


assume the sample is approximately normal.

4. Test statistic: t = 15.090

5. p-value: p-value = 0 (calculator gave 1.440 − 15 which is a very small number)


df = 29
95% Confidence Interval:
(52.905, 69.495)

6. Test decision (Significance Level:  = 0.05 ): With p = 0 < 0.05 we have evidence to reject the
null hypothesis. There is strong evidence that freshmen in RS1 have a higher active heart rate than
resting heart rate measured in beats per minute. The 95% confidence interval gives further
evidence; we are 95% sure that the active heart rate is 52.905 to 69.495 beats per minute higher
than the resting heart rate of a freshmen in RS1 on average.

Page 13
Topic 23 Summaries

From the Watch-Out on page 502,


a. How do you determine which t-procedure to use? It depends on how the data are collected.
If the sampling or experimental design is paired, then use paired t-procedures. But if
the samples are drawn independently for the two groups, or if randomization is used to
separate treatment groups, then use two-sample t-procedures.

b. What can help you decide whether data are collected with a paired design? Ask whether
there is a link between each observation in one group with a specific observation in the
other group.

c. In which type of design does mixing up the order of values in one group create a problem?
Paired design.

d. If a study has a different number of observations between two groups, what type of design
cannot it not be? Paired design.

Page 14
Summary

Hypothesis Testing – t-Test: Procedure to test if there is any statistical significance to your data.

Step 1: Collect data either through a sampling plan or experimental design.


Define your parameter of interest in context, using words and symbols.

Step 2: State competing claims concerning the parameter of interest. Write the null and alternative
hypothesis in words and symbols.

Step 3: Determine if the conditions are met to conduct an appropriate test.


Determine the sample size. Use to check for independence.
Plot the data to check for normality if sample size is too small.
Check for randomness.

Step 4: Calculate test statistic, some measure of difference between values (z-score, t-score, etc.). Calculate
degrees of freedom.

Step 5: Calculate the p-value associated with the test statistic.

Step 6: Draw a conclusion based on whether or not to reject or fail to reject the null hypothesis. Using
the p-value from Step 5, assume that the null hypothesis is true and the resulting probability (p-
value) is that of obtaining the sample statistical measure or a more extreme sample statistical
measure. If this probability is smaller than the level of significance, α, you should reject the
null hypothesis. If this probability is larger than the level of significance, α, you should fail to
reject the null hypothesis.

Your conclusion is two parts. The first part (a) is including the p-value to state whether you
reject or fail to reject the null hypothesis. The second part (b) is stating in context of the
problem what you are rejecting or failing to reject.

Another thing to consider:

A level of significance, α, is the maximum probability of error that you are willing to allow in the
hypothesis testing procedure.

Page 15
Unit 5 In-Class Review

1. Perform a complete hypothesis test: A state university is concerned that there is a difference in the
writing abilities of their male and female students. To test this assertion, the university took a random
sample of 60 of their first-year students and recorded their genders and SAT Writing scores. The data
appears below.

SAT Writing Scores of Female Students


480 540 620 590 530 620 580 530 530 560 510 560
560 550 520 480 560 510 500 540 490 430 610 620
510

SAT Writing Scores of Male Students


480 560 400 580 480 460 430 430 490 610 540 500
540 400 530 640 350 470 600 610 530 580 430 510
520 380 540 460 640 520 570 560 490 440 480

Use an appropriate t-test to compare these data sets and their population means.
1.  M is the mean SAT score for males from the state university;  F is the mean SAT score for
females from the state university

H 0 : M = F
2.
H 0 : M  F

3. Random sample is stated. We can assume there are more than 600 students at a state university and the
samples are male and female, so the samples are independent and independent of each other. The size of
sample of males is 35 which is large enough that we do not need to check normality. The size of the
female sample is 25 so we will construct a normal probability plot to check normality. The female SAT
scores will be the x-values with each z-score as the y-value:
The data are close to forming a line, especially
for the majority of the values, so we will proceed
as if the sample is normal.

4. Since we have two separate sample groups,


male and female, we will run a two-sample t-test:
Test statistic: t = 2.155

5. p-value = 0.035
df = 57.699 or 57.670
95% confidence interval: (2.425, 65.689)

6. With the p-value = 0.035 < 0.05 we can reject


the null hypothesis. We have evidence that there
is a difference in mean SAT score for females and males at this state university. Also, the 95%
confidence interval does not contain 0 which means that we are 95% confident that the mean difference in
SAT scores is between 2.425 and 65.689 points at this state university.

Page 16
2. A study on children’s television viewing was conducted by Stanford researchers (Robinson, 1999). At
the beginning of the study, parents of third- and fourth-grade students at two public elementary schools in
San Jose were asked to report how many hours of television the child watched in a typical week. The 198
responses had a mean of 15.41 hours and a standard deviation of 14.16 hours.

Conduct a test of whether or not these sample data provide evidence at the .05 level for concluding that
third- and fourth-grade children watch an average of more than two hours of television per day. Include
all the components of a significance test, and explain what each component reveals. Start by identifying
the observational units, variable, sample, and population.

This is Self-Check 20-4. The solution is on pages 429-430 of the textbook.

Page 17
3. Police trainees were seated in a darkened room facing a projector screen. Ten different license planes
A B
were projected on the screen, one at a time, for 5 seconds each, separated by 15-second intervals.
6 6
8 5
After the last 15-second interval, the lights were turned on and the police trainees were asked to write
6 6
down as many of the 10 license plate numbers as possible, in any order at all.
7 5
9 7
A random sample of 15 trainees who took this test where then given a week-long memory training course.
8 5
They were then retested. The results are shown in the table (A is after training, B is before)
9 4
6 6
Test, at the 5% level of significance, that the memory course improves the ability of the trainees to
correctly identify license plates. 7 7
5 8
A B
9 4
1. is the mean number of license plates remembered after the week-long memory course; 8 5
is the mean number of license plates remembered before the week-long memory course; D is the 6 4
difference in the means, after minus before 8 6
H 0 : D = 0 6 7
2.
H 0 : D  0
3. Apparently, all police trainees were given the original test, but then a random sample of 15 were
given the week-long memory course, so there is a random sample. We can assume there are more
than 150 police trainees, so we have an independent sample. Since the sample size, 15, is less than
30, we will construct a normal probability plot. The difference in number of license plates
memorized will be the x-values and the z-score for each will be the y-value:

Since the data values are mostly linear, we


can assume that the sample is approximately
normal to meet the technical condition.

4. Since data was collected from each


trainee before and after the memory course,
we will run a matched pairs one-sample t-
test.
Test statistic: t = 2.699 or 2.700

5. p-value = 0.008 or 0.009


df = 14
95% confidence interval: (0.315, 2.751)

6. The p-value = 0.008 < 0.05 so we reject the null hypothesis. We have strong evidence that the
week-long memory course increases the mean number of license plates police trainees are able to
memorize. Since the 95% confidence interval doesn’t contain 0, we are 95% confident that the mean
increase in number of license plates memorized after the memory course is between 0.315 and 2.751.

Page 18
Extra Practice: Unit 5 Review

1. Which type of sampling must be used to select the samples used for constructing confidence
intervals and performing hypothesis tests?
2. The null hypothesis is a claim about a:
a) parameter, where the claim is assumed to be false until it is declared true
b) parameter, where the claim is assumed to be true until it is declared false
c) statistic, where the claim is assumed to be false until it is declared true
d) statistic, where the claim is assumed to be true until it is declared false
3. If we want to calculate a confidence interval or perform a hypothesis test for a population mean,
when will we use the t-distribution rather than the z-distribution in the formulas and procedures?
4. The mean federal income tax paid last year by a random sample of 19 persons selected from a city
was $4275 with a standard deviation of $766. If we want to use this information to test at a 5%
significance level that the mean income tax of all persons in this city is more than $4000, we
a) could construct a Z-interval
b) could construct a T-interval
c) could perform a Z-test
d) could perform a T-test
5. In a hypothesis test, if we REJECT the null hypothesis at a 5% significance level, then it must be
that
a) P-value > 0.05
b) P-value < 0.05
c) P-value = 0.05
d) P-value > 0.025
6. A two-tailed hypothesis test using the normal distribution reveals that the area under the sampling
distribution curve of the mean and located in the tail to the right of the sample mean equals 0.028.
Consequently, the p-value for this test equals:
7. We want to know if there is a difference in pay among females and males at a large cooperation. We
draw two random samples, one from the population of female employees and one from the
population of male employees at this cooperation. The two samples are
a) independent
b) dependent
c) matched samples
d) paired samples
8. Drug A was given to 132 patients and Drug B was given to 127 patients in a Phase 3 clinical trial for
efficacy. Each drug claims to reduce patients' diastolic blood pressure. Blood pressure readings were
taken before and after administration of the drug to test parameters μA and μB . What type of T-test is
appropriate and how many degrees of freedom would you use in the following T-tests, using the
textbook's conservative approximations?
a) H0 : The average diastolic of patients given Drug A was 85, before administration.

Page 19
b) H0 : The average change in diastolic of patients given Drug A was -5.
c) H0 : The average diastolic of patients given Drug A was the same as patients given Drug B,
before administration.
d) H0 : The average change in diastolic of patients given Drug B was -5.
e) H0 : Drugs A and B are equally effective because the average change in diastolic for the two
groups is identical.
9. A soft-drink manufacturer claims that its 12-ounce cans do not contain, on average, more than 30
calories. A random sample of 64 cans of this soft drink, which were checked for calories, contained
a mean of 32 calories with a standard deviation of 3 calories. Does the sample information support
the alternative hypothesis that the manufacturer's claim is false? Use a significance level of 5%.

10. Listed below are temperatures (in F) of subjects measured at 8:00 am and then again at 12:00am.
a) Construct a 95% confidence interval estimate of the difference between the 8:00 am
temperatures and the 12:00 am temperatures.
b) Test at 5% significance level the claim that the body temperature is the same at both times.
c) Explain the relationship between your answers to the above 2 parts.
8:00
AM 97.0 96.2 97.6 96.4 97.8 99.9
12:00
AM 98.0 98.6 98.8 98.0 98.6 97.6

11. John read that farmers in Japan routinely subject plants to stress before transplanting from the
greenhouse to the field. Methods of stress induction included pulling on the plants and hitting them with
straw rakes. John decided to investigate this phenomenon by growing two groups of bean plants
(10/group) in a greenhouse for 15 days during which time the plants in one group were pulled on three
times daily at 8:00 in the morning and at 4:00 in the afternoon. The plants were then transplanted to a
field. John hypothesized that stressed plants would exhibit significantly larger mean plant heights after
transplanting than the non-stressed plants (control). Use  = .05 and complete a hypothesis test showing
all work.
Plant heights (in cm.) after 30 days were:
Stressed Plants: 55, 65, 50, 57, 59, 73, 57, 54, 62, 68
Non-stressed Plants: 48, 65, 59, 57, 51, 63, 65, 58, 44, 50

12. Car emissions on highway and in-town


Claim: Mean level of emissions is less for highway driving than for stop-and-go in-town driving.
Data: Each car is driven both on the highway and in-town

Formulate and test an appropriate hypothesis.

Stop-and-Go Highway
1 1500 941
2 870 456
3 1120 893
4 1250 1060
5 3460 3107
6 1110 1339
7 1120 1346
8 880 644

Page 20
Unit 5 Review Key
1. Random sampling of independent items; 2. B;
3. Use t when you don't know σ, the population standard deviation.
4. B or D; 5. B; 6. p = 2(0.028) = 0.056
7. A. If the corporation is sufficiently large, it is safe to assume the samples are independent.
8. a. 1-Sample T-test with 131 degrees of freedom, b. 1-Sample Matched Pairs T-test with 131 degrees
of freedom, c. 2-Sample T-test with 126 degrees of freedom, d. 1-Sample Matched Pairs T-test with
126 degrees of freedom, e. 2-Sample T-test with 126 degrees of freedom

9. The test rejects H0 in favor of HA : μ0 > 30 with a t-statistic of 5.3 and a p-value less that 10−6.
10. a) A 95% confidence interval is (−0.9094,2.476) using the TI-84.
b) We fail to reject H0 at the 5% level because p = 0.2876
c) We are 95% confident that the difference in means is between -0.9094 and 2.476. Since 0 is
within this confidence interval, we cannot reject H0 at the 5% level. The 95% central probability
associated with the confidence interval is the complement of the 5% alpha-region which would
allow us to reject H0 .
11. Two-sample t-test for the stressed and non-stressed plants:
1. S is the mean height in cm after 30 for the stressed plants;  N is the mean height in cm
after 30 days for the non-stressed plants
H 0 : S =  N
2.
H a : S   N
3. Simple random sampling isn’t stated, but we can assume that the 20 plants used were
randomly sampled from a larger population. We also know that the two samples are
independent of each other and that there are far more than 200 plants, so the samples are
independent. Since the sample size, 10, is less than 30, we will check normality by
constructing a normal probability plot for each sample. The sample heights in cm are on the
x-axis and the z-score for each height is the y-value (the red squares are the stressed plants
and the blue crosses are the non-stressed plants):
The red squares from the stressed
plants appear to form a line so we
can assume the sample is normal.
The blue crosses from the non-
stressed plants are not quite as
linear, but we will proceed with
caution.

4. Since the two groups of plants were part of different treatment groups, either being stressed
before moving or non-stressed, we will run a two-sample t-test.

Page 21
Test statistic: t = 1.240
5. p-value = 0.115, df = 17.944 or 17.945
95% confidence interval: (-2.777, 10.777)
6. With p = 0.115 > 0.05, we fail to reject the null hypothesis. We do not have sufficient
evidence to say that the mean height in cm of the stressed plants is greater than the mean
height of the non-stressed plants. The 95% confidence interval includes the value 0 which
is more evidence that there is no difference in the plant heights in cm for the two groups on
average.
12. Matched pairs t-test because emissions values were taken from each car, once when driven on the
highway and once when driven off the highway.
1.  N is the mean emissions value (no units given) for non-highway driving;  H is the
mean emissions value for the highway driving; D is the difference of the means, non-
highway minus highway
H 0 : D = 0
2.
H a : D  0
Note: if you set of the mean of the differences the opposite way, you would choose the
opposite alternative hypothesis.
3. Simple random sampling isn’t stated, but we can assume that the 16 cars were randomly
sampled from a larger population. We also know that there are far more than 160 cars, so
the sample is independent. Since the sample size, 8, is less than 30, we will check
normality by constructing a normal probability plot for the sample that is the difference in
emissions for each car. The sample difference in emissions for each car is on the x-axis
and the z-score for each car is the y-value:
From the normal probability plot
we are not convinced that the
sample data is normal since the
points for not look approximately
linear, so we will proceed with
caution.

4. Since each car was


tested for emissions after
highway and non-highway
driving, we will conduct a
matched pairs one-sample t-test:
Test statistic: t = 1.896 or 1.897

5. p-value = 0.049 or 0.050; df = 7


95% confidence interval: (-47.02, 428.02)
6. The p-value = 0.049 is just at or slightly under 0.05 so we reject the null hypothesis. We
have some evidence that emissions from cars driven off the highway are higher than
emissions from cars driven on the highway. The 95% confidence interval includes 0, so
the evidence that there is a difference is minimal, but the center of the confidence interval
is 237.52, well above 0, showing that emissions from cars driven off the highway tend to
be much higher.

Page 22
Glossary
Alternative hypothesis – a statement of what researchers suspect or hope to be true about the
parameter. It will take one of these three forms:
• Ha: parameter < hypothesized value
• Ha: parameter > hypothesized value
• Ha: parameter ≠ hypothesized value
The specific form (direction) of the alternative is determined by the research question, before
the sample data are determined.

Comparing two means – common inference procedure used when the response variable is
quantitative; procedure attempts to distinguish between an observed difference due to sampling
variability and one too large to have occurred by chance.

Conservative - describes decisions and/or calculations that may underestimate statistical


significance in order to avoid the worse error of saying an observed value or observed difference is
more unusual than it actually is. When calculating a confidence interval, a conservative approach
may yield a slightly wider interval. (see p.475) A test is conservative if, when conducted for a given
significance level, the true probability of incorrectly determining significance is unlikely.

Matched-pairs Experiment – Experiment that incorporates blocking, where the block size is 2. The
pairs may arise naturally, and they may not be independent.

Null hypothesis – A statement about the parameter of interest. Typically a statement of no effect or
no difference, the null states the parameter of interest is equal to a specific value:
H0-: parameter = hypothesized value
One-tailed test – significance test conducted when the alternative hypothesis is one-sided. For
example, Ha: μ > μ0 or Ha: μ < μ0.

Practical significance – When large samples are available, even tiny deviations from the null
hypothesis will be statistically significant. But a tiny deviation may not have practical importance,
so use your common sense and look at the size of an observed difference. Ask yourself whether the
observed difference is important.
p-value – The probability, assuming the null hypothesis to be true, of obtaining a test statistic at
least as extreme as the one actually observed. Extreme means in the direction of the alternative
hypothesis.
Robust – Describes a procedure that tends to give reasonable results even for small sample sizes as
long as the population is not severely skewed and does not have extreme outliers

Significance Level – The cutoff p-value that the researcher decides in advance in order to provide
convincing evidence against the null hypothesis.

Technical conditions for t-test – The t-test requires a simple random sample from a population of
interest. The t-test also requires either a large sample size or a normally distributed population.
You can generally regard a sample of at least 30 as large enough for the procedure to be valid. If
the sample size is less than 30, examine visual displays of the sample data to see whether they
appear to follow a normal distribution.

Page 23
Test decision – A comment evaluating the strength of evidence against the null hypothesis. Where a
test decision needs to be made:
If the p-value is small, reject the null hypothesis.
If the p-value is not small, fail to reject the null hypothesis.
The decision should respond to the research question, stating that you either have evidence for the
alternative hypothesis (in context) or you do not. In other words, restate your final conclusions in
the language of the research question.

Test of Significance – A significance test is a formal procedure for comparing observed data with a
claim (hypothesis) whose truth we want to assess. The claim is a statement about a parameter, like
the population mean μ. We express the results of a significance test in terms of a probability that
measures how well the data and the claim agree.

Test statistic – This is a measure of the discrepancy between our observed statistic and the
hypothesized value of the parameter. If the discrepancy is large, we have evidence against the null
hypothesis.

Two-sample t-tests – Inference procedure to compare two populations or two treatments. We


examine the difference x1 − x2 and compare it to the hypothesized difference μ1 – μ2.

Two-tailed test – When we look for results at least as extreme as the sample result in both
directions. When the alternative hypothesis is two-sided (not equal to), we find the p-value by
computing
2 ∙ P(Z > |z|)

Page 24

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy