Ejc t2 Enge

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Statistics II

Exercises Chapter 2

1. A pharmaceutical manufacturer is concerned about the impurity concentration in pills, and it is


anxious that this concentration does not exceed 3%. It is known that for a particular production
run, the concentration of impurities follows a normal distribution with standard deviation 0.4%. A
random sample of sixty-four pills from a production run was checked and the sample mean impurity
concentration was found to be 3.07%.
(a) Test at the 5% level that the population mean impurity concentration is 3% against the
alternative that it is more than 3%.
(b) Find the lowest level of significance at which the null hypothesis can be rejected.
(c) Suppose that the alternative hypothesis had been two-sided rather than one-sided. State,
without doing the calculations, whether the p-value of the test would be higher than, lower
than, or the same as that found in 1b. Sketch a graph to illustrate your reasoning.
(d) In the context of this problem, explain why a one-sided alternative hypothesis is more appropriate than a two-sided alternative.
2. A statistics instructor is interested in the ability of students to assess the difficulty of a test they
have taken. A test was taken by a large group of students, and the average score was 78.5. A
random sample of eight students was asked to predict this average score. Their predictions were:
72

83

78

65

69

77

81

71

Assuming a normal population:


(a) Test the null hypothesis that the population mean prediction would be 78.5. Use a two-sided
alternative and a 10% significance level.
(b) If the same sample results had been obtained from a random sample of sixteen students, would
the conclusion of the test be different from that in 2a?
(c) Without doing any calculations and based on your answer to 2a, decide if a 90% confidence
interval for the population mean would include the value of 78.5 or not. What about a 95%
confidence interval for the mean?
3. A mayor in a large city claims that in one particularly depressed neighborhood, at least 20% of all
males between ages 18 and 65 are unemployed. A random sample of 120 people from this population
included twenty unemployed.
(a) Test the mayors claim at a 5% significance level.
(b) Calculate the power of the test.
(c) Plot the power curve from 3b in Excel.
(d) Looking at the graph from 3c, for what values of p (roughly) is the power at least 0.8? For
those values of p, what is the upper bound on the probability of Type II error?
(e) Whats the upper bound on the power for p 0.2?
4. One way to evaluate the effectiveness of a course instructor is to examine the scores achieved by his
or her students in an examination at the end of the course. Obviously, the mean score is of interest.
However, the variance also contains useful information - some teachers have a style that works very
well with more able students but is unsuccessful with less able or poorly motivated students. A
professor sets a standard examination at the end of each semester for all sections of a course. The
variance of the scores on this test is typically very close to 300. A new instructor has a class of
thirty students, whose test scores had a sample quasi-variance of 480. Regarding these students
test scores as a random sample from a normal population:

(a) At a 5% significance level, test against a two-sided alternative the null hypothesis that the
population variance of their scores is 300.
(b) Based on your answer to 4a, decide if a 95% confidence interval for the population variance
would include the value of 300.
(c) Calculate the power of the test.
(d) Draw the power function from 4c in Excel.
(e) Looking at the graph from 4d, what is the probability of Type II error for 2 = 500 (roughly)?
5. State whether each of the following statements is true or false:
(a) The significance level of a test is the probability that the null hypothesis is false.
(b) A Type I error occurs when a true null hypothesis is rejected.
(c) A null hypothesis is rejected a the 0.025 level, but is accepted at the 0.01 level. This means
that the p-value of the test is between 0.01 and 0.025.
(d) The power of a test is the probability of accepting the null hypothesis that is true.
(e) If a null hypothesis is rejected against an alternative at a 5% level, then using the same data
it must be rejected against that alternative at the 1% level.
(f) If a null hypothesis is rejected against an alternative at a 1% level, then using the same data
it must be rejected against that alternative at the 5% level.
(g) The p-value of a test is the probability that the null hypothesis is true.
6. An insurance company employs agents on a bonus basis. It claims that in their first year, agents
will earn a mean bonus of at least 40000 euros and that the population standard deviation is no
more than 6000 euros. A random sample of nine agents found, for the bonuses in their first year,
9
X

xi = 333

and

i=1

9
X
(xi x
)2 = 312,
i=1

where xi are measured in thousands of euros and the population distribution can be assumed to be
normal.
(a) Test at the 5% level the null hypothesis that the population mean is at least 40000 euros (use
a p-value approach).
(b) Test at the 10% level the null hypothesis that the population standard deviation is at most
6000 euros (use a p-value approach).
7. A manufacturer claims that a new windmill in a certain location can generate an average of at least
800 kWh of energy per day. Daily energy generation for the windmill is assumed to be normally
distributed with a standard deviation of 120 kWh. A random sample of 100 days is taken to test
this claim against the alternative hypothesis that the true mean is less than 800 kWh. The claim
will be accepted if the sample mean is 776 kWh or more and rejected otherwise.
(a) What is the probability of a Type I error using the decision rule if the population mean is
in fact 800 kWh per day?
(b) What is the probability of a Type II error using this decision rule if the population mean is
in fact 740 kWh per day?
(c) Suppose that the same decision rule is used, but with a sample of 200 days rather than 100
days.
i. Would the value of be larger than, smaller than, or the same as that found in 7a?
ii. Would the value of be larger than, smaller than, or the same as that found in 7b?
Suppose now that a sample of 100 observations was taken but that the decision rule was changed
so that the claim would be accepted if the sample mean was at least 765 KWh.
(d) For this new decision rule:
i. Would the value of be larger than, smaller than, or the same as that found in 7a?
2

ii. Would the value of be larger than, smaller than, or the same as that found in 7b?
8. We want to test the null hypothesis that a population proportion is 0.5 against a two-sided alternative. One dilemma faced in such circumstances is that for any given significance level, the
larger the number of sample observations taken, the more likely it is that the null hypothesis will
be rejected. Why is this so, and why might the fact present a dilemma to an investigator planning
to use standard hypothesis testing techniques in these circumstances?
9. Consider the population of all residents of Getafe. We are interested in the weekly amount of money
Getafe residents spend on bread (assumed to be normally distributed). For a simple random sample
of ten residents, the following data (in euros) was obtained:
4.6

4.2

5.1

3.8

4.4

4.5

3.8

3.1

5.0

4.0

yielding
n
X

xi = 42.5

and

i=1

n
X

x2i = 183.91.

i=1

(a) Use a confidence interval approach to test a null hypothesis that the population variance 2
is 2 against a two-sided alternative at a 5% level.
(b) Now we switch our attention to another population parameter, the proportion of Getafenses
who spend no more than 4 euros on bread per week, p. We wish to perform an upper-tail test
H0 : p 0.75 at a 5% level. To do so, we need a large sample, thus we collect additional 25
observations, which combined with the earlier data give:
4.6 4.2 5.1 3.8 4.4 4.5 3.8 3.1 5.0 4.0
3.0 4.6 3.3 3.5 4.4 4.2 4.2 3.4 3.8 4.0
4.1 4.5 4.4 2.3 3.7 4.4 4.4 4.1 3.5 3.6
3.5 3.8 4.2 4.4 4.0
Find the sample proportion of Getafenses who spent no more than 4 euros on bread per week
and perform the desired test.
10. The data below represent one-way commuting times (in minutes) for a simple random sample of 15
people who work at a large assembly plant:
21.7

26.8

33.1

27.9

23.5

39.0

28.0

24.7

28.4

28.9

30.0

33.6

33.3

34.1

35.1

Assuming a normal distribution for the commuting times of those who work at the plant, a onesample upper-tail t-test (population mean exceeds 28 minutes) at a 5% level was performed in
Excel. Was the null hypothesis rejected or not? What was the final conclusion? (Note: Excel does
not have a function to perform a one-sample t-test, so we actually use a two-sample t-test for paired
samples, where the first sample is our data and the second sample consists of n repetitions of 0 ).

11. Below you can find an Excel sheet corresponding to problem 9b. Interpret its contents and explain
to your friend how to carry out the test of problem 9b in Excel. Hint: in the dialog window (and
in blue), the value of the population variance is 0.09375 = 21 p0 (1 p0 ) = 12 0.75(1 0.75).

12. You wish to conduct the following hypothesis test: H0 : = 10 vs. H1 : 6= 10. A simple
random sample with 700 observations has been collected, yielding a (standardized) value of the test
statistic equal to 1.96. The p-value associated to this test is:
(a) 0.025

(b) 0.05

(c) 1.96

(d) None of the preceding values.

13. You wish to check if a die is fair. To do so you conduct the following experiment: you throw the die
three times, and you conclude that the die is not fair if the sum of all the throws is smaller than 5.
(a) Define the null and alternative hypothesis for this test.
(b) Define the critical region of the test in terms of the outcomes of the three throws.
4

(c) Determine the significance level of the test.


(d) Give an expression for the power of the test in terms of the probabilities pi , i = 1, . . . , 6, where
pi = P (X = i) and X represents the number on the upper face of the die after a single throw.
(e) Find the probability of a Type II error for p1 = 1/2 and all the other probabilities being equal
to 1/10.
14. A certain emergency automated response procedure to apply when manufacturing problems are
detected in a production line has been shown to have undesirable consequences in certain lowprobability cases. You wish to determine if the procedure should be modified due to these consequences, as it in general has worked quite well in the past. You have selected a quality measurement
as your reference for this decision. This quantity is assumed to follow a normal distribution with
variance equal to 1. The reference value was measured on a sample of n = 25 emergencies, and
the sample mean was 15.4. If the mean value for the process (in the absence of emergencies) is 15,
answer the following questions:
(a) Do we have sufficient evidence to conclude that this procedure increases the mean value of the
indicator, at an = 0.05 level?
(b) The sample quasi-standard deviation for the 25 observations was 1.2. Does this imply that
the population variance has increased? Conduct this test for a significance level of 10%.
(c) Would you reach the same conclusions for the preceding tests if they were conducted at a 1%
significance level?
15. As part of a survey on employment length in a certain economic sector, you have collected information from a sample of 86 employees (after they left the sector). These data are available in the
file data ex2.xlsx, measured in years and months of employment.
Use Excel to answer the following questions:
(a) Would it be reasonable to assume that these data follow a normal distribution?
(b) Does this sample data provide enough evidence to accept the claim (at a 5% level) that the
mean time spent in employment in this sector is shorter than 5 years and a half?
16. As a way of improving its salesmens skills and thus their sales, a company is considering the
possibility of asking them to take Sales Techniques training courses. As this training is expensive,
9 randomly selected salesmen participate in a trial version of the program. Using the skills and
techniques learnt during the course, these nine employees average 115 units sold in a given time
period.
The companys objective is to have sales larger, on average, than 100 units per employee. Assume
that the variable of interest follows a normal distribution with a standard deviation of 20 units.
(a) At a 5% significance level, assess the effectiveness of the course, computing the critical region
for this test.
(b) Obtain the p-value of the test and give a recommendation regarding the course participation
in terms of the desired significance level.
(c) Assume that = 120 and the significance level is 5%. Determine the probability of a type II
error for this test.
(d) Assume you would like to have a type II error probability smaller than 1% for the test, when
= 120 and the significance level remains at 5%. Find the smallest value of n that would
achieve that probability.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy