Week9

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 39

DSA 210

Introduction to Data Science

Özgür Asar
ozgur.asar@sabanciuniv.edu
Weldon's dice
• Walter Frank Raphael Weldon (1860 - 1906), was an
English evolutionary biologist and a founder of
biometry. He was the joint founding editor of
Biometrika, with Francis Galton and Karl Pearson.

• In 1894, he rolled 12 dice 26,306 times, and


recorded the number of 5s or 6s (which he
considered to be a success).

• It was observed that 5s or 6s occurred more often than expected, and


Pearson hypothesized that this was probably due to the construction of the
dice. Most inexpensive dice have hollowed-out pips, and since opposite
sides add to 7, the face with 6 pips is lighter than its opposing face, which
has only 1 pip.

2
Labby's dice

• In 2009, Zacariah Labby (U of Chicago),


repeated Weldon's experiment using a
homemade dice-throwing, pip counting
machine.
www.youtube.com/watch?v=95EErdouO2w

• The rolling-imaging process took about 20


seconds per roll.
• Each day there were ~150 images to process manually.
• At this rate Weldon's experiment was repeated in a little more than six full
days.
• Recommended reading:
galton.uchicago.edu/about/docs/labby09dice.pdf

3
Labby's dice (cont.)
• Labby did not actually observe the same phenomenon that
Weldon observed (higher frequency of 5s and 6s).
• Automation allowed Labby to collect more data than Weldon
did in 1894, instead of recording "successes" and "failures",
Labby recorded the individual number of pips on each dice.

4
Expected counts
• Labby rolled 12 dice 26,306 times. If each side is equally
likely to come up, how many 1s, 2s, ..., 6s would he
expect to have observed?
a) 1/6
b) 12/6
c) 26,306 / 6
d) 12 x 26,306 / 6

5
Expected counts
• Labby rolled 12 dice 26,306 times. If each side is equally
likely to come up, how many 1s, 2s, ..., 6s would he
expect to have observed?
• 1/6
• 12/6
• 26,306 / 6
• 12 x 26,306 / 6

6
Summarizing Labby's results
• The table below shows the observed and expected
counts from Labby's experiment.

7
Summarizing Labby's results
• The table below shows the observed and expected counts from
Labby's experiment.

Why are the expected counts the same for all outcomes but the
observed counts are different? At a first glance, does there appear to be
an inconsistency between the observed and expected counts?
8
Setting the hypotheses
• Do these data provide convincing evidence of an
inconsistency between the observed and expected counts?

9
Setting the hypotheses
• Do these data provide convincing evidence of an
inconsistency between the observed and expected counts?

• H0: There is no inconsistency between the observed and the


expected counts. The observed counts follow the same
distribution as the expected counts.

• HA: There is an inconsistency between the observed and the


expected counts. The observed counts do not follow the same
distribution as the expected counts. There is a bias in which
side comes up on the roll of a die.

10
Evaluating Hypothesis
• To evaluate these hypotheses, we quantify how different
the observed counts are from the expected counts.
• Large deviations from what would be expected based on
sampling variation (chance) alone provide strong
evidence for the alternative hypothesis.
• This is called a goodness of fit test since we're evaluating
how well the observed data fit the expected distribution.

11
Chi-square statistic
When dealing with counts and investigating how far the
observed counts are from the expected counts, we use a
new test statistic called the chi-square (χ2) statistic.

χ2 statistic

12
Chi-square statistic
When dealing with counts and investigating how far the
observed counts are from the expected counts, we use a
new test statistic called the chi-square (χ2) statistic.

χ2 statistic

13
Calculating the chi-square statistic

14
Calculating the chi-square statistic

15
Calculating the chi-square statistic

16
Why square?
• Squaring the difference between the observed and the
expected outcome does two things:
• Any standardized difference that is squared will now be
positive.
• Differences that already looked unusual will become much
larger after being squared.

17
The chi-square distribution
• In order to determine if the χ2 statistic we calculated is
considered unusually high or not we need to first describe
its distribution.

• The chi-square distribution has just one parameter called


degrees of freedom (df), which influences the shape,
center, and spread of the distribution.

18
The chi-square distribution

Which of the following is false?


• As the df increases,
• the center of the χ2 distribution increases as well
• the variability of the χ2 distribution increases as well
• the shape of the χ2 distribution becomes more skewed (less like a
normal)

19
The chi-square distribution

Which of the following is false?


• As the df increases,
• the center of the χ2 distribution increases as well
• the variability of the χ2 distribution increases as well
• the shape of the χ2 distribution becomes more skewed (less like a
normal)

20
Finding areas under the chi-square curve
• p-value = tail area under the chi-square distribution (as usual)

• For this we can use technology, or a chi-square probability table.

> pchisq(q = 17, df = 9,


lower.tail = FALSE)
[1] 0.04871598

21
Back to Labby's dice
• The research question was: Do these data provide convincing evidence
of an inconsistency between the observed and expected counts?

• The hypotheses were:


• H0: There is no inconsistency between the observed and the expected
counts. The observed counts follow the same distribution as the expected
counts.
• HA: There is an inconsistency between the observed and the expected
counts. The observed counts do not follow the same distribution as the
expected counts. There is a bias in which side comes up on the roll of a
die.

• We had calculated a test statistic of χ2 = 24.67.


• All we need is the df and we can calculate the tail area (the p-value) and
make a decision on the hypotheses.

22
Degrees of freedom for a
goodness of fit test
• When conducting a goodness of fit test to evaluate how
well the observed data follow an expected distribution,
the degrees of freedom are calculated as the number of
cells (k) minus 1.

df = k – 1

• For dice outcomes, k = 6, therefore

df = 6 - 1 = 5

23
The p-value for a chi-square test is defined as
the tail area above the calculated test statistic.
The p-value for a chi-square test is defined as the tail area above the calculated test
statistic.

24
Conclusion of the hypothesis test
• We calculated a p-value less than 0.001. At 5%
significance level, what is the conclusion of the
hypothesis test?

a) Reject H0, the data provide convincing evidence that the dice are
fair.
b) Reject H0, the data provide convincing evidence that the dice are
biased.
c) Fail to reject H0, the data provide convincing evidence that the dice
are fair.
d) Fail to reject H0, the data provide convincing evidence that the dice
are biased.

25
Conclusion of the hypothesis test
• We calculated a p-value less than 0.001. At 5%
significance level, what is the conclusion of the
hypothesis test?

a) Reject H0, the data provide convincing evidence that the dice are
fair.
b) Reject H0, the data provide convincing evidence that the dice are
biased.
c) Fail to reject H0, the data provide convincing evidence that the dice
are fair.
d) Fail to reject H0, the data provide convincing evidence that the dice
are biased.

26
Recap: p-value for a chi-square test

• The p-value for a chi-square test is defined as the tail


area above the calculated test statistic.

• This is because the test statistic is always positive, and a


higher test statistic means a stronger deviation from the
null hypothesis.

27
Conditions for the chi-square test
1. Independence: Each case that contributes a count to
the table must be independent of all the other cases in
the table.
2. Sample size: Each particular scenario (i.e. cell) must
have at least 5 expected cases.
3. df > 1: Degrees of freedom must be greater than 1.

Failing to check conditions may unintentionally affect the


test's error rates.

28
Popular kids
• In the dataset popular, students in grades 4-6 were asked whether
good grades, athletic ability, or popularity was most important to
them. A two-way table separating the students by grade and by
choice of most important factor is shown below. Do these data
provide evidence to suggest that goals vary by grade?

29
• The hypotheses are:
• H0: Grade and goals are independent. Goals do not vary by
grade.
• HA: Grade and goals are dependent. Goals vary by grade.
•The test statistic is calculated as

where k is the number of cells, R is the number of rows, and C is the


number of columns.
_______________
Note: we calculate df differently for one-way and two-way tables.

30
Expected counts in two-way tables

31
Expected counts in two-way tables

a) 176 x 141 / 478


b) 119 x 141 / 478
c) 176 x 247 / 478
d) 176 x 478 / 478

32
Expected counts in two-way tables

a) 176 x 141 / 478 → 52


b) 119 x 141 / 478 more than expected # of 5th graders
c) 176 x 247 / 478 have a goal of being popular
d) 176 x 478 / 478

33
Calculating the test statistic in
two-way tables
Expected counts are shown in blue next to the observed counts.

34
Calculating the test statistic in
two-way tables
Expected counts are shown in blue next to the observed counts.

35
Calculating the test statistic in
two-way tables
Expected counts are shown in blue next to the observed counts.

36
Calculating the p-value
Which of the following is the correct p-value for this hypothesis test?

χ2df = 1.3121 df = 4 pvalue > 0.3

37
Conclusion
• Do these data provide evidence to suggest that goals
vary by grade?
• H0: Grade and goals are independent.
Goals do not vary by grade.
• HA: Grade and goals are dependent.
Goals vary by grade.

38
Conclusion
• Do these data provide evidence to suggest that goals
vary by grade?
• H0: Grade and goals are independent.
Goals do not vary by grade.
• HA: Grade and goals are dependent.
Goals vary by grade.

Since the p-value is large, we fail to reject H0.


The data do not provide convincing evidence that grade and goals are
dependent. It doesn't appear that goals vary by grade.

39

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy