0% found this document useful (0 votes)
26 views

Notes 10

The document discusses two strains of mice (A and B) and their levels of the cytokine IL10 after treatment. It measured IL10 levels in males of the same age from each strain. The graphs show the distributions of IL10 levels differ between the two strains. The document is primarily interested in comparing aspects of the underlying distributions in the two mouse populations.

Uploaded by

farah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Notes 10

The document discusses two strains of mice (A and B) and their levels of the cytokine IL10 after treatment. It measured IL10 levels in males of the same age from each strain. The graphs show the distributions of IL10 levels differ between the two strains. The document is primarily interested in comparing aspects of the underlying distributions in the two mouse populations.

Uploaded by

farah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Example

Two strains of mice: A and B.


Measure cytokine IL10 (in males all the same age) after treatment.

A ● ●● ● ●● ● ●

B ● ● ● ● ● ●●●

500 1000 1500 2000 2500

IL10

A ● ● ● ● ●● ● ●

B ● ● ● ●● ●●

500 1000 1500 2000 2500

IL10 (on log scale)

Point: We’re not interested in these particular mice, but in as-


pects of the distributions of IL10 values in the two strains.
1

Populations and samples

We are interested in the distribution of measurements in the


underlying (possibly hypothetical) population.

Examples: • Infinite number of mice from strain A; cytokine response to


treatment.
• All T cells in a person; respond or not to an antigen.
• All possible samples from the Baltimore water supply; concen-
tration of cryptospiridium.
• All possible samples of a particular type of cancer tissue; ex-
pression of a certain gene.

We can’t see the entire population (whether it is real or hypothetical),


but we can see a random sample of the population (perhaps a set of
independent, replicated measurements).

2
Parameters

The object of our interest is the population distribution or, in


particular, certain numerical attributes of the population distribution
(called parameters).
Population distribution Examples:
• mean
• median
• SD
• proportion = 1
σ • proportion > 40
µ • geometric mean
0 10 20 30 40 50 60 • 95th percentile

Parameters are usually assigned greek letters (like θ, µ, and σ ).


3

Sample data

We make n independent measurements (or draw a random sample


of size n).

This gives X 1, X 2, . . . , X n independent and identically distributed


(iid), following the population distribution.

Statistic: A numerical summary (function) of the X ’s (that is, of the data).


For example, the sample mean, sample SD, etc.

Estimator: A statistic, viewed as estimating some population parameter.


(estimate)

We write: θ̂ an estimator of θ X̄ = µ̂ an estimator of µ


p̂ an estimator of p S = σ̂ an estimator of σ

4
Parameters, estimators, estimates

µ • The population mean


• A parameter
• A fixed quantity
• Unknown, but what we want to know

X̄ • The sample mean


• An estimator of µ
• A function of the data (the X ’s)
• A random quantity

x̄ • The observed sample mean


• An estimate of µ
• A particular realization of the estimator, X̄
• A fixed quantity, but the result of a random process.

Estimators are random variables

Estimators have distributions, means, SDs, etc.


Population distribution

−→ X 1, X 2, . . . , X 10 −→ X̄
σ
µ

0 10 20 30 40 50 60

3.8 8.0 9.9 13.1 15.5 16.6 22.3 25.4 31.0 40.0 −→ 18.6
6.0 10.6 13.8 17.1 20.2 22.5 22.9 28.6 33.1 36.7 −→ 21.2
8.1 9.0 9.5 12.2 13.3 20.5 20.8 30.3 31.6 34.6 −→ 19.0
4.2 10.3 11.0 13.9 16.5 18.2 18.9 20.4 28.4 34.4 −→ 17.6
8.4 15.2 17.1 17.2 21.2 23.0 26.7 28.2 32.8 38.0 −→ 22.8
6
Sampling distribution

Distribution of X̄
Population distribution
n=5

10 15 20 25 30 35 40

n = 10

σ
µ

0 10 20 30 40 50 60 10 15 20 25 30 35 40

n = 25

Sampling distribution depends on:


• The type of statistic 10 15 20 25 30 35 40

• The population distribution n = 100

• The sample size


10 15 20 25 30 35 40

Bias, SE, RMSE

Population distribution Dist'n of sample SD (n=10)

σ
µ

0 10 20 30 40 50 60
5 10 15

Consider θ̂, an estimator of the parameter θ.

Bias: E(θ̂ − θ) = E(θ̂) − θ

Standard error (SE): SE(θ̂) = SD(θ̂).


q p
RMS error (RMSE): E{(θ̂ − θ) } = (bias)2 + (SE)2.
2

8
Example: heights of students

14
12 mean = 67.1
10 SD = 3.9
Frequency

8
6
4
2
0

60 65 70 75

Height (inches)

Example: heights of students


Population distribution Dist'n of Sample Ave (n=10)
mean = 67.1 mean = 67.1
SD = 3.9 SD = 1.2

60 65 70 75 60 65 70 75

Height Height (inches)

Dist'n of Sample Ave (n=5) Dist'n of Sample Ave (n=25)


mean = 67.1 mean = 67.1
SD = 1.7 SD = 0.68

60 65 70 75 60 65 70 75

Height (inches) Height (inches)

10
The sample mean

Population distribution Assume X 1, X 2, . . . , X n are iid with


mean µ and SD σ .

σ
µ
Mean of X̄ = E(X̄ ) = µ.
0 10 20 30 40 50 60

Bias = E(X̄ ) − µ = 0.

SE of X̄ = SD(X̄ ) = σ/ n.

RMS error of X̄ =
p √
(bias)2 + (SE)2 = σ/ n.

11

If the population is normally distributed

Population distribution

If X 1, X 2, . . . , X n are iid
normal(µ,σ), then σ

√ µ
X̄ ∼ normal(µ, σ/ n).
Distribution of X

σ n

12
Example
Suppose X 1, X 2, . . . , X 10 are iid normal(mean=10,SD=4)
Then X̄ ∼ normal(mean=10, SD ≈ 1.26); let Z = (X̄ – 10)/1.26.

Pr(X̄ > 12)?

1.26 ≈ 1 ≈ 5.7%
10 12 −1.58 0

Pr(9.5 < X̄ < 10.5)?

≈ ≈ 31%
10 0
9.5 10.5 −0.40 0.40

Pr(|X̄ − 10| > 1)?

≈ ≈ 43%
9 10 11 −0.80 0 0.80

13

Central limit theorm

If X 1, X 2, . . . , X n are iid with mean µ and SD σ .

and the sample size (n) is large,



then X̄ is approximately normal(µ, σ/ n).

How large is large?


It depends on the population distribution.
(But, generally, not too large.)

14
Example 1

Distribution of X̄
Population distribution
n=5

10 15 20 25 30 35 40

n = 10

σ
µ

0 10 20 30 40 50 60 10 15 20 25 30 35 40

n = 25

10 15 20 25 30 35 40

n = 100

10 15 20 25 30 35 40

15

Example 2

Distribution of X̄
Population distribution n = 10

50 100 150

n = 25

0 µ 50 100 150 200 50 100 150

n = 100

50 100 150

n = 500

50 100 150

16
Example 2 (rescaled)

Distribution of X̄
Population distribution n = 10

50 100 150

n = 25

0 µ 50 100 150 200 20 40 60 80 100

n = 100

20 25 30 35 40 45 50 55

n = 500

30 35 40

17

Example 3
Population distribution Distribution of X̄
n = 10

0 0.1 0.2 0.3 0.4 0.5

n = 25

0 1

0 0.05 0.1 0.15 0.2 0.25 0.3

{X i} iid n = 100

Pr(X i = 0) = 90%
Pr(X i = 1) = 10%
0 0.05 0.1 0.15 0.2

E(X i) = 0.1; SD(X i) = 0.3 n = 500

P
X i ∼ binomial(n, p)
0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14
X̄ = proportion of 1’s

18
The sample SD

Why use (n – 1) in the sample SD?


sP
(xi − x̄)2
s=
n−1

If {X i} are iid with mean µ and SD σ , then


E(s2) = σ 2
n−1 n−1 2
but E{ n s2 } = n σ < σ2

In other words:
Bias(s2) = 0
but Bias( n−1 2
n s ) =
n−1 2
n σ − σ 2 = − n1 σ 2
19

The distribution of the sample SD

If X 1, X 2, . . . , X n are iid normal(µ, σ )

then the sample SD, s, satisfies (n – 1) s2/σ 2 ∼ χ2n−1

When the X i are not normally distributed, this is not true.

χ2 distributions

df=9

df=19

df=29

0 10 20 30 40 50

20
Distribution of sample SD
(based on normal data)

n=25
n=10
n=5
n=3

0 5 10 15 20 25 30

21

A non-normal example

Distribution of sample SD
Population distribution
n=3

0 5 10 15 20 25 30

n=5
σ
µ

0 10 20 30 40 50 60
0 5 10 15 20 25 30

n = 10

0 5 10 15 20 25 30

n = 25

0 5 10 15 20 25 30

22

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy