0% found this document useful (0 votes)
39 views12 pages

Formelsamling-Stk-1100-1110 Eng Nov 2015

Statistics cheat sheet

Uploaded by

curiouswitcher
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views12 pages

Formelsamling-Stk-1100-1110 Eng Nov 2015

Statistics cheat sheet

Uploaded by

curiouswitcher
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

LIST OF FORMULAS FOR STK1100 AND STK1110

(Version of 11. November 2015)

1. Probability
Let A, B, A1 , A2 , . . . , B1 , B2 , . . . be events, that is, subsets of a sample space Ω.

a) Axioms:
A probability function P is a function from subsets of the sample space Ω to real
numbers, satisfying

P (Ω) = 1
P (A) ≥ 0
P (A1 ∪ A2 ) = P (A1 ) + P (A2 ) if A1 ∩ A2 = ∅
[∞  X ∞
P Ai = P (Ai ) if Ai ∩ Aj = ∅ for i 6= j
i=1 i=1

b) P (A0 ) = 1 − P (A)

c) P (∅) = 0

d) A ⊂ B ⇒ P (A) ≤ P (B)

e) The addition law of probability/ the sum rule:

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

f) Conditional probability:

P (A ∩ B)
P (A|B) = if P (B) > 0
P (B)

g) Total probability:
n
X n
[
P (A) = P (A|Bi )P (Bi ) if Bi = Ω and Bi ∩ Bj = ∅ for i 6= j
i=1 i=1

h) Bayes’ Rule:

P (A|Bj )P (Bj )
P (Bj |A) = Pn under same conditions as in g)
i=1 P (A|Bi )P (Bi )

i) A and B are (statistically) independent events if P (A ∩ B) = P (A)P (B)

1
j) A1 , . . . , An are (statistically) independent events if

P (Ai1 ∩ · · · ∩ Aim ) = P (Ai1 )P (Ai2 ) · · · P (Aim )

for any subset of indexes i1 , i2 , . . . , im

k) The product rule:

P (A1 ∩ · · · ∩ An )
= P (A1 )P (A2 |A1 )P (A3 |A1 ∩ A2 ) · · · P (An |A1 ∩ A2 ∩ · · · ∩ An−1 )

2. Combinatorics
a) Two operations that can be done in respectively n and m different ways can be
combined in n · m ways.

b) The number of ordered subsets of r elements drawn with replacement from a set
of n elements is nr

c) The number of ordered subsets of r elements drawn without replacement from a


set of n elements is n(n − 1) · · · (n − r + 1)

d) Number of permutations of n elements is n! = 1 · 2 · 3 · · · (n − 1) · n

e) The number of unordered subsets of r elements drawn from a set of n elements


is  
n n(n − 1) · · · (n − r + 1) n!
= =
r r! r! (n − r)!

f) Number of ways a set of n elements can be divided into r subsets with ni elements
in the ith subset is  
n n!
=
n1 n2 · · · nr n1 ! n2 ! · · · nr !

2
3. Probability distributions
a) For a random variable X (discrete or continuous), F (x) = P (X ≤ x) is the
cumulative distribution function (cdf).

b) For a discrete random variable X which can take the values x1 , x2 , x3 , . . .


we have

p(xj ) = P (X = xj )
X
F (x) = p(xj )
xj ≤x

p(xj ) is a point probability if

p(xj ) ≥ 0 for all j


X
p(xj ) = 1
j

c) For a continuous random variable X we have


Z b
P (a < X < b) = f (x)dx
a
Z x
F (x) = f (u)du
−∞
f (x) = F 0 (x)

f (x) is a probability density function if

f (x) ≥ 0
Z ∞
f (x)dx = 1
−∞

d) For two random variables X and Y (discrete or continuous) the


joint cumulative distribution function is F (x, y) = P (X ≤ x, Y ≤ y)

e) For discrete random variables X and Y which can take the values x1 , x2 , . . . and
y1 , y2 , . . . respectively, we have

p(xi , yj ) = P (X = xi , Y = yj )
XX
F (x, y) = p(xi , yj )
xi ≤x yj ≤y

p(xi , yj ) is a joint point probability if it fullfills the same conditions as in b)

3
f) For continuous random variables X and Y we have
Z Z
P ( (X, Y ) ∈ A ) = f (u, v)dvdu
A
Zx Zy
F (x, y) = f (u, v)dvdu
−∞ −∞
2
∂ F (x, y)
f (x, y) =
∂x∂y

f (x, y) is a joint probability density function if it fullfills the same conditions as


in c)

g) Marginal point probabilities:


X
pX (xi ) = p(xi , yj ) (for X)
j
X
pY (yj ) = p(xi , yj ) (for Y )
i

h) Marginal probability densities:

Z∞
fX (x) = f (x, y)dy (for X)
−∞
Z∞
fY (y) = f (x, y)dx (for Y )
−∞

i) Independence:
The random variables X and Y are independent if

p(xi , yj ) = pX (xi )pY (yj ) (discrete)


f (x, y) = fX (x)fY (y) (continuous)

j) Conditional point probabilities:

p(xi , yj )
pX|Y (xi |yj ) = (for X given Y = yj )
pY (yj )
p(xi , yj )
pY |X (yj |xi ) = (for Y given X = xi )
pX (xi )

Assuming pY (yj ) > 0 and pX (xi ) > 0, respectively. Conditional point probabili-
ties can be treated as regular point probabilities.

4
k) Conditional probability densities:

f (x, y)
fX|Y (x|y) = (for X given Y = y)
fY (y)
f (x, y)
fY |X (y|x) = (for Y given X = x)
fX (x)

Assuming fY (y) > 0 and fX (x) > 0, respectively. Conditional probability densi-
ties can be treated as regular probability densities.

4. Expectation
a) The expected value of a random variable X is defined as
X
E(X) = xj p(xj ) (discrete)
j
Z ∞
E(X) = xf (x)dx (continuous)
−∞

b) For a real function g(X) of a random variable X, the expectated value is


X
E[g(X)] = g(xj )p(xj ) (discrete)
j
Z ∞
E[g(X)] = g(x)f (x)dx (continuous)
−∞

c) E(a + bX) = a + bE(X)

d) For a real function g(X, Y ) of two random variables X and Y , the expected value
is
  XX
E g(X, Y ) = g(xi , yj )p(xi , yj ) (discrete)
i j
Z∞ Z∞
 
E g(X, Y ) = g(x, y)f (x, y)dydx (continuous)
−∞ −∞

     
e) If X and Y are independent E g(X)h(Y ) = E g(X) · E h(Y )

f) If X and Y are independent E(XY ) = E(X) · E(Y )


 n
 n
P P
g) E a + bi X i = a + bi E(Xi )
i=1 i=1

5
h) Conditional expectation:
X
E(Y |X = xi ) = yj pY |X (yj |xi ) (discrete)
j
Z∞

E(Y |X = x) = yfY |X (y|x)dy (continuous)


−∞

5. Variance and standard deviation


a) The variance and standard deviation of a random variable X are defined as

V(X) = E[(X − µ)2 ]


p
sd(X) = V(X)

2
b) V(X) = E(X 2 ) − E(X)

c) V(a + bX) = b2 V(X)

d) If X1 , . . . , Xn are independent we have


n
! n
X X
V a+ b i Xi = bi 2 V(Xi )
i=1 i=1

e) !
n
X n
X n X
X
2
V a+ bi X i = bi V(Xi ) + bi bj Cov(Xi , Xj )
i=1 i=1 i=1 j6=i

f) Chebyshev’s inequality:
Let X be a random variable with µ = E(X) and σ 2 = V(X).
For all t > 0 we have
σ2
P (|X − µ| > t) ≤ 2
t

6. Covariance and correlation


2
a) Let X and Y be random variables with µX = E(X), σX = V(X), µY = E(Y )
2
and σY = V(Y ). The covariance and correlation of X and Y is then defined as
 
Cov(X, Y ) = E (X − µX )(Y − µY )
Cov(X, Y )
ρ = Corr(X, Y ) =
σX σY

6
b) Cov(X, X) = V(X)

c) Cov(X, Y ) = E(XY ) − E(X)E(Y )

d) X, Y independent ⇒ Cov(X, Y ) = 0

e) !
n
X m
X n X
X m
Cov a + bi X i , c + dj Yj = bi dj Cov(Xi , Yj )
i=1 j=1 i=1 j=1

f) −1 ≤ Corr(X, Y ) ≤ 1 and Corr(X, Y ) = ±1 if and only if there exists two num-


bers a, b such that Y = a + bX (except, eventually, on areas of zero probability)

7. Moment generating functions


a) For a random variable X (discrete or continuous) the moment generating function
is MX (t) = E(etX )

b) If the moment generating function MX (t) exists for t in an open interval contain-
ing 0, then it uniquely determines the distribution of X.

c) If the moment generating function MX (t) exists for t in an open interval con-
taining 0, then all moments of X exist, and we can find the rth moment by
(r)
E(X r ) = MX (0)

d) Ma+bX (t) = eat MX (bt)

e) If X and Y are independent: MX+Y (t) = MX (t)MY (t)

8. Some discrete probability distributions


a) Binomial distribution:
n
 k
Point probability: P (X = k) = k
p (1 − p)n−k k = 0, 1, . . . , n
Moment generating function: MX (t) = (1 − p + pet )n
Expectation: E(X) = np
Variance : V(X) = np(1 − p)

X − np
Approximation 1: Z=p is approximately normally distributed
np(1 − p)
when np and n(1 − p) both are sufficiently big (at least 10)

Approximation 2: X is approximately Poisson distributed with parameter λ = np


when n is big and p is small

7
Sum rule: X ∼ binomial (n, p), Y ∼ binomial (m, p)
and X, Y independent ⇒ X + Y ∼ binomial (n + m, p)

b) Geometric distribution:
Point probability: P (X = k) = (1 − p)k−1 p k = 1, 2, . . .
Moment generating function: MX (t) = et p/[1 − (1 − p)et ]
Expectation: E(X) = 1/p
Variance: V(x) = (1 − p)/p2

Sum rule: If X is geometrically distributed with probability p then


X − 1 is negative binomial (1, p). Then if X and Y are
geometrically distributed with same p and independent then
X + Y − 2 is negative binomial (2, p)

c) Negative binomial distribution:


k+r−1
 r
Point probability: P (X = k) = r−1
p (1 − p)k k = 0, 1, 2, . . .
r
Moment generating function: MX (t) = {p/[1 − (1 − p)et ]}
Expectation: E(X) = r(1 − p)/p
Variance: V(X) = r(1 − p)/p2

Sum rule: X ∼ negative binomial (r1 , p),


Y ∼ negative binomial (r2 , p)
and X, Y independent
⇒ X + Y ∼ negative binomial (r1 + r2 , p)

d) Hypergeometric distribution:
−M
(Mk )(Nn−k )
Point probability: P (X = k) = N
(n)
M
Expectation: E(X) = n · N
M N −n
Variance: V(X) = n M
N
(1 − )
N N −1

Approximation: X is approximately binomial (n, M


N
)
when n is much smaller than N

e) Poisson distribution:
λk −λ
Point probability: P (X = k) = k!
e k = 0, 1, . . .
t
Moment generating function: MX (t) = eλ(e −1)
Expectation: E(X) = λ

8
Variance: V(X) = λ
X −λ
Approximation: Z= √ is approximately normally distributed
λ
when λ is sufficiently big (at least 10)

Sum rule: X ∼Poisson (λ1 ), Y ∼Poisson (λ2 )


and X, Y independent ⇒ X + Y ∼ Poisson (λ1 + λ2 )

e) Multinomial distribution:
Point probability: n!
P (N1 = n1 , . . . , Nr = nr ) = n1 !···n pn1 · · · pnr r
r! 1
Pr Pr
Here pi = 1 and ni = n
i=1 i=1

Marginal distribution: Ni ∼ binomial(n, pi )

9. Some continuous probability distributions


a) Normal distribution:
2 2
Density: f (x) = √ 1 e−(x−µ) /2σ −∞<x<∞
2πσ
2 t2 /2
Moment generating function: MX (t) = eµt eσ
Expectation: E(X) = µ
Variance: V(X) = σ 2
Transformation: X ∼ N (µ, σ 2 ) ⇒ a + bX ∼ N (a + bµ, b2 σ 2 )
X ∼ N (µ, σ 2 ) ⇒ Z = X−µ
σ
∼ N (0, 1)
2
Sum rule: X ∼ N (µX , σX ), Y ∼ N (µY , σY2 ),
X, Y independent
2
⇒ X + Y ∼ N (µX + µY , σX + σY2 )

b) Exponential distribution:
Density: f (x) = λe−λx x>0
Moment generating function: MX (t) = λ/(λ − t) for t < λ
Expectation: E(X) = 1/λ
Variance: V(X) = 1/λ2
Sum rule: X ∼ exp(λ), Y ∼ exp(λ), X and Y independent
⇒ X + Y ∼ gamma(2, 1/λ)

c) Gamma distribution:
Density: f (x) = 1
β α Γ(α)
xα−1 e−x/β x>0

9
R∞
Gamma function: Γ(α) = 0 uα−1 e−u du
Γ(α + 1) = αΓ(α)
Γ(n) = (n√− 1)! when n is an integer
Γ(1/2) = π, Γ(1) = 1
Moment generating function: MX (t) = [1/(1 − βt)]α
Expectation: E(X) = αβ
Variance: V(X) = αβ 2
Sum rule: X ∼ gamma(α, β), Y ∼ gamma(δ, β),
X and Y independent ⇒ X + Y ∼ gamma(α + δ, β)

d) Chi-squared distribution:
Density: f (v) = 1
2n/2 Γ(n/2)
v (n/2)−1 e−v/2 v>0
n degrees of freedom
Expectation: E(V ) = n
Variance: V(V ) = 2n
Sum rule: U ∼ χ2n , V ∼ χ2m , U and V independent
⇒ U + V ∼ χ2n+m
Result: Z ∼ N (0, 1) ⇒ Z 2 ∼ χ21

e) Student’s t-distribution:
Γ[(n+1)/2] t2 −(n+1)/2
Density: f (t) = √
nπΓ(n/2)
(1 + n
) −∞<t<∞
n degrees of freedom
Expectation: E(T ) = 0 (n ≥ 2)
Variance: V(T ) = n/(n − 2) (n ≥ 3)
p
Result: Z ∼ N (0, 1), U ∼ χ2n , Z, U independent ⇒ Z/ U/n ∼ tn

f) Binormal distribution:
Density:

f (x, y) =
n  (x−µX )2 o
(y−µY )2
1√ 1
exp − 2(1−ρ2) σ2
+ 2
σY
− 2ρ (x−µσXX)(y−µ
σY
Y)
2πσX σY 1−ρ2 X

2
Marginal distribution: X ∼ N (µX , σX ), Y ∼ N (µY , σY2 )
Correlation: Corr(X, Y ) = ρ
Conditional distribution: Given X = x, Y is normally distributed with
expectation E(Y |X = x) = µY + ρ σσXY (x − µX )
and variance V(Y |X = x) = σY2 (1 − ρ2 )

10
10. One normally distributed sample
If X1 , X2 , . . . , Xn are independent and N (µ, σ 2 ) distributed then we have that:
n n
1 1
and S 2 = (Xi − X)2
P P
a) X = n
Xi n−1
are independent
i=1 i=1

b) X ∼ N (µ, σ 2 /n)

c) (n − 1)S 2 /σ 2 ∼ χ2n−1
X−µ
d) √
S/ n
∼ tn−1

11. Two normally distributed samples


Let X1 , X2 , . . . , Xn be independent and N (µX , σ 2 ) distributed, and Y1 , Y2 , . . . , Ym in-
dependent and N (µY , σ 2 ) distributed. The two samples are independent of each other.
2
Let X, Y , SX and SY2 be defined as in 10a). Then we have that:

a) Sp2 = [(n − 1)SX


2
+ (m − 1)SY2 ]/(m + n − 2) is a weighted estimator for σ 2

b) X − Y ∼ N µX − µY , σ 2 ( n1 + m1 )


c) (n + m − 2)Sp2 /σ 2 ∼ χ2m+n−2
X−Y −(µX −µY )
d) √1 1
∼ tm+n−2
Sp n
+m

12. Regression analysis


Assume Yi = β0 + β1 xi + i ; i = 1, 2, . . . , n ; where xij -s are given numbers and i -s are
independent and N (0, σ 2 ) distributed. Then we have that:
a) The least squares estimators for β0 and β1 are
Pn
(x − x)(Yi − Y )
β̂0 = Y − β̂1 x and β̂1 = i=1 Pn i 2
i=1 (xi − x)

b) The estimators in a) are normally distributed and unbiased, and

σ 2 ni=1 x2i σ2
P
Var(β̂0 ) = and Var(β̂1 ) = n
n ni=1 (xi − x)2
P P 2
i=1 (xi − x)

n
(Yi − βˆ0 − βˆ1 xi )2 . Then S 2 = SSE/(n−2) is an unbiased estimator
P
c) Let SSE=
i=1
for σ 2 , and (n − 2)S 2 /σ 2 ∼ χ2n−2

11
13. Multiple linear regression
Assume Yi = β0 +β1 xi1 +· · ·+βk xik +i ; i = 1, 2, . . . , n ; where xij -s are given numbers
and i -s are independent and N (0, σ 2 ) distributed. The model can be written in matrix
form as Y = Xβ, where Y = (Y1 , . . . , Yn )T and β = (β0 , . . . , βk )T are n- and (k + 1)-
dimentional vectors, and X = {xij } (with xi0 = 1) is a n × (k + 1)-dimentional matrix.
Then:

1. The least squares estimator for β is β̂ = (XT X)−1 XT Y.

2. Let β̂ = (β̂0 , . . . , β̂k )T . Then β̂j -s are normally distributed and unbiased, and

Var(β̂j ) = σ 2 cjj og Cov(β̂j , β̂l ) = σ 2 cjl

where cjl is element (j, l) in the (k + 1) × (k + 1) matrix C = (XT X)−1 .


n
(Yi − Ŷi )2 . Then S 2 =
P
3. Let Ŷi = β̂0 + β̂1 xi1 + · · · + β̂k xik , og let SSE =
i=1
SSE/[n−(k+1)] is an unbiased estimator for σ 2 , and [n−(k+1)]S 2 /σ 2 ∼ χ2n−(k+1) .
Also, S 2 and β̂ are independent.

4. Let Sβ̂2 be the variance estimator for β̂j we get by replacing σ 2 with S 2 in the
j

formula for Var(β̂j ) (in b). Then (β̂j − βj )/Sβ̂j ∼ tn−(k+1) .

12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy