0% found this document useful (0 votes)
63 views

Bayesian Inference: by Hoai Nam Nguyen September 9, 2017

The document discusses Bayesian inference for parameter estimation. It explains how Bayesian inference treats parameters as random variables and uses prior distributions and likelihood functions to derive posterior distributions. Three examples are provided to illustrate how to construct Bayesian estimators for unknown population parameters by taking the mean of the posterior distributions. Key steps of Bayesian inference including deriving the likelihood, prior, and posterior distributions are demonstrated.

Uploaded by

Thảo Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Bayesian Inference: by Hoai Nam Nguyen September 9, 2017

The document discusses Bayesian inference for parameter estimation. It explains how Bayesian inference treats parameters as random variables and uses prior distributions and likelihood functions to derive posterior distributions. Three examples are provided to illustrate how to construct Bayesian estimators for unknown population parameters by taking the mean of the posterior distributions. Key steps of Bayesian inference including deriving the likelihood, prior, and posterior distributions are demonstrated.

Uploaded by

Thảo Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Bayesian Inference

by Hoai Nam Nguyen


September 9, 2017

1
The setting is the same. Given a population that follows a distribution P ,
where P contains 1 or more unknown parameters, we want to construct an
estimator for each of them. In this course, I consider the simple case, where
there is only 1 unknown parameter . To do this, we proceed by collecting
an i.i.d sample X1 , ..., Xn P

Similar to Maximum Likelihood Estimation, we first find the Likelihood


Function L():

L() = fX1 ,...,Xn (x1 , ..., xn |)

In Bayesian inference, we treat the parameter as a random variable. That


is, follows a probability distribution with pdf (). We call () the prior
distribution of

By Bayess formula, we have

fX1 ,...,Xn (x1 , ..., xn |)()


(|x1 , ..., xn ) =
fX1 ,...,Xn (x1 , ..., xn )

fX1 ,...,Xn (x1 , ..., xn |)()

where (|x1 , ..., xn ) is the pdf of given the sample data. This is called the
posterior distribution of

Let me clarify the last step further. The symbol means proportional
to. Since the left-hand side is the distribution of conditional on the sam-
ple data {x1 , ..., xn }, all the xi are assumed to be known and the denominator
fX1 ,...,Xn (x1 , ..., xn ) is, therefore, no more than a constant

In this setting, we are given the population distribution P and the prior
distribution (). We have to find the posterior distribution (|x1 , ..., xn ).
We then use the posterior mean E[|x1 , ..., xn ] to estimate the unknown
parameter . That is,

= E[|x1 , ..., xn ]

NOTE: when calculating (|x1 , ..., xn ), always use proportionality by re-


moving constants because this will simply the calculation a lot

2
Example 1

The population distribution is Bernoulli(p), where p U nif orm(0, 1). Use


Bayesian inference to construct an estimator p

The likelihood function is given by:


n
Y
L(p) = fXi (xi |p)
i=1

n
Y
= pxi (1 p)1xi
i=1

P P
xi
=p (1 p)n xi

The pdf of the prior distribution is (p) = 1, for 0 < p < 1

Therefore, the posterior distribution is given by:


(p|x1 , ..., xn ) fX1 ,...,Xn (x1 , ..., xn |p)(p)

P P
xi
=p (1 p)n xi
, for 0 < p < 1
Recall the pdf of Beta(, ):
(+) 1
fX (x) = ()()
x (1 x)1 , for 0 < x < 1
P
By comparing,
P we can see that the posterior distribution p is Beta( xi +
1, n xi + 1)


We know that the expectation of Beta(, ) is +
. Therefore, the posterior
mean is given by:
P
xi +1
E[p|x1 , ..., xn ] = n+2
P
Xi +1
Thus, p = n+2
is the Bayesian estimator for p

Note that we used proportionality when calculating the posterior distribu-


tion. By comparing with the pdf of Beta(, ), we can easily recover the
missing constant:
(+) (n+2) P
c= ()()
= P
( xi +1)(n xi +1)

3
Example 2

Same as example 1, except that p Beta(a, b), where both a and b are
given constants

The likelihood function stays unchanged:


P P
xi
L(p) = p (1 p)n xi

The pdf of the prior distribution is given by:


(a+b) a1
(p) = (a)(b)
p (1 p)b1 , for 0 < p < 1

Therefore, the pdf of the posterior distribution is given by:

(p|x1 , ..., xn ) fX1 ,...,Xn (x1 , ..., xn |p)(p)

P P
xi
p (1 p)n xi a1
p (1 p)b1

P P
p xi +a1 (1 p)n xi +b1 , for 0 < p < 1
P P
We recognise this is Beta( xi + a, n xi + b)
P
xi +a
The posterior mean is E[p|x1 , ..., xn ] = n+a+b
. The Bayesian estimator for p
is given by:
P
Xi +a
p = n+a+b

Again, you can recover the normalising constant in the pdf of the posterior
distribution:

c= P (n+a+b) P
( xi +a)(n xi +b)

4
Example 3

The population distribution is N (, 2 ), where is unknown and 2 is known.


The parameter follows a prior distribution N (, 2 ), where both and 2
are given constants. Use Bayesian inference to construct an estimator p

The likelihood function is given by:


n
Y
L() = fXi (xi |)
i=1

n  (x )2 
Y 1 i
= exp
i=1 2 2 2 2

n
Y (xi )2 

exp 2
, because 2 is known
i=1
2

Also, the pdf of the prior distribution is given by:


1  ( )2 
() = p exp
2 2 2 2

 ( )2 
exp , because 2 is known
2 2
Then, calculate the pdf of the posterior distribution:
Yn  (x )2   ( )2 
i
(|x1 , ..., xn ) exp exp
i=1
2 2 2 2

n
1 X
   1 
= exp 2 (xi )2 exp 2 (2 2 + 2 )
2 i=1 2

n
 1 X   1 
exp 2 (xi )2 exp 2 (2 2)
2 i=1 2

 2 
by removing exp
2 2

5
n
1 X 2

2
  1 2

= exp 2 (x 2xi + ) exp 2 ( 2)
2 i=1 i 2

n
1 X
   1 
exp 2 (2xi + 2 ) exp 2 (2 2)
2 i=1 2

n
1 X 2 
by removing exp 2 x
2 i=1 i

n
 X n2   1 2

= exp xi exp ( 2)
2 i=1
2 2 2 2

 n n
h 1  2 1 X  i
= exp + + xi + 2
2 2 2 2 2 i=1

= exp(A2 + B)

 2 B/A 
= exp
1/A

 2 B/A + B 2 /4A2 
exp
1/A

( B/2A)2 i
h
= exp
1/A

Comparing with the pdf of a Normal distribution, we deduce that the pos-
terior distribution of is given by:
 
B 1
|x1 , ...xn N 2A , 2A

Clearly, E[|x1 , ..., xn ] = B/2A. Therefore, = B/2A is the Bayesian esti-


mator for

6
Example 4

Consider the following types of treatment:

Treatment 1: 100% of the patients are cured (3 out of 3)

Treatment 2: 95% of the patients are cured (19 out of 20)

Treatment 3: 90% of the patients are cured (90,000 out of 100,000)

Which one is the best???

Treatment 1 cured 100% of the patients. But the sample was so small that
we should cast doubt on the result. On the other hand, Treatment 3 was
very reassuring, but the percentage was a bit lower

Let p be the probability that a patient is cured. Then, the probability that
a patient is not cured is 1 p

Therefore, the population follows Bernoulli(p), where p is an unknown pa-


rameter
P
xi +1
In example 1, we found that p = n+2
provided an estimate for p
3+1 4
Treatment 1: p = 3+2
= 5
= 0.8
19+1 20
Treatment 2: p = 20+2
= 22
0.909
90000+1 90001
Treatment 3: p = 100000+2
= 100002
0.9

We can see that p for Treatment 2 is the highest. Therefore, we predict that
Treatment 2 is the best one. Treatment 1, despite curing everyone in the
sample, is predicted to be the worst due to its small sample size

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy