0% found this document useful (0 votes)
279 views8 pages

Assignment 2

This document contains the results of Bayesian analysis on a log-linear model. It includes: 1) A directed acyclic graph (DAG) showing the model parameters and data. 2) Details of the priors used and the results of testing different prior specifications. 3) The likelihood function for the model. 4) Code to fit the model in BRugs and examine the posterior samples. 5) Plots of the traceplots, density plots, and Brooks-Gelman-Rubin statistics which indicate convergence of the MCMC chains.

Uploaded by

robert_tan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
279 views8 pages

Assignment 2

This document contains the results of Bayesian analysis on a log-linear model. It includes: 1) A directed acyclic graph (DAG) showing the model parameters and data. 2) Details of the priors used and the results of testing different prior specifications. 3) The likelihood function for the model. 4) Code to fit the model in BRugs and examine the posterior samples. 5) Plots of the traceplots, density plots, and Brooks-Gelman-Rubin statistics which indicate convergence of the MCMC chains.

Uploaded by

robert_tan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

MATH3871

Assignment 2

Robert Tan
School of Mathematics and
Statistics
z5059256@student.unsw.edu.au
Robert Tan MATH3871: Assignment 2.

Question 1
Part (a)
A DAG diagram of the data and model parameters is:

ln

i ln xi

2 ln yi

i = 1; N

Figure 1: DAG for data and model parameters.

Part (b)
The priors used are Normal distributions for ln and since they can be both be either
positive or negative, and an inverse Gamma distribution for . I chose the parameters by
using the coefficients and residuals from a linear regression performed on the log-linear model
ln yi = ln + ln xi + i . Varying these parameters did not significantly change the results,
as seen below where I tested with diffuse normal distributions. Varying the precision prior did
make changes in the final samples for the standard error.
log_alpha ~ dnorm(0.16851, 0.0003958169)
beta ~ dnorm(0.41400, 0.0002758921)
tau ~ dgamma(0.001, 0.001)
mean sd MC_error val2.5pc median val97.5pc start sample
beta 0.41400 0.01783 0.0004102 0.37850 0.41400 0.4489 1001 3000
log_alpha 0.16860 0.02030 0.0004506 0.12750 0.16830 0.2089 1001 3000
sigma 0.09165 0.01477 0.0002629 0.06769 0.09016 0.1252 1001 3000

log_alpha ~ dnorm(0.0, 0.000001)


beta ~ dnorm(0.0, 0.000001)
tau ~ dgamma(0.001, 0.001)
mean sd MC_error val2.5pc median val97.5pc start sample
beta 0.41400 0.01783 0.0004102 0.37850 0.41400 0.4489 1001 3000
log_alpha 0.16860 0.02030 0.0004506 0.12750 0.16830 0.2089 1001 3000
sigma 0.09165 0.01477 0.0002629 0.06769 0.09016 0.1252 1001 3000

log_alpha ~ dnorm(0.0, 0.000001)


beta ~ dnorm(0.0, 0.000001)
tau ~ dgamma(0.01, 0.01)
mean sd MC_error val2.5pc median val97.5pc start sample
beta 0.41400 0.01873 0.0004308 0.3767 0.41400 0.4506 1001 3000
log_alpha 0.16860 0.02132 0.0004732 0.1254 0.16830 0.2109 1001 3000
sigma 0.09625 0.01550 0.0002760 0.0711 0.09469 0.1315 1001 3000

1
Robert Tan MATH3871: Assignment 2.

Part (c)
The likelihood function of our data is given by
24 
!
2
  Y 1/2
1 (ln yi ln x i )
L y|x, , , 2 = 2 2 exp 2
i=1
2

12 24
 1 X
= 2 2 exp 2 (ln yi ln ln xi )2 .
2 i=1

Part (d)
library(BRugs)
library(coda)
data <- read.table("dome-data.txt", sep = "", header = F, nrows = 2)
data[1,] <- log(data[1,])
data[2,] <- log(data[2,])
dataset <- list(log_x = unlist(data[1,]), log_y = unlist(data[2,]), N = 24)
linearmodel <- lm(unlist(data[2,]) ~ unlist(data[1,]))
log_alpha_0 <- as.numeric(linearmodel$coefficients[1])
beta_0 <- as.numeric(linearmodel$coefficients[2])
tau_0 <- 1/(summary(linearmodel)$sigma)^2
codafit <- BRugsFit(modelFile = "H:/R/dome-model.txt", data = dataset,
inits = list(list(log_alpha = log_alpha_0, beta = beta_0, tau = tau_0),
list(log_alpha = 0.25 * log_alpha_0, beta = 0.25 * beta_0, tau = 0.25 * tau_0),
list(log_alpha = 4 * log_alpha_0, beta = 4 * beta_0, tau = 4 * tau_0)),
numChains = 3, c("log_alpha", "beta", "sigma2"),
nBurnin = 1000, nIter = 1000, coda = TRUE)
samplesStats("*")
a=samplesHistory("*", mfrow = c(3, 1))
b=samplesDensity("*", mfrow = c(3, 1))
bgr=samplesBgr("*", mfrow = c(3, 1))
HPDinterval(codafit, 0.95)

Part (e)
The traceplots, posterior density plots and Brooks-Gelman-Rubin statistic plots are as follows.
'beta' 'beta' beta
1.2
20
0.45

0.8
15

bgr
10
0.40

0.4
5
0.35

0.0
0

1000 1200 1400 1600 1800 2000 0.35 0.40 0.45 0.50 1000 1200 1400 1600 1800 2000

iteration iteration

'log_alpha' 'log_alpha' log_alpha


0.25

1.2
20
0.20

15

0.8
bgr
10
0.15

0.4
5
0.10

0.0
0

1000 1200 1400 1600 1800 2000 0.10 0.15 0.20 0.25 1000 1200 1400 1600 1800 2000

iteration iteration

'sigma2' 'sigma2' sigma2


1.2
150
0.025

0.8
100

bgr
0.015

0.4
50
0.005

0.0
0

1000 1200 1400 1600 1800 2000 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 1000 1200 1400 1600 1800 2000

iteration iteration

2
Robert Tan MATH3871: Assignment 2.

Table 1: The 95% HDR regions for each parameter.


Lower HDR HDR Centre Upper HDR
0.378477 0.4133130 0.448149
ln 0.125292 0.1668455 0.208399
2
0.003980 0.0091880 0.014396

Question 2
Part (a)
Claim. By assuming a multinomial likelihood and Dirichlet prior with parameter vector a =
(a1 , . . . , a9 )> , the Bayes factor for testing the conjecture that observed counts n = (n1 , . . . , n9 )>
are consistent with Benfords law is given by
9
B(a) Y nj
B01 = p
B(a + n) j=1 0j

where p0 = (p01 , . . . , p09 )> are Benfords hypothesised proportions and


Q9
j=1 (aj )
B(a) = P .
9
j=1 a j

Proof. The density of the Dirichlet distribution of order 9 is


9
1 Y ai 1
f (x; a) = x .
B(a) i=1 i
Calculating the Bayes factor:
L0 (n)
B01 =
L1 (n)
P 
9
j=1 nj + 1 Y
9
n
Q9 p0jj
(nj + 1) j=1
j=1
= P 
Z 9
n
j=1 j + 1 Y9
n
9
1 Y ai 1
Q9 pj j pi dp
j=1 (nj + 1) j=1
B(a) i=1

where is the support of a Dirichlet distribution of order 9


9
B(a) Y nj
p
B(a + n) j=1 0j
=Z 9
1 Y
pai i +ni 1 dp
B(a + n) i=1
9
B(a) Y nj
= p
B(a + n) j=1 0j
9
1 Y
since pai +ni 1 is the density of a Dirichlet(a + n) distribution, so the final integral
B(a + n) i=1 i
evaluates to 1. 

3
Robert Tan MATH3871: Assignment 2.

Part (b)
We now derive the fractional Bayes factor with training fraction b. The fractional Bayes factor
is given by
m (n)
b
B01 = 0
m1 (n)
We derive m0 (n) and m1 (n) as follows:
P 
9
n
j=1 j + 1 9
Y n
Q9 p0jj
j=1 (nj + 1) j=1
m0 =   b
P9
j=1 nj + 1 Y nj
9

Q9 p0j
j=1 (nj + 1) j=1
  1b
P9
j=1 nj + 1 Y nj
9
= Q9 p0j
j=1 (nj + 1) j=1

 
Z P9 nj + 1 Y 9 9
j=1 nj 1 Y ai 1
Q9 pj pi dp
j=1 (n j + 1) j=1
B(a) i=1
m1 =   b
Z P 9
j=1 nj + 1 Y nj
9 9
1 Y ai 1
pj p dp

B(a) i=1 i
Q9
j=1 (nj + 1) j=1

where is the support of a Dirichlet distribution of order 9


Z 9
1 Y
piai +ni 1 dp
  1b
P9
n
j=1 j + 1 B(a + n) B(a + n) i=1
= Q9


B(a + bn) 9
j=1 (nj + 1)
Z
1 Y
pai i +bni 1 dp
B(a + bn) i=1
  1b
P9
j=1 nj + 1 B(a + n)
= Q9
B(a + bn)

j=1 (nj + 1)

9 9
1 Y
ai +ni 1 1 Y
since pi and pai i +bni 1 are the densities of Dirichlet(a + n)
B(a + n) i=1 B(a + bn) i=1
and Dirichlet(a + bn) distributions respectively, so the final integral fraction evaluates to 1.
Therefore we have 1b
9
m (n) Y nj B(a + bn)
b
B01 = 0 = p0j
m1 (n) j=1
B(a + n)

4
Robert Tan MATH3871: Assignment 2.

Fractional Bayes Factor vs. Training Fraction


1.2
Fractional Bayes Factor

0.8
0.4
0.0

0.0 0.2 0.4 0.6 0.8 1.0

Training fraction

Figure 2: A plot of the Fractional Bayes Factor for varying values of b.

The code used to produce the image is as follows. We select a uniform prior, setting a = 1
so that any distribution of probabilities is equally likely (their marginals are Beta(1, 8) with
mean 1/9, so the expected occurrence of each individual digit is 1/9). Note that we set the
minimum value of b to nmin /n.
p <- c(0.301, 0.176, 0.125, 0.097, 0.079, 0.067, 0.058, 0.051, 0.046)
n <- c(31,32,29,20,18,18,21,13,10)
a <- rep(1,9)
log.bayes.factor <- ((sum(lgamma(a)) - lgamma(sum(a))) -
(sum(lgamma(a+n)) - lgamma(sum(a+n))) + sum(n*log(p)))
log.bayes.fractional <- rep(0,100)
bmin <- min(n)/sum(n)
for (i in 0:947) {
log.bayes.fractional[i+1] <- {(1-(bmin + i/1000)) * sum(n * log(p)) +
((sum(lgamma(a + (bmin + i/1000) * n)) -
lgamma(sum(a + (bmin + i/1000) * n))) -
(sum(lgamma(a + n)) - lgamma(sum(a + n))))}
}
bayes.factor <- exp(log.bayes.factor)
bayes.fractional <- exp(log.bayes.fractional)
x <- seq(bmin, bmin + 0.947,0.001)
plot(x, bayes.fractional, type = l, xlim = c(0,1), ylim = c(0,1.2),
xlab = "Training fraction", ylab = "Fractional Bayes Factor",
main = "Fractional Bayes Factor vs. Training Fraction")
abline(1,0, col = red)

Part (c)
The Bayes factor returned by the above code is 1.177189 which is greater than 1, slightly
favouring the first model, i.e. the election counts adhered to Benfords law. However, the
fractional Bayes factor is always below 1 for any meaningful training fraction, so it favours the
second model, that the election counts did not adhere to Benfords law. In this case I would use
the regular Bayes factor since neither model uses an improper prior, so we conclude that there
is insufficient evidence to suggest that the election counts do not follow Benfords law. However
we cannot really draw any meaningful conclusions about the validity of the Venezuelan election
itself, since election data often does not follow Benfords law (see Deckert, Myagkov, Ordeshook
2010. The Irrelevance of Benfords Law for Detecting Fraud in Elections).

5
Robert Tan MATH3871: Assignment 2.

Question 3
Part (a)
The minimal training samples for the Bayes factor contain 2 observations, one 0 and one 1, i.e.
they are the sets {0, 1} and {1, 0} (this always exists since we assume 0 < r < n. We derive
the partial Bayes factors as follows:

R|T B12
B12 = T
B12
Z 1  
n r1
c1 (1 )nr1 d
0 r
 
n r
0 (1 0 )nr
r
= Z 1  
2
c1 d
 0 1
2
0 (1 0 )
1
Z 1
1
B(r, n r) r1 (1 )nr1 d
0 B(r, n r) (r)(n r)
= where B(r, n r) =
0r1 (1 0 )nr1 n
= B(r, n r)01r (1 0 )1n+r .

for either training set {0, 1} or {1, 0} due to symmetry. Then the arithmetic and geometric
means of these two partial Bayes factors are simply the same expression, so the arithmetic and
geometric Bayes factors both take the value
AI GI
B12 = B12 = B(r, n r)01r (1 0 )1n+r

Part (b)
For a given fraction b, we derive the partial Bayes factor as follows:

F m1 (x)
B12 =
m2 (x)
Z 1
 
n r1
c1 (1 )nr1 d
0 r
Z 1 " #b
n
c1 br1 (1 )bnbr1 d
0 r
=  
n r
(1 0 )nr
r 0
  b
n r nr
(1 0 )
r 0

B(r, n r) r(b1)
= (1 0 )(nr)(b1) .
B(br, bn br) 0

6
Robert Tan MATH3871: Assignment 2.

Part (c)
For 0 = 0 and r = 1, we have

I 1
B12 = B(1, n 1) = noting that lim 00 = 0 since we have fixed r = 1
n1 0 0

and
F 0
B12 since b 1 < 0, so we have 0b1
0
.
This means the fractional Bayes factor always supports model 1, no matter what the fraction
is, whereas the intrinsic Bayes factor supports model 2 for n > 2 but is inconclusive for n = 2.
So the two are not consistent with each other for large n.
The fractional Bayes factor is correct since it always supports model 1, and here it should be
obvious that model 1 is correct because a single non-zero observation guarantees that we cannot
have 0 = 0, and the intrinsic Bayes factor fails to reach this conclusion.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy