0% found this document useful (0 votes)

21 views

Lec27 AcceptReject

This document discusses rejection sampling and its goals for the lecture. It provides an overview of rejection sampling and the accept/reject algorithm. It explains how to use rejection sampling to sample from distributions that are only known up to a proportionality constant, like the gamma distribution. It also discusses using an envelope function to improve the efficiency of rejection sampling, like using a Cauchy proposal distribution when sampling from a gamma distribution. The goals are to understand rejection sampling, the accept/reject algorithm, envelope rejection methods, and their applications to problems like sampling from the posterior distribution.

Uploaded by

hu jack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Lec27 AcceptReject

Uploaded by

hu jack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Rejection Sampling

Prof. Nicholas Zabaras

Email: nzabaras@gmail.com
URL: https://www.zabaras.com/

October 9, 2020

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 1

Contents
 Accept/Reject Algorithm, Adaptive rejection for the Gamma, Alternative
rejection sampling, Mixture methods, High-Dimensions, Rejection sampling
from the posterior

 Envelop rejection methods

 Monahan’s accept/reject method

Following closely:
 C. Robert, G. Casella, Monte Carlo Statistical Methods (Ch.. 1, 2, 3.1, & 3.2) (google books, slides, video)
 J. S. Liu, MC Strategies in Scientific Computing (Chapters 1 & 2)
 J-M Marin and C. P. Robert, Bayesian Core (Chapter 2)
 Statistical Computing & Monte Carlo Methods, A. Doucet (course notes, 2007)
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 2
Goals
 The goals for today’s lecture include:

 Understand rejection sampling and the accept/reject algorithm

 Understand the envelope rejection method and Monahan’s accept/reject

method

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 3

Rejection Sampling
 We would like to sample from 𝜋(𝑥) defined on X only known up to a
proportionality constant,   *

 The method relies on samples generated from a proposal distribution 𝑞(𝑥) on

𝑋. 𝑞 might also be known up to a normalizing constant, q  q *

 We need 𝑞(𝑥) to dominate 𝜋(𝑥), i.e.

𝜋 ∗ (𝑥)
𝑀 = sup ∗ < +∞
𝑥∈𝒳 𝑞 (𝑥 )

 This implies that  ( x)  0  q ( x)  0 but also that the tails of q *( x) must be

thicker than the tails of  *( x).

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 4

Accept/Reject Algorithm
 More generally, we would like to sample from 𝜋(𝑥), but it’s easier to sample
from a proposal distribution 𝑞(𝑥)
 𝑞(𝒙) satisfies 𝜋 ∗ (𝒙) < 𝑀’ 𝑞∗ (𝒙) for all 𝒙 for some 𝑀’ ≥ 𝑀
 Procedure:
 Sample 𝑦 from 𝑞(𝑦) and 𝑢 from 𝒰[0,1]
 Accept (set 𝑥 = 𝑦) with probability 𝜋 ∗ (𝑦)/ 𝑀’𝑞 ∗ (𝑦) (i.e. if 𝑢 ≤ 𝜋 ∗ (𝑦) / 𝑀’
𝑞 ∗ (𝑦))
 Reject otherwise and repeat.
 The accepted 𝑥 (𝑖) are samples from 𝜋(𝑥)!
 If 𝑀’ is too large, we will rarely accept samples.
B. D. Flury, Rejection Sampling made easy, SIAM review, 1990
J. Halton, Reject the rejection technique, J. Scientific Computing, 1992

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 5

Rejection Sampling
Reject
 *( y )
M ' q *( y )
M ' q *( y )
M ' q *( y )u  *( y )

y
 Set 𝑖 = 1 Accept
 Repeat until 𝑖 = 𝑁
 Sample y ~ q( y ) and u ~ U(0,1)
 If u  M' *( y)
q *( y )
then accept (set 𝑥 (𝑖) = 𝑦) and increment the

counter 𝑖
 Otherwise, reject

The distribution 𝜋(𝑦) needs to be known only up to a normalizing

constant:
 *( y )
 ( y)  , Z    *( y )dy
Z X
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 6
Accept/Reject Efficiency
 The number of trials before a candidate sample is accepted is a geometric
distribution.

 Indeed note that:

Pr  k th proposal is accepted   (1   ) k 1 

X  *( y)dy
  Pr(Y is accepted ) 
M '  q *( y )dy
X

 The mean of the geometric distribution is 1/𝛾, thus the number of trials before
success is thus an unbiased estimate of 1/Pr(𝑌 𝑖𝑠 𝑎𝑐𝑐𝑒𝑝𝑡𝑒𝑑).

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 7

Example: Sampling from the Gamma
1
 Consider sampling from the Gamma distribution 𝒢𝒶 𝑥 𝑎, 𝜆 = Γ(𝑎) 𝜆𝑎 𝑥 𝑎−1 𝑒 −𝜆𝑥
 One can show that if 𝑋𝑖 ~ℰ𝓍𝓅 𝜆 𝑖𝑖𝑑 and 𝑌 = 𝑋1 + ⋯ + 𝑋𝑘 , then 𝑌~𝒢𝒶 𝑥 𝑘, 𝜆 .
 This transformation cannot be used for non-integer 𝑎. In this case, we can sample from 𝒢𝒶 𝑥 𝑎, 𝜆 using
accept/reject with a proposal 𝑞 𝑥 = 𝒢𝒶 𝑥 𝑘 = 𝑎 , 𝜆 − 1 . With this choice

𝑝(𝑥) 𝑥 𝑎−1 𝜆𝑎 𝑒 −𝜆𝑥 Γ(𝑘) Γ( 𝑎 )𝜆𝑎

= 𝑘−1 𝑘 −(𝜆−1)𝑥
= 𝑘
𝑥 𝑎−𝑘 𝑒 −𝑥
𝑞(𝑥) 𝑥 (𝜆 − 1) 𝑒 Γ(𝑎) Γ(𝑎)(𝜆 − 1)

𝒢𝒶(𝑎−𝑘|𝑎,𝜆)
 This ratio is max at 𝑥 = 𝑎 − 𝑎 . Thus 𝑀 = ,𝑘 = 𝑎.
𝒢𝒶(𝑎−𝑘|𝑘,𝜆−1)

Link to MatLab implementation

From PMTK3

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 8

Acceptance-Rejection for the Gamma
1
 We revisit sampling from 𝒢𝒶 𝑥 𝑎, 𝜆 = 1 ≡ 𝑓(𝑥) = 𝑥 𝑎−1 𝑒 −𝑥 , 𝑎 > 1. A suitable
Γ(𝑎)
𝑐/𝜋 1
proposal is the Cauchy ℎ 𝑥 = . The CDF is given as 𝐻 𝑥 = +
1+𝑐 𝑥−𝑏 2 2
1 1
𝑎𝑟𝑐𝑡𝑎𝑛 𝑐 𝑥−𝑏 . The inverse is 𝑥 = 𝑡𝑎𝑛 𝜋 𝐻 𝑥 − 0.5 + 𝑏.
𝜋 𝑐

 Since we know the CDF of ℎ(𝑥), we can sample easily from it. To use it as a
proposal distribution, we generalize to be sure that is nowhere less than the Gamma.
 You can show 𝑎 > 1, 𝑥 ≥ 0 the following inequality holds:
In our notation:
1 𝑎−1 −𝑥 1 𝑒 − 𝑎−1 𝑎 − 1 𝑎−1
𝑓(𝑥) = 𝑥 𝑒 ≤ 2 ≡𝑔 𝑥 𝜋 ∗ = 𝑥 𝑎−1 𝑒 −𝑥
Γ 𝑎 Γ 𝑎 𝑥− 𝑎−1 1
𝑞∗ =
1+ 2
2𝑎 − 1 1+
𝑥− 𝑎−1
2𝑎 − 1
1 − 𝑎−1 𝑎−1
1 𝑀∗ = 𝑒 − 𝑎−1 𝑎 − 1 𝑎−1
= 𝑒 𝑎−1 𝜋 2𝑎 − 1 ℎ 𝑥|𝑎 − 1, 𝑐 = 1 𝑀∗ ‫ 𝑒 𝑥𝑑 ∗ 𝑞 ׬‬−(𝑎−1) 𝑎 − 1 𝑎−1
Γ 𝑎 2𝑎 − 1 = = 𝜋 2𝑎 − 1
𝛾 ‫𝑥𝑑 ∗ 𝜋 ׬‬ Γ 𝑎
 Thus we have shown that:
1 1
𝑓(𝑥) ≤ 𝐾ℎ 𝑥|𝑏 = 𝑎 − 1, 𝑐 = 2𝑎−1 , where 𝐾 = Γ 𝑒− 𝑎−1
𝑎−1 𝑎−1 𝜋
2𝑎 − 1.
𝑎
 U. Dieter & J. Ahrens, Acceptance Rejection Techniques for Sampling from the Beta and Gamma Distributions, 1974

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 9

Adaptive Rejection for the Gamma
The overall accept−reject algorithm now takes the form:

1. Set 𝑏 ← 𝑎 − 1, 𝐴 ← 𝑎 + 𝑏, and s ← 𝐴. // 𝑏 = 𝑎 − 1, 𝐴 = 2𝑎 − 1, 𝑠 = 2𝑎 − 1.

2. Generate 𝑢~𝒰 0,1 . Set 𝑡 ← 𝑠 𝑡𝑎𝑛 𝜋 𝑢 − 0.5 and 𝑥 ← 𝑏 + 𝑡. // 𝑥 = sample

from the Cauchy.

3. If 𝑥 < 0 go to 2.

𝑥 𝑡2
4. Generate 𝑢′ . If 𝑢′ > 𝑒𝑥𝑝 𝑏 𝑙𝑛 − 𝑡 + 𝑙𝑛 1 + go to Step 2. Otherwise
𝑏 𝐴
deliver 𝑥.

 U. Dieter & J. Ahrens, Acceptance Rejection Techniques for Sampling from the Beta and Gamma Distributions, 1974

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 10

Adaptive Rejection for the Gamma
1
 We revisit sampling from 𝒢𝒶 𝑥 𝑎, 𝜆 = 𝑥 𝑎−1 𝜆𝑎 𝑒 −𝜆𝑥 , 𝑎 > 1. A suitable
Γ(𝑎)
𝑐/𝜋
proposal distribution is the Cauchy ℎ 𝑥 = . We generalize to be sure
1+𝑐 𝑥−𝑏 2
that is nowhere less than the Gamma.

1
𝐾ℎ 𝑥|𝑏 = 𝑎 − 1, 𝑐 =
2𝑎 − 1

𝐺𝑎 𝑥 𝑎, 𝜆 = 1
1 𝑎−1 −𝑥
= 𝑥 𝑒
Γ(𝑎)
1 1
 The expected number of trials per sample is: 𝛾 = Γ 𝑎 𝑒 − 𝑎−1 𝑎 − 1 𝑎−1
𝜋 2𝑎 − 1.
It decreases from π = 3.14159 for 𝑎 = 1 to 𝜋 = 1.77245 for 𝑎 → ∞.
 U. Dieter & J. Ahrens, Acceptance Rejection Techniques for Sampling from the Beta and Gamma Distributions, 1974
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 11
Alternative Rejection Sampling (RS) Algorithm
 In the standard accept/reject algorithm, the candidate is sampled before 𝑢.
This is not necessary.
 (Beskos et al., 2005): Let (Yn , I n ) n1 be a sequence of i.i.d. random variables in
X  0,1 such that Y1 ~ q
 *( y )
Pr( I1  1| Y1  y )  y  X
Cq *( y )
Define   min i  1: Ii  1 , then Y ~ 
 This scheme does not assume any order for the simulation of 𝑌 and 𝐼 and,
besides the conditional property given in the proposition, does not restrict the
construction of 𝐼.
 This result is useful if we can construct conditions for the acceptance or
rejection of the current proposed element 𝑌 from minimal information about it.

A. Beskos and G. Roberts, The Annals of Applied Probability, Vol 15(4) (2005) pp. 2422–2444.
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 12
Alternative Rejection Sampling (RS) Algorithm
 The proof is given in (Beskos et al., 2005) and can be summarized in the
following steps. Let (𝑆, 𝒮) be a sufficiently regular measurement space
𝜋∗ (𝑦) ‫𝑦𝑑)𝑦( ∗𝜋 𝑆׬‬
 Step 1. 𝑝 𝐼1 = 1 = ‫𝐼 𝑃 𝑆׬‬1 = 1|𝑌1 = 𝑦 𝑞 𝑦 𝑑𝑦 = ‫𝑞 )𝑦( ∗𝑞𝐶 𝑆׬‬ 𝑦 𝑑𝑦 = ≡𝛾
𝐶 ‫𝑦𝑑)𝑦( ∗𝑞 𝑆׬‬
 Step 2. For any 𝐹 ∈ 𝒮, we have:

𝑝 𝑌𝜏 ∈ 𝐹 = 𝑝 𝑌𝜏 ∈ 𝐹, 𝐼1 = 1 + 𝑃 𝑌𝜏 ∈ 𝐹|𝐼1 = 0 𝑃 𝐼1 = 0

or using the independence of 𝑌𝑛 , 𝐼𝑛 𝑛≥1 and the definition of 𝜏 = min 𝑖 ≥ 1: 𝐼𝑖 = 1

𝑝 𝑌𝜏 ∈ 𝐹 = න 𝑃 𝐼1 = 1|𝑌1 = 𝑦 𝑞 𝑦 𝑑𝑦 + 𝑃 𝑌𝜏 ∈ 𝐹 1 − 𝛾
𝐹

𝜋 ∗ (𝑦) ‫𝑦𝑑)𝑦( ∗ 𝜋 𝑆׬‬

𝑝 𝑌𝜏 ∈ 𝐹 = න ∗ (𝑦)
𝑞 𝑦 𝑑𝑦 + 𝑃 𝑌𝜏 ∈ 𝐹 1 − 𝛾 → 𝑝 𝑌𝜏 ∈ 𝐹 = න 𝜋(𝑦)𝑑𝑦 + 𝑃 𝑌𝜏 ∈ 𝐹 1 − 𝛾
𝐹 𝐶𝑞 ∗
𝐶 ‫𝐹 𝑦𝑑)𝑦( 𝑞 𝑆׬‬
𝜋(𝐹)

𝑝 𝑌𝜏 ∈ 𝐹 = 𝛾𝜋(𝐹) + 𝑃 𝑌𝜏 ∈ 𝐹 1 − 𝛾 → 𝑃 𝑌𝜏 ∈ 𝐹 = 𝜋(𝐹)

A. Beskos and G. Roberts, The Annals of Applied Probability, Vol 15(4) (2005) pp. 2422–2444.
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 13
Mixture Methods for the Generation of Random Variables
 Consider infinite mixture elements, mixture weights 𝑝𝑖 that are geometric
probabilities and mixture elements 𝜋𝑖 that are all equal to 𝜋 ∙

 ( x)   pi ( x), pi  p (1  p )i 1 and
i 1
‫𝑦𝑑)𝑦( ∗ 𝜋 𝒳׬‬
𝑝=
𝑀′ ‫𝑦𝑑)𝑦( ∗𝑞 𝒳׬‬
 The element identifier 𝐼~𝒢ℯℴ(𝑝) is generated not by discrete sampling but by
a sequential search that tests for 𝐼 = 1 , {𝐼 = 2}, . . until a test is accepted.
𝑥~𝜋𝑖 𝑥 = 𝜋(𝑥) is generated automatically as a by product of the
determination of 𝐼.
 Instead of simulating from the geometric distribution 𝒢ℯℴ(𝑝) directly which is
impossible, one simulates an element which admits this probability distribution
(see Peterson and Kronmal, 1982)
Arthur V. Peterson, Jr. and Richard A. Kronmal, On Mixture Methods for the Computer Generation of Random
Variables, The American Statistician, Vol. 36, No. 3, Part 1 (Aug., 1982), pp. 184-191

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 14

Mixture Methods for the Generation of Random Variables
2 1 2
 Consider the truncated Cauchy 𝜋 𝑥 = 𝜋 ∗ 𝑥 = 𝕀 𝑥 < 1 . Use 𝑞∗ 𝑥 = 𝕀 𝑥 < 1 .
𝜋 1+𝑥 2 𝜋
∗
𝜋∗ (𝑥) ‫𝑦𝑑)𝑦( 𝜋 𝒳׬‬ 𝜋
Note that 𝑀 = sup ∗ ) = 1 and the probability of acceptance is 𝑝 = = .
𝑥∈𝒳 𝑞 (𝑥 𝑀 ‫𝑦𝑑)𝑦( ∗ 𝑞 𝒳׬‬ 4
 The AR algorithm proceeds as follows:
AR1. [Prepare for the acceptance-rejection comparison]
a. Generate 𝑥~𝒰(−1, + 1)
b. Generate 𝑢~𝒰 0, 1
AR2. [Acceptance-rejection step]
a. If 𝑢 < 1/(1 + 𝑥 2 ), return 𝑥.
b. Otherwise, repeat steps 1 and 2 until acceptance.
 The mixture representation of this AR algorithm is 𝜋 𝑥 = σ∞𝑖=1 𝑝𝑖 𝜋 𝑥 , 𝑝𝑖 = 𝑝 1 − 𝑝
𝑖−1 =
𝜋 𝜋 𝑖−1
1− where 𝑝𝑖 = 𝑃{the number of times that steps 1 and 2 are executed = 𝑖}.
4 4
 For the AR method 𝜋𝑖 ∙ = 𝜋 ∙ for all 𝑖, and thus the algorithm does not use in any
computations the value of the chosen mixture element 𝐼.

Arthur V. Peterson, Jr. and Richard A. Kronmal, On Mixture Methods for the Computer Generation of Random
Variables, The American Statistician, Vol. 36, No. 3, Part 1 (Aug., 1982), pp. 184-191

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 15

Accept/Reject-No Need of Normalizing Constants
x2

 The target 𝜋 is given by  ( x)   *( x)  e m( x), m( x)  M x  X
2

2
1  x2
 If we use q( x)  q *( x)  e (normalized )
2

then we have:  ( x) X  ( y)dy

 M '  2 M and Pr(Y accepted ) 
q *( x) 2 M'
x

 If instead we use q *( x)  e , then we have:
2

 ( x) X  ( y)dy X  ( y)dy X  ( y)dy

 M and Pr(Y accepted )   
q *( x) M  q *( y )dy M 2 M'
X

 Once again we see that we do not need the normalizing constant of 𝑞 ∗ .

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 16

Accept/Reject – Bayesian Estimation Example
 Consider a Bayesian model: prior  ( ) and likelihood f ( x |  )

 The posterior distribution is given by:

f ( x |  ) ( )
 ( | x)    *( | x), where  *( | x)  f ( x |  ) ( )
 f ( x |  ) ( )d
 We can use the prior distribution as a candidate distribution q( )  q *( )   ( ) as
long as  *( | x)
sup  sup f ( x |  )  M , i.e. M  f ( x |  MLE )
  q *( )  

 The likelihood is often bounded so one can use the rejection procedure.
Samples are accepted with probability
X  *( x)dx   ( ) f ( x |  )d   ( ) f ( x |  )d
 
 

M  q *( x)dx M   ( )d M
X 

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 17

Rejection Sampling in High-Dimensions
 Consider the following target distribution d

 xi2
1  i 1
 ( x )   *( x )  N (0,I )  e 2

 2 
d /2

 We take the following reasonably good proposal distribution (𝜎 > 1) d

 xi2
1  i 1 2
q *( x )  N (0,  I )  2
e 2

d
 2  2 d /2

 Note that: 
1
 *( x ) 
1 xi2 (1 2 )

The efficiency of
 e d 2 i1
  M
d
RS
q *( x ) decreases
Z 1
Pro  Proposal Accepted    0 exponentially
M  q *( y )dy  d
with
X dimensionality
when d  

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 18

Sampling From an Arbitrary PDF
 Inverting the CDF:

 Not practical in high dimensions

 Often the CDF is not known (e.g. when the density is known up to a normalizing factor)

 Rejection sampling:

 It only requires 𝜋 or 𝑞 to be known up to a normalizing constant

 We need to have a proposal density q *(x) from which we can draw easily samples. This
is not feasible in high dimensions.
 We need to find a bounding constant
 *(x)
M , x
q *(x)
 How do you construct 𝑞∗ (𝑥) automatically?

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 19

Envelope Rejection Method
 Assume we have:
qL* ( x)   *( x)  M ' q *( x)

 We can modify the accept/reject algorithm as follows:

I. Sample 𝑌~𝑞 and 𝑢~𝒰(0,1).

qL* (Y )
II. If u  , then return 𝑌;
M ' q *(Y )
 * (Y )
III. Otherwise, accept 𝑌 if u  , otherwise return to step I.
M ' q *(Y )

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 20

Log-Concave Densities
 Consider the class of univariate log-concave densities, i.e. we have:
 2 log  ( x)
x 2
 0,  ( x)  f ( x) X f ( x)dx
 The idea is to construct automatically a piecewise linear upper and lower
bounds for the target PDF.
1. Define the line Li ,i 1 through  xi , h( xi ) 
and  xi 1 , h( xi 1 )  as shown in the Fig .
2. Define on  x0 , xn 1 
hi ( x)  min  Li 1,i ( x), Li 1,i  2 ( x)  , hi ( x)  Li ,i 1 ( x)
h( x)  log f ( x) On [ x0 , xn 1 ]c : h n ( x)   and
h n ( x)  min  L0,1 ( x), Ln ,n 1 ( x) 
such that h n ( x)  h( x)  log f ( x)  h n ( x)
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 21
Log-Concave Densities
 Consider the class of univariate log-concave densities, i.e. we have:
 2 log  ( x)
x 2
 0,  ( x)  f ( x) X f ( x)dx
 The idea is to construct automatically a piecewise linear upper and lower
bounds for the target PDF.

h n ( x)  h( x)  log f ( x)  h n ( x) 

f n ( x)  e hn ( x )  f ( x)  e hn ( x )  f n ( x) 
f n ( x)  f ( x)  f n ( x)  wn g n ( x),
log f ( x) wn is the normalization constant of f n ( x )
and g n ( x) is a density easy to sample from.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 22

Adaptive Rejection Sampling
 Let 𝑆𝑛 be a set of points 𝑥𝑖 , 𝑖 = 0,1, … , 𝑛 + 1. Initialize 𝑛 = 0 and 𝑆0.

 At iteration 𝑛 ≥ 1

 Generate 𝑌~𝑔𝑛 (𝑥), 𝑢~𝒰[0,1]

f n (Y )
 If u
wn g n (Y )
, then accept 𝑌; // Squeezing test
f (Y )
Otherwise, if u , then return 𝑌 // Rejection step
wn g n (Y )

Otherwise, update
S n 1  S n  Y 

Gilks, WR, Wild, P, "Adaptive rejection sampling for Gibbs sampling“ Applied Statistics, Vol. 41 (1992), pp. 337-348. Get it from
JSTOR

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 23

Adaptive Rejection Sampling: Example
 Consider 𝑛 data ( xi , Yi ) of integer value data Yi where
 e
a bxi 
 a bxi  yi
e e
Yi | xi ~ Poisson (exp a  bxi ) 
yi !
 Consider as the prior the following:
 (a, b)  N (a;0,  2 )N (b;0, 2 )

 We have for the posterior:

a2 b2

 (a, b | x1:n , y1:n )  exp a  yi  b yi xi  e a  e x b  e

 
i 2 2
e 2 2

a2
Full Conditional : log  (a | x1:n , y1:n , b)  a  yi  e a
e xi b
 2  non a  dependent terms 
2
 2 log  (a | x1:n , y1:n , b)
 e a  e xib   2  0
a 2

 Thus  (a | x1:n , y1:n , b) and similarly  (b | x1:n , y1:n , b) are log-concave, and
adaptive rejection sampling can be applied.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 24

Monahan’s Accept/Reject Method
 Assume that we want to sample from a CDF of the form:
H  G ( x) 
F ( x)  P( X  x) 
H (1)
 Here 𝐺(𝑥) is a given CDF and

H ( x )   an x n ,
n 1

with : 1  a1  a2  ...  0

 We want to achieve this by taking samples only from 𝐺 and 𝒰[0,1].

J. F. Monahan, Extension of von Newmann’s method for generating random variables, Mathematics of
Computation, 33(147) (1979) 1065-1069.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 25

Monahan’s Accept/Reject Method
 For example, let
2  2 i 2
x
  x   48   x 
2 2 2
 ... 
2 2 i 3
 2i !
  x   ...
2 i

F ( x)  1  cos 
2 2  2 i 2
 1   1  ...   1  ...
2 i

48 2 2 i 3
 2i !
 To derive this note that: x 
2i

2i
x    2i
cos x   (1) i x

 2i !
 1  cos  1   (1) i  2 

 2i !
  2i   x 
2 i

i 0 2 i 0 i 1 2  2i !

 2 i 2   x 2 
i
 2 i 2
x

22i 3  2i !
2 2 i 3
 
2i !
 x 
2 i

i 1 i 1
1  cos   (use x  1 for the denominator approximation)
2  
1
 2 i 2
 2  2i !  1
2 i

 8  2 i 3
  i 1

2  2 i 2
Thus : G ( x)  x 2 , H ( x)  x  x 2  ...  x i  ...
48 22i 3  2i !
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 26
Monahan’s Accept/Reject Method
H  G ( x)  
F ( x)  P( X  x)  , G is a CDF , H ( x )   an x n , s.t.1  a1  a2  ...  0
H (1) n 1

 The sampling algorithm is as follows:

Repeat
 Generate X ~ G and set K  1
 Repeat
 Generate U ~ G and V ~ U [0,1]
aK 1
 If U  X and V  then K  K  1, otherwise stop.
aK
Until K odd , return X

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 27

Monahan’s Accept/Reject Method
 We define the event
An : X  max  X ,U1 ,U 2 ,...,U n  and Z1  Z 2  ...Z n  1
 The 𝑈𝑖 ’s are the random variables generated in the inner loop of the algorithm
and the 𝑍𝑖 ’s are Bernoulli random variables equal to consecutive values
aK 1
V
 We can show: aK
P  X  x, An   anG ( x) n
 Notice that (the CDF of the max function was derived in the previous lecture):
 a2   an 
P  X  x, An   1 G ( x)     G ( x)   ...    G ( x)   anG ( x) n Note 𝑎1 = 1
 a1   an1 
 Using 𝐴𝑛+1 ⊂ 𝐴𝑛 , we can now show that:
P  X  x, An , Anc1   P  X  x, An   P  X  x, An1 
 anG ( x) n  an1G ( x) n1
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 28
Monahan’s Accept/Reject Method
 The probability that 𝑋 is accepted is:

P ( Accept X )  P ( K odd )   an (1) n1   H (1)
n 1

 This can be shown simply using P  X  , An , Anc1   anG () n  an1G () n1  an  an1


P ( Accept X )  P ( K odd )   a1  a2    a3  a4   ...   an (1) n1   H (1)
n 1

 The returned 𝑋 has then a distribution:

P ( X  x, X returned )
F ( x)  P ( X  x | X returned ) 
P ( X returned )

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 29

Monahan’s Accept/Reject Method
 The returned 𝑋 has then a distribution:

P ( X  x, X returned )

n 1,3,5,..
P ( X  x, An , Anc1 )
P ( X  x | X returned )  
P ( X returned ) P ( X returned )


 1
a G ( x)  a2G ( x)    a3G ( x)  a4G ( x)   ...
2 3 4  a G ( x) (1)
n
n n 1
 H  G ( x) 
  
1 n 1

 H (1)  H (1)  H (1)

H  G ( x) 
F ( x)  P ( X  x | X returned ) 
H (1)

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 30

Solutions Manual For Statistical Computing With R - Rizzo
100% (1)
Solutions Manual For Statistical Computing With R - Rizzo
136 pages
Lecture 8.2 - Variational Quantum Eigensolver
No ratings yet
Lecture 8.2 - Variational Quantum Eigensolver
27 pages
Sampling Methods: Søren Højsgaard
No ratings yet
Sampling Methods: Søren Højsgaard
22 pages
Accept-Reject Sampling Method Derivation and Computation in R
No ratings yet
Accept-Reject Sampling Method Derivation and Computation in R
3 pages
Annotated_L18
No ratings yet
Annotated_L18
73 pages
sml
No ratings yet
sml
49 pages
Lec29 ImportanceSampling
No ratings yet
Lec29 ImportanceSampling
84 pages
Lec26 RandomVariableGeneration
No ratings yet
Lec26 RandomVariableGeneration
38 pages
Modeling and Simulation Scribe2 October 23
No ratings yet
Modeling and Simulation Scribe2 October 23
13 pages
Modeling and Simulation Scribe2 October 23
No ratings yet
Modeling and Simulation Scribe2 October 23
13 pages
book_chap4CMCM
No ratings yet
book_chap4CMCM
12 pages
IE 403 Ch03 RNG RVG With Comments
No ratings yet
IE 403 Ch03 RNG RVG With Comments
27 pages
Simulation Models in Industrial Engineering: Random-Variate Generation
No ratings yet
Simulation Models in Industrial Engineering: Random-Variate Generation
18 pages
Stat906 f24 A2
No ratings yet
Stat906 f24 A2
2 pages
Rarefied Gas Dynamics - DSMC Course
No ratings yet
Rarefied Gas Dynamics - DSMC Course
50 pages
Annotated_L19
No ratings yet
Annotated_L19
53 pages
Monte Carlo Techniques For Bayesian Statistical Inference - A Comparative Review
No ratings yet
Monte Carlo Techniques For Bayesian Statistical Inference - A Comparative Review
15 pages
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
No ratings yet
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
14 pages
On The Markov Chain Monte Carlo (MCMC) Method: Rajeeva L Karandikar
No ratings yet
On The Markov Chain Monte Carlo (MCMC) Method: Rajeeva L Karandikar
24 pages
Acceptance_and_Rejection_Random_number_generation
No ratings yet
Acceptance_and_Rejection_Random_number_generation
4 pages
MCMC Brief
No ratings yet
MCMC Brief
69 pages
L11 TopicModels 2
No ratings yet
L11 TopicModels 2
37 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
simulation5
No ratings yet
simulation5
29 pages
Sacsdc1 s2.0 0
No ratings yet
Sacsdc1 s2.0 0
7 pages
Monte Carlo Techniques: 32.1. Sampling The Uniform Distribution
No ratings yet
Monte Carlo Techniques: 32.1. Sampling The Uniform Distribution
7 pages
Questions_for_Unit_5__RM_
No ratings yet
Questions_for_Unit_5__RM_
4 pages
Sampling Slides
No ratings yet
Sampling Slides
38 pages
KY.UNIT 3
No ratings yet
KY.UNIT 3
20 pages
Monte Carlo
No ratings yet
Monte Carlo
59 pages
Iterative and Non-Iterative Simulation Algorithms
No ratings yet
Iterative and Non-Iterative Simulation Algorithms
6 pages
Devroye Non Uniform Random Variate Generation PDF
No ratings yet
Devroye Non Uniform Random Variate Generation PDF
857 pages
Devroye
No ratings yet
Devroye
857 pages
Random Variable Generation
No ratings yet
Random Variable Generation
5 pages
Von Neumann'S Comparison Method For Random Sampling From The Normaland Other Distributions
No ratings yet
Von Neumann'S Comparison Method For Random Sampling From The Normaland Other Distributions
21 pages
Full Download (Ebook) Independent Random Sampling Methods by Luca Martino, David Luengo, Joaquín Míguez ISBN 9783319726335, 9783319726342, 3319726331, 331972634X PDF DOCX
100% (8)
Full Download (Ebook) Independent Random Sampling Methods by Luca Martino, David Luengo, Joaquín Míguez ISBN 9783319726335, 9783319726342, 3319726331, 331972634X PDF DOCX
60 pages
Lec3 Inverse Transformation Rejection
No ratings yet
Lec3 Inverse Transformation Rejection
46 pages
Stochastic Simulation Book
No ratings yet
Stochastic Simulation Book
146 pages
Putational Statistics Using Matlab
No ratings yet
Putational Statistics Using Matlab
78 pages
Computational Statistics With Matlab: Mark Steyvers May 13, 2011
No ratings yet
Computational Statistics With Matlab: Mark Steyvers May 13, 2011
78 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
Part A Simulation: Matthias Winkel Department of Statistics University of Oxford
No ratings yet
Part A Simulation: Matthias Winkel Department of Statistics University of Oxford
54 pages
HE Etropolis Lgorithm: Theme Article
No ratings yet
HE Etropolis Lgorithm: Theme Article
5 pages
Non Unif HO
No ratings yet
Non Unif HO
58 pages
lecture-8-(-unmarked)
No ratings yet
lecture-8-(-unmarked)
42 pages
4 Random Variates - 2023
No ratings yet
4 Random Variates - 2023
30 pages
Lecture 18 1
No ratings yet
Lecture 18 1
17 pages
MTH210
No ratings yet
MTH210
126 pages
Computational Statistics With Matlab
No ratings yet
Computational Statistics With Matlab
71 pages
Drawing Random Samples From Statistical Distributions: Paul E. Johnson
No ratings yet
Drawing Random Samples From Statistical Distributions: Paul E. Johnson
26 pages
AI Obse-2
No ratings yet
AI Obse-2
32 pages
Cts RVSimulation
No ratings yet
Cts RVSimulation
39 pages
2d650fb8-6954-4f7b-b2ac-57c911bdb7c4
No ratings yet
2d650fb8-6954-4f7b-b2ac-57c911bdb7c4
33 pages
New Models of Acceptance Sampling Plans
No ratings yet
New Models of Acceptance Sampling Plans
23 pages
l04 46discrete Case
No ratings yet
l04 46discrete Case
7 pages
Mira - 2001 - On Metropolis-Hastings algorithms with delayed rejection
No ratings yet
Mira - 2001 - On Metropolis-Hastings algorithms with delayed rejection
11 pages
Lab 10, Monte Carlo Integration: Topics
No ratings yet
Lab 10, Monte Carlo Integration: Topics
1 page
Chapter Two General Principles Random Variate Generation: 1s Its It of
No ratings yet
Chapter Two General Principles Random Variate Generation: 1s Its It of
56 pages
Module07 RandomVariateGeneration
No ratings yet
Module07 RandomVariateGeneration
73 pages
L15 Misc Topic Sampling
No ratings yet
L15 Misc Topic Sampling
18 pages
weighted-ensemble-2003.02316v3
No ratings yet
weighted-ensemble-2003.02316v3
41 pages
Gonzalez 2020
No ratings yet
Gonzalez 2020
79 pages
Ek 2020
No ratings yet
Ek 2020
203 pages
Dai 2020
No ratings yet
Dai 2020
62 pages
Lecture 7 - Introduction To Quantum Noise Bonus
No ratings yet
Lecture 7 - Introduction To Quantum Noise Bonus
13 pages
Lecture 1.1 - Single States
No ratings yet
Lecture 1.1 - Single States
49 pages
Lec24 BayesianLinearRegression
No ratings yet
Lec24 BayesianLinearRegression
29 pages
Seminar em
No ratings yet
Seminar em
51 pages
Lecture 3 - Entanglement in Action
No ratings yet
Lecture 3 - Entanglement in Action
36 pages
Durrande 2020
No ratings yet
Durrande 2020
90 pages
Introduction To State Space Models and Sequential Bayesian Inference
No ratings yet
Introduction To State Space Models and Sequential Bayesian Inference
58 pages
Lec35 SequentialImportanceSampling
No ratings yet
Lec35 SequentialImportanceSampling
46 pages
Lec31 32 CaterpillarRegressionExample
No ratings yet
Lec31 32 CaterpillarRegressionExample
108 pages
Lecture 8.1 - Iterative Quantum Phase Estimation - Moving Beyond Traditional QPE
No ratings yet
Lecture 8.1 - Iterative Quantum Phase Estimation - Moving Beyond Traditional QPE
31 pages
Lecture 4.1 - Quantum Query Algorithms
No ratings yet
Lecture 4.1 - Quantum Query Algorithms
38 pages
Lec20 RidgeRegression
No ratings yet
Lec20 RidgeRegression
21 pages
Lec9 MultivariateGaussian
No ratings yet
Lec9 MultivariateGaussian
60 pages
Lec33 MetropolisHastings
No ratings yet
Lec33 MetropolisHastings
66 pages
Lec18 HierarchicalBayesianModels
No ratings yet
Lec18 HierarchicalBayesianModels
20 pages
Lec17 PriorModeling
No ratings yet
Lec17 PriorModeling
37 pages
Lec21 BiasVarianceDecomposition
No ratings yet
Lec21 BiasVarianceDecomposition
15 pages
Lec23 Evidence4Regression
No ratings yet
Lec23 Evidence4Regression
38 pages
Lec25 MonteCarloMethods
No ratings yet
Lec25 MonteCarloMethods
57 pages
Lec7 InformationTheory
No ratings yet
Lec7 InformationTheory
41 pages
Lec16 SummarizingPosteriors BayesianModelSelection
No ratings yet
Lec16 SummarizingPosteriors BayesianModelSelection
59 pages
Lec22 Introduction2BayesianRegression
No ratings yet
Lec22 Introduction2BayesianRegression
42 pages
Lec14 15 GenerativeModelsForDiscreteData
No ratings yet
Lec14 15 GenerativeModelsForDiscreteData
74 pages
Lec28 StratifiedSampling
No ratings yet
Lec28 StratifiedSampling
15 pages
Lec19 Introduction2LinearRegression
No ratings yet
Lec19 Introduction2LinearRegression
53 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lec27 AcceptReject

Uploaded by

Lec27 AcceptReject

Uploaded by

Rejection Sampling

Prof. Nicholas Zabaras

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 1

 Envelop rejection methods

 Monahan’s accept/reject method

 Understand rejection sampling and the accept/reject algorithm

 Understand the envelope rejection method and Monahan’s accept/reject

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 3

 The method relies on samples generated from a proposal distribution 𝑞(𝑥) on

 We need 𝑞(𝑥) to dominate 𝜋(𝑥), i.e.

 This implies that  *( x)  0  q *( x)  0 but also that the tails of q *( x) must be

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 4

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 5

The distribution 𝜋(𝑦) needs to be known only up to a normalizing

 Indeed note that:

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 7

𝑝(𝑥) 𝑥 𝑎−1 𝜆𝑎 𝑒 −𝜆𝑥 Γ(𝑘) Γ( 𝑎 )𝜆𝑎

Link to MatLab implementation

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 8

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 9

2. Generate 𝑢~𝒰 0,1 . Set 𝑡 ← 𝑠 𝑡𝑎𝑛 𝜋 𝑢 − 0.5 and 𝑥 ← 𝑏 + 𝑡. // 𝑥 = sample

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 10

or using the independence of 𝑌𝑛 , 𝐼𝑛 𝑛≥1 and the definition of 𝜏 = min 𝑖 ≥ 1: 𝐼𝑖 = 1

𝜋 ∗ (𝑦) ‫𝑦𝑑)𝑦( ∗ 𝜋 𝑆׬‬

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 14

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 15

then we have:  *( x) X  *( y)dy

 *( x) X  *( y)dy X  *( y)dy X  *( y)dy

 Once again we see that we do not need the normalizing constant of 𝑞 ∗ .

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 16

 The posterior distribution is given by:

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 17

 We take the following reasonably good proposal distribution (𝜎 > 1) d

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 18

 Not practical in high dimensions

 It only requires 𝜋 or 𝑞 to be known up to a normalizing constant

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 19

 We can modify the accept/reject algorithm as follows:

I. Sample 𝑌~𝑞 and 𝑢~𝒰(0,1).

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 20

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 22

 Generate 𝑌~𝑔𝑛 (𝑥), 𝑢~𝒰[0,1]

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 23

 We have for the posterior:

 (a, b | x1:n , y1:n )  exp a  yi  b yi xi  e a  e x b  e

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 24

 We want to achieve this by taking samples only from 𝐺 and 𝒰[0,1].

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 25

 The sampling algorithm is as follows:

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 27

 The returned 𝑋 has then a distribution:

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 29

 H (1)  H (1)  H (1)

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 30

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

 This implies that  ( x)  0  q ( x)  0 but also that the tails of q *( x) must be

then we have:  ( x) X  ( y)dy

 ( x) X  ( y)dy X  ( y)dy X  ( y)dy