Chapter Two General Principles Random Variate Generation: 1s Its It of
Chapter Two General Principles Random Variate Generation: 1s Its It of
GENERAL PRINCIPLES
IN RANDOM VARIATE GENERATION
1. INTRODUCTION.
In thls chapter we lntroduce the reader to the fundamental prlnclples In
non-unlform random varlate generatlon. Thls chapter 1s a must for the serlous
reader. On Its own I t can be used as part of a course In simulation.
i
These baslc prlnclples apply often, but not always, to both contlnuous and
dlscrete random varlables. For a structured development I t 1s perhaps best to
develop the materlal accordlng to the guldlng prlnclple rather than accordlng to
the type of random variable Involved. The reader 1s also cautloned that we do
not make any recomrnendatlons at thls palnt about generators for varlous dlstrl-
butlons. All the examples found In thls chapter are of a dldactlcal nature, and
the most lmportant famllles of dlstrlbutlons wlll be studled In chapters IX,X,XI In
more detall.
Theorem 2.1.
Let F be a contlnuous dlstrlbutlon functlon on R wlth lnverse F-' deflned
by
F - ' ( u ) = 1nf {a::F(a:)=u, O < U <I}.
If U 1s a unlform [O,l]random varlable, then F - ' ( U ) has dlstrlbutlon functlon
F . Also, If X has dlstrlbutlon functlon F , then F ( X ) 1s unlformly dlstrlbuted
on [0,1].
The second statement follows from the fact that for all O<u <1,
P ( F ( X ) l u )= P ( X L F - ' ( u ) )
= F ( F - l ( u )) = u I.
Theorem 2.1 can be used to generate random varlates wlth an arbltrary con-
tlnuous dlstrlbutlon functlon F provlded that F-' 1s expllcltly known. The fas-
ter the Inverse can be computed, the faster we can compute X from a glven unl-
form [0,1] random varlate U . Formally, we have
random variate U .
Generate a uniform [0,1]
RETURN X +F-'( U )
In the next table, we glve a few lmportant examples. Often, the formulas for
11.2.INVERSION METHOD 29
Cauchy(a)
U 1 1 X 1
-+-arctan(-) atan(A( U--)) atan(n u )
n(x a+$) 2 A U 2
Rayleigh(u)
--22 22
_I_
-2e
U
a02 , X ~ O 1-e m2 ad-log( 1- u) a m
I I I
Triangular on(0,a ) I
I Tail of Rayleigh I I I 1
I I I
Pareto( a ,b )
There are many areas In random varlate generatlon where the lnverslon
method 1s of partlcular Importance. We clte four examples:
Unfortunately, the best transformatlon h , 1.e. the one that mlnlmlzes V , depends
upon the dlstrlbutlon of x.
We can glve the reader some lnslght In how to
choose h by an example. Conslder for example the class if transformatlons
h(x)=
x -m
s+lx-m I '
where s >O and mER are constants. Thus, we have
h-'(y)=m+sy/(l-Iy I),and
v = E (-(s
1
S
+ I X-m I 12) = s + 2 (~I x - m I )+'E
S
((x-m 12) .
For symmetrlc random varlables x,
thls expresslon 1s mlnlmlzed by settlng
m =O and s = d m . For asymmetrlc X ,the mlnlmlzatlon problem 1s very
dlfflcult. The next best thlng we could do 1s mlnlmlze a good upper bound for V ,
such as the one provlded by applylng the Cauchy-Schwarz lnequallty,
IF F ( X ) < U
THEN a t X
ELSE b +X
UNTIL b -a 56
RETURN x
1
When uI2
-
we ,
argue by symmetry. Thus, lnformatlon about moments and
quantlles of F can be valuable for lnltlal guesswork. For the Newton-Raphson
method, we can often take an arbltrary polnt such as 0 a s our lnltlal guess.
The actual cholce of an algorlthm depends upon many factors such as
(1) Guaranteed convergence.
(11) Speed of convergence.
(111) A prlorl lnformatlon.
(lv) Knowledge of the denslty f .
If f 1s not expllcltly known, then the Newton-Raphson method should be
1
avolded because the approxlmatlon of f (z ) by -(F (a: +6)-F (z )) 1s rather lnac-
6
curate because of cancelatlon errors.
Only the blsectlon method 1s guaranteed t o converge In all cases. If
F ( X ) = U has a unlque solutlon, then the secant method converges too. By "con-
vergence" we mean of course that the returned varlable x*
would approach the
exact solutlon X if we would let the number of lteratlons tend to 03. The
Newton-Raphson method converges when F 1s convex or concave. Often, the
denslty f 1s unlmodal wlth peak at m . Then, clearly, F 1s convex on (-m,m 1,
and concave on [ m , m ) , and the Newton-Raphson method started at m con-
verges.
Let us conslder the speed of convergence now. For the blsectlon method
started at [ a , b ] = [g U ) , g , ( U)](where g 1,g2 are glven functlons), we need N
lteratlons If and only if
2N-1 < g 2 ( U ) - g 1 ( U )5 2N .
where log+ 1s the posltlve part of the logarlthm wlth base 2. From thls expres-
slon, we retaln that E ( N )can be lnflnlte for some long-talled dlstrlbutlons. If the
solutlon 1s known to belong t o [-1,1], then we have determlnlstlcally,
N 5 l+log (L).
+ 6
And In all cases in whlch E ( N ) < 00, we have as 610, E(N)-log(-). 1 Essen-
s
tlally, addlng one blt of accuracy t o the solutlon 1s equlvalent t o addlng one
lteratlon. As an example, let us take 6 = lo-', whlch corresponds t o the stan-
dard cholce for problems wlth solutlons In [-1,1] when a 32-blt computer 1s used.
The value of N In that case 1s In the nelghborhood of 24, and thls 1s often lnac-
ceptable.
The secant and Newton-Raphson methods are both faster, albelt less robust,
than the blsectlon method. For a good dlscusslon of the convergence and rate of
convergence of the W e n methods, we refer to Ostrowskl (1973). Let us merely
11.2 .INVERSION METHOD 35
state one of the results for E ( N ) ,the quantlty of lnterest to us, where N 1s the
number of lteratlons needed to get to wlthln 6 of the solutlon (note that thls 1s
lmposslble to verlfy when an algorlthm 1s runnlng !). Also, let F be the dlstrlbu-
tlon functlon correspondlng to a unlmodal denslty wlth absolutely bounded
derlvatlve f'. The Newton-Raphson method started at the mode converges, and
for some number N o dependlng only upon F (but posslbly 00) we have
where all logarlthms are base 2. For the secant method, a slmllar statement can
1
be made but the base should be replaced by the golden ratlo, -(l+&). In both
2
cases, the Influence of 6 on the average number of lteratlons 1s practlcally nll, and
the asyrnptotlc expresslon for E ( N ) 1s smaller than In the blsectlon method
(when 610). Obvlously, the secant and Newton-Raphson methods are not unlver-
sally faster than the blsectlon method. For ways of acceleratlng these methods,
see for example Ostrowskl (1973, Appendlx I, Appendlx G).
4 4
where A (a: )= ai z i t and B (a: )= bi x i , and the coefllclents are as shown In
I =o i=O
the table below:
-0.322232431088 0.0993484626060
-1.0 0.588581570495
-0.342242088547 0.531103462366
-0.0204231210245 0.103537752850
I4 I -0.0000453642210148 I 0.0038560700634
1
For u In the range [-,1-10-20], we take -g (l-u
) l and for u In the two tlny left-
2
over lntervals near 0 and 1, the approxlmatlon should not be used. Rougher
approxlmatlons can be found In Hastlngs (1955) and Balley (1981). Balley’s
approxlmatlon requlres fewer constants and 1s very fast. The approxlmatlon of
Beasley and Sprlnger (1977) 1s also very fast, although not as accurate as the
Odeh-Evans approxlmatlon glven here. Slmllar methods exlst for the lnverslon of
beta and gamma dlstrlbutlon functlons.
2.4. Exercises.
1. Most stopplng rules for the numerlcal lteratlve solutlon of F (X)=U are of
the type b -a 56 where [ a , b ] 1s an lnterval contalnlng the solutlon X , and
6>0 1s a small number. These algorlthms may never halt If for some u ,
there 1s an lnterval of solutlons of F(X)=u (thls applles especlally to the
secant method). Let A be the set of all u for whlch we have for some
x < y F (x )=F ( y )=u .
Show that P ( U EA )=O, 1.e. the probablllty of
endlng up In an lnAnlte loop 1s zero. Thus, we can safely llft the restrlctlon
lmposed throughout thls sectlon that F (X)=u has one solutlon for all u .
2. Show that the secant method converges if F (X)=U has one solutlon for
the glven value of U .
3. Show that If F(O)=O and F 1s concave on [O,oo),then the Newton-Raphson
method started at 0 converges.
II.2.INVERSION METHOD 37
[tan(-(7r U--)),tan(n(
1 U-?))I
1
2 2
Using thls lnterval as a startlng lnterval, compare and tlme the blsectlon
method, the secant method and the Newton-Raphson method (In the latter
method, start at 0 and keep iterating untll X does not change In value any
further). Flnally, assume that we have an efflclent Cauchy random varlate
generator at our disposal. Recalling that a Cauchy random varlable c IS
1
dlstrlbuted as tan(n(U--)), show that we can generate X by solving the
2
e quat ion
arc tan X +- X = arc tan C ,
1+x2
and by startlng wlth lnltlal lnterval
when c>O (use symmetry In the other case). Prove that thls Is a valid
method.
5. Develop a general purpose random varlate generator whlch 1s based upon
lnverslon by the Newton-Raphson method, and assumes only that F and the
correspondlng denslty f can be computed at all points, and that 1 1s uni-
modal. Verlfy that your method 1s convergent. Allow the user t o W e c ! m a
mode If thls information 1s avallable.
I
38 II.2.INVERSION METHOD
6. Wrlte general purpose generators for the blsectlon and secant methods In
whlch the user speclfles an lnltlal lnterval [g u),g,(V ) ] .
7. Dlscuss how you would solve F (x)=u for X by the blsectlon method If no
lnltlal lnterval 1s avallable. In a Arst stage, you could look for an lnterval
[ a ,b ] whlch contalns the solutlon x . In a second stage, you proceed by ordl-
nary blsectlon untll the lnterval’s length drops below 6. Show that regardless
of how you organlze the orlglnal search (thls could be by looklng at adjacent
lntervals of equal length, or adJacent lntervals wlth geometrlcally lncreaslng
lengths, or adJacent lntervals growlng as 2,22,222,...),the expected tlme taken
by the entlre algorlthm 1s 00 whenever E (log, I X I )=m. Show that for
extrapolatory search, I t 1s not a bad strategy to double the lnterval slzes.
Finally, exhlblt a dlstrlbutlon for whlch the glven expected search tlme 1s 00.
(Note that for such dlstrlbutlons, the expected number of blts needed to
represent the lnteger portlon Is lnflnlte.)
8. An exponential class of distributions. Conslder the dlstrlbutlon func-
tlon F (a: )=1-e -An(z) where A, (a:)= 5
:=1
ai x i for a: 20 and A, (a:)=O for
z <O. Assume that all coefflclents ai are nonnegatlve and that a ,>O. If U
1s a unlform [0,1] random varlate, and E 1s an exponentlal random varlate,
then I t 1s easy to see that the solutlon of l-e-An(X)=U 1s dlstrlbuted as the
solutlon of A , ( X ) = E . The baslc Newton-Raphson step for the solutlon of
the second equatlon 1s
xc-x- A n (X1-E
An’(X) *
Slnce a ,>O and A , 1s convex, any startlng polnt X 20 wlll yleld a conver-
gent sequence of values. We can thus start at X” =O or at =E / a (whlchx
1s the flrst value obtalned In the Newton-Raphson sequence started at 0).
Compare thls algorlthm wlth the algorlthm In whlch X 1s generated as
functlon
0 x <a
G(x) =
1 x >6
B.
u
-
1- u
1s dlstrlbuted as the ratlo of two lld exponentlal random varlables.
C. We say that a random varlable 2' has the extremal value dlstrlbutlon
with parameter a when F ( x ) = ~ - ' ~ * If . X 1s dlstrlbuted as Z with
parameter Y where Y 1s exponentlally dlstrlbuted, then X has the
standard loglstlc dlstrlbutlon.
n2 7n4
D. E ( X 2 ) = - , and E(X4)=-.
3 15
E. If X1,X2are lndependent extremal value dlstrlbuted random varlables
wlth the same parameter a , then Xl-x, has a loglstlc dlstrlbutlon.
..
40 II.3.REJECTION METHOD
3.1. Definition.
The rejectlon method 1s based upon the followlng fundamental property of
densltles:
Theorem 3.1.
Let X be a random vector with denslty f on R d , and let U be an lndepen-
1 dent unlform [0,1]random varlable. Then (x,cuf (x)) 1s unlformly dlstrlbuted
on A ={(x ,u ):x ER , O s u 5 c f (x)}, where c > O 1s an arbltrary constant.
Vlce versa, If ( X , U ) 1s a random vector In R d-C1 unlformly dlstrlbuted on A ,
then X has denslty f on R d .
Since the area of A 1s c , we have shown the flrst part of the Theorem. The
second part follows If we can show that for all Borel sets B of R d ,
P ( X E B )=Sj (x ) dx (recall the deflnltlon of a denslty). But
B
P ( X E B ) = P ( ( X , U ) E B ,= ((x,u):sEB ,o<u Scf (x)})
du dx
Theorem 3.2.
Let x1,x2,... be a sequence of lld random vectors taklng values In R d , and
let A C R be a Borel set such that P ( X , E A ) = p BO. Let Y be the flrst Xi
taklng values In A . Then Y has a dlstrlbutlon that 1s determlned by
P ( X , E A nB )
P ( Y E B )= , B Borel set of R
' P
In partlcular, If x, ,
1s unlformly dlstrlbuted In A where A , 2 A , then Y 1s unl-
formly dlstrlbuted In A .
The baslc verslon of the reJectlon algorlthm assumes the existence of a den-
slty g and the knowledge of a constant c 21 such that
f (XI L W(X) (all 2 ) .
REPEAT
Generate two independent random variates x (with density g on R ) and U (uni-
formly distributed on [0,1]).
UNTIL UT 51
RETURN x
where
Thus, E ( N ) = - =1 c , E (N2)=---2 1
and Var(N)=-- - c -c . In other
P P2 p P2
words, E ( N ) 1s one over the probablllty of acceptlng X . From thls we conclude
that we should keep c as small as posslble. Note that the dlstrlbutlon of N 1s
geometrlc wlth parameter p =-
1
C
. Thls 1s good, because the probabllltles
11.3 .REJECTION METHOD 43
REPEAT
Generate two lndependent unlform [o,i] random varlates U and V .
Set X+a+(b-a)V.
UNTIL U M < f ( X )
RETURNX
The reader should be warned here that thls algorlthm can be horrlbly lnemclent,
and that the cholce of a constant domlnatlng curve should be avolded except In a
few cases.
qulck results (see Example 3.2 below), I t 1s ad hoc, and depends a lot on the
mathematlcal background and lnslght of the deslgner. In a second approach,
whlch 1s also lllustrated In thls sectlon, one starts wlth a famlly of domlnatlng
densltles g and chooses the denslty wlthln that class for whlch c 1s smallest.
Thls approach 1s more structured but could sometlmes lead to dlmcult optlmlza-
tlon problems.
1
-(
2
Thus,
--2 2 1 e 21- 1 Z I
-
1
= cg (x )
6e 2 < -
-6 7
REPEAT
Generate an exponential random variate x and two independent uniform [OJ] ran-
1
dom variates U and V . If U <- set X t - X (X is now distributed as a Laplace
2’
random variate).
X2
1 f-IXI 1 -2
UNTIL v7ze 5xe.
RETURN X
1
The condltlon In the UNTIL statement can be cleaned up. The constant -
6
cancels out on left and rlght hand sldes. It 1s also better to take logarithms on
both sldes. Flnally, we can move the slgn change to the RETURN statement
because there 1s no need for a slgn change of a random varlate that wlll be
rejected. The random varlate U can also be avolded by the trlck lmplemented In
the algorlthm glven below.
II.3.REJECTION METHOD 45
REPEAT
Generate an exponential random variate X and an independent uniform [-l,l] ran-
dom variate V .
UNTlL (x-1)25-21~g( I vI)
RETURN X-X sign ( V )
The optlmal 8 1s that for whlch c e 1s mlnlmal, Le. for whlch c 8 1s closest to 1.
We wlll now lllustrate thls optlmlzatlon process by an example. For the sake
.
of argument, we take once agaln the normal denslty f The famlly of domlnat-
lng densltles 1s the Cauchy famlly wlth scale parameter 8:
Thls glves the values z=O and ~ = f d 2 - 8 (the latter case can only happen
A.
~
X2
--- <-
1 e--T
e T l+x2 - 6 9
or as
&-
U 5 (1+X2)-e -$
2
[SET-UP]
a+-
dT
2
[GENERATOR]
REPEAT
Generate two independent uniform [0,1]
random variates u and v
Set X + t a n ( r V ) , S+x2 (xis now Cauchy distributed).
_-
s
UNTIL U l a ( l + S ) e
RETURN x
REPEAT
Generate independent random variates X , U where X has density g and U is uni-
formly distributed on [O,l].
UNTIL U <$(z )
RETURN x
Vaduva (1977) observed that for speclal forms of $, there 1s another way of
proceedlng. Thls occurs when $=l-Q where \k Is a dlstrlbutlon functlon of an
easy denslty.
REPEAT
Generate two independent random variates x,Y ,where x has density g and Y has
distribution function 9.
UNTIL x< Y
RETURN x
The cholce between generatlng U and computlng 1-\k(X) on the one hand
(the orlglnal rejectlon algorlthm) and generatlng Y wlth dlstrlbutlon functlon \k
on the other hand (Vaduva's method) depends malnly upon the relatlve speeds of
computlng a dlstrlbutlon functlon and generatlng a random varlate wlth that dls-
t rlb utlon.
Example 3.3.
Conslder the denslty
REPEAT
-
1
Generate two iid uniform [O,lJ random variates U , V ,and set X t V '.
UNTIL U s e - '
RETURN x
Theorem 3.4.
Assume that a denslty f on R can be decomposed as follows:
f (z 1 = Js (Y ,z ) h (Y 9 2 ) dY 9
s
where dy 1s an integral In R , 9 (y ,x) 1s a density In y for ail 2 , and there
ex1st.S a functlon H ( z ) such that O<h ( y , x ) L H ( x ) for all y , and H / S H is an
easy denslty. Then the followlng algorlthm produces a random varlate wlth den-
sity f , and takes N lteratlons where N 1s geometrlcally dlstrlbuted wlth param-
eter - 1
(and thus E ( N ) = J H ) .
SH
REPEAT
Generate X with density H / $ H (on R ).
Generate Y with density g (y ,X),yER (Xis k e d ) .
Generate a uniform [0,1]
random variate U.
UNTTL.UH(X)Lh(Y,X)
RETURN X
We note first that by settlng z=00, p = I . But then, clearly, the varlate pro-
SH
duced by the algorlthm has denslty f as requlred.
6
{==I
+(Wj) 9
i
where Wi 1s the collectlon of all random varlables used In the i - t h lteratlon of
the rejectlon algorithm,@ 1s some functlon, and N 1s the number of lteratlons of
the reJectlon method. The random varlable N 1s known as a stopplng rule
because the probabllltles P ( N = n ) are equal t o the probabllltles that
W,, . . . , W, belong t o some set B,. The lnterestlng fact 1s that, regardless of
whlch stopplng rule 1s used (l.e., whether we use the one suggested In the reJec-
tlon method or not), as long as the wi’s are lid random varlables, the following
remalns true:
(-1
i=l i=1
II.3.REJECTION METHOD 51
@I
= C E (zi I [ ~ > i l )
t =1
co
= CE(Zi)P(N>i)
i=1
=E(&) E P ( N l 2 )
i=1
= E(&) E ( N ).
The exchange of the expectatlon and lnflnlte sum ls'allowed by the monotone
convergence theorem: Just note that for any sequence of nonnegatlve random
n n M
i=i i=i t =I
It should be noted that for the reJectlon method, we have a speclal case for
whlch a shorter proof can be glven because our stopplng rule N 1s an lnstantane-
ous stopplng rule: we deflne a number of declslons Di , all 0 or 1 valued and
dependent upon Wi only: Dl=O lndlcates that we "reJect" based upon W , ,
etcetera. A 1 denotes acceptance. Thus, N 1s equal to n If and only If D, =1 and
Di =O for all 2 <n Now, .
E ( . g $(wj1)
1 =1
=E ( $( wi ))+E ($( wN 1)
i<N
=E ( N - W I
Wl) D ,=O)+E (II
W(, ) I D 1=1>
n =I
00
= P(N-
> n ) P(U,EB)
n =1
=E(N)P(U,EB).
I
.- I
11.3 .REJECTION METHOD 53
There are qulte a few algorithms that fall Into this category. In partlcular, if
we use reJectlon wlth a constant dominating curve on [O,i], then we use N unl-
form random variates where for contlnuous f ,
. ww L: S U P , f ( z ) .
We have seen that In the reJectlon algorithm, we come within a factor of 2 of this
lower bound. If the ui’s have density g on the real llne, then we can construct
stopping times for all densltles f that are absolutely contlnuous wlth respect to
g , and the lower bound reads
f
For continuous -, the lower bound is equal to sup- f of . course. Again, wlth the
9 Q
reJectlon method-with g as domlnatlng denslty, we come wlthin a factor of 2 of
the lower bound.
There Is another class of algorlthms that A t s the descrlptlon glven here, not-
ably the Forsythe-von Neumann algorlthms, whlch wlll be presented in section
N.2.
REPEAT
random variate u .
Generate a uniform [0,1]
Set W - w sh
Generate a random variate
Ucg (X).
Accept -[ l(x)].
x
with density g .
IF N O T Accept
THEN IF W s h , ( X ) THEN Accept - [ W < _ f (X)].
UNTIL, Accept
RETURN x
1 =1
Here we used the fact that we have proper sandwlchlng, 1.e. O l h <h,<cg.
If h,=O and h,=cg (l.e., we have no squeezlng), then we obtaln the result
E ( N 1 )=c for the reJectlon method. Wlth only a quick acceptance step (1.e.
- and/or h 2 5 c g are vlolated,
h 2 = c g ) , we have E ( N 1 ) = c - J h l . When h,>O
equallty In the expresslon for E (N1) should be replaced by lnequallty (exerclse
3.13).
l! n -l! n!
<
where Is a number In the lnterval [O,x](or [x,O],dependlng upon the slgn of
s). From thls, by lnspectlon of the last term, one can obtaln lnequalltles whlch
are polynomlals, and thus prlme candldates for h and h,. For example, we have
From thls, we see that for x 20, e-' is sandwlched between consecutlve terms of
the well-known expanslon
03
e-Z = (-1)i -
X'
.
i =O i!
In partlcular,
f < c g where c = 6.
and g(x)=
1
. For h and h , we should look
7r(l+x2)
for slmple functlons of x Applylng the Taylor serles technlque descrlbed above,
we see that
X2 x2 x4
1--
2
5 &f (3) 5 1--+- .
2 8
56 II.3.REJECTION METHOD
Uslng the lower bound for h,, we can now accelerate our normal random varlate
generator somewhat:
REPEAT
random variate U
Generate a uniform (0,1]
Generate a Cauchy random variate x.
Set W e 2u . (Note: W t c V g (X)&.)
6 (l+X2)
X2
Accept --[ W 51--].
2
--X 2
IF NOT Accept THEN Accept -[ W se 1.
UNTIL Accept
RETURN X
Thls algorlthm can be lmproved In many dlrectlons. We have already got rld of
the annoylng normallzatlon constant 6. I I
For X >A,the qulck acceptance
step 1s useless ln vlew of h ,(X)<O. Some further savlngs In computer tlme result
1
lf we work wlth Y t - X 2 throughout. The expected number of computatlons of
2
f 1s
REPEAT
Generate a uniform [0,1]random variate U .
Generate a random variate X with density g
b
Accept +-[ u 5 -1.C
IF NOT Accept THEN Accept t[
f (XI 1.
U 5-
cg (X)
UNTIL Accept
RETURN x
Here g 1s another functlon, typlcally wlth small Integral. Here we could lmple-
ment the rejectlon method wlth as domlnatlng curve g +h , and apply a sueeze
step based upon f >h -9. After some slmpllflcatlons, thls leads t o the followlng
algorlthm:
58 II.3.REJECTION METHOD
REPEAT
Generate a random variate x with density proportional to h +g , and a uniform [0,1]
random variate U .
Accept -[-5
V(X) 1- 1-LJ
h ( X ) 1+u
IF NOT Accept THEN Accept --[ V(g ( X ) + h f (x)]
(x))<
UNTIL Accept
RETURN x
Thls algorlthm has rejectlon constant 1 + s g , and the expected number of evalua-
tlons of f Is at most 2 s g . Algorlthms of thls type are malnly used when g has
very small lntegral. One lnstance 1s when the startlng absolute devlatlon lnequal-
lty 1s known from the study of llmlt theorems In mathematical statlstlcs. For
example, when f 1s the gamma ( n ) denslty normallzed to have zero mean and
unlt varlance, I t 1s known that f tends t o the normal denslty as n + m . Thls
convergence 1s studled In more detall In local central llmlt theorems (see e.g.
Petrov (1975)). One of the by-products of thls theory 1s an lnequallty of the form
needed by us, where g 1s a functlon dependlng upon n , wlth lntegral decreaslng
at the rate 1 / 6 as n+m. The rejectlon algorlthm would thus have lmproved
performance as n+x. What 1s lntrlgulng here 1s that thls sort of lnequallty 1s
not llmlted t o the gamma denslty, but applles t o densltles of sums of lld random
varlables satlsfylng certaln regularlty condltlons. In one sweep, one could thus
deslgn general algorlthms for thls class of densltles. See also sectlons XrV.3.3 and
xIv.4.
In thls example, we merely want to make a polnt about our ldeallzed model.
Recycling can be (and usually Is) dangerous on flnlte-preclslon computers. When
f 1s close to c g , as In most good rejectlon algorithms, the upper portlon of U
(1.e. ( V - l ) / ( 2'-1) in the notatlon of the algorlthm) should not be recycled since
T-1 1s close to 0. The bottom part Is more useful, but thls 1s at the expense of
less readable algorithms. All programs should be set up as follows: a unlform
random varlate should be provided upon lnput, and the output consists of the
returned random varlate and another unlform random varlate. The input and
output random varlates are dependent, but I t should be stressed that the
returned random varlate X and the recycled unlform random variate are
Independent! Another argument agalnst recycllng Is that I t requlres a few multl-
Pllcatlons and/or dlvlslons. Typlcally, the time taken by these operatlons Is
longer than the time needed to generate one good unlform [O,l] random variate.
For all these reasons, we do not pursue the recycllng prlnclple any further.
60 11.3.REJECTION METHOD
3.8. Exercises.
1. Let I and g be easy densltles for whlch we have subprograms for comput-
lng f (z ) and g ( z ) at all z ER d . These densltles can be comblned Into
other densltles In several manners, e.g.
h = c max(f , g )
h = c mh(f ,g)
h =c fi
h = c jag1*
where c =-
7r
&, 3
f (z)=-(l-z2)
4
and g (z)=-,
1
2
and 13: 1 51.Thus,
1s In one of the forms speclfled In exerclse 3.1. Glve a complete algorlthm
h
and analysls for generatlng random varlates wlth denslty h by the general
method of exerclse 3.1.
3. The algorlthm
REPEAT
Generate X with density g .
Generate an exponential random variate E
UNTIL h ( X ) < E
RETURN X
Lux's algorithm
REPEAT
Generate a random variate X with density g .
Generate a random variate Y with distribution function F .
UNTIL Y < r ( X )
RETURN x
rlthm 1s F ( T (z )) g (z ) dz .
0
6. The followlng denslty on [O,oo)has both an lnflnlte peak at 0 and a heavy
tall:
2
f (XI= (5 >o) .
(l+x )G
Conslder as a posslble candldate for a domlnatlng curve c 0 g 0 where
7rX
Cauchy (e): e 1
e2+ 5 2)
TIT(
Logistic (e):
eeAZ ? ?
l+e
1 6 ? ?
min(-,T>
40 dX
and
i=i
Y
.c =
&r<a +I) ’
and the lnequalltles
-- a22
ce 1-z2 -j
< (a:) 5 ce-az2 ( l a : I 51).
The followlng reJectlon algorlthm wlth squeezlng can be used:
REPEAT
REPEAT
Generate a normal random variate X .
Generate an exponential random variate E.
UNTIL Y l l
x+&, Y+X2
a
Accept +-[l-Y(l+-Y)zO].
E
IF NOT Accept THEN Accept +[aY+E f a log(1-Y)>o].
UNTIL Accept
RETURN x
D. Wrlte a generator whlch works for all a >-1. (Thls requlres yet another
solutlon for a In the range (-l,O).)
E. Random varlates from f can also obtalned In other ways. Show that all
of the followlng reclpes are valld:
1
(1) S fl where B 1s beta(-,a +1) and S 1s a random slgn.
2
(11) s p Y+Z
where Y,Z
gamma(a +l,l) random varlates, and S 1s a random slgn.
1
are lndependent gamma(-,i)
2
and
Show that the algorlthm 1s valld (relate I t to the reJectlon method). Relate
the expected number of X ’ s generated before haltlng to I I f I I co, the
.
essentlal supremum of f Among other thlngs, conclude that the expected
tlme Is 00 for every unbounded denslty. Compare the expected number of
x’s wlth Letac’s lower bound. Show also that if lnverslon by sequentlal
search 1s used for generatlng 2 , then the expected number of lteratlons In
the search before haltlng 1s flnlte If and only If J f 2 < ~ . A Anal note: usu-
ally, one does not have a cumulatlve mass functlon for an arbltrary denslty
f.
66 11.4.DISCRETE MIXTURES
4.1. Definition.
If our target denslty f can be decomposed lnto a dlscrete mlxture
00
f ($1 = CPifi(”)
1 =1
where the f ’s are glven densltles and the p i ’s form a probablllty vector (l.e.,
p i 2 0 for all t‘ and Cpi =l), then random varlates can be obtained as follows:
i
Thls algorlthm 1s Incomplete, because I t does not speclfy Just how Z and X are
generated. Every tlme we use the general form of the algorlthm, we wlll say that
the composltlon method 1s used.
We wlll show In thls sectlon how the decomposltlon method can be appiled
In the deslgn of good generators, but we wlll not at thls stage address the prob-
lem of the generatlon of the dlscrete random varlate Z . Rather, we are lnterested
In the decomposltlon Itself. It should be noted however that in many, If not most,
practlcal sltuatlons, we have a Anlte mlxture wlth K components.
1
Thus, wlth f l(a:)=-8,
2
Ix I5 8 where 8 1s the wldth of the centered rectangle,
we see that at best we can set
--2 2 e2
28e =I 2- 8 e-T
l t;L- & 6
The functlon p 1 1s maxlmal (as a functlon of 8 ) when 8=1, and the correspond-
lng value 1s &. Of course, thls welght 1s not close to 1, and the present
decomposltlon seems hardly useful. The work lnvolved when we decompose In
terms of several rectangles and trlangles 1s baslcally not dlfferent from the short
analysls done here.
t =I
REPEAT
Generate a discrete random variate with probability vector proportional to
P I , . * , PK on (1, . . ? K ) .
1
In the second algorlthm we use the rejectlon method wlth as domlnatlng curve
h IIA,+ . . +h, IA,, and use the composltlon method for random varlates from
*
the domlnatlng denslty. In contrast, the Arst algorlthm uses true decomposltlon.
After havlng selected a component wlth the correct probablllty we then use the
rejectlon method. A brlef comparison of both algorlthms 1s In order here. Thls
can be done In terms of four quantltles: Nz,N,,Nh and Nh , where N 1s the
number of random varlates requlred of the type speclfled by the lndex wlth the
K
understandlng that Nh refers to h i , 1.e. I t 1s the total number of random vari-
i=1
ates needed from any one of the K domlnatlng densltles.
I Theorem 4.1.K
I Let q =
i=1
q i , and let N be the number of Iterations In the second algo-
rlthm. For the second algorlthm we have NU=Nz =Nh = N , and N 1s geometrl-
1
cally dlstrlbuted wlth parameter -. In partlcular,
Q
* E ( N ) = q ; E ( N 2 )= 2q2-q .
For the flrst algorlthm, we have N z =l. Also, N u =Nh satlsfy
Qi Qi
T o show that the last expresslon Is always greater or equal to 2 q 2 -q we use the
C auchy-S chw arz lne qual1ty :
Flnally, we conslder E (Nh ). For the flrst algorlthm, Its expected value Is
Qi
pi(-) = q i . For the second algorlthm, we employ Wald's equallty after notlng
Pi
N
1- 1
p2
where the c i ' s are constants and K Is a posltlve Integer. Densltles wlth polyno-
mlal forms are Important further on a s bulldlng blocks for constructlng plecewlse
polynomlal approxlmatlons of more general densltles. If K is 0 or 1, we have the
unlform and trlangular densltles, and random variate generatlon 1s no problem.
There 1s also no problem when the ti's are all nonnegatlve. To see thls, we
observe that the dlstrlbutlon functlon F 1s a mixture of the form
K+l
where of course -=i. Slnce x * 1s the dlstrlbutlon functlon of the max-
:=I *
\mum of i lld unlform [O,l]random varlables, we can proceed as follows:
72 II.4.DISCRETE MIXTURES
C i -1
Generate a discrete random variate Z where P (2=i)=- , i<i <K+i.
a
RETURN X where X is generated as max(U,, . . . , uz) and the Q.'s are iid uniform [0,1]
random variates.
We have a nontrlvlal problem on our hands when one or more of the c i ' s are
negative. The solution glven here is due to Ahrens and Dleter (1974),and can be
applled whenever c,+ ci 2 0 . They decompose f as follows: let A be the
i : c ,< O
collectlon of lntegers In (0,. . . , K } for which c i LO,and let B the collectlon of
lndlces In (0, . . . , K } for whlch ci <O. Then, we have
K
f (a:)= CCiZ'
i =o
Lemma 4.1.
Let U 1, U 2 ...
, be lld uniform [0,1]random variables.
-1
A. For a >1, U l a U, has density
a
I
-
a -1
(1-za-1) ( 0 5 s 51)
B. Let L be the lndex of the flrst U; not equal to max( U,, . . . , U, ) for n 2 2 .
Then U, has denslty
n
-
n -1
(1-z"-'> (05a:51).
We are now In a posltlon to glve more detalls of the polynomlal density al&-
rlthm of Ahrens and Dleter.
[SET-UP]
Compute the probability vector p , , p 1, . . . , p~ from c o, . . . , CK according to the formu-
las given above. For each i E { O , l , . . . , K }, store the membership of i ( i EA if ci 20 and
i EB otherwise).
[GENERATOR]
Generate a discrete random variate with probability vector p , , p 1, . . . , p~ .
IF ZEA
- 1
THEN RETURN x t u z + ' ( o r X+max(Ul, . . . , Uz+,) where U,ul,
... are iid uni-
form (0,1]random variates).
- 1
ELSE RETURN X + U l z + ' U2(or X t u , where L is the ui with the lowest index
not equal to max(U,, . . . , UZ+~)).
74 II.4 .DISCRETE MIXTURES
f (a:) = CPifiW 9
i=1
where the f i ’s are densltles, but the pi’s are real numbers summlng t o one. A
general algorlthm for these densltles was glven by Blgnaml and de Mattels (1971).
It uses t h e fact that If p i 1s decomposed lnto Its posltlve and negatlve parts,
Pi =Pi +-Pi then -t
03
a =1
REPEAT
00 00
Generate a random variate x with density pi + f i/ pi +.
i=1 i-1
Generate a uniform [0,1]random variate u.
i- 1 1-1
RETURN x
03
The rejectlon constant here 1s s g = pi+. The algorlthm 1s thus not
i=l
valld when thls constant 1s 00. One should observe that for thls algorlthm, the
rejectlon constant 1s probably not a good measure of the expected tlme taken by
I t . Thls 1s due to the fact that the tlme needed to verlfy the acceptance condltlon
can be very large. For flnlte mlxtures, or mlxtures that are such that for every
a : , only a Anlte number of f (a: )’s are nonzero, we are In goo(I shape. In all cases,
I t 1s often posslble t o accept or reject after havlng computet1 .lust a few terms In
the serles, provlded that we have good analytical estimates of the tall sums of the
serles. Slnce thls Is the maln ldea of the serles method of sectlon IV.5, I t will not
be pursued here any further.
Example 4.1.
3
The denslty f (a:)=-(1-z2),
4
I x 151, can . be wrltten as
6 1 2 x2
f (a: )==y(yI[-1,1j(a:
>>--(--lr-l,ll(a:
4 6 )). The algorlthm glven above 1s then
II.4.DISCRETE MIXTURES 75
REPEAT
Generate a uniform [-1,1] random variate X,
Generate a uniform (0,1] random variate u.
UNTIL u 5 1-x2
RETURNX
5.1. Definition.
Let f be a given denslty on which can be decomposed into a sum of
two nonnegative functions:
f (5 1= f >+f 2(2 > *
Assume furthermore that there exists an easy denslty g such that f L l g . Then
the followlng algorithm can be used to generate a random varlate X wlth density
f:
= .f/ ( x ) dx .
B
In general, we galn If we can RETURN the flrst X generated in the algo-
rlthm. Thus, I t seems that we should try to maxlmlze Its probablllty of accep-
tance,
Deak (1981) calls thls the economical method. Usually, g 1s an easy denslty
.
close to f It should be obvlous that generatlon from the leftover denslty
( f -g ) + / p can be problematlc. If there Is some freedom In the deslgn (Le. In the
.
choke of g ), we should try to mlnlmlze p Thls slmple acceptance-complement
method has been used for generatlng gamma and t varlates (see Ahrens and
Dleter (1981,1983)and Stadlober (1981) respectlvely). One of the maln technical
obstacles encountered (and overcome) by these authors was the determlnatlon of
the set on whlch f ( a : ) > g (5).If we have two densltles that are very close, we
must first verlfy where they cross. Often thls leads to compllcated equatlons
whose solutlons can only be determlned numerlcally. These problems can be
sldestepped by exploltlng the added flexlblllty of the general acceptance-
complement method.
78 11.5.ACCEPTANCECOMPLEMENT METHOD
IF U > h ( X )
THENIF u z h * ( X )
RETURN x
1s large.
The condltlon lmposed on the class of densltles follows from the fact that we
must ask that f be nonnegatlve. The algorlthm now becomes:
-1 -
REPEAT
Generate a uniform [-1,1]random variate x
Generate a uniform [0,1]random variate u .
UwXL Uf m a x l f (XI
RETURN X
2 1
our condltlons are satlsfled because f max=- and the lnflmum of f 1s -, the
7r 7r
1
dlfference belng smaller than In thls case, the expected number of unlform
2
4
random varlates needed 1s I+-. Next, note that If we can generate a random
lr
varlate X wlth denslty then a standard Cauchy random varlate can be
obtalned by exploltlng the Property that the random varlate Y deflned by
11.5 .ACCEPTANCE-COMPLEMENT METHOD 81
1
X wlth probablllty -
2
Y = I 1 - wlth probablllty -1
X 2
IS Cauchy dlstrlbuted. For thls, we need an extra coln fllp. Usually, extra coln
Alps are generated by borrowlng a random blt from U. For example, In the
unlversal algorlthm shown above, we could have started from a unlform [-1,1]
I 1
random varlate u , and used u In the acceptance condltlon. Slnce slgn(U) is
1
lndependent of u I I, slgn(U) can be used to replace X by -, so that the
X
returned random varlate has the standard Cauchy denslty. The Cauchy generator
thus obtalned was flrst developed by Kronmal and Peterson (1981).
We were forced by technlcal conslderatlons to llmlt the densltles somewhat.
The rejectlon method can be used on all bounded densltles wlth compact support.
Thls typlfles the sltuatlon In general. ,In the acceptance-complement method, once
we choose the general form of g and f 2, we loose In terms of unlversallty. For
example, If both f and g are constant on [-1,1], then f = fl+f 2 5 g +f2<1.
Thus, no denslty f wlth a peak hlgher than 1 can be treated by the method. If
unlversallty 1s a prlme concern, then the reJectlon method has llttle competltlon.
5.5. Exercises.
1. Kronmal and Peterson (1981) developed yet another Cauchy generator based
upon the acceptance-complement method. It 1s based upon the followlng
decomposltlon of the truncated Cauchy denslty f (see text for the
deflnltlon) lnto f l+f 2:
We have:
82 1I.S.ACCEPTANCECOMPLEMENT METHOD
The Arst two IF’S are not requlred for the algorlthm t o be correct: they
correspond t o squeeze steps. Verlfy that the algorlthm generates standard
Cauchy random varlates. Prove also that the acceleratlon steps are valld.
The constant 0.7225 1s but an approxlmatlon of an lrratlonal number, whlch i
should be determlned.