Unit 2
Unit 2
Unit 2
Structure
2.0 " Objectives
2.1 Introduction
2.2 Working with Mean Deviation
2.2.1 Deviation below the Mean
2.2.2 Variance of a Binomial Distribution
2.3 Applications of Chebysliev's Theorem
2.3.1 Deviation of Repeated Ti+ials
2.3.2 Proof of the Weak Law
2.4 Random Walks and Gamblers' Ruin
2.4.1 The Probability Space
2.4.2 The Probability of Winning
2.4.3 A Recurrence for the Probability of Winning
2.4.4 Intuition
2.4.5 Duration of a Walk
2.4.6 Right Time to Quit
2.5 Chernoff Bounds
2.6 Strong Law of Large Numbers
2.7 Let Us Sum Up
2.8 Keywords
2.9 Some Useful Books
2.10 Answer or Hints to Check Your Progress
2.1 1 Exercises
2.1 INTRODUCTION
In this unit we initiate preparatory wark for understanding thr5: basic tools used
in stochastic process while applying to finance and insurance. Basically themes
covered in the theory of probability have been extended tic, a state where
information a;ailability varies over time. Our main concern is to highlight the
results obtained in simple random walks, The following study material is
Quantitative Techniques adopted from the lecture notes of Prof. Albert Mayer and Dr. Radhika Nagpal
for Risk Analysis (2002) borrowing freely the theorems, definitions and examples without
mentioning the source. Our reliance on the above mentioned work is heavy,
indeed. Credit goes to the authors for presenting the theme for a non-technical
learner. We have included the select themes of the notes keeping in view the
overall requirement of the course. Perhaps it will be useful to go back to the
lecture notes for better appreciation of the themes dealt with.
then by letting n grow large enough, we will be as certain as we want that tlie
average will be close to p . In other words, given any positive tolerance, E ,
A,, we will be within the tolerance of p as n grows. Effectively, we are
looking for the limit
According to Bernoulli, this limit equals one. This result is called the Weak
Law of Large Numbers. ~h'corern2.1 (Weak Law of Large Nupbers). Let
G,,...,G,,... be independent variables with the same distribution and the some
expectation, p. For any E > 0,
See, however, that the law, does not say anything about the rate at which the
, limit i s approached. That is, how big must n be to be within a given tolerance
of the expected value with a specific desired probability?
. In the following we will develop three basic results concerning the above
problem:
Marltov's Tlieorern
When we want to consider 1.11~probleni ol' bounding ilic probahilitj because
the value of a random variable is far ilway l'rorn tlie mean. Mnrltov',s thcore~n
, gives a vcry rough estimate, bascd otily on the valilc of tlie mean.
Theorem 2.2 (Markov's Theorem). If'j'li is lr I I ~ I ~ I I L ~ S ( I ~ ~ r~r17(Iot11
I)LJ ~~~ri~lhle,
thcr7.fi,r. crll x > 0
, Consider the Chinese ,Appetizer problem where N people are eating Chinese
appetizers arranged on a circular, rotating tray. Someone then spins the tray so
that each person receives a randorn appetizer. We need to find the probability
that everyone gets the satile appetizer as before.
Quantitative Techniques
See that are N equally likely orientations for the tray after it stops spinning.
for Risk Analysis Therefore everyone is jikely getting the right appetizer in just one of these
a N
orientations. Therefore, probability is l / N .
Let us use the Markov theorem to get the probability. Let the random
variable, R , be the number of people who get the right appetizer. We know
that E [ R ] = 1. Applying Markov's Theorem, we find:
Markov's Theorem also gives the same 1/N bound for the probability
everyone gets their hat in the hatcheck problem. But in reality, the probability
of this event isl/N!. So Markov's Theorem i n this case gives probability
bounds that are way off.
Note that the proof of Markov's Theorem requires that the randoln variable, R , .
be nonnegative.
2.2.1 Deviation below the Mean
According Markov's Theorem, a random variable may not greatly exceed the
mean. There is another theorem which points out that a random variable is not
likely to be much slnaller than its mean.
. Theorem 2.2.1. Let L he a real number and let R be a random variable such
that R IL . For all x < L , we have:
Take k = 2 to get a special case of the above corollary. Then we can apply it to
bound the random variable, R - E [ R ], that measures R' s deviation from its
mean. Thus,
where the inequality (2) follows from Corollary 2.2..1 applied to the random
variable, R - E [ R ] . That is to say, we can bound the probability that the
random variable X deviates from its mean by Inore than x by an expression
decreasing as 1/x2 multiplied by the constant
1 with probability3
( A- E [ A I ) ' 4 with probabilityf
var [ A ] = 2.
B-E[B]
( B - E[B])'
= r 1001 with probability?
-2002 with probability+
We may note that the variance of Game A is 2 whereas the variance of Game
B is more than two million. This means, the payoff in Game A remains closer
to the expected value of $1. On the contrary the payoff in Game B can deviate
very far from this expected value.
In economics high variance is often associated with high risk. For example, in
ten rounds of Game A, we expect to make $1 0, but could conceivably lose $10
instead. On the other hand, in ten rounds of game B , we also expect to make
$10, but could actually lose more than $20,000.
Theorem 2.2.3 (Pairwise Independent Additivity of Variance).
If R,, R,, ...,R,, are pairwise independent random variables, then
Vtzr[8,+R, +...+R,,]=v a r [ ~ , ] + v a r [+...+
~ , ] VW[R,,]
=
1sr+,i;n.
E [ R , ]E [R,,] +z
11
/=I
E [R,,] (painvise independence) ( 3 )
In 53), we use the fact that the expectation of the product o f two independent Applied Probability 111
variables is the product of their expectations.
Moreover,
Therefore we get
R, =( 1 with probabilityp,
0 with probability
. . 1 - p.
I
Quantitative Techniques
Then variance of the binomially distributed variable R is given by
for Risk Annlysis
'From the above result we see that tlie binomial distribution has variance
p(l - p ) n and standard deviation d m . It we consider a special case,
an unbiased binomial distribution with ( p = 1/2), then tlie variance is n/4 and
the standard deviation is &/2 .
and
To estimate p , we take a large number of trials, n , and record the fraction of Applied Probability II
success. Thus, we are taking Bernoulli variables G,, G,, ...,G, which are
independent each with the same expectation as G . Computing their sum
S,, = 1GI. l
, our estimate of p .
and then using the average, S , , / I I as
Speaking more generally, we may take any set of random variables
G,,G,,..., GI,,with the s a n ~ emean, p, and use the average, S,,/n, to estimate
p. A critical property of S,/n is that it has the same expectation as the ~ ~ ' s ,
namely,
Note that the random variables G, need not be either Bernoulli or independent
for (6) to hold.
To derive its variance, suppose that the G,'Salso have the same deviation, cr .
In that case, we get the second critical property of S,,/n, that is,
Thus, the variance af S, is the sum of the variances of the G,'s. For example,
by Theorem 2.3.1, the variances can be summed if the G,'S are pairwise
independent. Then we derive,
With this result, we apply Chebyshev 's Bound and conclude as follows:
Theorem 2.3.1 [Pairwise Independent Sampling]. Let
where G,,...,Gn are pairwise independent variables with the same mean, p,
and deviation, 0 .Then
Note that Theorem 2.3.1 provides a precise statement about the average of
j Quantitative Techniques
for Risk Analysis * indebendent samples of a randoin variable approaching the mean. We .can
generalise it to many cases when S, is the sum of independent variables whose
mean and dev.iation are not necessarily all the same.
2.3.2 Proof of the Weak Law
We can state the conclusion of the Weak Law of Large Numbers, Theorem 2.1,
is that the probability average differing from the expectation by more than any
given tolerance approaches zero.
Theorem 2.3.2 [Weak Law of Large Numbers]. Let
where G,, .... Gn,... are pairwise independent variables with the same
expectation, ,LL and standard deviation, a . For any c;. > 0 ,
Ch~eckYour Progress 1
1) What is the result of Weak Law of Large Number?
2) Write the meaning of Applied Probability 11
(i) Markov's theorem and (ii) Cliebyshev's tlieorem.
....... ..........
A
I
r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
t
./)
,. ,,
. ,
I .
Quantitative Techniques
h r Risk Analysis
, T=Y1+In , - - - - - - - - - - - - - - - - - - - - - ^ - - - - - - - - - - - - - - - - - -
, , , ,
. , ! ..
bet olttcomes:
WLLWLWWLLL
Example 4. Suppose John starts with $100, and Sita starts with $10. They flip
a fair coin, and every time a head appears, John wins $I from Sita, and vice
versa for tails. They play this game until one person goes bankrupt. What is the
probability of John winning?
Frame this problem as is done in case of Gambler's Ruin problem with n=100
and T=100+10 =I 10. The probability of John winning is 10011 10=10111,
namely, the ratio of his wealth to the combined wealth. Sita's chances of
winning are 1 / I 1.
It is easy to see that although John will win most of the time, the game is still
fair. When John wins, he wins $10 only; when he loses, he loses big: $,loo.
The expected win of both players is zero.
Another .result of the analysis is: the larger the gambler's initial,stake, the
larger the probability that he will win a fixed amount.
Example 5. If the gambler started with one million dollars instead of 500, but
aimed to win the same 100 dollars as in the Example 3, the probability of
winning would increase to 1 M / ( IM + 100) > 0.9999,
2.4.3 - A Recurrence for the Probability of Winning
Let p and T be fixed, and let w,, be the gambler's probability of winning
when his initial capital is n dollars. For example, w, is the probability that the
gambler will win given that he starts off broke; clearly, w, = 0 . Similarly, for
d7.= 1 .
Suppose that, the gambler starts with n dollars, where 0 c n < T . Consider the
outcome of his first bet. The gambler wins the ficst bet with probability p . In
that case, he is Ieb with n t 1 dollars and becomes a winner with probability
w,,,, . On the other hand, he loses the first bet with probability 1 - p . Now he is
left with n - 1 dollars and becomes a winner with probability wn-,. Thus, he
wins with probability wn = pw,,,, +qw,-, . Solving for w,,,, we have
Homogeneous Linear Recurrence Applied Probability 11
From (1 0) we have
then (1 I), and hence (10) will be satisfied. Taking w,, = In condition (I 1) can
be satisfied, Since the left-hand side of ( I I ) is zero using either definition, it
follows that any definition of the form
will also satisfy (11). For the values of w, and w T ,the boundary co~iditions
we solve for A and B to gel
I
Thus, A t B=-A, (12)
( 4 / ~- )1 '~
and
Corollary 10.7. In the Gambler's Ruin game biased against tlie gambler. that
is, with probability y < 112 of winning each bet, with initial capital, n , and
goal, T ,
~ r ( t l i gambler
where m = T - n ,
e is a winner} <
(9 n'
,
the gambler's intended profif. Tlius, the gambler gains his intended profit, m ,
before going broke with probability at most ( P / q ) " J . In this result, the upper
bo~lnd does not depend on the gambler's starting capital. To see the
consequences of thsis, we give tlie following examples:
Example 6. Suppose that the gambler starts with $500 aiming to profit $100,
this time by making $1 bets on red in roulette. By (15), the probability, 1vJl,
that he is a winner is less than
This is in sharp contrast to the unbiased game, where we saw (in Example 3)
that his probability of winning was 516.
Example 7. Consider Example 3 and the capital to be $1,000,000 to start in the
unbiased game. The ganible was alniost certain to win $100. But betting
against the "slightly" unfair roulette wheel, even starting with $1,000,000, his
chance of winning $100 remains less than 1 in 37,648. He will alnlost surely
lose all his $1,000,000. In fact, because the bound (15) depends only on his
intended profit, his chance of going up a mere $100 is less than I in 37,648 no
mutjer how much money he starts with.
9
The bound (1 5) is exponential in m . So, for example, doubling his intended
profit will square his probability of winning.
Example 8. Suppose that ganibler's stake goes up ,200 dollars in the above
example. What is probability that he wins before going broke in the roulette
game?
See that this is somewhat worse than the 1 in 2 chance in the fair game.
What is lesson of fair and unfair game's formulation?
In case of a fair game, it helps if we to have a large capital, whereas in the
unfair case, it doesn't.
2.4.4 Intuition
It will be useful to see the intuition behind the gambler's failitre to makc
money when the game is slightly biased against him. Seen in terms of roulette
game, the gambler wins a dollar with probability 9/19 and loses a dollar with
probability10119.
Therefore, his expected return on each bet is
9119-10/19=-l/19x-0.053 dollars.
hat is, on each bet: his capital is expected to drift downward by n little over 5
cents.
If the gambler starts with a trillion ciollar~then he will play for a very long
time. So at sorile point there would be a lucky outcome. Perhaps an upward
swing will put him $100 ahead. However, there is a problem in this
formulation. His capital drifts downward steadily. IFtlie gambler does not have
luck for an upward swing early, then lie is doomed. The logic is simple: After
his capital drifts downward a few hundred dollars, lie weds a huge upward
swing to save himself', Perl~aps,silch a huge swing is extremely unlikely.
To quantify these drifts and swings, consider the position after k rounds, Let
the nuriiber of wills by tlic player be a binomial distribution with parameters
.
p < 1 12 and k Then his expected win on any single bet is p - q = 2 p - 1
do1 lars. Thus. his expected capital is n - k ( 1 - 2~2).
The actual number of wins
must exceed the expected number by nl-I-k(1-217) to be a winner. Since the
binomial distribution has a standard deviation or only J G j , for the
gambler to win, he needs his number of wins to deviate by
E[G] = ( 2 p - l ) ~ [ Q ] (16)
In an unbiased game (16) is a trivial result as both 2y - 1 and the expected
overall winnings, E [GI,
are zero. In the unfair case, however, 2 p - 1 # 0 .
I Moreover,
Since G = xR !=I
G, is not a sum of nonnegative variables we do not have a
special case of Wald's theorem and if the gambler loses the ith bet, then
random variable GI equals 1. To confirm that (16) applies in general cases,
take the random variable GI+ 1 is nonnegative, and *
E [ G , + ~ ~ Q > ~ ] = El[~Gk, i ] + 1 = 2 ~ ,
so by Wald's Theorem
L 1-1 _I
= E [ G ] +E[Q] (1 9)
Combining (18) and (19) we get the result implied by (16).
Example 10. If the gambler aims to profit $100 playing roulette with n dollars Probability 11
~pplitd
to start, he can expect to male ((n + 100) 137,648- n)1(2(18/38)- 1) 1 19n
bets before the game ends. So he can enjoy playing for a-good while before
almost surely going broke.
Duration of an Unbiased Walk I
Consider the expected number of bets as a function of the gambler's initial
capital. That is, for fixed p and T , let e,, be the expected fiumber of bets until
the game ends when the gambler's initial capital is n dollars. Since the game
is over in no steps if n = 0 or T , the corresponding boundary conditions are
eo = e,. = 0.
Alternatively, let the gambler start with n dollars, where 0 ir; < T. By the
conditional expectation, the expected number of steps can be broken down. So
that,
e,, = p ~ I gambler
[ ~wins first bet] + q~ [ Q I gambler loses first bet] .
When the gambler wins the first bet, his capital is n + 1, so he can expect to
I
make another e,,,, beti. That is,
E [ Q I gambler wins first bet] = I + e,,,, .
and similarly,
I
E [ Q I gambler loses first bet] = I + e,,-,
So we have
. II
which yields the linear recurrence
Two useful special cases of Theorem 2.4.6 are given in the following:
Corollary 15.2. Suppose an event has probqbility l/m. Then the probability Apfilied ~robabilit~"11
that the event will occur at least once in m independent trials is at least
approximately 1. l/e = 63% . There is at least 50% chance the evekwill occur
--
in n = m log 2 = 0.69m trials.
< I
Quantitative Techniques
Example B . Suppose that we want the probability of a student answering at
I
for Risk Analysis least k questions .correctly on an exam with N questions. In this case, A, is
the event that the student answers the i th question correctly, T is the total
\ number of questions answered correctly, and ~r{ T 2 k} is the probability that
. k questions correctly.
the student answers at least
The difference between these two examples: In the first, all events A, have
equal probability, i.e., the coin is as likely to come up heads on one flip as on
another. So T has a binomial distribution whose tail bounds we have already
characterised above.
On the other hand we have a situation where, some examination questions
might be more difficult than others. If Question 1 is easier than Question 2,
then the probability of event A, is greater than the probability of event A,.
In following we develop a method to handle this moie general situation in
which the events A, may have different probabilities.
Taking Chernoff bound we will show that the number of events that occur is
almost never much greater. than the expectation. For example, if we toss N
coins, the expected number of heads is N/2 heads. The Chernoff Bound
implies that for sufficiently large N , the number of heads is almost always not
much greater than N/2.
Statement of the.Bound
Chernoff s Theorem can be stated in terms of Bernoulli variables instead of
events. However, we oan take I; as an indicator for the event A,.
Theorem 2.5. (Chernoff Bound). Let I;, T,, ...,T, be mutually independent
Bernoulli variables, and let
T = I ; + T , + ...+T,.
Then for all c 2 1, we have
The formula for the exponent in the bound'is a little awkward. The situation is
simpler when c = e = 2.718... 111 this case,
Although Theorem 2.6 can be proven without this assumption, we will assume
for simplicity that the random variables Xi have a finite fourth moment. That
is, we will suppose that
B [ x > ]= K < m. (29)
2) If a gambler is playing with the aim of winning a fixed amount, what are
the chances of his success?
3) Suppose an evelit has probability of -.m1 Find the probability that the event
will occur at least once in independent trials.
Quantitative Techniques 4) Take T,,T,, ...q,to be mutually independent Bernou Ili variables and let
for Risk Analysis T=T,+T,+ ...+ T,. So that
~r{(T 2 CE[ T ] ) )5 exp(-(c log c - c + I) E [TI)
Chernoff Bound: A lower bound for the success of majority agreement for n
independent, equally likely events. The Chernoff bound is a special case of
Chernoff s inequality.
Cliernoff's inequality: It states the fol,lowing:
Let ...X,
X,X,,
. be independent random variables, such that
< .
E [ x , ] = o and Ix,I51forall i .
Gambler's Ruin: A 'gamblerts loss of the last of her unit of gambling money
and coukquent inability to continue gambling. It is is also sometimes used to
refer to, a final large losin bet placed in the hopes of winning back all the
U' gambler has lost during a gambling session. More generally, the phrase refers
to the ever decreasing expected value of a gambler's initial capital as she Applied
continues to gamble with her winnings.
Markov Property: A property of a stochastic process in which the past and
future are independent.
Random Walk: Taking successive steps, each in a random direction.
Recurrence: A relation in an equation which defines a sequence recursively,
i.e., each term of the sequence is defined as a function of the preceding terms.
Stopping Time: With respect to a sequence of random variables XI,Xz,... a
random variable z with the property that for each t, the occurrence or non-
occurrence of the event z = t depends only on the values of XI, Xz,..., 4.'Thus,
at any particular time t, you can look at the sequence so far and tell if it is time
to stop.
Strong Law of Large Numbers: As the sample size grows larger, the
probabiIity that the sample mean and the population mean will be exactly equal
and approaches 1.
Weak Law of Large Numbers: As the sample size grows larger, the
difference between the sample mean and the population mean will approach
zero.
2.1 1 EXERCISES
1) Suppose you roll n dice that are 6-sided, fair and mutually independent.
What is the expected value of the largest number that comes up?
1" + 2" + 3" + 4".c5"
Ans. 6--
6" I
v,
Game 1: You win $2 with probability 213 and lose $1 with probability
113.
Game 2: You win $1002 with probability 213 and lose $2001 with
probability 113. \
i) What is the ekpected win in each case? (Ans. 1)
ii) What is the variance in each case (Ans. 2)
5) Suppose you have learned tat the average MEC students total number of Applied Probabiilty II
marks 200.
a) Use ~ a r k o v ' sinequality to find a best possible. upper
-
bound for the
200
fraction of MCE students with at least 235 marks. (Ans. -)
235
b) Suppos,e you are tald that no student can pass with less than 170
marks. How does this help to improve your previous bound? Show
30
that this is the best possible bound. (Ans. -)
65
C) Suppose that you further learn that the standard deviation of the total
marks per studdnt is 7. Give a best possible bound on the fractian of
I
students who can graduate with at least 235 marks. (Ans. -)
25
6) Let X, Y be independent Binomial random variables with parameters
( n ,J J ) and ( m , F ) , respectively. What is ~r { x + Y = k) ?
-
Quantitrrive Techniques Appendix to Unit 2
for Risk Analysis Wald Theorem
Let C,,C,, ..., be a sequence of nonnegative random variables, and let Q be a
positive interger-valued random variable, all with finite expectations, suppose
that