Introduction To Probability Theory
Introduction To Probability Theory
K. Suresh Kumar
Department of Mathematics
Indian Institute of Technology Bombay
October 8, 2017
2
LECTURES 20-21
Proof. By using the simple functions given in the proof of Theorem 7.4, we
get
n2n −1
X k k k+1
EX = lim n
P ({ n ≤ X(ω) < })
n→∞ 2 2 2n
k=0
n2n −1
X k k+1 k
= lim n
[F ( n ) − F ( n )]
n→∞ 2 2 2
k=0
n2n −1
X k k+1 k
= lim f (tk )( n − n )
n→∞ 2n 2 2
k=0
n2n −1 n2 −1 n
X k+1 k X k k+1 k
= lim tk f (tk )( n − n ) + lim ( n − tk ) f (tk )( n − n ) ,
n→∞ 2 2 n→∞ 2 2 2
k=0 k=0
(0.1)
k k+1
where tk ∈ ( 2n , 2n ) is the point given by the mean value theorem.
n2n −1
X k k+1 k
0 ≤ lim ( n
− tk ) f (tk )( n − n )
n→∞ 2 2 2
k=0
n2n −1
X 1 k+1 k
≤ lim n
f (tk )( n − n )
n→∞ 2 2 2
k=0 Z
∞
1
= lim n · f (x)dx = 0 .
n→∞ 2 0
Hence Z ∞
EX = xf (x) dx .
0
EX = EX + − EX − ,
3
where
X + = max{X, 0}, X − = max{−X, 0} .
Note that X + is the positive part and X − is the negative part of X.
Theorem 0.2 Let X be a continuous random variable with finite mean and
pdf f . Then Z ∞
EX = xf (x) dx .
−∞
Proof. Set
k k k+1
(
if ≤ X + (ω) < , k = 0, · · · , n2n − 1
Yn (ω) = 2n 2n 2n
0 if X + (ω) ≥ n .
Hence
k k k+1
(
if ≤ X(ω) < , k = 0, · · · , n2n − 1
Yn (ω) = 2n 2 n 2n
0 if X(ω) ≥ n or X(ω) ≤ 0 .
EX + = lim EYn .
n→∞
Similarly, set
k k k+1
(
if ≤ X − (ω) < , k = 0, · · · , n2n − 1
Zn (ω) = 2n 2n 2n
0 if X − (ω) ≥ n .
Hence
k k k+1
(
if ≤ −X(ω) < , k = 0, · · · , n2n − 1
Zn (ω) = 2n 2 n 2n
0 if X(ω) ≤ −n or X(ω) ≥ 0 .
Then
EX − = lim EZn .
n→∞
Now
n2 n −1
X k k k+1
lim EYn = lim n
P ({ n ≤ X(ω) < }) (0.2)
n→∞ n→∞ 2 2 2n
k=0
4
and
n2n −1
X k k k+1
− limn→∞ EZn = − lim P ({ n ≤ −X(ω) < })
n→∞ 2n 2 2n
k=
n2n −1
X k −k − 1 −k
= − n P ({ n
< X(ω) ≤ n })
2 2 2
k=0 (0.3)
−n2 n
X+1 k k−1 k
= n
P ({ n < X(ω) ≤ n })
2 2 2
k=0
−n2 n
X+1 k−1 k−1 k
= P ({ n < X(ω) ≤ n }) .
2n 2 2
k=0
The last equality follows by the arguments from the proof of Theorem 6.0.26.
Combining (0.2) and (0.2), we get
n2n −1
X k k k+1
EX = lim n
P ({ n ≤ X(ω) < }) .
n→∞ 2 2 2n
k=−n2n
i.e., to take limit inside the integral, one need uniform convergence of func-
tions. In many situations in it is highly unlikely to get uniform convergence.
5
lim EXn = EX .
n→∞
(ii) |Xn | ≤ Y .
(iii) limn→∞ Xn = X.
Then
lim EXn = EX .
n→∞
Now we state the following theorem which provides a useful tool to com-
pute expection of random variable which are mixed in nature.
Proof: First, I will give a proof for a special case, i.e. when ϕ : R → R is
strictly increasing and differentiable. Set Y = ϕ(X). Then Y has a density
g given by
1
g(y) = f (ϕ−1 (y)) , y ∈ ϕ(R), = 0 otherwise.
ϕ0 (ϕ−1 (y))
6
Hence
Z ∞
E[ϕ(X)] = yg(y)dy
−∞
Z
1
= yf (ϕ−1 (y)) dy
ϕ(R) ϕ0 (ϕ−1 (y))
Z ∞
dy 1
(use y = ϕ(x), Jacobian is = ϕ0 (x)) = ϕ(x)f (x) ϕ0 (x)dx
dx ϕ0 (x)
Z−∞
∞
= ϕ(x)f (x)dx.
−∞
When ϕ(x) = x2n+1 , n ≥ 0. Also let E[X 2n+1 ] exists. Then note that ϕ
is strictly increasing and differentiable. Though one can use the proof given
above to conclude the result, we will give a direct proof. Here
1
ϕ−1 (y) = y 2n+1 , ϕ0 (x) = (2n + 1)x2n .
G(y) = P {Y ≤ y}
1 1
= P {−y 2n ≤ X ≤ y 2n }
1 1
= F (y 2n ) − F (−y 2n ).
Hence
1 1 −1 1 1
g(y) = y 2n f (y 2n ) + f (−y 2n ) , y > 0, = 0 for y ≤ 0.
2n
7
Therefore
Z ∞
2n
E[X ] = yg(y)dy
−∞
Z ∞
1 1
1 1
= y 2n f (y 2n ) + f (−y 2n ) dy.
2n 0
Consider Z ∞
1 1 1
y 2n f (y 2n )dy.
2n 0
We use the following change variable argument. Set y = x2n := ψ(x), x > 0.
Then note that ψ : (0, ∞) → (0, ∞) is a bijective map and the Jacobian is
ψ 0 (x) = 2nx2n−1 , x > 0. Hence
Z ∞ Z
1 1 1
y 2n f (y 2n )dy = xf (x)|ψ 0 (x)|dx
2n 0 −1
ψ ((0,∞))
Z ∞
= x2n f (x)dx.
0
Similarly consider Z ∞
1 1 1
y 2n f (−y 2n )dy.
2n 0
Set y = x2n := ψ(x), x < 0, i.e. ψ : (−∞, 0) → (0, ∞) and is a bijective map
2n
with Jacobian ψ 0 (x) = 2nx2n−1 , x < 0. We write is as ψ 0 (x) = 2nxx . Also
note that y 1/2n = |x|, x < 0. Hence
Z ∞ Z
1 1 1
y 2n f (−y 2n )dy = |x|f (x)|ψ 0 (x)|dx
2n 0 ψ 1 ((∞,0))
Z 0
= x2n f (x)dx.
−∞
Example 0.1 Let X ∼ U (0, 2) and Y = max{1, X}. Then for ϕ(x) =
max{1, x}, we have
EY = E[ϕ ◦ X]
Z ∞
= ϕ(x)f (x)dx
−∞
Z 2
1
= max{1, x}dx
2 0
1 1 2
Z
= + xdx
2 2 1
5
= ,
4
where f is the pdf of U (0, 2).
So consider
ϕa (x) = (x − a)I{x≤a} .
Note ϕa is continuous and hence using Theorem 0.6, we get
Z ∞
E[(X − a)I{X≤a} ] = (x − a)I{x≤a} (x)f (x)dx
−∞
Z a
= (x − a)f (x)dx
−∞
Z a
= xf (x)dx − aP {X ≤ a}.
−∞
Now
Theorem 0.7 Let X and Y be continuous random variables with joint pdf
f and ϕ : R2 → R be a continuous function. Then
Z ∞Z ∞
E[ϕ ◦ (X, Y )] = ϕ(x, y)f (x, y)dxdy.
−∞ −∞
Here again, proof is beyond the scope of the course but special cases like
X + Y, XY, X 2 etc . I will do the case ϕ(x, y) = xy as an example.
(If you have not yet done this, immediately attend the above)
Hence
Z ∞
E[XY ] = zg(z)dz
Z−∞
∞ Z ∞
z z
= f (x, )dxdz
|x| x
Z−∞ −∞
∞ Z ∞
z z
(change order of integrtaion) = f (x, )dzdx
|x| x
Z−∞ −∞
∞ Z ∞
xy
(put z = xy) = f (x, y)|x|dydx
|x|
Z−∞ −∞
∞ Z ∞
= xyf (x, y)dydx.
−∞ −∞
[In the above, justify the change of variable calculation. Hint: split the
outer integral (i.e. integral over x variable into (−∞, 0) and (0, ∞) and
apply change of variable formula separately and then combine.]
10
E[XY ] = EXEY.
Proof: Using the above one can see that when X and Y are independent
and with a pdf f , then
E[XY ] = EXEY
(exercise). When X, Y discrete with joint pmf f , then again it is easy to
prove (exercise).
Variance and other Higer order moments: In this subsection, we ad-
dress the question of recontruction of the distribution of a random variable.
To this end, we introduce objects are moments of a random variable and
later see whether we can determine the distribution function using them.
Example 0.3 (1) Let X be Bernoulli (p). Then we have see that EX = p
and hence for a Bernoulli random variable, knowing first moment itself
will uniquely identify the distribution. Also note that other moments
are EX n = p, n ≥ 1. Observe the pattern of the moments, {p, p, · · · }.
(2) Let X be Binomial (n, p). Then EX = np. This doesn’t gives uniquely
the distribution. So let us compute the variance. To do this, we use the
following. Let X1 , X2 , · · · , Xn be n-independent Bernoulli(p) random
variables. Then we know that X1 + · · · + Xn is Binomial (n, p). Hence
take X = X1 + · · · + Xn . Now
n
X
2
E[X − EX] = E[Xk − p]2
k=1
n
X
= p(1 − p)
k=1
= np(1 − p).
MX (t) = E[etX ], t ∈ I,
12
MX (t) = (1 − p) + pet , t ∈ R.