0% found this document useful (0 votes)
32 views

Introduction To Probability Theory

The document provides an introduction to probability theory and expectation. It presents several theorems and proofs regarding expectation of continuous random variables. Specifically: 1) Theorem 0.1 proves that the expectation of a continuous non-negative random variable X with pdf f is equal to the integral from 0 to infinity of xf(x) dx, provided the integral exists. 2) Theorem 0.2 extends this to any continuous random variable X with finite mean, proving the expectation is equal to the integral from negative infinity to positive infinity of xf(x) dx. 3) The document then presents additional theorems regarding properties of expectation, such as linearity and monotonicity, as well as the monotone convergence
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Introduction To Probability Theory

The document provides an introduction to probability theory and expectation. It presents several theorems and proofs regarding expectation of continuous random variables. Specifically: 1) Theorem 0.1 proves that the expectation of a continuous non-negative random variable X with pdf f is equal to the integral from 0 to infinity of xf(x) dx, provided the integral exists. 2) Theorem 0.2 extends this to any continuous random variable X with finite mean, proving the expectation is equal to the integral from negative infinity to positive infinity of xf(x) dx. 3) The document then presents additional theorems regarding properties of expectation, such as linearity and monotonicity, as well as the monotone convergence
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Introduction to Probability Theory

K. Suresh Kumar
Department of Mathematics
Indian Institute of Technology Bombay

October 8, 2017
2

LECTURES 20-21

Theorem 0.1 Let X be a continuous non negative random variable with


pdf f . Then Z ∞
EX = xf (x) dx
0

provided the rhs integral exists.

Proof. By using the simple functions given in the proof of Theorem 7.4, we
get
n2n −1
X k k k+1
EX = lim n
P ({ n ≤ X(ω) < })
n→∞ 2 2 2n
k=0
n2n −1
X k k+1 k
= lim n
[F ( n ) − F ( n )]
n→∞ 2 2 2
k=0
n2n −1
X k k+1 k
= lim f (tk )( n − n )
n→∞ 2n 2 2
k=0
n2n −1 n2 −1 n
X k+1 k X k k+1 k
= lim tk f (tk )( n − n ) + lim ( n − tk ) f (tk )( n − n ) ,
n→∞ 2 2 n→∞ 2 2 2
k=0 k=0
(0.1)
k k+1
where tk ∈ ( 2n , 2n ) is the point given by the mean value theorem.

n2n −1
X k k+1 k
0 ≤ lim ( n
− tk ) f (tk )( n − n )
n→∞ 2 2 2
k=0
n2n −1
X 1 k+1 k
≤ lim n
f (tk )( n − n )
n→∞ 2 2 2
k=0 Z

1
= lim n · f (x)dx = 0 .
n→∞ 2 0

Hence Z ∞
EX = xf (x) dx .
0

Definition 7.4. Let X be a random variable on (Ω, F, P ). The mean or


expectation of X is said to exists if either EX + or EX − is finite. In this
case EX is defined as

EX = EX + − EX − ,
3

where
X + = max{X, 0}, X − = max{−X, 0} .
Note that X + is the positive part and X − is the negative part of X.

Theorem 0.2 Let X be a continuous random variable with finite mean and
pdf f . Then Z ∞
EX = xf (x) dx .
−∞

Proof. Set
k k k+1
(
if ≤ X + (ω) < , k = 0, · · · , n2n − 1
Yn (ω) = 2n 2n 2n
0 if X + (ω) ≥ n .

Hence
k k k+1
(
if ≤ X(ω) < , k = 0, · · · , n2n − 1
Yn (ω) = 2n 2 n 2n
0 if X(ω) ≥ n or X(ω) ≤ 0 .

Then Yn is a sequence of simple random variables such that

EX + = lim EYn .
n→∞

Similarly, set

k k k+1
(
if ≤ X − (ω) < , k = 0, · · · , n2n − 1
Zn (ω) = 2n 2n 2n
0 if X − (ω) ≥ n .

Hence
k k k+1
(
if ≤ −X(ω) < , k = 0, · · · , n2n − 1
Zn (ω) = 2n 2 n 2n
0 if X(ω) ≤ −n or X(ω) ≥ 0 .

Then
EX − = lim EZn .
n→∞

Now
n2 n −1
X k k k+1
lim EYn = lim n
P ({ n ≤ X(ω) < }) (0.2)
n→∞ n→∞ 2 2 2n
k=0
4

and
n2n −1
X k k k+1
− limn→∞ EZn = − lim P ({ n ≤ −X(ω) < })
n→∞ 2n 2 2n
k=
n2n −1
X  k −k − 1 −k
= − n P ({ n
< X(ω) ≤ n })
2 2 2
k=0 (0.3)
−n2 n
X+1 k k−1 k
= n
P ({ n < X(ω) ≤ n })
2 2 2
k=0
−n2 n
X+1 k−1 k−1 k
= P ({ n < X(ω) ≤ n }) .
2n 2 2
k=0

The last equality follows by the arguments from the proof of Theorem 6.0.26.
Combining (0.2) and (0.2), we get
n2n −1
X k k k+1
EX = lim n
P ({ n ≤ X(ω) < }) .
n→∞ 2 2 2n
k=−n2n

Now as in the proof of Theorem 6.0.26, we complete the proof.


We state the following useful properties of expectation. The proof follows
by approximation argument using the corresponding properties of simple
random variables.
Theorem 0.3 Let X, Y be random variables with finite mean. Then
(i) If X ≥ 0, then EX ≥ 0.
(ii) For a ∈ R,
E(aX + Y ) = aEX + EY .

(iii) Let Z ≥ 0 be a random variable such that Z ≤ X. Then Z has finite


mean and EZ ≤ EX.
In the context of Riemann integration, one can recall the following con-
vergence theorem.
“ If gn , n ≥ 1 is a sequence of continuous functions defined on the [a, b]
such that gn → g uniformly in [a, b], then
Z b Z b
lim gn (x) dx = g(x)dx .00
n→∞ a a

i.e., to take limit inside the integral, one need uniform convergence of func-
tions. In many situations in it is highly unlikely to get uniform convergence.
5

In fact uniform convergence is not required to take limit inside an in-


tegral. This is illustrated in the following couple of theorem. The proof of
them are beyond the scope of this course.

Theorem 0.4 (Monotone convergence theorem) Let Xn be an increasing


sequence of nonnegative random variables such that limn→∞ Xn = X. Then

lim EXn = EX .
n→∞

[Here limn→∞ Xn = X means limn→∞ Xn (ω) = X(ω), ω ∈ Ω.]

Theorem 0.5 (Dominated Convergence Theorem) Let Xn , X, Y be ran-


dom variables such that

(i) Y has finite mean.

(ii) |Xn | ≤ Y .

(iii) limn→∞ Xn = X.

Then
lim EXn = EX .
n→∞

Now we state the following theorem which provides a useful tool to com-
pute expection of random variable which are mixed in nature.

Theorem 0.6 Let X be a continuous random variable Rwith pdf f and φ :



R → R be a continuous function such that the integral −∞ φ(x)f (x)dx is
finite. Then
Z ∞
E[φ ◦ X] = φ(x)f (x)dx .
−∞

Proof: First, I will give a proof for a special case, i.e. when ϕ : R → R is
strictly increasing and differentiable. Set Y = ϕ(X). Then Y has a density
g given by

1
g(y) = f (ϕ−1 (y)) , y ∈ ϕ(R), = 0 otherwise.
ϕ0 (ϕ−1 (y))
6

Hence
Z ∞
E[ϕ(X)] = yg(y)dy
−∞
Z
1
= yf (ϕ−1 (y)) dy
ϕ(R) ϕ0 (ϕ−1 (y))
Z ∞
dy 1
(use y = ϕ(x), Jacobian is = ϕ0 (x)) = ϕ(x)f (x) ϕ0 (x)dx
dx ϕ0 (x)
Z−∞

= ϕ(x)f (x)dx.
−∞

When ϕ(x) = x2n+1 , n ≥ 0. Also let E[X 2n+1 ] exists. Then note that ϕ
is strictly increasing and differentiable. Though one can use the proof given
above to conclude the result, we will give a direct proof. Here
1
ϕ−1 (y) = y 2n+1 , ϕ0 (x) = (2n + 1)x2n .

Hence the pdf g of X 2n+1 is given by


1 1 2n
g(y) = f ( y 2n+1 )y − 2n+1 , y 6= 0.
2n + 1
Hence
Z ∞
E[X 2n+1 ] = yg(y)dy
−∞
Z ∞
1 1 2n
= yf ( y 2n+1 )y − 2n+1 dy
2n + 1 −∞
Z ∞
dy
(use y = x2n+1 , Jacobian is = (2n + 1)x2n ) = x2n+1 f (x)dx.
dx −∞

When ϕ(x) = x2n . Then note ϕ is not one to one.


Set Y = X 2n and G, g denote respectively the distribution function and
the pdf of Y . Then for y > 0,

G(y) = P {Y ≤ y}
1 1
= P {−y 2n ≤ X ≤ y 2n }
1 1
= F (y 2n ) − F (−y 2n ).

Hence
1 1 −1  1 1

g(y) = y 2n f (y 2n ) + f (−y 2n ) , y > 0, = 0 for y ≤ 0.
2n
7

Therefore
Z ∞
2n
E[X ] = yg(y)dy
−∞
Z ∞
1 1
 1 1

= y 2n f (y 2n ) + f (−y 2n ) dy.
2n 0

Consider Z ∞
1 1 1
y 2n f (y 2n )dy.
2n 0

We use the following change variable argument. Set y = x2n := ψ(x), x > 0.
Then note that ψ : (0, ∞) → (0, ∞) is a bijective map and the Jacobian is
ψ 0 (x) = 2nx2n−1 , x > 0. Hence
Z ∞ Z
1 1 1
y 2n f (y 2n )dy = xf (x)|ψ 0 (x)|dx
2n 0 −1
ψ ((0,∞))
Z ∞
= x2n f (x)dx.
0

Similarly consider Z ∞
1 1 1
y 2n f (−y 2n )dy.
2n 0

Set y = x2n := ψ(x), x < 0, i.e. ψ : (−∞, 0) → (0, ∞) and is a bijective map
2n
with Jacobian ψ 0 (x) = 2nx2n−1 , x < 0. We write is as ψ 0 (x) = 2nxx . Also
note that y 1/2n = |x|, x < 0. Hence
Z ∞ Z
1 1 1
y 2n f (−y 2n )dy = |x|f (x)|ψ 0 (x)|dx
2n 0 ψ 1 ((∞,0))
Z 0
= x2n f (x)dx.
−∞

Now combining the integrals, we get the formula.


Now when ϕ is a polynomial, we can prove the theorem by writing ϕ as
a linear combination of xn of appropriate order and then use the (linearity)
property of expectation. Proof of the theorem beyond polynomials requires
more sofistication so we won’t be considering in this course.

The above theorem is some times referred as the “Law of unconscious


statistician” since often users treat the above as a definition itself.
8

Example 0.1 Let X ∼ U (0, 2) and Y = max{1, X}. Then for ϕ(x) =
max{1, x}, we have

EY = E[ϕ ◦ X]
Z ∞
= ϕ(x)f (x)dx
−∞
Z 2
1
= max{1, x}dx
2 0
1 1 2
Z
= + xdx
2 2 1
5
= ,
4
where f is the pdf of U (0, 2).

Example 0.2 Let X be a random variable with pdf f . Find E[XI{X≤a} ]


where a ∈ R.
Note this doesn’t come under the realm of Theorem 0.6, because here
ϕ(x) = xI{x≤a} has a discontinuity at x = a for a 6= 0. So discussion below
will give you a method to tackle ϕ with discontinuities.

So consider
ϕa (x) = (x − a)I{x≤a} .
Note ϕa is continuous and hence using Theorem 0.6, we get
Z ∞
E[(X − a)I{X≤a} ] = (x − a)I{x≤a} (x)f (x)dx
−∞
Z a
= (x − a)f (x)dx
−∞
Z a
= xf (x)dx − aP {X ≤ a}.
−∞

Now

E[(X − a)I{X≤a} ] = E[XI{X≤a} ] − aE[I{X≤a} ]


= E[XI{X≤a} ] − aP {X ≤ a}.

Now equating the above two, we get


Z a
E[XI{X≤a} ] = xf (x)dx.
−∞
9

Exercise Let ϕ be a continuous function and a ∈ R and X be a random


variable with pdf f such that E[ϕ(X)] is finite. Show that
Z a
E[ϕ(X)I{X≤a} ] = ϕ(x)f (x)dx.
−∞

Along similar line to Theorem 0.6, we have the following theorem.

Theorem 0.7 Let X and Y be continuous random variables with joint pdf
f and ϕ : R2 → R be a continuous function. Then
Z ∞Z ∞
E[ϕ ◦ (X, Y )] = ϕ(x, y)f (x, y)dxdy.
−∞ −∞

Here again, proof is beyond the scope of the course but special cases like
X + Y, XY, X 2 etc . I will do the case ϕ(x, y) = xy as an example.

First note that the pdf g of Y = XY is given by


Z ∞
1 z
g(z) = f (x, )dx, z ∈ R.
−∞ |x| x

(If you have not yet done this, immediately attend the above)

Hence
Z ∞
E[XY ] = zg(z)dz
Z−∞
∞ Z ∞
z z
= f (x, )dxdz
|x| x
Z−∞ −∞
∞ Z ∞
z z
(change order of integrtaion) = f (x, )dzdx
|x| x
Z−∞ −∞
∞ Z ∞
xy
(put z = xy) = f (x, y)|x|dydx
|x|
Z−∞ −∞
∞ Z ∞
= xyf (x, y)dydx.
−∞ −∞

[In the above, justify the change of variable calculation. Hint: split the
outer integral (i.e. integral over x variable into (−∞, 0) and (0, ∞) and
apply change of variable formula separately and then combine.]
10

Theorem 0.8 Let X and Y be independent random variables such that


EX, EY exists. Then E[XY ] exists and is given by

E[XY ] = EXEY.

Proof: Using the above one can see that when X and Y are independent
and with a pdf f , then
E[XY ] = EXEY
(exercise). When X, Y discrete with joint pmf f , then again it is easy to
prove (exercise).
Variance and other Higer order moments: In this subsection, we ad-
dress the question of recontruction of the distribution of a random variable.
To this end, we introduce objects are moments of a random variable and
later see whether we can determine the distribution function using them.

Definition 7.5.(Higher Order Moments) Let X be a random variable. Then


EX n is called the nth moment of X and E(X − EX)n is called the nth
central moment of X. The second central moment is called the variance.

Example 0.3 (1) Let X be Bernoulli (p). Then we have see that EX = p
and hence for a Bernoulli random variable, knowing first moment itself
will uniquely identify the distribution. Also note that other moments
are EX n = p, n ≥ 1. Observe the pattern of the moments, {p, p, · · · }.

(2) Let X be Binomial (n, p). Then EX = np. This doesn’t gives uniquely
the distribution. So let us compute the variance. To do this, we use the
following. Let X1 , X2 , · · · , Xn be n-independent Bernoulli(p) random
variables. Then we know that X1 + · · · + Xn is Binomial (n, p). Hence
take X = X1 + · · · + Xn . Now
n
X
2
E[X − EX] = E[Xk − p]2
k=1
n
X
= p(1 − p)
k=1
= np(1 − p).

Now given EX and E[X − EX]2 , we can solve

EX = np, E[X − EX]2 = np(1 − p)


0.1. MOMENT GENERATING FUNCTION 11

to find the parameters n and p.


Also find few more moments (exercise).

(3) Let X ∼ N (0, 1). Then EX = 0 and EX 2 = 1 and also Variance


Var(X) = 1. (Exercise)
The above examples only tell that if we know apriori that given random
variable is of certain type like Binomial, Bernoulli, Poisson but we don’t
know the parameters, one can use their moments to determine their param-
eters and hence their distribution.
There is an interesting problem, given a sequence of numbers {a1 , a2 , · · · },
can one find a distribution/ distribution function whose moments are given
by {an }? Also is it unique. This is called the moment problem.
For example, we have seen that the sequence {p, p, · · · } corresponds to
the Bernoulli (p) distribution but not sure right now whether it is the only
distribution with moments given by the above sequence.
If a distribution is uniquely determined by its moments are called moment
determinant distributions and others are called moment indeterminant.
In fact, Bernoulli, Binomial, Poisson, Normal etc are moment determi-
nant but log normal distribution is moment indetreminant.

Chapter 8: Moment Generating and Characteristic functions


In this chapter, we introduce the notion of moment generating function
(in short mgf) and characteristic function of a random variable and study its
properties. Both moment generating function and Characteristic function
can be used to identify distribution functions uniquely unlike moments. In
fact, a way to understand whether a distribution is moment determinant or
not is by using either moment generating function or characteristic functions.
It is interesting to note that mgf is closely related the Laplace transform and
characteristic function is its counter part Fourier transform.

0.1 Moment generating function


In this subsection we study moment generating functions and its properties.
Definition 8.1 Given a random variable on a probability space (Ω, F, P ),
its moment generating function denoted by MX is defined as

MX (t) = E[etX ], t ∈ I,
12

where I is an interval on which the rhs expectation exists. In fact for a


non negative random variable X, I always contains (−∞, 0]. If X is a non
negative random variable such that EX doesn’t exists, then MX (t) doesn’t
exists for t > 0 (exercise). Analogous comment holds for negative random
variable. Moment generating functions becomes useful, if I contains an
interval containing 0.

Example 0.4 Let X ∼ Bernoulli (p). Then

MX (t) = (1 − p) + pet , t ∈ R.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy