Random Variables PDF
Random Variables PDF
Random Variables PDF
Fall 2015
Instructor:
Ajit Rajwade
Topic Overview
Random variable: definition
Discrete and continuous random variables
Probability density function (pdf) and cumulative
Random variable
In many random experiments, we are not always
Random variable
Value of X (Denoted
as x) where X = sum
of 2 dice throws
P(X=x)
1/36
2/36
3/36
4/36
5/36
6/36
5/36
4/36
10
3/36
11
2/36
12
1/36
alphabet.
Individual values the random variable can acquire are
experiments
The probability that a random variable X takes on
P{ X B} f X ( x )dx
B
Properties:
( x )dx 1
P( X a ) f X ( x )dx 0
a
a / 2
( x )dx f (a )
a / 2
f X (a ) lim 0
P{a / 2 X a / 2}
f X ( x)
2
2
1
e ( x ) /( 2 )
2
1
,a x b
(b a )
0 otherwise
f X ( x)
E( X )
xf
( x )dx
E( X )
xf
( x )dx
A Game of Roulette
E ( g ( X ))
g ( x) f
( x )dx
E (ag ( X ) b) ( ag ( x ) b) f X ( x )dx
ag ( x) f
( x )dx bf X ( x )dx
aE ( g ( X )) b why ?
This property is called the linearity of the expected value. In general, a
function f(x) is said to be linear in x is f(ax+b) = af(x)+b where a and b are
constants. In this case, the expected value is not a function but an operator
(it takes a function as input). An operator E is said to be linear if
E(af(x) + b) = a E(f(x)) + b.
Then
E((X-c) 2 ) E (( X c)2 )
E (( X )2 ( c)2 2( X )( c))
E (( X )2 ) E (( c)2 ) 2 E (( X )( c))
E (( X )2 ) ( c)2 0
E (( X )2 )
The expected value is the value that yields the least
mean squared prediction error!
The median
What minimizes the following quantity?
J (c) | x c | f X ( x )dx
(c x ) f X ( x )dx ( x c) f X ( x )dx
c
The median
c
The median
J (c) 2cFX (c) c 2Q(c) Q() Q()
J ' (c) 0
2cf X (c) 2 FX (c) 1 2q(c) 0
2cf X (c) 2 FX (c) 1 2cf X (c) 0
2 FX (c) 1 0
FX (c) 1 / 2
This is the median by definition and it minimizes J(c). We can double
check that J(c) >= 0. Notice the peculiar definition of the median for the
continuous case here! This definition is not conceptually different from
the discrete case, though. Also, note that the median will not be unique if
FX is not differentiable at c.
Variance
The variance of a random variable X tells you how much
standard deviation.
For some distributions, the variance (and hence standard
deviation) may not be defined, because the integral may not
have a finite value.
density
Low-variance probability mass functions or probability densities
tend to be concentrated around one point. High variance densities
are spread out.
Alternative expression:
Var ( X ) E[( X ) 2 ] E[ X 2 2 2 X ]
E[ X 2 ] 2 2 E[ X ]
E[ X 2 ] 2 2 2 why ?
E[ X 2 ] 2
E[ X 2 ] ( E[ X ]) 2
Variance: properties
Property:
Probabilistic inequalities
Sometimes we know the mean or variance of a random
Probabilistic inequalities
Example: Lets say the average annual salary offered to a
Markovs inequality
Let X be a random variable that takes only non-
Markovs inequality
Proof:
E[ X ] xf X ( x )dx
0
a
xf X ( x )dx xf X ( x )dx
xf X ( x )dx
a
af X ( x )dx
a
a f X ( x )dx
a
aP{ X a}
Chebyshevs inequality
For a random variable X with mean and variance 2,
k2
Proof: follows from Markovs inequality
( X )2 is a non - negative random variable
P{( X )2 k 2 } E[( X )2 ] / k 2 2 / k 2
P{| X | k} 2 / k 2
k2
If I replace k by k, I get the following:
1
P{| X | k } 2
k
100K
P{ X 110K}
0.9090 90%
110K
50 K
P{| X 100 K | 10 K }
0.0005 0.05%
10 K 10 K
P{| X 100 K | 10 K } 1 0.05% 99.5%
https://en.wikipedia.org/wiki/Law_of_large_numbers
X 1 X 2 ... X n
| } 0 as n
n
Empirical (or
sample) mean
inequality
X 1 X 2 ... X n
X 1 X 2 ... X n
n 2 2
E(
) ,Var (
) 2
,
n
n
n
n
X 1 X 2 ... X n
2
P{|
| } 2
n
n
X X 2 ... X n
lim n P{| 1
| } 0
n
P(lim n
X 1 X 2 ... X n
) 1
n
This is stronger than the weak law because this states that
Joint distributions/pdfs/pmfs
Joint CDFs
Given continuous random variables X and Y, their joint
FXY ( x, y ) P( X x, Y y )
The distribution of either random variable (called as
FX ( x ) P( X x, Y ) FXY ( x, )
FY ( y ) P( X , Y y ) FXY (, y )
Joint PMFs
Given two discrete random variables X and Y, their
P{ X xi , Y y j } p( xi , y j )
j
Why?
only one child, 35% have two children and 30% have three
children. Let us suppose that male and female child are equally
likely and independent.
What is the probability that a randomly chosen family has no
children?
P(B = 0, G = 0) = 0.15 = P(no children)
Has 1 girl child?
P(B=0,G=1)=P(1 child) P(G=1|1 child) = 0.2 x 0.5 = 0.1
Has 3 girls?
P(B = 0, G = 3) = P(3 children) P(G=3 | 3 Children) = 0.3 x (0.5)3
Has 2 boys and 1 girl?
P(B = 2, G = 1) = P(3 children) P(B = 2, G = 1| 3 children) = 0.3 x
(1/8) x 3 = 0.1125 (all 8 combinations of 3 children are equally
likely. Out of these there are 3 of the form 2 boys + 1 girl)
Joint PDFs
For two jointly continuous random variables X and Y,
XY
( x, y )dxdy
( x , y )C
follows:
a b
FXY (a, b)
XY
( x, y )dxdy
2
f XY (a, b)
FXY ( x, y ) |x a , y b
xy
X
The joint probability that (X,Y) belongs to any
arbitrary-shaped region in the XY-plane is
obtained by integrating the joint pdf of (X,Y)
over that region (eg: region C)
f X ( x)
XY
( x, y )dy
XY
( x, y )dy
fY ( y )
f X ( x)
XY
( x, y )dx
( x )dx
XY
FX ( a ) FXY ( a, )
( x, y )dydx
independence!
( xi , x j ),1 i n,1 j n, i j,
f X i , X j ( xi , x j ) f X i ( xi ) f X j ( x j )
mutual independence.
Concept of covariance
The covariance of two random variables X and Y is
defined as follows:
Cov( X , Y ) E[( X X )(Y Y )]
Further expansion:
Cov( X i , Y ) Cov( X i , Y )
i
Cov( X i , Y j ) Cov( X i , Y j )
i
Cov( X i , Y j ) Cov( X i , Y j )
i
Var ( X i ) Cov( X i , X i )
i
Cov( X i , X j )
i
Cov( X i , X i ) Cov( X i , X j )
i
j i
Var ( X i ) Cov( X i , X j )
i
j i
xi y j P{ X xi }P{Y y j }
i
xi P{ X xi } y j P{Y y j }
i
E[ X ]E[Y ]
Conditional pdf/cdf/pmf
Given random variables X and Y with joint pdf fXY(x,y),
f XY ( x, y )
FX |Y ( x | y )
fY ( y )
x
f X ,Y ( z, y )
dz
fY ( y )
http://math.arizona.edu/~jwatkins/m-conddist.pdf
X |Y
( z | y )dz
E( X | Y y)
xf
X |Y
( x, y )dx
Var ( X | Y y ) ( x E ( X | Y y )) 2 f X |Y ( x, y )dx
Example
f ( x, y ) 2.4 x ( 2 x y ),0 x 1,0 y 1
0 otherwise
Find conditional density of X given Y y.
Find conditional mean of X given Y y.