Random Variables
Random Variables
Random Variables
Random Variables
Examples
(i) The sum of two dice.
(ii) The length of time I have to wait at the bus
stop for a #2 bus.
(iii) The number of heads in 20 flips of a coin.
Definition. A random variable, X, is a
function from the sample space S to the real
numbers, i.e., X is a rule which assigns a number
X(s) for each outcome s ∈ S.
Example. For S = {(1, 1), (1, 2), . . . , (6, 6)} the
random variable X corresponding to the sum is
X(1, 1) = 2, X(1, 2) = 3, and in general
X(i, j) = i + j.
Note. A random variable is neither random nor a
variable. Formally, it is a function defined on S.
1
Defining events via random variables
• Notation: we write X = x for the event
{s ∈ S : X(s) = x}.
• This is different from the usual use of equality
for functions. Formally, X is a function X(s).
What does it usually mean to write f (s) = x?
2
Remarks
• For any random quantity X of interest, we can
take S to be the set of values that X can take.
Then, X is formally the identity function,
X(s) = s. Sometimes this is helpful, sometimes
not.
Example. For the sum of two dice, we could
take S = {2, 3, . . . , 12}.
• It is important to distinguish between random
variables and the values they take. A
realization is a particular value taken by a
random variable.
• Conventionally, we use UPPER CASE for
random variables, and lower case (or numbers)
for realizations. So, {X = x} is the event that
the random variable X takes the specific value x.
Here, x is an arbitrary specific value, which does
not depend on the outcome s ∈ S.
3
Discrete Random Variables
Definition: X is discrete if its possible values
form a finite or countably infinite set.
Definition: If X is a discrete random variable,
then the function
p(x) = P(X = x)
4
Example. The probability mass function of a
random variable X is given by p(i) = c · λi /i!
for i = 0, 1, 2, . . ., where λ is some positive value.
Find
(i) P(X = 0)
5
Example. A baker makes 10 cakes on given day.
Let X be the number sold. The baker estimates
that X has p.m.f.
1 k
p(k) = + , k = 0, 1, . . . , 10
20 100
Is this a plausible probability model?
Pn
Hint. Recall that i=1 i = 12 n(n + 1). How do
you prove this?
6
Discrete distributions
• A discrete distribution is a probability mass
function, i.e. a set of values x1 , x2 , . . . and
p(x1 ), p(x2 ), . . . with 0 < p(xi ) ≤ 1 and
P
i p(xi ) = 1.
7
The Cumulative Distribution Function
Definition: The c.d.f. of X is
F (x) = P(X ≤ x), for −∞ < x < ∞.
8
• The c.d.f. contains the same information (for a
discrete distribution) as the p.m.f., since
X
F (x) = p(xi )
xi ≤x
9
Example Continued. Flip a fair coin until a
head occurs. Let X be the length of the sequence.
Find the c.d.f. of X, and plot it.
Solution
10
Properties of the c.d.f.
Let X be a discrete RV with possible values
x1 , x2 , . . . and c.d.f. F (x).
• 0 ≤ F (x) ≤ 1 . Why?
11
Functions of a random variable
• Let X be a discrete RV with possible values
x1 , x2 , . . . and p.m.f. pX (x).
• Let Y = g(X) for some function g mapping real
numbers to real numbers. Then Y is the random
¡ ¢
variable such that Y (s) = g X(s) for each
s ∈ S. Equivalently, Y is the random variable
such that if X takes the value x, Y takes the
value g(x).
12
Expectation
Consider the following game. A fair die is rolled,
with the payoffs being...
n n n
×5 + × 10 + × 15 = 10.83 n
6 2 3
13
Expectation of Discrete Random Variables
Definition. Let X be a discrete random variable
with possible values x1 , x2 , . . . and p.m.f. p(x).
The expected value of X is
P
E(X) = i xi p(xi )
14
Expectation of a Function of a RV
• If X is a discrete random variable, and g is a
function taking real numbers to real numbers,
then g(X) is a discrete random variable also.
• If X has probability p(xi ) of taking value xi ,
then g(X) does not necessarily take value g(xi )
¡ ¢
with probability p g(xi ) . Why? Nevertheless,
£ ¤ P
Proposition. E g(X) = i g(xi ) p(xi )
Proof.
15
Example 1. Let X be the value of a fair die.
(i) Find E(X).
Proof.
16
Example. There are two questions in a quiz
show. You get to choose the order to answer
them. If you try question 1 first, then you will be
allowed to go on to question 2, only if your
answer to question 1 is correct, vice versa. The
rewards for these two questions are V1 and V2 . If
the probability that you know the answers to the
two questions are p1 and p2 , then which question
should be chosen first?
17
Two intuitive properties of expectation
• The formula for expectation is the same as the
formula for the center of mass, when objects of
mass pi are put at position xi . In other words,
the expected value is the balancing point for the
graph of the probability mass function.
• The distribution of X is symmetric about
some point µ if p(µ + x) = p(µ − x) for every x.
18
Variance
• Expectation gives a measure of center of a
distribution. Variance is a measure of spread.
Proof.
19
Proposition. For any RV X and constants a, b,
Var(aX + b) = a2 Var(X)
Proof.
20
Standard Deviation
• We might prefer to measure the spread of X in
the same units as X.
21
Example: standardization. Let X be a
random variable with expected value µ and
standard deviation σ. Find the expected value
X −µ
and variance of Y = .
σ
Solution.
22
Bernoulli Random Variables
• The result of an experiment with two possible
outcomes (e.g. flipping a coin) can be classified
as either a success (with probability p) or a
failure (with probability 1 − p). Let X = 1 if
the experiment is a success and X = 0 if it is a
failure. Then the p.m.f. of X is p(1) = p,
p(0) = 1 − p.
• If the p.m.f. of a random variable can be
written as above, it is said to be Bernoulli with
parameter p.
• We write X ∼ Bernoulli(p).
23
Binomial Random Variables
Definition. Let X be the number of successes in
n independent experiments each of which is a
success (with probability p) and a failure (with
probability 1 − p). X is said to be a binomial
random variable with parameters (n, p). We write
X ∼ Binomial(n, p).
24
The p.m.f. of the binomial distribution
• We write 1 for a success, 0 for a failure, so e.g.
for n = 3, the sample space is
S = {000, 001, 010, 100, 011, 101, 110, 111}.
• The probability of any particular sequence with
k successes (so n − k failures) is
pk (1 − p)n−k
for k = 0, 1, . . . , n.
25
Example. A die is rolled 12 times. Find an
expression for the chance that 6 appears 3 or
more times.
Solution.
26
The Binomial Theorem. Suppose that
Pn
X ∼ Binomial(n, p). Since k=0 P(X = k) = 1,
we get the identity
Pn ¡n¢ k n−k
k=0 k p (1 − p) =1
27
Expectation of the binomial distribution
Let X ∼ Binomial(n, p). What do you think the
expected value of X ought to be? Why?
28
Variance of the binomial distribution.
Let X ∼ Binomial(n, p). Show that
Var(X) = np(1 − p)
¡¢2
Solution. We know E[X] = n2 p2 . We have to
find E[X 2 ].
29
Discussion problem. A system of n satellites
works if at least k satellites are working. On a
cloudy day, each satellite works independently
with probability p1 and on a clear day with
probability p2 . If the chance of being cloudy is α,
what is the chance that the system will be
working?
Solution
30
Binomial probabilities for large n, small p.
Let X ∼ Binomial(N, p/N ). We look for a limit
as N becomes large.
(1). Write out the binomial probability. Take
limits, recalling that the limit of a product is the
product of the limits.
31
h ¡ ¢ i
p N
(2). Note that log limN →∞ 1 − N =
h¡ ¢ i
N
limN →∞ log 1 − Np . Why?
¡ ¢
p N
(3). Hence, show that limN →∞ 1 − N = e−p .
32
The Poisson Distribution
• Binomial distributions with large n, small p
occur often in the natural world
33
Example. The probability of a product being
defective is 0.001. Compare the binomial
distribution with the Poisson approximation for
finding the probability that a sample of 1000
items contain exactly 2 defective item.
Solution
34
Discussion problem. A cosmic ray detector
counts, on average, ten events per day. Find the
chance that no more than three are recorded on a
particular day.
Solution. It may be surprising that there is
enough information in this question to provide a
reasonably unambiguous answer!
35
Expectation of the Poisson distribution
k −k
±
• Let X ∼ Poisson(λ), so P(X = k) = λ e k!.
Since X is approximately Binomial(N, λ/N ), it
would not be surprising to find that
λ
E[X] = N × N = λ.
36
Variance of the Poisson distribution
k −k
±
• Let X ∼ Poisson(λ), so P(X = k) = λ e k!.
• The Binomial(N, λ/N ) approximation suggests
λ
¡ λ
¢
Var(X) = limN → ∞N × N × 1 − N = λ.
• We can find E[X 2 ] by direct computation to
check this:
37
The Geometric Distribution
Definition. Independent trials (e.g. flipping a
coin) until a success occurs. Let X be the number
of trials required. We write X ∼ Geometric(p).
• P(X = k) = p (1 − p)k−1 , for k = 1, 2, . . .
• E(X) = 1/p and Var(X) = (1 − p)/p2
Why?
38
Exercise. Let X ∼ Geometric(p). Derive the
expected value of X.
Solution.
39
Example 1. Suppose a fuse lasts for a number of
weeks X and X ∼ Geometric(1/52), so the
expected lifetime is E(X) = 52 weeks (≈ 1 year).
Should I replace it if it is still working after two
years?
Solution
40
The Negative Binomial Distribution
Definition. For a sequence of independent trials
with chance p of success, let X be the number of
trials until r successes have occurred. Then X has
the negative binomial distribution,
X ∼ NegBin(p, r), with p.m.f.
¡k−1¢
P(X = k) = r−1 pr (1 − p)k−r
for k = r, r + 1, r + 2, . . .
41
Example. One person in six is prepared to
answer a survey. Let X be the number of people
asked in order to get 20 responses. What is the
mean and SD of X?
Solution.
42
The Hypergeometric Distribution
Definition. n balls are drawn randomly without
replacement from an urn containing N balls of
which m are white and N − m black. Let X be
the number of white balls drawn. Then X has the
hypergeometric distribution,
X ∼ Hypergeometric(m, n, N ).
¡m¢¡N −m¢ ¡N ¢
• P(X = k) = k n−k / n , for k = 0, 1, . . . , m.
• E(X) = mn
N = np and Var(X) =
N −n
N −1 np(1 − p),
where p = m/N .
• Useful for analyzing sampling procedures.
• N here is not a random variable. We try to use
capital letters only for random variables, but this
convention is sometimes violated.
43
Example: Capture-recapture experiments.
An unknown number of animals, say N , inhabit a
certain region. To obtain some information about
the population size, ecologists catch a number,
say m of them, mark them and release them.
Later, n more are captured. Let X be the number
of marked animals in the second capture. What is
the most likely value of N ?
Solution.
44