Gallery of Continuous Random Variables Class 5, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
Gallery of Continuous Random Variables Class 5, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
Gallery of Continuous Random Variables Class 5, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
1 Learning Goals
1. Be able to give examples of what uniform, exponential and normal distributions are used
to model.
2. Be able to give the range and pdf’s of uniform, exponential and normal distributions.
2 Introduction
Here we introduce a few fundamental continuous distributions. These will play important
roles in the statistics part of the class. For each distribution, we give the range, the pdf,
the cdf, and a short description of situations that it models. These distributions all depend
on parameters, which we specify.
As you look through each distribution do not try to memorize all the details; you can always
look those up. Rather, focus on the shape of each distribution and what it models.
Although it comes towards the end, we call your attention to the normal distribution. It is
easily the most important distribution defined here.
3 Uniform distribution
1. Parameters: a, b.
6. Models: All outcomes in the range have equal probability (more precisely all out
comes have the same probability density).
Graphs:
1
b−a f (x)
F (x)
1
x x
a b a b
pdf and cdf for uniform(a,b) distribution.
4 Exponential distribution
1. Parameter: λ.
Examples. 1. If I step out to 77 Mass Ave after class and wait for the next taxi, my
waiting time in minutes is exponentially distributed. We will see that in this case λ is given
by one over the average number of taxis that pass per minute (on weekday afternoons).
2. The exponential distribution models the waiting time until an unstable isotope undergoes
nuclear decay. In this case, the value of λ is related to the half-life of the isotope.
Memorylessness: There are other distributions that also model waiting times, but the
exponential distribution has the additional property that it is memoryless. Here’s what
this means in the context of Example 1. Suppose that the probability that a taxi arrives
within the first five minutes is p. If I wait five minutes and in fact no taxi arrives, then the
probability that a taxi arrives within the next five minutes is still p.
By contrast, suppose I were to instead go to Kendall Square subway station and wait for
the next inbound train. Since the trains are coordinated to follow a schedule (e.g., roughly
12 minutes between trains), if I wait five minutes without seeing a train then there is a far
greater probability that a train will arrive in the next five minutes. In particular, waiting
time for the subway is not memoryless, and a better model would be the uniform distribution
on the range [0,12].
The memorylessness of the exponential distribution is analogous to the memorylessness
of the (discrete) geometric distribution, where having flipped 5 tails in a row gives no
information about the next 5 flips. Indeed, the exponential distribution is the precisely the
18.05 class 5, Gallery of Continuous Random Variables, Spring 2014 3
continuous counterpart of the geometric distribution, which models the waiting time for a
discrete process to change state. More formally, memoryless means that the probability of
waiting t more minutes is unaffected by having already waited s minutes without incident.
In symbols, P (X > s + t | X > s) = P (X > t).
Proof of memorylessness: Since (X > s + t) ∩ (X > s) = (X > s + t) we have
P (X > s + t) e−λ(s+t)
P (X > s + t | X > s) = = = e−λt = P (X > t). QED
P (X > s) e−λs
Graphs:
5 Normal distribution
In 1809, Carl Friedrich Gauss published a monograph introducing several notions that have
become fundamental to statistics: the normal distribution, maximum likelihood estimation,
and the method of least squares (we will cover all three in this course). For this reason,
the normal distribution is also called the Gaussian distribution, and it the most important
continuous distribution.
1. Parameters: µ, σ.
1
2 2
4. Density: f (x) = √ e−(x−µ) /2σ .
σ 2π
5. Distribution: F (x) has no formula, so use tables or software such as pnorm in R to
compute F (x).
The standard normal distribution N (0, 1) has mean 0 and variance 1. We reserve Z for
1 2
a standard normal random variable, φ(z) = √ e−x /2 for the standard normal density,
2π
and Φ(z) for the standard normal distribution.
Note: we will define mean and variance for continuous random variables next time. They
have the same interpretations as in the discrete case. As you might guess, the normal
distribution N (µ, σ 2 ) has mean µ, variance σ 2 , and standard deviation σ.
Here are some graphs of normal distributions. Note they are shaped like a bell curve. Note
also that as σ increases they become more spread out.
Graphs: (the bell curve):
To make approximations it is useful to remember the following rule of thumb for three
approximate probabilities
P (−1 ≤ Z ≤ 1) ≈ .68, P (−2 ≤ Z ≤ 2) ≈ .95, P (−3 ≤ Z ≤ 3) ≈ .99
within 1 · σ ≈ 68%
within 3 · σ ≈ 99%
68%
95%
99%
z
−3σ −2σ −σ σ 2σ 3σ
Symmetry calculations
We can use the symmetry of the standard normal distribution about x = 0 to make some
calculations.
Example 1. The rule of thumb says P (−1 ≤ Z ≤ 1) ≈ .68. Use this to estimate Φ(1).
answer: Φ(1) = P (Z ≤ 1). In the figure, the two tails (in red) have combined area 1-.68 =
.32. By symmetry the left tail has area .16 (half of .32), so P (Z ≤ 1) ≈ .68 + .16 = .84.
18.05 class 5, Gallery of Continuous Random Variables, Spring 2014 5
P (−1 ≤ Z ≤ 1)
P (Z ≤ −1) P (Z ≥ 1)
.34 .34
.16 .16
z
−1 1
pnorm(0,0,1)
[1] 0.5
pnorm(1,0,2)
[1] 0.6914625
pnorm(1,0,1) - pnorm(-1,0,1)
[1] 0.6826895
pnorm(5,0,5) - pnorm(-5,0,5)
[1] 0.6826895
Note: The R function pnorm(x, µ, σ) uses σ whereas our notation for the normal distri
bution N(µ, σ 2 ) uses σ 2 .
Here’s a table of values with fewer decimal points of accuracy
z: -2 -1 0 .3 .5 1 2 3
In 18.05, we only have time to work with a few of the many wonderful distributions that are
used in probability and statistics. We hope that after this course you will feel comfortable
learning about new distributions and their properties when you need them. Wikipedia is
often a great starting point.
The Pareto distribution is one common, beautiful distribution that we will not have time
to cover in depth.
18.05 class 5, Gallery of Continuous Random Variables, Spring 2014 6
α mα
4. Density: f (x) = .
xα+1
5. Distribution: (easy integral)
mα
F (x) = 1 − , for x ≥ m
xα
7. Models: The Pareto distribution models a power law, where the probability that
an event occurs varies as a power of some attribute of the event. Many phenomena
follow a power law, such as the size of meteors, income levels across a population, and
population levels across cities. See Wikipedia for loads of examples:
http://en.wikipedia.org/wiki/Pareto_distribution#Applications
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.