Chapter 2 - Probability

ESSENTIAL MATH AND STATISTICS FOR
FINANCE AND RISK MANAGEMENT
Alan Anderson, Ph.D.

Western Connecticut State University
ECI Risk Training
http://www.ecirisktraining.com
CHAPTER 2: PROBABILITY THEORY
Probability theory is based on the notion of a random experiment, which is a process

that generates outcomes in such a way that:
• all possible outcomes are known in advance

• the actual outcome is not known in advance
Some examples of a random experiment are:
• spinning a roulette wheel

• choosing a card from a deck
• rolling a pair of dice
EXAMPLE
Suppose that a die is rolled twice; assume that the variable of interest is whether the
number that turns up is odd or even. There are four possible outcomes of this random
experiment: {even, even}, {even, odd}, {odd, even}, {odd, odd}.
The set of all possible outcomes of a random experiment is known as the sample space.
The sample space for this experiment consists four sample points: S = {EE, EO, OE,
OO}. A subset of the sample space is known as an event.
EXAMPLE
Based on the die-rolling experiment, suppose that the event F is defined as follows: “at
most one even number turns up”. The event F is a set containing the following sample
points: F = {EO, OE, OO}.
COMPUTING PROBABILITIES OF EVENTS
For a random experiment in which there are n equally likely outcomes, each outcome has
a probability of occurring equal to 1/n. In the die-rolling example, each of the four
sample points is equally likely, giving it a probability of 1/4. The probability of event F is
therefore:
P(F) = P{EO} + P{OE} + P{OO} = 1/4 + 1/4 + 1/4 = 3/4
Alternatively, the probability of an event E can be computed as P(E) = #E/#S
(c) ECI Risk Training 2009 2

where: #E = number of sample points in event E
#S = number of sample points in the sample space
In this example, the probability of event F would be calculated as P(F) = #F/#S = 3/4.
AXIOMS OF PROBABILITY
An axiom is a logical statement that is assumed to be true without formal proof; all
further results are derived using axioms as a starting point. In probability theory, the
three fundamental axioms are:
Axiom 1: The probability of any event A is non-negative; P(A) ≥ 0
Axiom 2: The probability of the entire sample space is one; P(S) = 1
Axiom 3: For any collection of disjoint (non-overlapping) events:
n
P(A1 ∪ A2 ∪ ... ∪ An ) = ∑ P(Ai )
i =1
RULES OF PROBABILITY
The rules of probability are derived from these axioms. Three of the most important rules
are the Addition Rule, Multiplication Rule and the Complement Rule.
ADDITION RULE
For two events A and B, the addition rule is:
P(A ∪ B) = P(A) + P(B) – P(A ∩ B)
EXAMPLE
Referring to the die-rolling experiment, suppose that the following events are defined:
A = “the first roll is even”

B = “exactly one roll is odd
What is the probability that the first roll is even or at exactly one roll is odd?
Events A, B and A ∩ B contain the following sample points:

A = {EE, EO}
B = {EO, OE}
A ∩ B = {EO}
Since the sample space consists of four equally likely sample points,
P(A) = 1/2
P(B) = 1/2
P(A ∩ B) = 1/4
P(A ∪ B) = P(A) + P(B) – P(A ∩ B) = 1/2 + 1/2 – 1/4 = 3/4
ADDITION RULE FOR MUTUALLY EXCLUSIVE (DISJOINT) EVENTS
If events A and B cannot both occur at the same time, they are said to be mutually
exclusive or disjoint; in this case, P(A ∩ B) = 0. Therefore, the addition rule for
mutually exclusive events is: P(A ∪ B) = P(A) + P(B).
CONDITIONAL PROBABILITY
The conditional probability of an event is the probability that it occurs given that
another event occurs. For two events A and B, the probability that B occurs given that A
occurs is written as: P(B|A).
This can be computed as: P(B|A) = P(A ∩ B) / P(A) = P(B ∩ A) / P(A)
Equivalently, the probability that A occurs given that B occurs is computed as:
P(A|B) = P(A ∩ B) / P(B) = P(B ∩ A) / P(B)
EXAMPLE
For the die-rolling experiment, where:
A = “the first roll is even” = {EE, EO}

B = “exactly one roll is odd” = {EO, OE}
Since (B ∩ A) = {EO}, P(B ∩ A) = 1/4

Since A = {EE, EO}, P(A) = 1/2
Since B = {EO, OE}, P(B) = 1/2

P(B|A) = P(B ∩ A) / P(A) = (1/4)/(1/2) = 1/2
P(A|B) = P(B ∩ A) / P(B) = (1/4)/(1/2) = 1/2
MULTIPLICATION RULE
For two events A and B the multiplication rule is:
P(B ∩ A) = P(B|A)P(A) = P(A|B)P(B)
EXAMPLE
Suppose that a card is chosen from a standard deck without being replaced; a second card
is then chosen.
Define: A = “the first card is a club”

B = “the second card is a club”
The probability that both cards are clubs is computed as follows:
P(B ∩ A) = P(B|A)P(A)
Since there are thirteen clubs in a standard deck of cards and fifty-two cards in the deck,
P(A) = 13/52 = 1/4
The probability of B given A is computed as follows: if the first card is a club (event A),
then there will be 51 remaining cards in the deck when the second card is chosen. Of
these, 12 of 13 clubs remain in the deck. Therefore, P(B|A) = 12/51.
Therefore, the probability that both cards are clubs is:
P(B ∩ A) = P(B|A)P(A) = (12/51)(1/4) = 12/204 = 0.0588
INDEPENDENT EVENTS
Two events A and B are said to be independent if the occurrence of A does not affect the
probability of B occurring, and vice versa.
EXAMPLE
Referring to the die-rolling experiment, suppose that the following events are defined:

C = “the first roll is even”
D = “the second roll is odd”
Intuitively, events C and D are independent since the two rolls of the die have no
influence on each other.
INDEPENDENT EVENTS
Events A and B are independent only if both of the following conditions are true:
P(B | A) = P(B)
P(A | B) = P(A)
EXAMPLE
For the die-rolling experiment, where:

P(A) = 1/2
P(B) = 1/2
P(B ∩ A) = 1/4
P(A|B) = P(B ∩ A) /P(B) = (1/4)/(1/2) = 1/2
P(B|A) = P(B ∩ A) /P(A) = (1/4)/(1/2) = 1/2
Since P(A|B) = P(A) and P(B|A) = P(B), A and B are independent events.
MULTIPLICATION RULE FOR TWO INDEPENDENT EVENTS
For two independent events, the multiplication rule for independent events is:
P(B ∩ A ) = P(B)P(A) = P(A)P(B)
COMPLEMENT RULE
Two events A and B are said to be complements if:
P(A ∪ B) = 1
P(A ∩ B ) = 0

The complement of A is written AC. The complement rule is:
P(A) = 1 - P(AC)
or: P(AC) = 1 - P(A)
BAYES’ THEOREM
Bayes’ Theorem can be used to determine the conditional probability of an event. The
general formula for computing conditional probabilities is:
P(B | Ai )P(Ai )
P(Ai | B) = n
! P(B | A )P(A )
i i
i=1
The sample space, or set of all possible events, is partitioned into n events: A1, A2, ..., An;
the denominator of the formula represents the total probability of event B.
EXAMPLE
For the die-rolling experiment, the following events are defined:

In this example, AC = "the first roll is odd" = {OE, OO}. Using Bayes’ Theorem, the
probability that the first roll is even given that exactly one roll is odd is determined as
follows:
A = {EE, EO} P(A) = 1/2

B = {EO, OE} P(B) = 1/2
AC = {OE, OO} P(AC) = 1/2
(B ∩ A) = {EO} P(B ∩ A) = 1/4
(B ∩ AC) = {OE} P(B ∩ AC) = 1/4
P(B|A) = P(B ∩ A)/P(A) = (1/4)/(1/2) = 1/2

P(B|AC) = P(B ∩ AC)/P(AC) = (1/4)/(1/2) = 1/2
In this case, the formula for computing P(A|B) is:

P(B | A)P(A)
P(A | B) =
P(B | A)P(A) + P(B | AC )P(AC )
(1 / 2)(1 / 2) 1
P(A | B) = =
(1 / 2)(1 / 2) + (1 / 2)(1 / 2) 2
RANDOM VARIABLES
A random variable is a function that assigns numerical values to the outcomes of a

random experiment.
EXAMPLE
Referring to the die-rolling experiment, suppose that a random variable X is defined as

“the number of times an even number turns up during the experiment”. X assigns a
number to each element of the sample space, as shown in the following table:
SAMPLE POINT X
EE 2
EO 1
OE 1
OO 0
From the table, it can be seen that:
P(X = 0) = 1/4
P(X = 1) = 2/4
P(X = 2) = 1/4
Since X can only assume a finite number of different values, it is said to be a discrete
random variable.
If X can assume an infinite number of different values, it is said to be a continuous

random variable.
CUMULATIVE DISTRIBUTION FUNCTION (CDF)

The cumulative distribution function (cdf), designated F(x), shows the probability that
a random variable X assumes a value that is less than or equal to a constant x:
F(x) = P(X ≤ x)
where:
X = a random variable
x = a realization (value) of X
For a discrete random variable, the cumulative distribution function is:
F(x) = P(X ≤ x) = ∑ P(X = x)

x
For a continuous random variable, the cumulative distribution function is:

x
F(x) = P(X ≤ x) = ∫
−∞
f (u)du
EXAMPLE
Referring to the die-rolling experiment, the following table illustrates the cdf of X:
x (X ≤ x) F(x) = P(X ≤ x)
<0 {} 0
0 {OO} 1/4
1 {OO, EO, OE} 3/4
2 {OO, EO, OE, EE} 1
>2 {OO, EO, OE, EE} 1
PROBABILITY MASS FUNCTION (PMF)
For a discrete random variable, a table or a function showing the probability of each
possible value is known as a probability mass function (pmf), designated p(x):
p(x) = P(X = x)
where:

∑ p(x ) = 1 i
x
COMPUTING PROBABILITIES WITH A CDF
For a discrete random variable, probabilities can be derived from the cdf as follows:
p(xi) = F(xi) – F(xi-1)
EXAMPLE
For the die-rolling experiment, the probability mass function of X is shown in the
following table:
x F(x) p(x)
<0 0 0
0 1/4 (1/4 – 0) = 1/4
1 3/4 (3/4 – 1/4) = 1/2
2 1 (1 – 3/4) = 1/4
>2 1 (1 – 1) = 0
This table shows that:
P(X = 0) = 1/4
P(X = 1) = 2/4
P(X = 2) = 1/4
PROBABILITY DENSITY FUNCTION (PDF)
For a continuous random variable, a function showing the probability of any range of
possible values is known as a probability density function (pdf), designated f(x):
∫ f (x)dx = P(a < X < b)

a
where:
∫
−∞
f (x)dx = 1

The pdf is related to the cdf as follows:
dF(x)
f (x) =
dx
Equivalently, the cdf is related to the pdf as follows:

x
F(x) = ∫ f (u)du
−∞
COMPUTING PROBABILITIES WITH A CDF
For a continuous random variable, probabilities can be derived from the cdf as follows:
p(a < X < b) = F(b) – F(a)
PROBABILITY FUNCTION
Both probability mass functions and probability density functions are sometimes known
more simply as probability functions.
JOINTLY DISTRIBUTED RANDOM VARIABLES
For two jointly distributed discrete random variables, X and Y, the joint probability
mass function is defined as:
p(x, y) = P(X = x, Y = y)
EXAMPLE
Suppose that a census is taken for a small town; the distribution of the number of children
among the families in this town is given as follows:
 20% have no children

 30% have one child
 40% have two children
 10% have three children
Define:
X = “number of boys in a family”

Y = “number of girls in a family”

The following table shows the joint probability mass function for X and Y:
i,j 0 1 2 3 Row Sum

P(X = i)
0 0.20 0.15 0.10 0.0125 0.4625
1 0.15 0.20 0.0375 0 0.3875
2 0.10 0.0375 0 0 0.1375
3 0.0125 0 0 0 0.0125
Row Sum 0.4625 0.3875 0.1375 0.0125 1.0000
P(Y = j)
This table shows the joint probabilities for every possible value of X and Y. For
example, the joint probability that a family has 2 boys and 1 girl (X = 2, Y = 1) is 0.0375.
MARGINAL PROBABILITY
For two jointly distributed random variables, X and Y, the marginal probability of X is
the probability that X assumes a given value for all possible values of Y. Equivalently,
the marginal probability of Y is the probability that Y assumes a given value for all
possible values of X.
Marginal probabilities are computed as follows:
p X (X = x) = ∑ p(x, y)
y
This is the marginal probability mass function of X.
The marginal probability mass function of Y is:
pY (Y = y) = ∑ p(x, y)
x
In the census example, the marginal probabilities of X are given by the row sums of the
joint pmf; the marginal probabilities of Y are given by the column sums.
EXAMPLE

The marginal probability that a family has one boy can be determined as follows:
P(one boy) = P(one boy and no girls) + P(one boy and one girl) + P(one boy and two
girls) + P(one boy and three girls)
= P(X = 1, Y = 0) + P(X = 1, Y = 1) + P(X = 1, Y = 2) + P(X = 1, Y = 3)
= 0.15 + 0.20 + 0.0375 + 0 = 0.3875
EXAMPLE
The (marginal) probability that a family has two girls can be determined as follows:
P(two girls) = P(no boys and two girls) + P(one boy and two girls) + P(two boys and two
girls) + P(three boys and two girls)
= P(X = 0, Y = 2) + P(X = 1, Y = 2) + P(X = 2, Y = 2) + P(X = 3, Y = 2)
= 0.10 + 0.0375 + 0 + 0 = 0.1375
UNCONDITIONAL PROBABILITY
A marginal probability is also known as an unconditional probability.
CONDITIONAL PROBABILITY
A conditional probability P(X = x|Y = y) is the probability that X assumes a specific

value x given that Y assumes a specific value y.
CONDITIONAL PROBABILITY MASS FUNCTION
The probability mass function of X given that Y = y is known as a conditional

probability mass function:
p(x, y)
p X |Y (x | y) = P(X = x | Y = y) =
pY (Y = y)
The probability mass function of Y given that X = x is:

p(x, y)
pY | X (y | x) = P(Y = y | X = x) =
p X (X = x)
EXAMPLE
The probability that a family has one boy given that it has two girls is computed as
follows:
p(x, y)
P(X = x | Y = y) =
pY (Y = y)
p(1, 2)
P(X = 1 | Y = 2) =
pY (Y = 2)
= 0.0375 / 0.1375 = 0.2727
JOINT PROBABILITY DENSITY FUNCTION
For two jointly distributed continuous random variables, X and Y, the joint probability
density function is defined as:
b d
∫ ∫ f (x, y)dx dy = P(a < X < b, c < Y < d)

a c
The marginal probability density function of X is:
f X (x) = ∫
−∞
f (x, y)dy
The marginal probability density function of Y is:
fY (y) = ∫
−∞
f (x, y)dx
CONDITIONAL PROBABILITY DENSITY FUNCTION

For jointly distributed continuous random variables, the conditional probability density
function is:
f (x, y)
f X |Y (x | y) =
fY (y)
JOINT CUMULATIVE DISTRIBUTION FUNCTION
For two jointly distributed random variables, X and Y, the joint cumulative distribution
function is defined as:
F(x, y) = P(X ≤ x, Y ≤ y)
If X and Y are discrete random variables and a and b are constants:
F(a,b) = P(X ≤ a,Y ≤ b) = ∑ ∑ p(x, y)

x ≤a y≤b
If X and Y are continuous random variables and a and b are constants:
a b
F(a,b) = P(X ≤ a,Y ≤ b) = ∫∫

−∞ −∞
f (x, y)dx dy
EXAMPLE
Using the census example, the probability that a family has two boys or less and one girl
or less can be computed as follows:
P(two boys or less and one girl or less) = P(no boys, no girls) + P( no boys, one girl) +
P(one boy, no girls) + P(one boy, one girl) + P(two boys, no girls) + P(two boys, one girl)
= P(0, 0) + P(0, 1) + P(1,0) + P(1,1) + P(2,0) + P(2, 1)
= 0.20 + 0.15 + 0.15 + 0.20 + 0.10 + 0.0375 = 0.8375
INDEPENDENCE OF RANDOM VARIABLES
If X and Y are discrete random variables and p(X=xi ,Y = yj) = p(X=xi)p(Y = yj) then X
and Y are independent. For continuous random variables, if:

f(x,y) = fX(x)fY(y)
then X and Y are independent.

Some of the consequences of independence are:
1) E(XY) = E(X)E(Y)
2) COV(X, Y) = 0
3) Var(X, Y) = Var(X) + Var(Y)
MOMENTS OF A RANDOM VARIABLE
A random variable can be characterized by its moments. These are summary measures
of the behavior of a random variable. The most important of these are:
• Expected Value
• Variance
• Skewness
• Kurtosis
EXPECTED VALUE
The first moment of a random variable X is known as its expected value; this is the
average or mean value of X. For a discrete random variable X, the expected value is
computed as follows:
n
E(X) = ∑ xi P(X = xi )
i =1
where:
xi = a possible value of X
i = an index
n = the number of possible values of X
Σ = “sigma”; this is the summation operator
EXAMPLE
For the die-rolling experiment, where X is the number of even numbers that turn up after
rolling a die twice, the expected value is computed as follows:
E(X) = (0)(1/4) + (1)(1/2) + (2)(1/4) = 1
This shows that on average, there will be one even number each time a die is rolled twice.

For a continuous random variable X, the expected value is computed as follows:
∞
E(X) = ∫ xf (x)dx
−∞
PROPERTIES OF THE EXPECTED VALUE
For two random variables X and Y and two constants a and b:
1) E(a) = a
2) E(aX+ b) = aE(X) + b
3) E(X + Y) = E(X) + E(Y)
CONDITIONAL EXPECTATION
For two random variables X and Y, the conditional expectation of X given that Y
assumes a specific value y is written as:
E[X|Y = y]
If X and Y are discrete random variables:
E[X | Y = y] = ∑ xP(X = x | Y = y)
x
If X and Y are continuous random variables:
E[X | Y = y] = ∫ xf
−∞
X |Y (x | y)dx
∞
f (x, y)
= ∫x
−∞
fY (y)
dx
EXAMPLE
Using the census example, the expected number of boys in a family given that there are
two girls in the family is computed as:
E[X|Y = 2] =

(0)P(X = 0 | Y = 2) + (1)P(X = 1 | Y = 2) + (2)P(X = 2 | Y = 2) + (3)P(X = 3| Y = 2)
p(0, 2) p(1, 2) p(2, 2) p(3, 2)

= (0) + (1) + (2) + (3)
pY (Y = 2) pY (Y = 2) pY (Y = 2) pY (Y = 2)
= (0)(0.10/0.1375) + (1)(0.0375/0.1375) + (2)(0/0.1375) + (3)(0/0.1375)
= 0 + 0.2727 + 0 + 0 = 0.2727
EXPECTED VALUE OF A PRODUCT OF RANDOM VARIABLES
The expected value of the product of two discrete random variables is:
E(XY ) = ∑ ∑ xyp(x, y)
x y
The expected value of the product of two continuous random variables is:
∞ ∞
E(XY ) = ∫ ∫ xyf (x, y)

−∞ −∞
VARIANCE
The second central moment of a random variable X is known as its variance. This
indicates the degree of dispersion or spread of X around its expected value. The variance
of random variable X is computed as follows:
σ X2 = E[(X − E(X))2 ]
= E[X 2 ] − (E[X])2
For a discrete random variable, the variance can also be expressed as:

n
σ X2 = ∑ [ xi − E(X)] P(X = xi )
2
i =1
EXAMPLE
For the die-rolling experiment, the variance of X is computed as follows:
σx2 = [0-1]2(1/4) + [1-1]2(2/4) + [2-1]2(1/4)

= (1)(1/4) + (0)(2/4) + (1)(1/4) = 1/2
For a continuous random variable X, the variance is computed as follows:
σ X2 = ∫ [ x − E(X)]
2
f (x)dx
−∞
PROPERTIES OF VARIANCE
1) Var(a) = 0
2) Var(aX + b) = a2Var(X)
3) Var(aX + bY) = a2Var(X) + b2Var(Y) + 2abCov(X,Y)
CONDITIONAL VARIANCE
For two random variables X and Y, the conditional variance of X given that Y assumes
a specific value y is written as:
Var[X|Y] = E[[X-E(X|Y)]2|Y]
This can be re-written as:
Var[X|Y] = E[X2|Y] - (E[X|Y])2
where:
E[X 2 | Y ] = ∑ x 2 p(x, y)
x
EXAMPLE

Using the census example, the conditional variance of the number of boys in a family
given that there are two girls is computed as follows:
p(0, 2) p(1, 2) p(2, 2) p(3, 2)

= (0)2 + (1)2 + (2)2 + (3)2
pY (Y = 2) pY (Y = 2) pY (Y = 2) pY (Y = 2)
2 2 2
E[X |Y] = (0) P(X = 0 | Y = 2) + (1) P(X = 1 | Y = 2) +
(2)2P(X = 2 | Y = 2) + (3)2P(X = 3 | Y = 2)
= (0)(0.10/0.1375) + (1)(0.0375/0.1375) + (4)(0/0.1375) + (9)(0/0.1375)
= 0 + 0.2727 + 0 + 0 = 0.2727
Var[X|Y] = E[X2|Y] - (E[X|Y])2
= (0.2727) - (0.2727)2 = 0.2727 - 0.0744 = 0.1983
STANDARD DEVIATION
One of the drawbacks to using variance is that it is measured in squared units. Since
these are difficult to interpret, the standard deviation is often used instead. For any
random variable X, the standard deviation of X equals the square root of the variance of
X.
EXAMPLE
For the die-rolling experiment, the standard deviation is computed as follows:
1
σ X = σ X2 = = 0.7071
2
COVARIANCE
Covariance is a measure of dependence between random variables:
• If X and Y tend to move in the same direction, Cov(X,Y) > 0

• If X and Y tend to move in opposite directions, Cov(X,Y) < 0
• If X and Y are unrelated to each other (i.e., independent ), Cov(X,Y) = 0
Covariance is defined as:

Cov(X,Y) = E([X - E(X)][Y - E(Y)])
This can also be written as:
Cov(X,Y) = E(XY) - E(X)E(Y)
If X and Y are discrete random variables, the covariance can also be expressed as:
n n
∑ ∑ (x i − E(X))(y j − E(Y ))P(X = xi ,Y = y j )

i =1 j =1
where:
i, j are indexes
P(X=xi, Y=yj) = the joint probability mass function of X and Y
For two continuous random variables, covariance is computed as:
∞ ∞
∫ ∫ (x − E(X))(y − E(Y )) f (x, y)dxdy

−∞ −∞
where:
f(x,y) = the joint probability density function of X and Y
EXAMPLE
Using the following joint probability mass function for two random variables X and Y,
the covariance is computed as follows:
i, j 0 1 Row Sum
P(X = i)
0 0.40 0.30 0.70
1 0.10 0.20 0.30
Column Sum 0.50 0.50 1.00
P(Y = j)

The first step is to compute E(X) and E(Y). Using the row sums from the joint
probability mass function, the probability mass function for X is:
x P(x)
0 0.70
1 0.30
n
E(X) = ∑ xi P(X = xi )
i =1
= (0)(0.70) + (1)(0.30) = 0.30
Using the column sums from the joint probability mass function, the probability mass
function for Y is:
y P(y)
0 0.50
1 0.50
n
E(Y ) = ∑ yi P(Y = yi )
i =1
= (0)(0.50) + (1)(0.50) = 0.50
n n
COV (X,Y ) = ∑ ∑ (xi − E(X))(y j − E(Y ))P(X = xi ,Y = y j )
i =1 j =1
= (0 - 0.3)(0 - 0.5)(0.40) + (0 - 0.3)(1 - 0.5)(0.30)

+ (1 - 0.3)(0 - 0.5)(0.10) + (1 - 0.3)(1 - 0.5)(0.20)
= 0.06 - 0.045 + 0.035 + 0.07 = 0.12
PROPERTIES OF COVARIANCE
1) Cov(X, Y) = Cov(Y, X)
2) Cov(X + a,Y + b) = Cov(X, Y)

3) Cov(aX, bY) = abCov(X, Y)
4) Cov(X, X) = Var(X)
CORRELATION
A related measure of dependence is correlation. The correlation coefficient is defined

as:
Cov(X,Y )
ρ=
σ Xσ Y
where:
ρ = “rho”; this is the correlation coefficient

σX = the standard deviation of X
σY = the standard deviation of Y
The correlation coefficient always assumes a value between negative one and positive
one and is unit-free: -1 ≤ ρ ≤ 1
EXAMPLE
Using the data from the covariance example, the correlation is computed as follows.
First, the variance of X is computed as:
n
σ X2 = ∑ [ xi − E(X)] P(X = xi )
2
i =1
σ2X = [0 - 0.3]2(0.70) + [1 - 0.3]2(0.30) = 0.21
The standard deviation of X is:
σ X = 0.21 = 0.4583
Second, the variance of Y is computed as:

n
σ Y2 = ∑ [ yi − E(Y )] P(Y = yi )
2
i =1
σ2Y = [0 - 0.5]2(0.50) + [1 - 0.5]2(0.50) = 0.25
The standard deviation of Y is:
σ Y = 0.25 = 0.5
The correlation between X and Y is:
Cov(X,Y ) 0.12
ρ= = = 0.5237
σ Xσ Y (0.4583)(0.5)
SKEWNESS
The third central moment of a random variable X is known as its skewness. This
indicates the degree of asymmetry in the values of X. Skewness is computed as follows:
E[(X − E(X))3 ]
α3 =
σ3
where:
α3 is the skewness coefficient

σ3 is the standard deviation cubed
EXAMPLE
For the die-rolling experiment, the numerator of the skewness formula is computed as
follows:
= [0-1]3(1/4) + [1-1]3(2/4) + [2-1]3(1/4)

= (-1)(1/4) + (0)(2/4) + (1)(1/4) = 0

The denominator equals (σ2)3/2 = (1/2)3/2 = 0.3536
Therefore, α3 = 0/0.3536 = 0
KURTOSIS
The fourth central moment of a random variable X is known as its kurtosis. This refers
to the likelihood that X will assume an extremely small or large value. Kurtosis is
computed as follows:
E[(X − E(X))4 ]
α4 =
σ4
where:
α4 is the kurtosis coefficient

σ4 is the standard deviation raised to the fourth power
EXAMPLE
For the die-rolling experiment, the numerator of the kurtosis formula is computed as
follows:
= [0-1]4(1/4) + [1-1]4(2/4) + [2-1]4(1/4)

= (1)(1/4) + (0)(2/4) + (1)(1/4) = 1/2
The denominator equals (σ2)2 = (1/2)2 = 0.25
Therefore, α4 = 0.5/0.25 = 2
COEFFICIENT OF VARIATION
The coefficient of variation is a measure of relative variation; it is defined as the ratio of

the standard deviation to the mean (expected value) of X:
σX
CV =
µX

EXAMPLE
For the die-rolling experiment, the coefficient of variation is:
σ X 0.7071
CV = = = 0.7071
µX 1
CHEBYSHEV’S INEQUALITY
Chebyshev’s inequality gives an upper limit for the probability that X will be k or more
standard deviations away from its expected value. Chebyshev’s inequality is written as:
P { X − µ X ≥ kσ X } ≤
1
k2
EXAMPLE
Suppose that X is a random variable with an expected value of 2, and a standard

deviation of 5. For k = 3:
P { X − µ X ≥ kσ X } ≤
1
k2
P { X − 2 ≥ 15} ≤
1
9
In other words, P(X ≥ 17) and P(X ≤ -15) are both less than or equal to 1/9.

Chapter 2 - Probability

Uploaded by

Copyright:

Available Formats

Chapter 2 - Probability

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 2 - Probability

Uploaded by

Copyright:

Available Formats

ESSENTIAL MATH AND STATISTICS FOR

FINANCE AND RISK MANAGEMENT

Alan Anderson, Ph.D.

Probability theory is based on the notion of a random experiment, which is a process

• all possible outcomes are known in advance

Some examples of a random experiment are:

• spinning a roulette wheel

COMPUTING PROBABILITIES OF EVENTS

P(F) = P{EO} + P{OE} + P{OO} = 1/4 + 1/4 + 1/4 = 3/4

Alternatively, the probability of an event E can be computed as P(E) = #E/#S

(c) ECI Risk Training 2009 2

Axiom 1: The probability of any event A is non-negative; P(A) ≥ 0

Axiom 2: The probability of the entire sample space is one; P(S) = 1

Axiom 3: For any collection of disjoint (non-overlapping) events:

For two events A and B, the addition rule is:

P(A ∪ B) = P(A) + P(B) – P(A ∩ B)

A = “the first roll is even”

Events A, B and A ∩ B contain the following sample points:

(c) ECI Risk Training 2009 3

P(A ∪ B) = P(A) + P(B) – P(A ∩ B) = 1/2 + 1/2 – 1/4 = 3/4

ADDITION RULE FOR MUTUALLY EXCLUSIVE (DISJOINT) EVENTS

This can be computed as: P(B|A) = P(A ∩ B) / P(A) = P(B ∩ A) / P(A)

P(A|B) = P(A ∩ B) / P(B) = P(B ∩ A) / P(B)

For the die-rolling experiment, where:

A = “the first roll is even” = {EE, EO}

Since (B ∩ A) = {EO}, P(B ∩ A) = 1/4

(c) ECI Risk Training 2009 4

For two events A and B the multiplication rule is:

P(B ∩ A) = P(B|A)P(A) = P(A|B)P(B)

Define: A = “the first card is a club”

The probability that both cards are clubs is computed as follows:

Therefore, the probability that both cards are clubs is:

P(B ∩ A) = P(B|A)P(A) = (12/51)(1/4) = 12/204 = 0.0588

(c) ECI Risk Training 2009 5

For the die-rolling experiment, where:

A = “the first roll is even” = {EE, EO}

MULTIPLICATION RULE FOR TWO INDEPENDENT EVENTS

P(B ∩ A ) = P(B)P(A) = P(A)P(B)

Two events A and B are said to be complements if:

(c) ECI Risk Training 2009 6

or: P(AC) = 1 - P(A)

For the die-rolling experiment, the following events are defined:

A = “the first roll is even” = {EE, EO}

A = {EE, EO} P(A) = 1/2

P(B|A) = P(B ∩ A)/P(A) = (1/4)/(1/2) = 1/2

In this case, the formula for computing P(A|B) is:

(c) ECI Risk Training 2009 7

A random variable is a function that assigns numerical values to the outcomes of a

Referring to the die-rolling experiment, suppose that a random variable X is defined as

From the table, it can be seen that:

If X can assume an infinite number of different values, it is said to be a continuous

CUMULATIVE DISTRIBUTION FUNCTION (CDF)

(c) ECI Risk Training 2009 8

For a discrete random variable, the cumulative distribution function is:

F(x) = P(X ≤ x) = ∑ P(X = x)

For a continuous random variable, the cumulative distribution function is:

PROBABILITY MASS FUNCTION (PMF)

(c) ECI Risk Training 2009 9

COMPUTING PROBABILITIES WITH A CDF