Cours Chapter1
Cours Chapter1
LAKSACI Noura
Course for 3rd Year Computer Science Degree
Lecture: 1/week, Tutorial: 1/week, Credits: 4, Coefficient: 2
University of Adrar
nor.laksaci@univ-adrar.edu.dz
November 2, 2024
2
Contents
1 A Review of Some Concepts in Probability and Random Variables 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Fundamental Space and Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Operations on Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Complete System of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.4 Measurable Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Probabilistic Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.3 Independence of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Compound Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.2 Total Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.3 Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.1 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.2 Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . 10
Conventions
F denotes either R or C.
N denotes the set {1, 2, 3, ...} of natural numbers (excluding 0).
Chapter 1
Objectives
By the end of this chapter, the student should be able to:
• Understand the basic language and concepts of probability;
• Differentiate between types of events;
• Learn how to work with discrete random variables;
• Learn how to work with continuous random variables;
• Understand the meaning of expectation and variance and be able to calculate them.
1.1 Introduction
The concepts of probability and random variables are essential for understanding, modeling, and
analyzing uncertainty in the real world. They provide powerful tools for informed decision-making,
data analysis, modeling random phenomena, and solving problems in uncertain environments, leading
to improved modeling of complex systems and advancements in various fields of study.
Definition 1.2.1. An experiment is called random if it is impossible to predict the result with
certainty, i.e., if repeated under the same conditions, it may yield different outcomes. However, we
can describe, before the experiment is conducted, the set of all possible outcomes.
• The result of an experiment, denoted as an elementary outcome or event, is a possible result
of the experiment.
• The set of all elementary outcomes for a given experiment constitutes the sample space, also
known as the universe of possibilities.
Examples.
1. The toss of a coin is a random experiment with a sample space of {heads, tails}.
2. The roll of a die is a random experiment with a sample space of {1, 2, 3, 4, 5, 6}.
3. In a blood test, the possible outcomes, such as blood type and Rh factor, form the sample
space.
3
4CHAPTER 1. A REVIEW OF SOME CONCEPTS IN PROBABILITY AND RANDOM VARIABLES
Definition 1.2.2. An event is a set of elementary outcomes, and it is part of the sample space.
After conducting the experiment, we can determine whether the event occurred or not.
Remarks.
• If A is an event, then A is said to occur if the outcome of the experiment is one of the elementary
outcomes in A. If not, the event did not occur, and the complementary event of A occurs
instead.
• The impossible event, denoted by ∅, cannot occur regardless of the outcome.
• The certain event, denoted by Ω, is always realized, regardless of the experiment’s outcome.
Example. In the case of rolling a die, the event "rolling an even number" and the event "rolling an
odd number" are mutually exclusive, and their union is the entire sample space.
1.3 Probability
1.3.1 Definitions
The transition from a set-theoretic description of random phenomena to a fully developed mathe-
matical model involves the introduction of probability measures.
Definition 1.3.1. A probability is defined as a function P that assigns a number between 0 and 1
to each event in a σ-algebra A, satisfying the following properties:
• For every event A ∈ A, P (A) ≥ 0 (non-negativity).
• P (Ω) = 1 (the probability of the entire sample space is 1).
• For any countable sequence of mutually exclusive events A1 , A2 , . . . , the probability of their
union is the sum of their probabilities:
∞ ∞
!
Ai = P (Ai )
[ X
P
i=1 i=1
Proposition 1.3.3 (Properties of Probability). From the axioms, we derive several important proper-
ties of probability, such as:
• Additivity: If A and B are arbitrary two events, then:
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
• Complement Rule: The probability of the complement of an event A is:
P (Ac ) = 1 − P (A)
• Impossible Event:
P (∅) = 0.
• Monotonicity: If A ⊆ B, then:
P (A) ≤ P (B).
• Increasing continuity: If (An )n∈N is an increasing sequence of events with respect to inclu-
sion, that is, if An ⊆ An+1 for all integers, then
+∞
!
lim P (An ) = P
[
An .
n→+∞
n=0
• Sub-additivity: if An is a sequence of events such that P (An ) converge, so
P+∞
n=0
+∞ +∞
!
P (An )
[ X
P An ≤
n=0 n=0
Proof.
• Additivity: Let A and B be two arbitrary events. They can be decomposed as the union of
two mutually exclusive events:
We have:
P (A ∪ B) = P (A0 ) + P (A ∩ B) + P (B 0 )
but,
P (A0 ) = P (A) − P (A ∩ B) and P (B 0 ) = P (B) − P (A ∩ B)
hence,
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
.
• Complement Rule: We have by the additivity property of probabilities:
P (Ω) = P (A ∪ Ac ) = P (A) + P (Ac )
Hence,
P (Ω) = 1 = P (A) + P (Ac )
.
Thus,
P (Ac ) = 1 − P (A).
6CHAPTER 1. A REVIEW OF SOME CONCEPTS IN PROBABILITY AND RANDOM VARIABLES
Remark.
It is important not to confuse independent events with mutually exclusive events. If A and B are
both independent and mutually exclusive, then:
P (A ∩ B) = 0 with (P (A) = 0 or P (B) = 0)
Example. In the example of rolling a fair six-sided die, the two events: A = "the result is even" and
B = "the result is a multiple of three" are independent.
Indeed, let
A = {2, 4, 6}, B = {3, 6}, A ∩ B = {6}
We have:
3 2 1
P (A) = , P (B) = , P (A ∩ B) =
6 6 6
We then verify that:
3 2 6 1
P (A ∩ B) = P (A)P (B) = × = =
6 6 36 6
Remarks. • P (A) is called the prior probability ( initial probability) and P (A|B) the poste-
rior probability, as the realization of B affects the likelihood of A occurring.
Proof. By definition
P (A ∩ B)
P (A | B) = =⇒ P (A ∩ B) = P (A | B)P (B),
P (B)
then, by symmetry
P (A ∩ B) = P (B | A)P (A) = P (A | B)P (B)
1.4. CONDITIONAL PROBABILITY 7
Remark. When two events are independent, the occurrence of one event does not provide any infor-
mation about the occurrence of the other. In this case, the conditional probability is equal to the
prior probability.
Proposition 1.4.3. If A and B are independent events, then the conditional probability P (A|B)
equals P (A),
P (A ∩ B) = P (A | B)P (B)
Statistical independence between A and B is equivalent to
P (A ∩ B) = P (A)P (B)
which leads to the relation
P (A | B) = P (A)
1.4.2 Total Probability
Theorem 1.4.4. If {A1 , A2 , . . . , An } is a complete system of mutually exclusive and exhaustive
events, then for any event B:
n
P (B) = P (B|Ai )P (Ai )
X
i=1
This is known as the law of total probability.
P (A1 ∩ B) = P (B | A1 ) P (A1 )
which leads to:
n
P (B) = P (B | Ai ) P (Ai )
X
i=1
1.4.3 Bayes’ Theorem
Theorem 1.4.5. Let {A1 , A2 , . . . , An } be a complete system of events. For any event B such that
P (B) > 0, Bayes’ Theorem states that:
P (B|Ai )P (Ai )
P (Ai |B) =
P (B)
8CHAPTER 1. A REVIEW OF SOME CONCEPTS IN PROBABILITY AND RANDOM VARIABLES
i=1
According to the formula for conditional probability,
P (Ai ∩ B)
P (Ai | B) =
P (B)
Thus,
P (B | Ai )P (Ai )
P (Ai | B) = Pn .
i=1 P (B | Ai )P (Ai )
Probability Law
A random variable is characterized by the set of values it can take and by the mathematical expression
of the probability of these values. This expression is called the probability law (or probability
distribution) of the random variable.
Definition 1.5.3. The probability distribution of a discrete random variable is entirely determined
by the probabilities pi of the events X = xi , where xi takes values over the image universe X(Ω).
The probability distribution is given by the pairs (xi , pi ), and we denote P ({X = xi }) as equivalent
1.5. RANDOM VARIABLES 9
to P (X = xi ) or pi . That’s mean
pi = P (X = xi )
= P ({ω ∈ Ω : X(ω) = xi })
P (X−1 (xi )).
Example. In the case of a family with two children, if we assume that the probability of having a
boy is equal to that of having a girl (1/2), then the probability distribution or probability law for
the number of girls in a two-child family is:
Probabilities associated
Values of the random
Possible event sets of Ω with the variable X,
variable X
P (X = xi ) or pi
B and B 0 1
4
B and G 1 1
2
G and G 2 1
4
Distribution Function
Definition 1.5.4. We call the distribution function of a random variable X, the function FX defined
as:
FX : R → R
t 7→ FX (t) = P (X≤ t)
= P ({ω ∈ Ω : X(ω)≤ t})
= P ({X −1 ([−∞; t])})
Concretely, the distribution function corresponds to the cumulative probability distribution. The
plateau reached by the distribution function corresponds to the probability value 1 because pi = 1.
P
Proposition 1.5.5. Let FX be the distribution function of a discrete random variable X, then:
• ∀t ∈ R, 0 ≤ FX (t) ≤ 1,
• FX is non-decreasing on R,
• If a ≤ b, then P (a ≤ X ≤ b) = FX (b) − FX (a).
Example (Tossing 3 Coins Example). We consider the event ω "tossing 3 coins". We introduce a
random variable X defined by X(ω)= "the number of heads in the event ω". The probability
distribution of X is:
Number of heads P (X = xi ) FX
1 1
0
8 8
3 4
1
8 8
3 7
2
8 8
1
3 1
8
In the case of a discrete random variable, a bar chart is used to visualize the probability distribution,
and a step function is used for the cumulative distribution function.
10CHAPTER 1. A REVIEW OF SOME CONCEPTS IN PROBABILITY AND RANDOM VARIABLES
Figure 1.1: Bar chart and Distribution diagram correspondent to example of tossing 3 Coins Example
ω∈Ω
Remark. If X(Ω) is infinite, we are not sure that the expectation exists. The mathematical expec-
tation is also denoted by µ(X) or µX if no confusion is to be feared.
Example. If we take the example of a family with two children, the expectation of the random
variable "number of daughters" is:
1 1 1
E(X) = 0 × + 1 × + 2 × = 1
4 2 4
Thus, E(X) = 1.
If we observe a sufficient number of families with 2 children, we expect on average one daughter per
family.
Definition 1.5.7. If X is a random variable with an expectation E(X), we call the variance of X
the real number:
V (X) = E (X − E(X))2
If X is a discrete random variable with a probability distribution (xi , pi ) defined on a finite number
n of elementary events, then the variance is given by:
n n
V (X) = (xi − E(X)) pi =
2
x2i pi − E(X)2
X X
i=1 i=1
Example.
• The mass of newborn babies,
• The glucose level in blood, are continuous random variables.
• The amount of RAM used by a program can be considered a continuous random variable, as
it can take on any value within a range (e.g., from 0 to the maximum available memory).
If we represent the first example we will find the following histograms
1.5. RANDOM VARIABLES 11
If ∆x → 0, we get a curve of area 1, like histograms, represents how the values of the random variable
X are distributed . This curve is the curve of a function called a probability density or simply density.
A density f describes the distribution of a random variable X in the following way :
Z b
for all a, b ∈ R, (a ≤ X ≤ b) = f (x) dx
a
.
It follows that a density must satisfy
for all x ∈ R, Z
f (x) ≥ 0 and f (x) dx = 1
R
Definition 34 A probability density is defined as any positive real function with an integral of 1.
Definition 1.5.9. A probability density function is any continuous function given in pieces:
f : R → R, x 7→ f (x)
such that:
• For all x ∈ R, f (x) ≥ 0
• R f (x) dx = 1.
R
Remark. For a continuous random variable X, the density f does not represent the probability of
the event X = x, because P (X = x) = 0.
Definition 1.5.10. As with discrete random variables, we define the cumulative distribution function
of X as:
FX : R → R, t 7→ FX (t) = P (X < t).
Thus, the relationship between the cumulative distribution function FX and the probability density
function f (x) is as follows:
Proposition 1.5.11. The cumulative distribution function FX is the primitive of the probability
density function f : Z t
∀t ∈ R, FX (t) = P (X < t) = f (x) dx.
−∞
Proposition 1.5.12. Let X be an absolutely continuous random variable with density f and cumu-
lative distribution function FX , then:
• P (a ≤ X ≤ b) = FX (b) − FX (a) = ab f (x)dx with a < b
R
Proposition 1.5.15 (Properties of Expectation). If X and Y are two random variables defined on the
same probability space Ω and both have an expectation, then:
• E(X + Y ) = E(X) + E(Y )
• E(aX) = aE(X) for any a ∈ R
• If X ≥ 0, then E(X) ≥ 0
• If X is a constant character such that: ∀ω ∈ Ω, X(ω) = k, then E(X) = k
Proposition 1.5.16 (Properties of Variance). If X is a random variable that admits a variance, then:
• ∀a ∈ R, V (aX) = a2 V (X)
• ∀a, b ∈ R, V (aX + b) = a2 V (X)
• V (X) = E(X 2 ) − E(X)2