Lecture Quantifying Uncertainty
Lecture Quantifying Uncertainty
Lecture Quantifying Uncertainty
Motivation
Problems:
1. partial observability (road state, other drivers’ plans, etc.)
2. noisy sensors (radio traffic reports)
3. uncertainty in action outcomes (flat tire, etc.)
4. immense complexity of modelling and predicting traffic
Knowledge representation
Language Main elements Assignments
Propositional logic Facts T, F, unknown
First-order logic facts, objects, relations T, F, unknown
Temporal logic facts, objects, relations, times T, F, unknown
Temporal CSPs time points time intervals
Fuzzy logic set membership degree of truth
Probability theory facts degree of belief
The first three do not represent uncertainty, while the last three do.
Example of uncertain reasoning
• Consider the following simple rule
• Toothache ⇒Cavity
• It is wrong as not all toothache’s are due to cavity
• Toothache ⇒Cavity V gum_problem V Abscess …
• Turn around Cavity ⇒Toothache
• This rule is not true either (not all cavities cause pain
• Only way to fix the rule is to make it logically exhaustive
• To augment the left hand side with all the qualifications required for a cavity to
cause a toothache.
Handling Uncertain KB
• Logic fails in medical domain because of three reasons
• Laziness: Too much to list the complete set of antecedents or consequents
needed to ensure an exceptionless rule and too hard to use such rules.
• Theoretical ignorance: No complete theory for the domain
• Practical ignorance: Even if we know all rules, we might be uncertain about
a particular patient because not all necessary tests have been conducted
• Typical of medical domain, true with any other domains law, gardening etc
• The agent’s knowledge can at least be best provide only a degree of belief
in the relevant sentences.
• The tool to dealing with degree of belief is probability theory
Uncertainty and rational decisions
• For getting airport prepare many choices
• The agent must have preferences between the different possible
outcomes of the various plans
• We use utility theory to represent and reason with preferences.
Utility theory says every state has a degree of usefulness or utility.
• The agent prefers high utility
• Preferences, as expressed by utilities, are combined with probabilities
in the general theory of rational decisions called decision theory.
• Decision theory = probability theory + utility theory
Decision theoretic agent that selects rational
actions
Probability
• Probabilistic assertions summarize effects of
• laziness: failure to enumerate exceptions, qualifications, etc.
ignorance: lack of relevant facts, initial conditions, etc.
• Probabilities relate propositions to one’s own state of knowledge. They
might be learned from past experience of similar situations.
• e.g., P (A25) = 0.05
• Probabilities of propositions change with new evidence: e.g.,
P (A25| no reported accidents) = 0.06
• e.g., P (A25| no reported accidents, 5am) = 0.15
Probability basics
Begin with a set Ω called the sample space
A sample space is a set of possible outcomes
Each ω ∈ Ω is a sample point (possible world,
atomic event) e.g., 6 possible rolls of a die:{1, 2,
3, 4, 5, 6}
Probability space or probability model:
Take a sample space Ω, and
assign a number P(ω) (the probability of ω)
to every atomic event ω ∈ Ω
Probability basics (cont’d)
A probability space must satisfy the following properties:
0 ≤ P(ω) ≤ 1 for every ω ∈ Ω
Σ
ω∈Ω P(ω) = 1
e.g., for rolling the die,
P (1) = P (2) = P (3) = P (4) = P (5) = P (6) = 1/6.
An event A is any subset of Ω
The probability of an event is defined as follows:
Σ
P(A) = {ω ∈ A} P(ω)
e.g., P(die roll < 4) =
P(1) + P(2) + P(3) = 1/6 + 1/6 + 1/6 = 1/2
Random variables
A random variable is a function from sample points to some range such as
integers or Booleans.
We’ll use capitalized words for random variables.
e.g., rolling the die:
true if ω is even,
Odd(ω) =
false otherwise
toothache ~toothache
catch ~catch catch ~catch
cavity .108 .012 .072 .008
~cavity .016 .064 .144 .576
toothache ~toothache
catch ~catch catch ~catch
cavity .108 .012 .072 .008
~cavity .016 .064 .144 .576
P(cavity ∨ toothache)
= 0.108 + 0.012 + 0.072 + 0.008 + 0.016 + 0.064
= 0.28
Computing a conditional probability
toothache ~toothache
catch ~catch catch ~catch
cavity .108 .012 .072 .008
~cavity .016 .064 .144 .576
= 0.016+0.064
0.108+0.012+0.016+0.064
Normalization
toothache ~toothache
catch ~catch catch ~catch
cavity .108 .012 .072 .008
~cavity .016 .064 .144 .576
Recall that events are lower case, random variables are Capitalized
General idea: The denominator can be viewed as a normalization constant α
We take the probability distribution over the values of the hidden variables.
P(Cavity|toothache) = αP(Cavity, toothache)
= α[P(Cavity, toothache, catch) + P(Cavity, toothache, ¬catch)]
= α[< P(cavity, toothache, catch), P(¬cavity, toothache, catch) > +
< P(cavity, toothache, ¬catch), P(¬cavity, toothache, ¬catch) >]
= α[< 0.108, 0.016 > + < 0.012, 0.064 >]
= α[< 0.108+ 0.012, 0.016+ 0.64 >] = α[< 0.12, 0.08 >]
=< 0.6, 0.4 > because the entries must add up to 1
Compute α from 0.12+0.08
1
Inference by enumeration, summary
Let X be the set of all variables. Typically, we are interested in the
posterior (conditional) joint distribution of the query variables Y given
specific values e from the evidence variables E
Weather
Weather
P(Effect|Cause)P(Cause)
P(Cause|Effect) =
P(Effect)
Bayes’ rule example
P(Effect|Cause)P(Cause)
P(Cause|Effect) =
P(Effect)
E.g., let M be meningitis, S be stiff neck:
Cavity Cause
The agent is navigating the There is one wumpus. Being in the same cell as the
wumpus world in search of wumpus kills the agent. The cells adjacent to
gold. where the wumpus have a stench.
1,1 2,1
B
3,1 4,1 Second term: Pits are placed independently.
OK OK Calculate using probability 0.2 for each of the n
pits. For example:
P(p1,1, . . . , p4,4) = 0.216 × 0.80, as n = 0
P(¬p1,1, . . . , p4,4) = 0.215 × 0.81, as n = 1
Observations and query
We know the following facts (evidence):
b = ¬b1,1 ∧ b1,2 ∧ b2,1
known = ¬p1,1 ∧ ¬p1,2 ∧ ¬p2,1
1,4 2,4 3,4 4,4
The query is P(P1,3|known, b)
1,3 2,3 3,3 4,3 We need to sum over the hidden variables, so
define Unknown = Pi,j s
1,2 2,2 3,2 4,2
other than P1,3 and Known
B
OK
For inference by enumeration, we have
P(P
Σ |known, b) =
α 1,3 P(P1,3, unknown, known, b)
1,1 2,1
B 3,1 4,1
OK
OK
unknown
1,2B 2,2 1,2B 2,2 1,2 B 2,2 1,2 B 2,2 1,2 B 2,2
OK OK OK OK OK
1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1
B B B B B
OK OK OK OK OK OK OK OK OK OK
0.2 x 0.2 = 0.04 0.2 x 0.8 = 0.16 0.8 x 0.2 = 0.16 0.2 x 0.2 = 0.04 0.2 x 0.8 = 0.16