Chapter13 Uncertainty
Chapter13 Uncertainty
Chapter13 Uncertainty
Ali Naqvi
The world is not a well-defined place.
There is uncertainty in the facts we know:
◦ What’s the temperature? Imprecise measures
◦ Is Trump a good president? Imprecise definitions
◦ Where is the pit? Imprecise knowledge
There is uncertainty in our inferences
◦ If I have a blistery, itchy rash and was gardening all
weekend I probably have poison ivy
People make successful decisions all the time anyhow.
Uncertain data
◦ missing data, unreliable, ambiguous, imprecise representation,
inconsistent, subjective, derived from defaults, noisy…
Uncertain knowledge
◦ Multiple causes lead to multiple effects
◦ Incomplete knowledge of causality in the domain
◦ Probabilistic/stochastic effects
Uncertain knowledge representation
◦ restricted model of the real system
◦ limited expressiveness of the representation mechanism
inference process
◦ Derived result is formally correct, but wrong in the real world
◦ New conclusions are not well-founded (eg, inductive reasoning)
◦ Incomplete, default reasoning methods
Uncertainty techniques used in AI
systems include:
Probability
Bayes Theory
Certainty Factors
Fuzzy Logic
Traditional logic is monotonic
◦ The set of legal conclusions grows monotonically with the set
of facts appearing in our initial database
When humans reason, we use defeasible logic
◦ Almost every conclusion we draw is subject to reversal
◦ If we find contradicting information later, we’ll want to retract
earlier inferences
Nonmonotonic logic, or defeasible reasoning, allows a
statement to be retracted
Solution: Truth Maintenance
◦ Keep explicit information about which facts/inferences
support other inferences
◦ If the foundation disappears, so must the conclusion
Agents must still act even if world is not certain
If not sure which of two squares have a pit and must
enter one of them to reach the gold, the agent will
take a chance
If can only act with certainty, most of the time will not
act.
Example: An agent wants to drive someone to the airport to
catch a flight, and is considering plan A90 that involves leaving
home 90 minutes before the flight departs and driving at a
reasonable speed. Even though the Pullman airport is only 5
miles away, the agent will not be able to reach a definite
conclusion - it will be more like
“Plan A90 will get us to the airport in time, as long as my car
doesn't break down or run out of gas, and I don't get into an
accident, and there are no accidents on the Moscow-Pullman
highway, and the plane doesn't leave early, and there's no
thunderstorms in the area, …”
We may use this plan if it improves our situation, given known
information
Performance measure includes getting to the airport in time,
not waiting at the airport, and/or not getting a speeding ticket.
Consider the following plans for getting to the airport:
◦ P(A25 gets me there on time | ...) = 0.04
◦ P(A90 gets me there on time | ...) = 0.70
◦ P(A120 gets me there on time | ...) = 0.95
◦ P(A1440 gets me there on time | ...) = 0.9999
Which action should I choose?
Depends on my preferences for missing the flight vs. time
spent waiting, etc.
◦ Utility theory is used to represent and infer Preferences
◦ Decision theory is a combination of probability theory
and utility theory
Decision theory = utility theory + probability theory
Pure logic fails for three main reasons:
Laziness
◦ Too much work to list complete set of antecedents
or consequents needed to ensure an exceptionless
rule, too hard to use the enormous rules that
result
Theoretical ignorance
◦ Science has no complete theory for the domain
Practical ignorance
◦ Even if we know all the rules, we may be uncertain
about a particular patient because all the
necessary tests have not or cannot be run
Probabilities are numeric values between 0 and 1
(inclusive) that represent ideal certainties (not beliefs)
of statements, given assumptions about the
circumstances in which the statements apply.
These values can be verified by testing, unlike certainty
values. They apply in highly controlled situations.
a b
Negation
P(~a) = 1 – P(a)
a
Conditional probability
◦ Once evidence is obtained, the agent can
use conditional probabilities, P(a|b)
◦ P(a|b) = probability of a being true given
that we know b is true
P ( a ^ b)
◦ The equation P(a|b) = P (b)
holds whenever P(b)>0
Conjunction
◦ Product rule
◦ P(a^b) = P(a)*P(b|a) a b
◦ P(a^b) = P(b)*P(a|b)
a b a b
For example, if we roll two dice, each showing one of six possible
numbers, the number of total unique rolls is 6*6 = 36. We
distinguish the dice in some way (a first and second or left and right
die). Here is a listing of the joint possibilities for the dice:
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Wet ~Wet
Rain 0.6 0.4
~Rain 0.4 0.6
A lunar lander crashes somewhere in your town (one of the cells
at random in the grid). The crash point is uniformly random (the
probability is uniformly distributed, meaning each location has
an equal probability of being the crash point).
D is the event that it crashes downtown.
R is the event that it crashes in the river.
What is P(R)? 18/54
H1 H2 H3
Observation: I draw a white bead.
P(H1|W) = P(H1)P(W|H1) / P(W)
= (1/3 * 3/4) / 5/12 = 3/12 * 12/5 = 36/60 = 3/5
P(H2|W) = P(H2)P(W|H2) / P(W)
= (1/3 * 1/2) / 5/12 = 1/6 * 12/5 = 12/30 = 2/5
P(H3|W) = P(H3)P(W|H3) / P(W)
= (1/3 * 0) / 5/12 = 0 * 12/5 = 0
If I replace the bead, then redraw another bead at
random from the same box, how well can I predict its
color before drawing it?
P(H1)=3/5, P(H2) = 2/5, P(H3) = 0
P(W) = P(W|H1)P(H1) + P(W|H2)P(H2) +
P(W|H3)P(H3)
= 3/4*3/5 + 1/2*2/5 + 0*0 = 9/20 + 4/20 = 13/20
H1 H2 H3
We wish to know probability that John has malaria, given that he has
a slightly unusual symptom: a high fever.
We have 4 kinds of information
a) probability that a person has malaria regardless of symptoms
(0.0001)
b) probability that a person has the symptom of fever given that he
has malaria (0.75)
c) probability that a person has symptom of fever, given that he
does NOT have malaria (0.14)
d) John has high fever P( E | H ) P( H )
P( H | E )
H = John has malaria P( E )
E = John has a high fever
Given: P(H) = 0.0001, P(E|H) = 0.75, P(E|~H) = 0.14
We wish to know probability that John has malaria, given that he has a slightly
unusual symptom: a high fever.
We have 4 kinds of information
a) probability that a person has malaria regardless of symptoms
b) probability that a person has the symptom of fever given that he has malaria
c) probability that a person has symptom of fever, given that he does NOT have malaria
d) John has high fever
H = John has malaria P(E|H) * P(H)
P(H|E) =
E = John has a high fever P(E)
Suppose P(H) = 0.0001, P(E|H) = 0.75, P(E|~H) = 0.14
Then P(E) = 0.75 * 0.0001 + 0.14 * 0.9999 = 0.14006
and P(H|E) = (0.75 * 0.0001) / 0.14006 = 0.0005354
On the other hand, if John did not have a fever, his probability of having malaria would be
Joint probability distribution for a set of r.v.s gives the probability of every
atomic event on those r.v.s (i.e., every sample point)
P(Weather, Cavity) = a 4 × 2 matrix of values:
Obvious problems:
1) Worst-case time complexity O(dn) where d is the largest arity
2) Space complexity O(dn) to store the joint distribution
3) How to find the numbers for O(dn) entries???
P(Toothache, Cavity, Catch) has 23 − 1 = 7 independent entries
If I have a cavity, the probability that the probe catches in it doesn’t depend on
whether I have a toothache:
(1) P (catch|toothache, cavity) = P (catch|cavity)
Cavity Cause
for n pits.
We know the following facts:
b = ¬b1,1 ∧ b1,2 ∧ b2,1
known = ¬p1,1 ∧ ¬p1,2 ∧ ¬p2,1
Query is P(P1,3|known, b)