Quantifying Uncertainty: Week 5

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 38

COMP6275 – Artificial Intelligence

Week 5
Quantifying Uncertainty
LEARNING OUTCOMES
At the end of this session, students will be able to:
o LO3 : Demonstrate how to achieve a goal through a sequence of actions called planning
o LO4 : Apply various techniques to an agent when acting under certainty and how to process
natural language and other perceptual signs in order that an agent can interact
intelligently with the real world
LEARNING OBJECTIVE
1. Acting Under Uncertainty
2. Basic Probability Notation
3. Inference Using Full Joint Distributions
4. Independence
5. Probability and Bayes’ Theorem
6. Summary
ACTING UNDER UNCERTAINTY
o Agent may need to handle uncertainty, whether due to partial
observability, nondeterminism, or combination of the two. An
agent may never know for certain what state it’s in or where it
will end up after a sequence of actions.
o The agent’s knowledge can at best provide only a degree of
belief in the relevant sentences.
o Main tool for dealing with degree of belief is probability theory.
o Probability provides a way of summarizing the uncertainty that
come from laziness and ignorance, thereby solving the
quantification problem.
ACTING UNDER UNCERTAINTY

Uncertainty and rational decisions


o To makes such choices, an agent must first have preferences between the
different possible outcomes of the various plans.
o Preferences, as expressed by utilities, are combined with probabilities in the
general theory of rational decisions called decision theory.
Decision Theory = probability theory + utility theory
o Fundamental of decision theory is that an agent is rational if only if it chooses
the action that yields the highest expected utility, averaged over all the
possible outcomes of the action. Called maximum expected utility (MEU).
o The primary difference is that the decision-theoretic agent’s belief state
represents not just the possibilities for world states but also their probabilities.
Given the belief state, the agent can make probabilistic predictions of action
outcomes and hence select the action with highest expected utility.
ACTING UNDER UNCERTAINTY
Methods for Handling Uncertainty
o Default or nonmonotonic logic:
o Assume my car does not have a flat tire
• Assume A25 works unless contradicted by evidence
o Issues: What assumptions are reasonable? How to handle contradiction?
o Rules with fudge factors:
• A25 |→0.3 get there on time
• Sprinkler |→ 0.99 WetGrass
• WetGrass |→ 0.7 Rain
o Issues: Problems with combination, e.g., Sprinkler causes Rain??
o Probability
• Model agent's degree of belief
• Given the available evidence,
• A25 will get me there on time with probability 0.04
ACTING UNDER UNCERTAINTY
Summarizing Uncertainty
Probabilistic assertions summarize effects of
• laziness: failure to enumerate exceptions, qualifications, etc.
• ignorance: lack of relevant facts, initial conditions, etc.

Subjective probability:
o Probabilities relate propositions to agent's own state of
knowledge
e.g., P(A25 | no reported accidents) = 0.06
o These are not assertions about the world

o Probabilities of propositions change with new evidence:


e.g., P(A25 | no reported accidents, 5 a.m.) = 0.15
BASIC PROBABILITY NOTATION
o Probability model : 0  P(w)  1 ; w and P(w) = 1 ;
w
o For any proportion , P() = P(w) = 1 ; w
o Conditional probability for propositions a and b are :

o For example (the “|” is pronounced “given”)

o In a different form called the product rule :


BASIC PROBABILITY NOTATION
Permutations and Combinations
o If you have a collection of n distinguishable objects, then
the number of ways you can pick a number r of them
(r<n) is given by the permutation and combination
relationship:

o For example if you have six persons for tennis, then the
number of pairings for singles tennis is
BASIC PROBABILITY NOTATION
Basic Probability
o If we identify the set of all possible outcomes as the
"sample space" and denote it by S, and label the desired
event as E, then the probability for event E can be written

o In the probability of a throw of a pair of dice, bet on the


number 7 since it is the most probable. There are six ways
to throw a 7, out of 36 possible outcomes for a throw. The
probability is then
BASIC PROBABILITY NOTATION
Flying Card
o Draw five cards from a standard deck of 52 playing cards,
and calculate the probability that all five cards are hearts.
This desired event brings in the idea of a combination.
The number of ways for pick five hearts, without regard to
which hearts or which order, is given by the combination

o while the total number of possible outcomes is given by


the much larger combination
BASIC PROBABILITY NOTATION
Flying Card
o The same basic probability expression is used,
but it takes the form

o So drawing a five-card hand of a single selected


suit is a rare event with a probability of about
one in 2000.
BASIC PROBABILITY NOTATION
o If theGate
Logic events are related by a logical AND, the resultant probability
is the product of the individual probabilities. If you want the
probability of throwing a 7 with a pair of dice AND throwing
another 7 on the second throw, then the probability would be the
product

o Probability has no causative effect, so prior events have no


influence on the probability for future events. For example:
 Probability of throwing a "2" with a single die: 1/6
 Probability of throwing "2" twice in a row, "2" AND "2": 1/6 x
1/6=1/36
 Probability of throwing a "2" on the next throw: 1/6
BASIC PROBABILITY NOTATION
Probability Axiom and Their Reasonableness
o Relationship between the probability of proposition and
probability of its negation is :

• Probability of disjoint
called (inclusive-exclusive principle)
BASIC PROBABILITY NOTATION
Example:
o To 100 child, asked “what the meaning of traffic light”,
and the answer is :
75 child know the meaning of the red lamp
35 child know the meaning of the yellow lamp
50 child know the meaning of both

o Therefore :
P(red yellow) = P(red) + P(yellow) – P(red yellow)
P(red yellow) = 0.75 + 0.35 – 0.5 = 0.6
INFERENCE USING FULL
JOINT DISTRIBUTIONS
o A complete specification of the state of the world
about which the agent is uncertain
o E.g., if the world consists of only two Boolean variables
Cavity and Toothache, then there are 4 distinct atomic events:

Cavity = false Toothache = false


Cavity = false  Toothache = true
Cavity = true  Toothache = false
Cavity = true  Toothache = true

o Atomic events are mutually exclusive and exhaustive


INFERENCE USING FULL
JOINT DISTRIBUTIONS
Full Joint Distribution
o Start with the joint probability distribution:

o For any proposition φ, sum the atomic events where it is


true: P(φ) = Σω:ω╞φ P(ω)
INFERENCE USING FULL
JOINT DISTRIBUTIONS
Inference by Enumeration
o Start with the joint probability distribution:

o Six possible world in which cavity  toothache holds:

P(cavity  toothache)
= 0.108 + 0.012 + 0.072 + 0.008 + 0.016 + 0.064 = 0.28
INFERENCE USING FULL
JOINT DISTRIBUTIONS
Inference by Enumeration
o Start with the joint probability distribution:

o P(cavity) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2


INFERENCE USING FULL
JOINT DISTRIBUTIONS
Inference by Enumeration
o Start with the joint probability distribution:

o Can also compute conditional probabilities:


P(cavity | toothache) = P(cavity  toothache)
P(toothache)
= 0.016+0.064
0.108 + 0.012 + 0.016 + 0.064
INFERENCE USING FULL
JOINT DISTRIBUTIONS
Normalization

o Denominator can be viewed as a normalization constant α


P(Cavity | toothache) = α P(Cavity, toothache)
= α [P(Cavity, toothache, catch) + P(Cavity, toothache, 
catch)]
= α [ 0.108, 0.016  +  0.012, 0.064 ]
= α [(0.108 + 0.012) + (0.012, 0.064)]
=  0.12, 0.08  =  0.6, 0.4 
General Idea: compute distribution on query variable by fixing
evidence variables and summing over hidden variables
INFERENCE USING FULL
JOINT DISTRIBUTIONS
Inference by Enumeration
o Typically, we are interested in the posterior joint distribution of the query
variables Y given specific values e for the evidence variables E

o Let the hidden variables be H = X - Y – E

o Then the required summation of joint entries is done by summing out the
hidden variables:
o P(Y | E = e) = α P(Y,E = e) = αΣh P(Y,E= e, H = h)

o The terms in the summation are joint entries because Y, E and H together
exhaust the set of random variables

o Obvious problems:
1. Worst-case time complexity O(dn) where d is the largest entry
2. Space complexity O(dn) to store the joint distribution
3. How to find the numbers for O(dn) entries?
INDEPENDENCE
o P (Toothache, Catch, Cavity, Weather),
which has 2 x 2 x2 x 4 = 32 entries
o For Example, how are
P(toothache, catch, cavity, cloudy) and P(toothache, catch, cavity)

o Use the product rule :


P(toothache, catch, cavity, cloudy)
= P(cloudy | toothache, catch, cavity) P(toothache, catch, cavity)

o The weather does not influence the dental variables, therefore :


P(cloudy | toothache, catch, cavity) = P(cloudy)
INDEPENDENCE
o A and B are independent iff
o P(A|B) = P(A) or P(B|A) = P(B) or P(A, B) = P(A) P(B)

P(Toothache, Catch, Cavity, Weather)


= P(Toothache, Catch, Cavity) P(Weather)

o 32 entries reduced to 12; for n independent biased coins, O(2n) →O(n)


o Absolute independence powerful but rare

o Dentistry is a large field with hundreds of variables, none of which are


independent. What to do?
INDEPENDENCE
Conditional Independence
o P(Toothache, Cavity, Catch) has 23 – 1 = 7 independent entries
o If I have a cavity, the probability that the probe catches in it doesn't
depend on whether I have a toothache:
(1) P(catch | toothache, cavity) = P(catch | cavity)
o The same independence holds if I haven't got a cavity:
(2) P(catch | toothache, cavity) = P(catch | cavity)
o Catch is conditionally independent of Toothache given Cavity:
P(Catch | Toothache, Cavity) = P(Catch | Cavity)
o Equivalent statements:
o P(Toothache | Catch, Cavity) = P(Toothache | Cavity)
P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch |
Cavity)
INDEPENDENCE
Conditional Independence
o Write out full joint distribution using chain rule:
P(Toothache, Catch, Cavity)
= P(Toothache | Catch, Cavity) P(Catch, Cavity)
= P(Toothache | Catch, Cavity) P(Catch | Cavity) P(Cavity)
= P(Toothache | Cavity) P(Catch | Cavity) P(Cavity)
I.e., 2 + 2 + 1 = 5 independent numbers

o In most cases, the usage of conditional independence reduces the


size of the representation of the joint distribution from exponential
in n to linear in n.
o Conditional independence is our most basic and robust form of
knowledge about uncertain environments.
PROBABILITY AND BAYES’ THEOREM
Definition :
P(Hi E) = probability that the hypothesis Hi is true if
given the evidence E
P(E Hi) = probability that the evidence E is appear if
know that hypothesis Hi is true
P(Hi) = a priori probability is probability that the
hypothesis Hi is appear without look any
evidence
k = number Pof( Epossible
/ H i ) * Phypothesis
(H i )
P( H i / E )  k
 P( E / H n ) * P( H n )
n 1
PROBABILITY AND BAYES’ THEOREM
Example 1:
Vany had onset of symptoms such as spots on the face. Doctor diagnose that
Vany got chicken pox with the possibility:
 Probability appearance of spots on the face, if Vany got chicken pox, p
(spots/chicken pox) = 0,8
 Probability Vany got chicken pox without notice any symptoms, p(chicken
pox) = 0,4
 Probability appearance of spots on the face, if Vany got allergy,
p(spots/allergy) = 0,3
 Probability Vany got allergy without notice any symptoms, p(allergy) = 0,7
 Probability appearance of spots on the face, if Vany got pimples,
p(spots/pimples) = 0,9
 Probability that Vany got pimples without notice any symptoms, p(pimples)
= 0,5
Calculate the probability of each symptoms stated above!
PROBABILITY AND BAYES’ THEOREM
Solution:
p (chickenpox / spots ) 
p(spots/ch ickenpox) * p(chickenp ox)
p(spots/ch ickenpox) * p(chickenp ox)  p(spots/al lergy) * p(allergy)  p(spots/pi mples) * p(pimples)

p(chickenp ox/spots) 
(0,8) * (0,4) 0,32
  0,327
(0,8) * (0,4)  (0,3) * (0,7)  (0,9) * (0,5 0,98

In the same way, we can obtain:


(0,3) * (0,7)
p (allergy / spots )   0,214
0,98
(0,9) * (0,5)
p ( pimples / spots )   0,459
0,98
PROBABILITY AND BAYES’ THEOREM
Example 2:
o Problem : Marie is getting married tomorrow, at an outdoor ceremony in
the desert. In recent years, it has rained only 5 days each year.
Unfortunately, the weatherman has predicted rain for tomorrow. When it
actually rains, the weatherman correctly forecasts rain 90% of the time.
When it doesn't rain, he incorrectly forecasts rain 10% of the time. What is
the probability that it will rain on the day of Marie's wedding?
o Solution: The sample space is defined by two mutually-exclusive events -
it rains or it does not rain. Additionally, a third event occurs when the
weatherman predicts rain. Notation for these events appears below.
 Event A1. It rains on Marie's wedding.
 Event A2. It does not rain on Marie's wedding.
 Event B. The weatherman predicts rain.
PROBABILITY AND BAYES’ THEOREM
In terms of probabilities, we know the following:
o P( A1 ) = 5/365 = 0.0136985 [It rains 5 days out of the year.]
o P( A2 ) = 360/365 = 0.9863014 [It does not rain 360 days out of
the year.]
o P( B | A1 ) = 0.9 [When it rains, the weatherman predicts rain
90% of the time.]
o P( B | A2 ) = 0.1 [When it does not rain, the weatherman predicts
rain 10% of the time.]

o We want to know P( A1 | B ), the probability it will rain on the


day of Marie's wedding, given a forecast for rain by the
weatherman. The answer can be determined from Bayes'
PROBABILITY AND BAYES’ THEOREM
The Answer

• Note the somewhat unintuitive result. Even when the


weatherman predicts rain, it only rains only about 11%
of the time. Despite the weatherman's gloomy
prediction, there is a good chance that Marie will not
get rained on at her wedding.
PROBABILITY AND BAYES’ THEOREM
Applying Bayes' Rule : the simple case
o Product rule :

o two right-hand sides and dividing by P(a), get Bayes' rule:


o

o or in distribution form :

o Perceive as evidence the effect of some unknown cause and


to determine that cause. Bayes’ rule become :
PROBABILITY AND BAYES’ THEOREM
Example:
o The doctor also know some unconditional fact : the prior
probability that patient has meningitis is (1/50000), and
prior probability that any patient has stiff neck is 1 %.
o Letting :
s = the proposition that the patient has a stiff neck
m = the proposition that the patient has meningitis
PROBABILITY AND BAYES’ THEOREM
Bayes' Rule : Combining Evidence
P(Cavity | toothache  catch)
= αP(toothache  catch | Cavity) P(Cavity)
= αP(toothache | Cavity) P(catch | Cavity) P(Cavity)
o This is an example of a naive Bayes model:

P(Cause,Effect1, … ,Effectn) = P(Cause)πiP(Effecti|Cause)

o Total number of parameters is linear in n


SUMMARY
o Uncertainty arises because of both laziness and ignorance. It is inescapable in
complex, nondeterministic, or partially observable environments.Decision theory
combines the agent’s beliefs and desires, defining the best action as the one that
maximizes expected utility.
o Basic probability statements include prior probabilities and conditional probabilities
over simple and complex propositions.
o The axioms of probability constrain the possible assignments of probabilities to
propositions. An agent that violates the axioms must behave irrationally in some
cases.
o The full joint probability distribution specifies the probability of each complete
assignment of values to random variables. It is usually too large to create or use in its
explicit form, but when it is available it can be used to answer queries simply by
adding up entries for the possible worlds corresponding to the query propositions.
o Bayes’ rule allows unknown probabilities to be computed from known conditional
probabilities, usually in the causal direction. Applying Bayes’ rule with many pieces
of evidence runs into the same scaling problems as does the full joint distribution.
36
REFERENCES
o Stuart Russell, Peter Norvig,. 2010. Artificial intelligence: a
modern approach. PE. New Jersey. ISBN:9780132071482,
Chapter 13
o Elaine Rich, Kevin Knight, Shivashankar B. Nair. 2010.
Artificial Intelligence. MHE. New York. , Chapter 7
o Reasoning under Uncertainty:
http://aitopics.net/Uncertainty
o Handling Uncertainty in Artificial Intelligence, and the
Bayesian Controversy:
http://eprints.ucl.ac.uk/16378/1/16378.pdf
ThankYOU...

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy