ch13 Uncertainty

Artificial Intelligence
uncertainty
Prepared by
Atalla Salh
Ihab Ameer
Melad Muzhar
ACTING UNDER UNCERTAINTY
When an agent knows enough facts about its
environment,the logical approach enables it to derive
plans that are guaranteed to work. This is a good thing.
Unfortunately, agents almost never have access to the
whole truth about their erzvironment. Agents must,
therefore, act under uncertainty
.
Uncertainty arises because of both-
laziness and ignorance. It is inescapable in
.complex,dynamic, or inaccessible worlds
Uncertainty means that many of the-

simplifications that are possible with
.deductive inference are no longer valid
For example, an agent in UNCERTAINTY the wumpus
world . has sensors that report only local information;
most of the world is not immediately observable. A
wumpus agent often will find itself unable to discover
which of two squares contains a pit. If those squares
are en route to the gold, then the agent
might have to take a chance and enter one of the two
squares
Handling uncertain knowledge
we look more closely at the nature of uncertaiin knowledge. We will use a simple
Handling uncertain knowledge
,diagnosis example to illustrate the concepts involved. Diagnosis-whether for medicine
automobile repair, or whatever-is a task that almost always involves uncertainty. Let us try
to write rules for dental diagnosis using first-order logic, so that we can see how the logical
:approach breaks down. Consider the following rule
. V p Symptom(p, Toothache) * Disease(p, Cavity)
The problem is that this rule is wrong. Not all patients with toothaches have cavities; some
:of them have gum disease, an abscess, or one of several other problems
+ V p Symptom(p, Toothache)
. . . Disease(p, Cavity) V Disease(p, GumDisease) C' Disease(p, Abscess)
Unfortunately, in order to make the rule true, we have to add an almost unlimited list of
:possible causes. We c o ~ ~tlryd turning the rule into a causal rule
. ^:V y Disease(p, Cavity) + Symptom(p, toothache
But this rule is not right either; not all cavities cause pain The only way to fix the rule
is to make it logically exhaustive: to augment the left-hand side with all the qualifications
required for a cavity to cause a toothache. Even then, for the purposes of diagnosis, one must
also take into account the possibility that the patient might have a toothache and a cavity that
.are unconnected
Sources of Uncertainty
,Incomplete knowledge. E.g., laboratory data can be late
medical science can have an incomplete theory for some
.diseases
Imprecise knowledge. E.g., the time that an event happened •
.can be known only approximately
Unreliable knowledge. E.g., a measuring instrument can be •
.biased or defective
Uncertainty and Rational Decisions
Agents have preferences over states of the world that
are
.possible outcomes of their actions
Every state of the world has a degree of usefulness, or •
,utility
.to an agent. Agents prefer states with higher utility
Decision theory=Probability theory + Utility theory •
An agent is rational if and only if it chooses the action •
that
yields the highest expected utility, averaged over all the
.possible outcomes of the action
the general framework for a rational agent, we will need a formal
language for representing and reasoning with uncertain knowledge. Any notation for describing
degrees of belief must be able to deal with two main issues: the nature of the sentences to
which degrees of belief are assigned and the dependence of the degree of belief on the agent's
experience. The version of probability theory we present uses an extension of propositional
.logic for its sentences
Propositions
we have seen two formal languages-propositic)nal logic and first-order logicfor
stating propositions. Probability theory typically uses a language that is slightly more
expressive than propositional
random varible The basic element of the language, which can be thought of as random
referring to a "part" of the world whose "status" is initially unknown
Each random variable has a domain of values that it can take on-
Probability Theory: The Basics (AI Style)
Like logical assertions, probabilistic assertions are •
about
.possible worlds
Logical assertions say which possible worlds •
(interpretations)
are ruled out (those in which the KB assertions are
.false)
Probabilistic assertions talk about how probable the •
.various worlds are
The Basics (cont’d)
The set of possible worlds is called the sample space (denoted •
by Ω). The elements of Ω (sample points) will be denoted by
.ω
The possible words of Ω (e.g., outcomes of throwing a dice) are •
.mutually exclusive and exhaustive
In standard probability theory textbooks, instead of possible •
worlds we talk about outcomes, and instead of sets of possible
worlds we talk about events (e.g., when two dice sum up to
.)11
We will represent events by propositions in a logical
.language which we will define formally later
Atomic events
atomic event is useful in understanding the-
.foundations of probability theory
An atomic event is a complete specification of the-
state of the world about which the agent is uncertain.
-It can be thought of as an assignment of particular
values to all the variables of which the world is
composed
Prior probability
The unconditional or prior probability associated with a proposition a is
the degree of belief accorded to it in the absence of any other
information; it is written as P ( a ) . For example, if
the prior probability that I have a cavity is 0.1, then we would write
. P(Cavity = true) = 0.1 or P(cavity) = 0.1
Conditional probability
Once the agent has obtained some evidence concerning the previously
unknown random variables
making up the domain, prior probabilities are no longer applicable.
Instead, we use conditional or posterior probabilities. The notation
used is P(alb), where a and b are any proposition. This is read as "the
,probability of a, given that all we know is b." For example
P(cavity| toothache) = 0.8
we have defined a syntax for propositions and for prior and conditional probability
statements about those propositions. Now we must provide some sort of semantics
for probability
statements. We begin with the basic axioms that serve to define the probability scale
:and its endpoints
,All probabilities are between 0 and 1. For any propositjion a .1
p(a)<=1=<0
Necessarily true (i.e., valid) propositions have probability I, and necessarily false .2
(i.e.,unsatisfiable) propositions have probability 0
P(true)=1 p(false)=0
Next, we need an axiom that connects the probabilities of logically related
propositions. The simplest way to do this is to define the probability of a disjunction as
follow
The probability of a disjunction is given by .3
P(a˅b)=p(a)+p(b)-p(a^b)
Using the axioms of probability
We can derive a variety of useful facts from the basic ,axioms. For
example, the familiar rule for negation follows by substituting ~a for b in
:axiom 3, giving us
P(a V ~a ) = P(a) + P (~a ) - P(a A ~a ) ('by axiom 3 with b = ~a )
P(true) = P(a) + P ( ~ a ) - P(false) (by logical equivalence)
P(a) + P ( ~ a ) (by axiom 2) = 1
.P (~a ) = 1 - P(a) (by algebra)
The third line of this derivation is itself a useful fact and can be extended
from the Boolean case to the general discrete case. Let the discrete
.variable D have the domain (dl, . . . , d,)
Why the axioms of probability are reasonable
The axioms of probability can be seen as restricting tihe set of probabilistic beliefs that an
agent can hold. This is somewhat analogous to the logical case, where a logical agent cannot
simultaneously believe A, B , and l ( A A B ) , for example. There is, however, an additional
complication. In the logical case, the semantic definition of conjunction means that at least
one of the three beliefs just mentioned must be false in the world, so it is unreasonable for an
agent to believe all three. With probabilities, on the other hand, statements refer not to the
world directly, but to the agent's own state of knowledge. Why, then, can an agent not hold
?the following set of beliefs, which clearly violates axiom 3
P(a)=0.4 p(a^b)=0.0
P(b)=0.3 p(aνb)=0.8
The Joint Probability Distribution
If we have more than one random variable and we are considering
problems that involve two or more of these variables at the same time,
then the joint
probability distribution specifies degrees of belief in the values that
.these functions take jointly
The joint probability distribution P(X), where X is a vector of random
variables, is usually specified graphically by a n-dimensional table
(where n
.)is the dimension of X
Example: (two Boolean variables Toothache and Cavity)
The Full Joint Probability Distribution
The full joint probability distribution is the joint probability
distribution for all random variables.
If we have this distribution, then we can compute the
probability of any propositional sentence using the formulas
.about probabilities we presented earlier
Independence
The notion of independence captures the situation
when the probability of a random variable taking a
certain value is not influenced by the fact
.that we know the value of some other variable
Definition. Two propositions a and b are called
independent if
P(a|b) = P(a) (equivalently: P(b|a) = P(b) or P(a ∧ b) =
.P(a)P(b))
Definition. Two random variables X and Y are called
independent if
P(X | Y ) = P(X) (equivalently: P(Y | X) = P(Y ) or
.)P(X, Y ) = P(X) P(Y )
Example
P(Weather | Toothache,Catch,Cavity) = P(Weather)
!Note: Zeus might be an exception to this rule
Applying Bayes' rule: The simple case
On the surface, Bayes' rule does not seem very useful. It requires three terms-a conditional
.probability and two unconditional probabilities-just to compute one conditional probability
Bayes' rule is useful in practice because there are many cases where we do have good
probability estimates for these three numbers and need to compute the fourth. In a task such
as medical diagnosis, we often have conditional probabilities on causal relationships and want
to derive a diagnosis. A doctor knows that the disease meningitis causes the patient to have
a stiff neck, say, 50% of the time. The doctor also knows some unconditional facts: the prior
probability that a patient has meningitis is 1150,000, and the prior probability that any patient
has a stiff neck is 1120. Letting s be the proposition that the patient has a stiff neck and rn be
the proposition that the patient has meningitis
Applying Bayes’ Rule: Combining Evidence
What happens when we have two or more pieces of
?evidence
Example: What can a dentist conclude if her steel
probe catches
?in the aching tooth of a patient
?How can we compute P(Cavity | toothache ∧ catch)
Combining Evidence (cont’d)
.Use the full joint distribution table (does not scale) •
:Use Bayes’ rule •
= P(Cavity | toothache∧catch)
P(toothache ∧ catch | Cavity) P(Cavity)/P(toothache
∧ catch)
This approach does not scale too if we have a large
number of
.evidence variables
Question: Can we use independence
Combining Evidence (cont’d)
Answer: No, Toothache and Catch are not

.independent in general
If the probe catches in the tooth, then it is likely that
the tooth has
a cavity and that the cavity causes a toothach
Uncertainty arises in the wumpus world because the agent's sensors
give only partial, local
information about the world. For example, Figure 13.6 shows a situation
in which each of the
three reachable squares-[1,3], [2,2], and [3,1]-might contain a pit. Pure
logical inference
can conclude nothing about which square is most likely to be safe, so a
logical agent might
be forced to choose randomly. We will see that a probabilistic agent can
.do much better than the logical agent
Our aim will be to calculate the probability that each of the three squares contains a-
pit.(For the purposes of this example, we will ignore the wumpus and the gold.) The
relevant properties of the wumpus world are that (1) a pit causes breezes in all
neighboring squares,and (2) each square other than [1,1] contains a pit with
probability 0.2. The first step is to identify the set of random variables we need
,As in the propositional logic case, we want one Boolean variable Pij for each square
.which is true iff square [i, j] actually contains a pit
We also have Boolean variables Bij that are true iff square, [i, j ] is breezy; we include-
.these variables only for the observed squares-in this case, [1,1], [1,2], and [2,1]
The next step is to specify the full joint distribution, P(p1,1,…….p4,4,B1,2,B2,1)
applying the product rule, we have
P(Pl,l,.. . , P4,4, Bl,1, B1,2, B2,l) =P(B1,1, Bl,2, B2,l 1 Pl,l, . . . , P4,4)P(P1,1,. . . ,P4,4)
.This decomposition makes it very easy to see what the joint probability values should be
;The first term is the conditional probability of a breeze configuration, given a pit configuration
this is 1 if the breezes are adjacent to the pits and 0 otherwise. The second term is
,the prior probability of a pit configuration. Each square contains a pit with probability 0.2
,independently of the other squares; hence
p ( p l , ~. ., . , p4,4) =П p ( p i ,j)
The End
Thanks for listening

ch13 Uncertainty

Uploaded by

Copyright:

Available Formats

ch13 Uncertainty

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ch13 Uncertainty

Uploaded by

Copyright:

Available Formats

Artificial Intelligence

Uncertainty means that many of the-

Answer: No, Toothache and Catch are not

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.