Ai R16 - Unit-6
Ai R16 - Unit-6
Ai R16 - Unit-6
UNIT –VI Uncertainty measure & Fuzzy sets and fuzzy logic
6.1. Uncertainty
● Uncertainty may occur in KBS because of the problems with the data.
● The chance that a particular event will occur = the number of ways the
event can occur divided by the total number of all possible events.
Example: The probability of throwing two successive heads with a fair coin is
0.25
probability = ¼ = 0.25
– S is a sure event
● Since the events are the set, it is clear that all set operations can be
performed on the events.
– P(S) = 1
– P(A’ ) = 1 - P(A)
= P(A and B)
● The probability of getting Heads on one or on both of the coins i.e. the
union of the probabilities P(A) and P(B) is expressed as
= 0.75
6.2.3. Conditional Probability
P(H | E) =
P(H and E)
P(E)
This approach relies on the concept that one should incorporate the prior
probability of an event into the interpretation of a situation.
P(E|H) * P(H)
P(H|E) =
P(E)
Proof of Bayes’ Theorem
Hence, we obtain
P(E|H) * P(H)
P(H|E)
P(E)
=
Advantages:
Disadvantages:
● We normally assume that the facts are always completely true but facts
might also be probably true.
Example:
battery_dead (0.04).
– "if 30% of the time when car does not start, it is true that the
battery is dead "
Here 30% is rule probability. If right hand side of the rule is certain,
then we can even write above rule as:
battery_dead(0.3) :- ignition_not_start.
battery_dead(P) :-voltmeter_measurment_abnormal(P).
– We should gather all relevant rules and facts about the battery is
dead.
● In this case, probabilities of sub goals in the right side of rule are
multiplied, assuming all the events are independent of each other using
the formula
● The rules with same conclusion can be uncertain for different reasons.
● If there are more than one rules with the same predicate name having
different probabilities, then in cumulative likelihood of the above
predicate can be computed by using or-combination.
● To get overall probability of predicate, the following formula is used to get
'or' probability if events are mutually independent.
Prob(A or B or C or ...)
Examples
1. "half of the time when a computer does not work, then the battery is dead"
battery_dead(P):-computer_dead(P1), P is P1*0.5.
2. "95% of the time when a computer has electricalproblem and battery is old,
then the battery is dead"
battery_dead(P) :- electrical_prob(P1),
battery_old(P2), P is P1 * P2 * 0.95.
● For the sake of simplicity we write P(X1 , … , Xn) instead of P(X1 and
… and Xn).
P(X1 , … ,Xn) = P(Xn | X1 ,…, Xn-1) * P(X1 , … , Xn-1)
Or
P(X1 , … , Xn) = P(Xn | X1 , … , Xn-1) * P(Xn-1 | X1 , … , Xn-2) * …. * P(X2 | X1) * P(X1)
● Nodes with no children are termed as hypotheses node and nodes with
no parents are called independent nodes.
● Here there are four nodes with {A, B} representing evidences and
{C, D} representing hypotheses.
A B
C D
● All four variables have two possible values T (for true) and F (for false).
Alarm ringing
● Using this model one can answer questions using the conditional
probability formula as follows:
● It can easily handle situations where some data entries are missing as
this model encodes dependencies among all variables.
● The probabilities are described as a single numeric point value. This can
be a distortion of the precision that is actually available for supporting
evidence.
● The quality and extent of the prior beliefs used in Bayesian inference
processing are major shortcomings.
● Reliability of Bayesian network depends on the reliability of prior
knowledge.
Selecting the proper distribution model to describe the data has a notable effect
on the quality of the resulting network. Therefore, selection of the statistical
distribution for modeling the data is very important.
(1 – P(H))
=
In order to avoid getting a negative value of belief, we can modify the above
definition to obtain positive value of measure as follows:
1, if P(H) = 1
Max (P(H|E), P(H)) – P(H)
MB[H, E] = , otherwise
(1 – P(H))
Alternatively,
1,
if P(H) = 0
P(H) – Min{P(H|E), P(H)} P(H)
MD[H, E] = , otherwise
Let us first compute MB(H, E1 and E2) and MD(H, E1 and E2)
● Similarly MD is defined
and
Therefore,
CF(H, E1 and E2) = 0.58
● Case 3: In chained rule, the rules are chained together with the
result that the outcome of one rule is input of another rule. For
example, if the outcome of an experiment is treated as an evidence
for some hypothesis i.e., E1 E2 H
● These degrees of belief may or may not have the mathematical properties
of probabilities.
● The difference between them will depend on how closely the two events
are related.
Example
– Suppose Mary tells John that his car is stolen. Then John’s belief
on the truth of this statement will depend on the reliability of
Mary. But it does not mean that the statement is false if Mary is
not reliable.
– This zero does not mean that John is sure that his car is not stolen
as in the case of probability, 0 would mean that John is sure that
his car is not stolen. The values 0.85 and the 0 together constitute
a belief function.
● The power set P(U), is the set of all possible subsets of U, including
the empty set represented by .
● Assume that m1 and m2 are two belief functions used for representing
multiple sources of evidences for two different hypotheses.
m3() = 0
A B = C (m1(A) * m2(B))
m3(C) =
1 - A B = (m1(A) * m2(B))
● This belief function gives new value when applied on the set C = A B.
● The normalization factor has the effect of completely ignoring conflict and
attributing any mass associated with conflict to the null set.
– This means we are sure that answer is somewhere in the whole set
U.
m1(U) = 0.2
m2(U) = 0.4
● Now previous belief functions are modified to m3 with the following belief
values and are different from earlier beliefs.
m3({flu}) = 0.48
m3({flu, cold}) = 0.12
m3(U) = 0.08
m4(U) = 0.3
● From the table we get multiple belief values for empty set () and its total
belief value is 0.56.
● While computing new belief we may get same subset generated from
different intersection process. The ‘m’ value for such set is computed by
summing all such values.
● In Fuzzy logic, a statement can be both true or false and also can be
neither true nor false. Fuzzy logic is non monotonic logic.
hold
● Russell’s paradox
– Assume that he did shave himself. But we see from the story that
he shaved only those men who did not shave themselves.
Therefore, he did not shave himself.
Example – Paradox
● “ All Cretans are liars”, said the Cretan
● If the Cretan is liar then his claim can not be believed and so is not a
liar.
● The main idea behind Fuzzy systems is that truth values (in fuzzy logic)
or membership values are indicated by a value in the range [0,1] with 0
for absolute falsity and 1 for absolute truth.
● Fuzzy set theory differs from conventional set theory as it allows each
element of a given set to belong to that set to some degree.
● In contrast to classical set theory each element either fully belongs to the
set or is completely excluded from the set.
● In other words, classical set theory represents a special case of the more
general fuzzy set theory.
Example
Probability approach:
● We may assign the statement “Helen is old” the truth value of 0.95. The
interpretation is that there is 95% chance of Helen is old
Fuzzy approach:
● Although these two statements seem similar but they actually carry
different meanings.
● First view: There are 5% chances that Helen may not old
● Second view: There is no chance of Helen being young and she is more
or less old.
0, for x a
2[(x-a) / (c-a)]2
for a x b
for b < x c
1 for x c
F = { (x, µF(x) ) | x X}
A = { (3, 0.7), (5, 1), (6, 0.8) } and B = {(3, 0.9), (4, 1), (6, 0.6) }
A’ = { (1, 1), (2, 1), (3, 0.3), (4, 1), (6, 0.2), (7, 1) }
Additional operations
● S-shaped function
● Z-shaped function
● Pi function
● Vicinity function
S-shaped function
0, for x a
2[(x-a) / (c-a)]2 , for a x b
S(x, a, b, c) = 1- 2[(x-c) / (c-a)] , 2
for b < x c
1, for x c
a b c
Figure S-shaped Membership Function
Z-Shaped Function
1, for x a
1- 2[(x-a) / (c-a)]2, for a x b
Z(x, a, b, c) = 2[(x-c) / (c-a)]2, for b < x c
0, for x c
Graphical Representation
a b c
Figure Z membership function
0, if x < a
(x – a) / (b- a), if a ≤ x ≤ b
F(x, a, b, c) =
(c – x) / (c – b), if b ≤ x ≤ c
0, if c < x
1.2
1
Membership Values
0.8
0.6
0.4
a b c
0.20
0 20 40 60 80 100
0, if x < a
(x – a) / (b- a), if a ≤ x ≤ b
F(x, a, b, c, d) = 1, if b < x < c
(d – x) / (d – c), if c ≤ x ≤ d
0, if d < x
( x, a, ( x b ) 2
b) e 2 a 2
Pi Function
● The parameters a and d locate the ‘feet’ of the curve, while b and c locate
its ‘shoulders’. In the graph given in Fig. 10.14, a = 2, b = 4, c = 5, and d
= 9.
a b c d
Figure Pi-shaped Membership Function
Vicinity function
● Total width of the function between two zero points is equal to ‘2b’.
a
Figure Vicinity Function
● Most of the actual fuzzy control operations are drawn from a small set of
different curves.
– Physical measurement
● Physical measurement: