Bayes Rule
Bayes Rule
Bayes Rule
OF
TECHNOLOGY
AI –Bayes Rule and its Problems
By
Dr. S. JOTHI SHRI, M.E,Ph.D.
Associate professor,
Panimalar Engineering College
Bayes' theorem in Artificial intelligence
Bayes' theorem:
• Bayes' theorem is also known as Bayes' rule, Bayes' law,
or Bayesian reasoning, which determines the probability of an
event with uncertain knowledge.
• In probability theory, it relates the conditional probability and
marginal probabilities of two random events.
• Bayes' theorem was named after the British
mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to
Bayesian statistics.
• It is a way to calculate the value of P(B|A) with the knowledge
of P(A|B).
• As from product rule we can write:
• Bayes' theorem allows updating the
probability prediction of an event by
observing new information of the real world.
• Example: If cancer corresponds to one's age
then by using Bayes' theorem, we can
determine the probability of cancer more
accurately with the help of age.
• Bayes' theorem can be derived using product
rule and conditional probability of event A
with known event B:
Product Rule
• P(A ⋀ B)= P(A|B) P(B) or
• Similarly, the probability of event B with
known event A:
• P(A ⋀ B)= P(B|A) P(A)
• Equating right hand side of both the
equations, we will get:
Probability
• P(A|B) is known as posterior, which we need to calculate, and
it will be read as Probability of hypothesis A when we have
occurred an evidence B.
• P(B|A) is called the likelihood, in which we consider that
hypothesis is true, then we calculate the probability of
evidence.
• P(A) is called the prior probability, probability of hypothesis
before considering the evidence
• P(B) is called marginal probability, pure probability of an
evidence.
• In the equation (a), in general, we can write P
(B) = P(A)*P(B|Ai),
• hence the Bayes' rule can be written as:
• The Bayesian network for the above problem is given below. The
network structure is showing that burglary and earthquake is the parent
node of the alarm and directly affecting the probability of alarm's going
off, but David and Sophia's calls depend on alarm probability.
• The network is representing that our assumptions do not directly
perceive the burglary and also do not notice the minor earthquake, and
they also not confer before calling.
• The conditional distributions for each node are given as conditional
probabilities table or CPT.
• Each row in the CPT must be sum to 1 because all the entries in the table
represent an exhaustive set of cases for the variable.
• In CPT, a boolean variable with k boolean parents contains
2K probabilities. Hence, if there are two parents, then CPT will contain 4
probability values
List of all events occurring in this network:
• Burglary (B)
• Earthquake(E)
• Alarm(A)
• David Calls(D)
• Sophia calls(S)
• We can write the events of problem statement in the form of
probability: P[D, S, A, B, E], can rewrite the above probability statement
using joint probability distribution:
• P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
• =P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
• = P [D| A]. P [ S| A, B, E]. P[ A, B, E]
• = P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
• = P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]
Diagrammatic Representation
Problem -Steps
• P(B= True) = 0.002, which is the probability of
burglary.
• P(B= False)= 0.998, which is the probability of no
burglary.
• P(E= True)= 0.001, which is the probability of a
minor earthquake
• P(E= False)= 0.999, Which is the probability that an
earthquake not occurred.
• We can provide the conditional probabilities as per
the below tables:
Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends on
Burglar and earthquake
The Conditional probability of David that he will call depends on the probability of Alarm.