Bayes Rule

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

PANIMALAR INSTITUTE

OF
TECHNOLOGY
AI –Bayes Rule and its Problems
By
Dr. S. JOTHI SHRI, M.E,Ph.D.
Associate professor,
Panimalar Engineering College
Bayes' theorem in Artificial intelligence

Bayes' theorem:
• Bayes' theorem is also known as Bayes' rule, Bayes' law,
or Bayesian reasoning, which determines the probability of an
event with uncertain knowledge.
• In probability theory, it relates the conditional probability and
marginal probabilities of two random events.
• Bayes' theorem was named after the British
mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to
Bayesian statistics.
• It is a way to calculate the value of P(B|A) with the knowledge
of P(A|B).
• As from product rule we can write:
• Bayes' theorem allows updating the
probability prediction of an event by
observing new information of the real world.
• Example: If cancer corresponds to one's age
then by using Bayes' theorem, we can
determine the probability of cancer more
accurately with the help of age.
• Bayes' theorem can be derived using product
rule and conditional probability of event A
with known event B:
Product Rule
• P(A ⋀ B)= P(A|B) P(B) or
• Similarly, the probability of event B with
known event A:
• P(A ⋀ B)= P(B|A) P(A)
• Equating right hand side of both the
equations, we will get:
Probability
• P(A|B) is known as posterior, which we need to calculate, and
it will be read as Probability of hypothesis A when we have
occurred an evidence B.
• P(B|A) is called the likelihood, in which we consider that
hypothesis is true, then we calculate the probability of
evidence.
• P(A) is called the prior probability, probability of hypothesis
before considering the evidence
• P(B) is called marginal probability, pure probability of an
evidence.
• In the equation (a), in general, we can write P
(B) = P(A)*P(B|Ai),
• hence the Bayes' rule can be written as:

• Where A1, A2, A3,........, An is a set of mutually exclusive and


exhaustive events.
• Applying Bayes' rule:
• Bayes' rule allows us to compute the single term
P(B|A) in terms of P(A|B), P(B), and P(A).
• This is very useful in cases where we have a good probability of
these three terms and want to determine the fourth one.
• Suppose we want to perceive the effect of some unknown
cause, and want to compute that cause, then the Bayes' rule
becomes:
Example-Problem 1
• If a single card is drawn from a standard deck
of playing cards, the probability that the card
is a king is 4/52, since there are 4 kings in a
standard deck of 52 cards. Paraphrasing this, if
a king is an event "this card is a king," the
prior probability P(King) = 4/52.
Cards
What is the probability of drawing a king from a deck of cards?

• Solution: Here the event E is drawing a king


from a deck of cards.
• There are 52 cards in a deck of cards.
• Hence, total number of outcomes = 52
• The number of favorable outcomes = 4 (as there
are 4 kings in a deck)
• Hence, the probability of this event occuring is
• P(E) = 4/52 = 1/13
Detailed information
• We use the basic formula of probability to solve the problem.
• Probability = Number of possible outcomes/Total number of
favorable outcomes.
• Total number of cards from a well-shuffled deck = 52
• Number of spade cards = 13
• Number of heart cards = 13
• Number of diamond cards = 13
• Number of club cards = 13
• Total number of kings = 4
• Total number of queens = 4
• Total number of jacks = 4
• Number of face cards = 12
Applying Bayes theorem
• If evidence is provided (for instance, someone looks at
the card) that the single card is a face card, then
• the posterior probability P(King|Face) can be calculated
using Bayes theorem formula:
• P(King|Face) = P(Face|King) * P(King) / P(Face)
• Since every King is also a face card,
• P(Face|King) = 1.
• Since there are 3 face cards in each suit (Jack, Queen,
King),
• the probability of a face card is P(Face) = 12/52.
Solution
• Probability of getting a face card = Number of
face cards/Total number of outcomes
• 12/52 = 3/13
• Using Bayes' formula gives
Example 2
• A doctor is aware that disease meningitis causes a patient to have a
stiff neck, and it occurs 80% of the time. He is also aware of some
more facts, which are given as follows:
• The Known probability that a patient has meningitis disease is
1/30,000.
• The Known probability that a patient has a stiff neck is 2%.
• Let a be the proposition that patient has stiff neck and b be the
proposition that patient has meningitis. ,
so we can calculate the following as:
• P(a|b) = 0.8
• P(b) = 1/30000
• P(a)= .02
• Hence, we can assume that 1 patient out of 750 patients has
meningitis disease with a stiff neck
Bayesian Belief Network in Artificial Intelligence

• Bayesian belief network is key computer technology for


dealing with probabilistic events and to solve a problem
which has uncertainty. We can define a Bayesian network as:
• "A Bayesian network is a probabilistic graphical model
which represents a set of variables and their conditional
dependencies using a directed acyclic graph."
• It is also called a Bayes network, belief network, decision
network, or Bayesian model.
• Bayesian networks are probabilistic, because these networks
are built from a probability distribution, and also use
probability theory for prediction and anomaly detection.
• Real world applications are probabilistic in nature, and to represent the
relationship between multiple events, we need a Bayesian network. It can also
be used in various tasks including prediction, anomaly detection, diagnostics,
automated insight, reasoning, time series prediction, and decision making
under uncertainty.
• Bayesian Network can be used for building models from data and experts
opinions, and it consists of two parts:
• Directed Acyclic Graph
• Table of conditional probabilities.
• The generalized form of Bayesian network that represents and solve decision
problems under uncertain knowledge is known as an Influence diagram.
• A Bayesian network graph is made up of nodes and Arcs (directed links),
where:
Architecture
Network graph
• Each node corresponds to the random variables, and a variable can
be continuous or discrete.
• Arc or directed arrows represent the causal relationship or conditional
probabilities between random variables. These directed links or arrows
connect the pair of nodes in the graph.

These links represent that one node directly influence the other node,
and if there is no directed link that means that nodes are independent
with each other
– In the above diagram, A, B, C, and D are random variables represented by
the nodes of the network graph.
– If we are considering node B, which is connected with node A by a directed
arrow, then node A is called the parent of Node B.
– Node C is independent of node A.
The Bayesian network
The Bayesian network has mainly two components:
• Causal Component
• Actual numbers
• Each node in the Bayesian network has condition probability
distribution P(Xi |Parent(Xi) ), which determines the effect of the parent
on that node.
• Bayesian network is based on Joint probability distribution and conditional
probability. So let's first understand the joint probability distribution:
• Joint probability distribution:
• If we have variables x1, x2, x3,....., xn, then the probabilities of a different
combination of x1, x2, x3.. xn, are known as Joint probability distribution.
Joint probability distribution.
• P[x1, x2, x3,....., xn], it can be written as the
following way in terms of the joint probability
distribution.
• = P[x1| x2, x3,....., xn]P[x2, x3,....., xn]
• = P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|
xn]P[xn].
• In general for each variable Xi, we can write the
equation as:
• P(Xi|Xi-1,........., X1) = P(Xi |Parents(Xi ))
Example
• Harry installed a new burglar alarm at his home to
detect burglary. The alarm reliably responds at detecting
a burglary but also responds for minor earthquakes.
Harry has two neighbors David and Sophia, who have
taken a responsibility to inform Harry at work when they
hear the alarm. David always calls Harry when he hears
the alarm, but sometimes he got confused with the
phone ringing and calls at that time too. On the other
hand, Sophia likes to listen to high music, so sometimes
she misses to hear the alarm. Here we would like to
compute the probability of Burglary Alarm.
Problem
• Calculate the probability that alarm has
sounded, but there is neither a burglary, nor
an earthquake occurred, and David and
Sophia both called the Harry.
Solution:

• The Bayesian network for the above problem is given below. The
network structure is showing that burglary and earthquake is the parent
node of the alarm and directly affecting the probability of alarm's going
off, but David and Sophia's calls depend on alarm probability.
• The network is representing that our assumptions do not directly
perceive the burglary and also do not notice the minor earthquake, and
they also not confer before calling.
• The conditional distributions for each node are given as conditional
probabilities table or CPT.
• Each row in the CPT must be sum to 1 because all the entries in the table
represent an exhaustive set of cases for the variable.
• In CPT, a boolean variable with k boolean parents contains
2K probabilities. Hence, if there are two parents, then CPT will contain 4
probability values
List of all events occurring in this network:

• Burglary (B)
• Earthquake(E)
• Alarm(A)
• David Calls(D)
• Sophia calls(S)
• We can write the events of problem statement in the form of
probability: P[D, S, A, B, E], can rewrite the above probability statement
using joint probability distribution:
• P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
• =P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
• = P [D| A]. P [ S| A, B, E]. P[ A, B, E]
• = P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
• = P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]
Diagrammatic Representation
Problem -Steps
• P(B= True) = 0.002, which is the probability of
burglary.
• P(B= False)= 0.998, which is the probability of no
burglary.
• P(E= True)= 0.001, which is the probability of a
minor earthquake
• P(E= False)= 0.999, Which is the probability that an
earthquake not occurred.
• We can provide the conditional probabilities as per
the below tables:
Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends on
Burglar and earthquake

B E P(A= True) P(A= False)


:

True True 0.94 0.06

True False 0.95 0.04

False True 0.31 0.69

False False 0.001 0.999


Conditional probability table for David Calls:

The Conditional probability of David that he will call depends on the probability of Alarm.

A P(D= True) P(D= False)

True 0.91 0.09

False 0.05 0.95


Conditional probability table for Sophia Calls:
The Conditional probability of Sophia that she calls is depending
on its Parent Node "Alarm."
From the formula of joint distribution, we can write the problem
statement in the form of probability distribution:

A P(S= True) P(S= False)


True 0.75 0.25
False 0.02 0.98
From the formula of joint distribution

• we can write the problem statement in the form


of probability distribution:
• P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E)
*P (¬B) *P (¬E).
• = 0.75* 0.91* 0.001* 0.998*0.999
• = 0.00068045.
• Hence, a Bayesian network can answer any query
about the domain by using Joint distribution.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy