Unit 5
Unit 5
Unit 5
PROBABILISTIC REASONING
• Hence, 57% are the students who like English also like
Mathematics.
Bayesian Network
• Bayesian network is a data structure to represent an
uncertain knowledge or domain.
• It is a directed acyclic graph whose nodes
correspond to random variables.
• Each node is annotated with conditional probability
distribution for the node given it’s parents.
• Bayesian networks, often abbreviated as “Bayes
net” were called belief networks in the 1980s and
1990s.
Bayesian Belief Network in artificial intelligence
• Bayesian belief network is key computer technology for
dealing with probabilistic events and to solve a problem
which has uncertainty. We can define a Bayesian
network as:
• "A Bayesian network is a probabilistic graphical model
which represents a set of variables and their conditional
dependencies using a directed acyclic graph."
• It is also called a Bayes network, belief network,
decision network, or Bayesian model.
• Bayesian networks are probabilistic, because these
networks are built from a probability distribution, and
also use probability theory for prediction and anomaly
detection.
Bayesian Belief Network
• Real world applications are probabilistic in nature, and to
represent the relationship between multiple events, we need
a Bayesian network.
• It can also be used in various tasks including prediction,
anomaly detection, diagnostics, automated insight,
reasoning, time series prediction, and decision making
under uncertainty.
• Bayesian Network can be used for building models from data
and experts opinions, and it consists of two parts:
• Directed Acyclic Graph
• Table of conditional probabilities.
• The generalized form of Bayesian network that represents
and solve decision problems under uncertain knowledge is
known as an Influence diagram.
The Bayesian network has mainly two components:
• Causal Component
• Actual numbers
• Each node in the Bayesian network has condition probability
distribution P(Xi |Parent(Xi) ), which determines the effect of
the parent on that node.
• Bayesian network is based on Joint probability distribution
and conditional probability. So let's first understand the joint
probability distribution:
Joint probability distribution:
• If we have variables x1, x2, x3,....., xn, then the probabilities of
a different combination of x1, x2, x3.. xn, are known as Joint
probability distribution.
• P[x1, x2, x3,....., xn], it can be written as the
following way in terms of the joint probability
distribution.
• = P[x1| x2, x3,....., xn]P[x2, x3,....., xn]
• = P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|
xn]P[xn].
• In general for each variable Xi, we can write
the equation as:
• P(Xi|Xi-1,........., X1) = P(Xi |Parents(Xi ))
Example:
• Harry installed a new burglar alarm at his home to
detect burglary. The alarm reliably responds at
detecting a burglary but also responds for minor
earthquakes. Harry has two neighbors John and Mary,
who have taken a responsibility to inform Harry at work
when they hear the alarm. John always calls Harry when
he hears the alarm, but sometimes he got confused
with the phone ringing and calls at that time too. On
the other hand, Mary likes to listen to high music, so
sometimes she misses to hear the alarm. Here we
would like to compute the probability of Burglary
Alarm.
Problem:
• Calculate the probability that alarm has sounded, but there is neither a
burglary, nor an earthquake occurred, and John and Mary both called
the Harry.
Solution:
• The network structure is showing that burglary and earthquake is the
parent node of the alarm and directly affecting the probability of
alarm's going off, but John and Mary calls depend on alarm probability.
• The network is representing that our assumptions do not directly
perceive the burglary and also do not notice the minor earthquake,
and they also not confer before calling.
• The conditional distributions for each node are given as conditional
probabilities table or CPT.
• Each row in the CPT must be sum to 1 because all the entries in the
table represent an exhaustive set of cases for the variable.
• In CPT, a boolean variable with k boolean parents contains
2K probabilities. Hence, if there are two parents, then CPT will contain
4 probability values
1. Directed acyclic graph
• Nodes = random variables Burglary, Earthquake, Alarm,
Mary calls and John calls
• Links = direct (causal) dependencies between variables,
• The chance of Alarm is influenced by Earthquake
• The chance of John calling is affected by the Alarm
2. Local conditional distributions
• Relate variables and their parents.
• The local probability information attached to each
node takes the form of a conditional probability table
(CPT).
• Each row in the CPT contains the conditional
probability of each node value for conditioning case.
• A conditioning case is just the possible combination
of values for the parent nodes.
• Each row must sum to 1, because the entries
represent an exhaustive set of cases for the variable.
• Let's take the observed probability for the
Burglary and earthquake component:
armempty
on(block3, block4)
on(block5, block1)
ont(block2)
What is an Expert System?
•It is important to remember that an expert system is not used to replace the
human experts; instead, it is used to assist the human in making a complex
decision.
•These systems do not have human capabilities of thinking and work on the
basis of the knowledge base of the particular domain.
Below are some popular examples of the Expert System:
1. IF-THEN rules
• Human experts usually tend to think along :