Module 5 2

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 41

Bayesian Belief Network

Introduction
Suppose you are trying to
determine if a patient
has Covid19. You
observe the following
symptoms:
• The patient has a cough
• The patient has a fever
• The patient has difficulty
breathing
Introduction
You would like to
determine how likely the
patient is infected with
Covid19 given that the
patient has a cough, a
fever, and difficulty
breathing
We are not 100% certain
that the patient has
Covid19 because of these
symptoms. We are dealing
with uncertainty!
Introduction

Now suppose you order an


x-ray and observe that the
patient has a wide
mediastinum.
Your belief that that the
patient is infected with
Covid19 is now much
higher.
Introduction

• In the previous slides, what you observed


affected your belief that the patient is
infected with Covid19
• This is called reasoning with uncertainty
• Wouldn’t it be nice if we had some
methodology for reasoning with uncertainty?
Well in fact, we do…
Probabilistic Graphical models
• Probabilistic graphical modelling is a branch of
machine learning that studies how to use
probability distributions to describe the world
and to make useful predictions about it.
Probabilistic modelling

• The probabilistic aspect of modelling is very


important, because:

– Typically, we cannot perfectly predict the future. We


often don’t have enough knowledge about the world,
and often the world itself is stochastic.

– We need to assess the confidence of our predictions;


often, predicting a single value is not enough, we need
the system to output its beliefs about what’s going on in
the world.
• Bayesian belief network is a key computer
technology for dealing with probabilistic
events and to solve a problem which has
uncertainty.
• Bayesian Belief Network is a graphical
representation of different probabilistic
relationships among random variables in a
particular set.
• A Bayesian network is a probabilistic graphical
model which represents a set of variables and their
conditional dependencies using a directed acyclic
graph.
• It is also called a Bayes network, belief network,
decision network, or Bayesian model.
• Bayesian networks are probabilistic, because these
networks are built from a probability distribution,
and also use probability theory for prediction and
anomaly detection.
• Real world applications are probabilistic in
nature, and to represent the relationship
between multiple events, we need a Bayesian
network.
• It can also be used in various tasks
including prediction, anomaly detection,
diagnostics, automated insight, reasoning,
time series prediction, and decision making
under uncertainty.
• Bayesian Network can be used for building
models from data and experts opinions, and it
consists of two parts:
• Directed Acyclic Graph
• Table of conditional probabilities
Bayesian belief network
• Each node corresponds to the random variables, and a variable
can be continuous or discrete.
• Arc or directed arrows represent the causal relationship or
conditional probabilities between random variables. These
directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other
node, and if there is no directed link that means that nodes are
independent with each other
– In the above diagram, A, B, C, and D are random variables represented
by the nodes of the network graph.
– If we are considering node B, which is connected with node A by a
directed arrow, then node A is called the parent of Node B.
– Node C is independent of node A.
• The Bayesian network graph does not contain
any cyclic graph. Hence, it is known as
a directed acyclic graph or DAG.
• Each node in the Bayesian network has
condition probability distribution P(Xi |
Parent(Xi) ), which determines the effect of the
parent on that node.
• Bayesian network is based on Joint probability
distribution and conditional probability.
Joint probability distribution

• If we have variables x1, x2, x3,....., xn, then the


probabilities of a different combination of x1,
x2, x3.. xn, are known as Joint probability
distribution.
P[x1| x2, x3,....., xn]P[x2, x3,....., xn]
P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].
In general for each variable Xi, we can write the
equation as:
Conditional dependencies as follows:
A is conditionally dependent upon B, e.g. P(A|B)
C is conditionally dependent upon B, e.g. P(C|B)
Conditional independencies as follows:
• A is conditionally independent from C: P(A|B, C)
• C is conditionally independent from A: P(C|B, A)
• we can simply state the conditional
independence of B from A and C as P(B),P(A|
B), P(C|B)) or P(B).
• The joint probability of A and C given B
P(A, C | B) = P(A|B) * P(C|B)
• The joint probability of P(A, B, C), calculated
as:
P(A, B, C) = P(A|B) * P(C|B) * P(B)
Bayesian Networks
Bayesian Networks
Bayesian Networks
Bayesian Networks
Bayesian Networks
Example
• Harry installed a new burglar alarm at his home to
detect burglary. The alarm reliably responds at
detecting a burglary but also responds for minor
earthquakes. Harry has two neighbors David and
Sophia, who have taken a responsibility to inform
Harry at work when they hear the alarm. David
always calls Harry when he hears the alarm, but
sometimes he got confused with the phone ringing
and calls at that time too. On the other hand, Sophia
likes to listen to high music, so sometimes she misses
to hear the alarm. Here we would like to compute
the probability of Burglary Alarm.
• The network structure is showing that burglary and
earthquake is the parent node of the alarm and
directly affecting the probability of alarm's going off,
but David and Sophia's calls depend on alarm
probability.
• The network is representing that not directly
perceive the burglary and not minor earthquake, and
they also not confer before calling.
• The conditional distributions for each node are given
as conditional probabilities table or CPT.
• Each row in the CPT must be sum to 1 because
all the entries in the table represent an
exhaustive set of cases for the variable.
• In CPT, a boolean variable with k boolean
parents contains 2K probabilities. Hence, if
there are two parents, then CPT will contain 4
probability values.
Example – Joint distribution
List of all events occurring in this network:
• Burglary (B)
• Earthquake(E)
• Alarm(A)
• David Calls(D)
• Sophia calls(S)
• We can write the events of problem statement
in the form of probability: P[D, S, A, B, E], can
rewrite the above probability statement using
joint probability distribution:
P[D, S, A, B, E]= P[D | A ]. P[S | A]. P[A| B, E].
P[B ]. P[E]
Let's take the observed probability for the Burglary and
earthquake component:
Burglary ‘B’ –
• P (B=T) = 0.002 (‘B’ is true i.e burglary has occurred)
• P (B=F) = 0.998 (‘B’ is false i.e burglary has not
occurred)
Earthquake ‘E’ –
• P (E=T) = 0.001 (‘E’ is true i.e earthquake has occurred)
• P (E=F) = 0.999 (‘E’ is false i.e earthquake has not
occurred)
Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends
on Burglar and earthquake:
Conditional probability table for David Calls:
The Conditional probability of David that he will
call depends on the probability of Alarm.
Conditional probability table for Sophia Calls:
The Conditional probability of Sophia that she
calls is depending on its Parent Node "Alarm."
Question:
• Calculate the probability that alarm has
sounded, but there is neither a burglary, nor
an earthquake occurred, and David and Sophia
both called the Harry.
From the formula of joint distribution, we can
write the problem statement in the form of
probability distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^
¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
What is the probability that David call?
P(D)= P(D|A)P(A)+P(D|¬A)P(¬A)
=P(D|A){P(A|B,E)*P(B,E)+P(A|¬B,E)*P(¬B,E)+
(P(A|B,¬E)*P(B,¬E)+ P(A|¬B,¬E)*P(¬B,¬E)}+
P(D|¬A){P(¬A|B,E)*P(B,E)+P(¬A|¬B,E)*P(¬B,E)
+P(¬A|B,¬E)*P(B,¬E)+ P(¬A|¬B,¬E)*P(¬B,¬E)}
=P(D|A){P(A|B,E)*P(B)P(E)+P(A|
¬B,E)*P(¬B)P(E)+(P(A|B,¬E)*P(B)P(¬E)+ P(A|
¬B,¬E)*P(¬B)P(¬E)}+ P(D|¬A){P(¬A|
B,E)*P(B)P(E)+P(¬A|¬B,E)*P(¬B)P(E)+P(¬A|
B,¬E)*P(B)P(¬E)+ P(¬A|¬B,¬E)*P(¬B)P(¬E)}
Inference
Example-2

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy