CS3491 Unit 2 Aiml
CS3491 Unit 2 Aiml
CS3491 Unit 2 Aiml
PROBABILISTIC REASONING
Acting under uncertainty – Bayesian inference – naïve bayes models. Probabilistic reasoning–Bayesian
networks – exact inference in BN – approximate inference in BN – causal networks.
PROBABILISTIC REASONING:
Conditional probability:
Conditional probability is a probability of occurring an event when another event has already happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the probability of A
under the conditions of B", it can be written as:
Example:
In a class, there are 70% of the students who like English and 40% of the students who likes English and
mathematics, and then what is the percent of students those who like English also like mathematics?
Solution:
Let, A is an event that a student likes Mathematics
B is an event that a student likes English.
Hence, 57% are the students who like English also like Mathematics.
BAYES' THEOREM:
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which
determines the probability of an event with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. Bayesian inference is an
application of Bayes' theorem. It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the
probability of cancer more accurately with the help of age.
Bayes' theorem can be derived by product rule & conditional probability of event A with known event B:
As from product rule we can write:
P(A ⋀ B)= P(A|B) P(B) or
Similarly, the probability of event B with known event A:
P(A ⋀ B)= P(B|A) P(A)
Equating right hand side of both the equations, we will get:
The above equation (a) is called as Bayes' rule or Bayes' theorem. It shows the simple relationship
between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis
A when we have occurred an evidence B.
P(B|A) is called likelihood, in which if hypothesis is true, then we calculate the probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the evidence
P(B) is called marginal probability, pure probability of an evidence.
In n general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be written as:
Applications:
o Real world applications are probabilistic in nature, and Bayesian network is used to represent the
relationship between multiple events. It can also be used in various tasks including
Prediction
Anomaly detection
Diagnostics
Automated insight
Reasoning
Time series prediction
Decision making under uncertainty.
NAIVE BAYESIAN MODEL
Naive Bayesian classifier is based on Bayes’ theorem with the independence assumptions between
predictors. A Naive Bayesian model is easy to build, it particularly useful for very large datasets.
Why is it called Naïve Bayes?
Naïve Bayes algorithm is comprised of two words Naïve and Bayes,
o Naïve: It is called Naïve because occurrence of a certain feature is independent of the occurrence of
other features. Such as if the fruit is identified on the bases of color, shape, and taste, then red, spherical,
and sweet fruit is recognized as an apple. Hence each feature individually contributes to identify an
apple without depending on each other.
o Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.
Bayes' Theorem:
o Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the probability
of a hypothesis with prior knowledge. It depends on the conditional probability.
o The formula for Bayes' theorem is given as:
Where,
P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
P(B) is Marginal Probability: Probability of Evidence.
Outlook Play
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
Solution:
Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 5
Weather No Yes
Rainy 2 2 4/14=0.29
Sunny 2 3 5/14=0.35
The posterior probability can be calculated by first, constructing a frequency table for each attribute against
the target. Then, transforming the frequency tables to likelihood tables and finally use the Naive Bayesian
equation to calculate the posterior probability for each class. The class with the highest posterior
probability is the outcome of prediction.
The likelihood tables for all four predictors.
RESULT: Play Golf=no
EXAMPLE-3
Two tables frequency and likelihood tables is used to calculate prior and posterior probability. Frequency
table contains the occurrence of labels for all features. There are two likelihood tables. Likelihood Table 1
is showing prior probabilities of labels and Likelihood Table 2 is showing the posterior probability.
BAYESIAN INFERENCE
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update
the probability for a hypothesis as more evidence or information becomes available.
Bayesian inference derives the posterior probability as a consequence of two antecedents: a prior
probability and a "likelihood function" derived from a statistical model for the observed data. Bayesian
inference computes the posterior probability according to Bayes' theorem:
where
CAUSAL NETWORKS
Causal Networks is a Acyclic Digraph, based on Cause and Effect relationships rather than
Correlational Relationships. Causal networks are diagrams that indicate causal connections using arrows,
where an arrow from A to B indicates that A is the cause of B.
Types of Causes:
(b) Indirect Causal edges may represent indirect effects that occur via unmeasured intermediate nodes.
If node A causally influences node B via measured node C, the causal network should contain edges from
A to C and from C to B, However, if node C is not measured (and is not part of the network), the causal
network should contain an edge from A to B (bottom).
(c) Causal edges depend on biological context; Causal edge from A to B appears in context1, not in context2
(d) Correlation and causation. Nodes A and B are correlated owing to regulation by the same node (C), but
in this example no sequence of mechanistic events links A to B, and thus inhibition of A does not change the
abundance of B (lines in bottom right graph are as defined in a). There is no causal edge from A to B.