mod(ML)-4
mod(ML)-4
Learning
Module 4
Bayesian Learning
Introduction
Probability-based learning is a crucial practical learning approach that combines
prior knowledge or prior probabilities with newly observed data to make decisions
or predictions
It relies heavily on probability theory, which helps in modeling randomness,
uncertainty, and noise when predicting future events
1. It helps describe how random events occur and how uncertainty can be
mathematically captured
2. Especially useful when dealing with massive data, as it models and predicts
hidden quantities
Probabilistic vs Deterministic Models
Instead of giving a single solution, it gives a probability distribution over possible
outcomes
Given the same initial conditions, the model will always produce the same output
Introduction - Bayesian Learning
While probabilistic learning generally models uncertainty using observed data,
Bayesian learning goes a step further by using subjective probabilities
Personal belief or interpretation about the likelihood of an event
These beliefs can change over time as more data is observed
Two major algorithms under Bayesian learning
1. Naïve Bayes Learning
2. Bayesian Belief Networks (BBN)
Fundamentals of Bayes Theorem
The prior probability reflects what we already know about an event before any new
observations are made
It’s our starting point. It represents the initial degree of belief in a hypothesis
before seeing the new evidence.
Likelihood measures how probable the observed evidence is, assuming that the
hypothesis is true.
It’s denoted as P(Evidence|Hypothesis) — the probability of the observed
evidence given that the hypothesis holds.
It helps us evaluate how well the hypothesis explains the new data
Posterior probability is the updated probability of the hypothesis after observing
the new evidence
It’s denoted as P(Hypothesis|Evidence) — the probability that the hypothesis is
true given the observed evidence.
Classification Using Bayes Model
Naïve Bayes Classification models are based fundamentally on Bayes Theorem
Bayes' rule is a mathematical formula used to compute the posterior probability of
a hypothesis, given prior information and new evidence
Step 2: Frequency Matrix and Likelihood probability for each of the feature
2a: CGPA:
Naïve Bayes Algorithm
2b: Interactiveness
Given the test data (CGPA ≥ 9, Interactiveness = Yes, Practical Knowledge = Average,
Communication Skills = Good), apply Bayes' theorem to determine whether the
student will receive a job offer or not.
Naïve Bayes Algorithm
Step 3: Calculate the probability of all hypothesis
Since one of the probabilities is zero, the entire multiplication becomes zero
Laplace Correction (Smoothing) is suggested
Add 1 to each count (even if it was zero)
Adjust the denominator accordingly (to account for the added counts)
This ensures no attribute probability is exactly zero, allowing the model to make
better predictions even for unseen combinations.
Bayes Optimal Classifier
It is a probabilistic model that uses Bayes' theorem to find the best possible
classification for a new data point
Instead of choosing just the single most probable hypothesis, it combines all
hypotheses based on their posterior probabilities
It aggregates the predictions of all hypotheses instead of committing early to one
hypothesis
Bayes Optimal Classifier
4. Activation Function
The binary step function is used