0% found this document useful (0 votes)
2 views65 pages

ML Unit 5

Markov Chain Monte Carlo (MCMC) is a class of algorithms used for sampling from complex probability distributions, particularly in Bayesian inference, by constructing a Markov chain with the desired distribution as its equilibrium. Key concepts include the Markov chain, Monte Carlo methods, and ergodicity, with common algorithms such as Metropolis-Hastings and Gibbs Sampling. MCMC is widely applied in machine learning for tasks like Bayesian inference, hyperparameter tuning, and modeling latent variables.

Uploaded by

Sameer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views65 pages

ML Unit 5

Markov Chain Monte Carlo (MCMC) is a class of algorithms used for sampling from complex probability distributions, particularly in Bayesian inference, by constructing a Markov chain with the desired distribution as its equilibrium. Key concepts include the Markov chain, Monte Carlo methods, and ergodicity, with common algorithms such as Metropolis-Hastings and Gibbs Sampling. MCMC is widely applied in machine learning for tasks like Bayesian inference, hyperparameter tuning, and modeling latent variables.

Uploaded by

Sameer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Markov Chain Monte Carlo (MCMC)

https://www.slideshare.net/shivangisaxena566/hidden-markov-model-ppt

Markov Chain Monte Carlo (MCMC) is a


powerful class of algorithms widely used in
machine learning, Bayesian statistics, and
computational physics for sampling from
complex probability distributions when
direct sampling is difficult. Here's a
breakdown of what MCMC is and how it’s
used in machine learning:
What is MCMC?
MCMC methods are used to generate samples
from a target probability distribution (usually
a posterior in Bayesian inference) by
constructing a Markov chain that has the
desired distribution as its equilibrium
distribution.

Bayesian inference is a method of statistical


inference in which Bayes' theorem is used to
calculate a probability of a hypothesis, given
prior evidence
• Key Concepts
• Markov Chain: A stochastic process where the
next state depends only on the current state, not
the history.
• Monte Carlo: Refers to methods that rely on
random sampling to obtain numerical results.
• Stationary Distribution: The distribution to which
the Markov chain converges after many
iterations.
• Ergodicity: Ensures that the time average of the
samples equals the ensemble average.
• How MCMC Works
• Start with an initial guess.
• Generate a new sample using a proposal
distribution.
• Accept or reject the new sample based on a
rule (like the Metropolis-Hastings criterion).
• Repeat this process to get a sequence of
samples (the Markov chain).
• After a burn-in period, the samples
approximate the target distribution.
Hidden Markov Models (HMMs)
• Hidden Markov Models (HMMs)
are probabilistic models that describe a
sequence of observations as being generated
by a hidden sequence of states, where the
states are not directly observable, but their
transitions and the symbols they emit are.
• Common MCMC Algorithms
• Metropolis-Hastings Algorithm
– Proposes new points and accepts them with a
probability that maintains the desired distribution.
• Gibbs Sampling
– A special case of Metropolis-Hastings when you
can sample directly from the conditional
distributions.
• Hamiltonian Monte Carlo (HMC)
– Uses gradient information to propose samples,
often used in tools like Stan and PyMC3 for
efficient Bayesian inference.
• Applications in Machine Learning
• Bayesian Inference: Estimating posterior
distributions over model parameters.
• Model Averaging: Averaging predictions across
multiple models.
• Hyperparameter Tuning: Especially in Bayesian
optimization.
• Latent Variable Models: Inference in models like
Hidden Markov Models (HMMs), LDA (Latent
Dirichlet Allocation).
• Uncertainty Quantification: In predictions,
especially in deep learning with Bayesian neural
networks.
• Example: Bayesian Linear Regression with MCMC
• In Bayesian linear regression, you might not know the
exact distribution of the coefficients. Using MCMC, you
can sample from the posterior distribution of the
weights and compute expectations, variances, and
credible intervals.

• Libraries That Use MCMC


• PyMC3 / PyMC4
• Stan / CmdStanPy
• TensorFlow Probability
• emcee (Python-based affine-invariant ensemble
sampler)
Graphical Models
• Markov Chain Monte Carlo (MCMC) is a
sampling method used in graphical models,
particularly for Bayesian inference when exact
computation is intractable. MCMC constructs
a Markov chain that gradually samples from
distributions, approaching the desired
posterior distribution. It's particularly useful
for complex distributions represented by
undirected
• MCMC is a class of algorithms used to sample
from complex probability distributions,
especially when direct sampling is infeasible.
• Key Idea: Construct a Markov Chain whose
stationary distribution is the target
distribution (e.g., posterior distribution in
Bayesian inference).
• Common Algorithms:
– Metropolis-Hastings
– Gibbs Sampling
Graphical Models:

• Graphical models represent probabilistic


relationships among a set of variables using a
graph structure.
• Types:
– Bayesian Networks (Directed Acyclic Graphs)
– Markov Random Fields (Undirected Graphs)
• They provide a compact way to represent joint
probability distributions.
Using MCMC in Graphical Models:

• Graphical models often involve high-dimensional,


complex distributions that are intractable to compute
directly.
• MCMC methods are used to perform inference in
these models, particularly in Bayesian frameworks.
• Example: Gibbs Sampling in Graphical Models
• Works well when we can sample from the conditional
distributions of each variable.
• Suitable for models like:
– Hidden Markov Models (HMM)
– Latent Dirichlet Allocation (LDA)
– Bayesian Networks with conjugate priors
Workflow Summary:

• Model Definition: Use a graphical model to


define dependencies.
• Posterior Estimation: Use Bayes’ theorem
(often intractable).
• Sampling with MCMC: Use Metropolis-
Hastings or Gibbs to generate samples.
• Inference: Estimate marginals, MAP estimates,
or expected values.
Applications:

• Natural Language Processing (Topic Models)


• Computer Vision (Image Segmentation)
• Bioinformatics (Gene Expression Models)
• Robotics (Localization & Mapping)
Bayesian Networks
• Bayesian Networks is key to understanding many
probabilistic models, especially in AI, ML, and
data science. Here's a clear and structured
overview you can use for learning, teaching, or
presenting.

• A Bayesian Network (also known as a Belief


Network) is a Directed Acyclic Graph (DAG) that
represents a set of random variables and their
conditional dependencies.
• Components:
• Nodes → Represent random variables
(discrete or continuous).
• Edges → Represent direct dependencies
(conditional relationships).
• Conditional Probability Distributions (CPDs):
– For each node Xi​, there’s a CPD:
– P(Xi ∣ Parents(Xi))
• Inference in Bayesian Networks:
• Used to compute posterior probabilities,
predict unknowns, or update beliefs.
• Common methods:
• Exact inference:
– Variable Elimination
– Junction Tree Algorithm
• Approximate inference:
– MCMC (e.g., Gibbs Sampling)
– Likelihood Weighting
• Example: Medical Diagnosis BN
• Variables:
• Flu (F)
• Fever (Fe)
• Cough (C)
• Graph:
• F → Fe
• F→C
• Factorization:
• P(F,Fe,C)=P(F)⋅P(Fe∣F)⋅P(C∣F)
• Applications:
• Medical diagnosis
• Risk analysis
• Speech recognition
• Sensor fusion in robotics
• Bioinformatics (e.g., gene expression
networks)
Tracking Methods
• Hidden Markov Models (HMMs) are a
fundamental tool for modeling sequential or
time-series data, and they play a major role in
tracking problems—especially when the
system state is not directly observable (i.e.,
"hidden"). Here's a breakdown focusing on
HMMs and their application in tracking
methods:
What is a Hidden Markov Model?
• An HMM is a statistical model where:
• The system being modeled is assumed to follow a Markov process
with hidden states.
• Each hidden state produces an observable output.

Components of an HMM:
• States (S): S={s1,s2,...,sN}
• Observations (O): O={o1,o2,...,oT}
• Transition Probabilities (A):
aij=P(sj ∣ si)
• Emission Probabilities (B):
bj(ot)=P(ot ∣ sj)
• Initial Probabilities (π):
πi=P(si at t=1)
• Tracking with HMMs:
• Tracking is the task of estimating the most probable sequence of hidden
states over time, given a sequence of observations.
• Common Tracking Methods:
• 1. Filtering (Online Inference)
– Goal: Estimate the current state at time ttt, given all observations up to ttt:
– P(st∣o1,o2,...,ot)
– Algorithm: Forward Algorithm
• 2. Smoothing (Offline Inference)
– Goal: Estimate past states using all available observations:
– P(st∣o1,...,oT)(t<T)
– Algorithm: Forward-Backward Algorithm
• 3. Most Likely State Sequence
– Goal: Determine the single best sequence of hidden states:
– arg maxSP(S∣O)
– Algorithm: Viterbi Algorithm
• 4. Learning the Model (if unknown)
– Goal: Learn the parameters A,B,π from data.
– Algorithm: Baum-Welch (a special case of Expectation-Maximization)
• Use Cases in Tracking:
• Speech recognition: Track phoneme states
over time
• Activity recognition: Track hidden
physical/behavioral states
• Object tracking in vision: Track motion
patterns or positions
• Robot localization: Estimate robot’s position
from noisy sensor data

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy