0% found this document useful (0 votes)

32 views34 pages

Chap2 Part2 GMM

This document discusses Gaussian mixture models (GMMs). GMMs are a type of machine learning algorithm that can classify data into different categories based on probability distributions. The document explains key concepts such as Gaussian distributions, GMM parameters, the expectation-maximization algorithm, and comparing GMMs to k-means clustering.

Uploaded by

houcem.swissi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views34 pages

Chap2 Part2 GMM

Uploaded by

houcem.swissi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Chapter 2 : Generative Models

Gaussian Mixture Models

Machine Learning Team
UP GL-BD
Learning Outcomes
• At the end of this chapter, students will be able to:
• Identify a Gaussian distribution

• Define a mixture of Gaussian Distribution

• Enumerate GMM parameters

• Able to identify the model parameters

• Able to use the Expectation-Maximisation algorithm in order to calculate the GMM optimal
parameters

2
Motivation
• Hard clustering vs Soft Clustering

• Hard Clustering: Each point is associated to one and only one Cluster.

• Soft Clustering: Every point belongs to several clusters with certain degrees.

• A limitation of the hard clustering approach is that there is no probability which

can tell us how much a data point is associated with a specific cluster.

• Example: K-means is an example of Hard Clustering

3
Gaussian Distribution
• A Gaussian distribution, or normal distribution, is a type of continuous probability
distribution that is symmetrical about its mean. It has a bell-shaped curve that is
symmetrical from the mean point to both halves of the curve

• Most of the observations are around the

mean, and the further away an observation

is from the mean, the lower its probability

of occurring.

4
Gaussian distribution
• Mathematical definition:
The probability density function related to a continuous random variable x which follows
a normal/Gaussian distribution, is given by :

With : µ= Mean

σ= Standard Variation

5
Gaussian Distribution

• 91,4% of the data are between [µ-2σ] and [µ+2σ]

6
Gaussian Distribution

7
Gaussian distribution
• For d-dimensions, the Gaussian distribution of a vector x

is defined by:

Where µ is the mean and Σ is the covariance matrix

8
Gaussian Distribution

9
Gaussian Mixture Models (GMM)
• Gaussian mixture models (GMMs) are a type of machine learning algorithm.

• They are used to classify data into different categories based on the probability
distribution.

• Gaussian mixture models can be

used in many different areas,

including finance, marketing,

computer vision and so much more.

10
Gaussian Mixture Model (GMM)
• Definition:
• A Gaussian Mixture is a function that is comprised of several Gaussians, each
identified by k {1,…, K}, where K is the number of clusters of our dataset.
Each Gaussian k in the mixture is comprised of the following parameters:
• A mean μ that defines its centre.
• A covariance Σ .
• A mixing probability that defines how big or small the Gaussian function
will be.
11
Gaussian Mixture Model (GMM)
• The probability given in a mixture models of K Gaussians is:

where is the prior probability (weight) of the jth Gaussian

(component)

12
Gaussian Mixture Model (GMM)
• Example of Gaussian Mixture model of two components (2 Gaussian
Distributions)

13
Gaussian Mixture Model (GMM)
• Example: (a) & (b) presents each one a 2-dimensions Gaussian Mixture model

14
Parameters Estimation
• Problem:
Given a set of data X={x1, x2, …, xN} following probably a GMM distribution,
estimate the parameters θ of the GMM that fits the data.
with θ={ µ, Σ, ѡ } for each component of the mixture.

• Solution: Use the technique of Maximum Likelihood

Maximise the likelihood p(X|θ) of the data with regard to the model
parameters

15
Maximum Likelihood Estimation (MLE)
• The Maximum Likelihood Estimation is a frequentist approach used to estimate
the optimal parameters related to a mixture model.

• Given a sample of dataset, the Maximum likelihood estimator calculates the best
values of the mixture model.

• In practise, a likelihood function is created based on the probability density

function.

• It is required to maximise the log related to the likelihood function.

16
Expectation Maximisation (EM) Algorithm
• MLE is a frequentist principle that suggests that given a dataset, the “best” parameters
to use are the ones that maximise the probability of the data

• MLE is a way to formally pose the problem

• Expectation-Maximisation (EM) is a way to solve the problem posed by MLE

• The Expectation-Maximisation (EM) Algorithm is an iterative method used to maximise

the log likelihood.

• It is widely used for optimisation problems especially when the objective function has
complexities such as maximising the log likelihood.

17
Expectation Maximisation (EM) Algorithm

• Expectation (E) step: Given the current

parameters of the model, estimate a
probability distribution.

• Maximization (M) step: Given the current

data, estimate the parameters to update
the model.

18
EM more in depth
• Expectation (E) step: Using the current
estimate for the parameters, create
function for the expectation of the log-
likelihood.

• Maximization (M) step: Computes

parameters maximizing the expected log-
likelihood found on the E step.

19
EM more in depth
• Problem:

θ={ µ, Σ, ѡ } model parameters

X={x1, x2, …, xN} Observed data

zik hidden (latent) variable, related to xi, zik=1 if xi belongs to the k component of the mixture
• EM Steps:
1. Initialize the parameters θ of the model
2. E Step: Find the posterior probabilities of the latent variable Z given current parameter θ
3. M Step: Re-estimate the parameter values given the current posterior probabilities. Use the
computed values of Z to re-estimate θ
4. Iterate steps 2 and 3 until convergence
20
EM more in depth
• Example:
Hidden variable : for each
point which gaussian
(component) generate it ?

21
EM more in depth
• E Step
For each point estimate the
probability that each Gaussian
generated it

22
EM more in depth
• M Step
For each point estimate the
probability that each Gaussian
generated it

23
K-means vs GMM
K-means GMM
• Objective function • Objective function

– Minimize sum of squared Euclidean distance – Maximize log-likelihood

• Can be optimized by an EM algorithm • EM algorithm

– E-step: assign points to clusters – E-step: Compute posterior probability of membership

– M-step: optimize clusters – M-step: Optimize parameters

– Performs hard assignment during E- step – Perform soft assignment during E-step

• Assumes spherical clusters with equal probability of • Can be used for non-spherical clusters
a cluster
• Can generate clusters with different probabilities
24
Gaussian Mixture Models (GMMs)
• Strengths
– Give probabilistic cluster assignments
– Have probabilistic interpretation
– Can handle clusters with varying sizes, variance etc.
• Weakness
– Initialization matters
– Choose appropriate distributions
– Overfitting issues

25
Appendix
Expectation-Maximisation
Expectation-Maximisation
• Let’s suppose we want to know what is the probability that a data point xn comes from
Gaussian k. We can express this as:
• “given a data point x, what is the probability it came from Gaussian k?”
• In this case, z is a latent (hidden, unknown variable that takes only two possible values. It is
one when x came from Gaussian k, and zero otherwise.
• Knowing the probability of occurrence of z will be useful in helping us determine the
Gaussian mixture parameters.
• Likewise, we can state the following:
• Which means that the overall probability of observing a point that comes from Gaussian k is
actually equivalent to the mixing coefficient for that Gaussian.
• Now let z be the set of all possible latent variables z, hence:

27
Expectation-Maximisation
• Each z occurs independently of others and that they can only take the value of one when
k is equal to the cluster the point comes from. Therefore:

• Now, what about finding the probability of observing our data given that it came from
Gaussian k? Turns out to be that it is actually the Gaussian function itself! Following the
same logic we used to define p(z), we can state:

• The aim is to determine what the probability of z given our observation x? Well, it turns
out to be that the equations we have just derived, along with the Bayes rule, will help us
determine this probability. From the product rule of probabilities, we know that

28
Expectation-Maximisation
• We just need to sum up the terms on z to get p(xn), not p(xn, z)

• This is the equation that defines a Gaussian Mixture. To determine the optimal values
for the parameters we need to determine the maximum likelihood of the model. We
can find the likelihood as the joint probability of all observations xn, defined by:

• Let’s apply the log to each side of the equation:

29
Expectation-Maximisation
• Now, remember that our aim is to find the probability of z given x.
• From Bayes rule, we know that

• Also, we have:

• So let’s now replace these in the previous equation:

Eq1

30
Expectation-Maximisation
• Let us now define the steps that the general EM algorithm will follow.
• Step 1: Initialise θ accordingly. For instance, we can use the results obtained by a
previous K-Means run as a good starting point for our algorithm.
• Step 2 (Expectation step): Evaluate
• Eq2
• The expectation step consists on calculating the value of ϒ, so if we replace Eq1 in Eq2,
we get:
Eq3

• We have the following complete likelihood:

31
Expectation-Maximisation
• The log of this expression is given by

Eq4
• Now, we replace Eq4 in Eq3:
Eq5

• Step 3 (Maximization step): Find the revised parameters θ* using:

• Note that Eq5 can be re-written as following:

32
Expectation-Maximisation
• And now we can easily determine the parameters by using maximum likelihood. Let’s now
take the derivative of Q with respect to π and set it equal to zero:

• By rearranging the terms and applying a summation over k to both sides of the equation,
we obtain:

• we know that the summation of all mixing coefficients π equals one. In addition, we know
that summing up the probabilities γ over k will also give us 1. Thus we get λ = N. Using this
result, we can solve for π:

33
Expectation-Maximisation
• Similarly, if we differentiate Q with respect to μ and Σ, equate the derivative to zero and
then solve for the parameters by using the log-likelihood equation, we obtain:

• Then we will use these revised values to determine ϒ in the next EM iteration and so on
and so forth until we see some convergence in the likelihood value

Gaussian Mixture Models Unit-III
No ratings yet
Gaussian Mixture Models Unit-III
13 pages
Introduction To ICT
No ratings yet
Introduction To ICT
4 pages
Accepted Here: Scan & Pay Using Phonepe App To
No ratings yet
Accepted Here: Scan & Pay Using Phonepe App To
1 page
MLSlides5 - Selected - Shared
No ratings yet
MLSlides5 - Selected - Shared
30 pages
Lec15 16 Handout
No ratings yet
Lec15 16 Handout
33 pages
Gaussian Mixture Model (GMM)
No ratings yet
Gaussian Mixture Model (GMM)
10 pages
Lecture-04_GMM_EMalg
No ratings yet
Lecture-04_GMM_EMalg
34 pages
14 Gaussian Mixture Models
No ratings yet
14 Gaussian Mixture Models
60 pages
Gaussian Mixture Model GMM
No ratings yet
Gaussian Mixture Model GMM
5 pages
Gaussian Mixture Mode
No ratings yet
Gaussian Mixture Mode
30 pages
L08_GMM
No ratings yet
L08_GMM
11 pages
GMM
No ratings yet
GMM
26 pages
EM and Kmeans relations
No ratings yet
EM and Kmeans relations
70 pages
GaussianMixtureModel(GMM)_0a8d7758700f041bd57d8aef0862eb14
No ratings yet
GaussianMixtureModel(GMM)_0a8d7758700f041bd57d8aef0862eb14
18 pages
15_GMC
No ratings yet
15_GMC
4 pages
gmm
No ratings yet
gmm
8 pages
GMM
No ratings yet
GMM
40 pages
Gaussian Mixture Models
No ratings yet
Gaussian Mixture Models
3 pages
کتاب ششم بارگزاری شده
No ratings yet
کتاب ششم بارگزاری شده
49 pages
Week 7 - Latent Variable Models and Expectation Maximization
No ratings yet
Week 7 - Latent Variable Models and Expectation Maximization
39 pages
Notes7_Mixtures_and_EM
No ratings yet
Notes7_Mixtures_and_EM
7 pages
16) ISM-Session 16 - 30th and 31st March 2024
No ratings yet
16) ISM-Session 16 - 30th and 31st March 2024
36 pages
Applied Stat
No ratings yet
Applied Stat
2 pages
EM-converted
No ratings yet
EM-converted
22 pages
PROBABILISTIC Learning Jb-new
No ratings yet
PROBABILISTIC Learning Jb-new
13 pages
Chapter 1 - Part1
No ratings yet
Chapter 1 - Part1
56 pages
GAUSSIAN MIXTURES
No ratings yet
GAUSSIAN MIXTURES
5 pages
Ch9 2-MixturesofGaussians PDF
No ratings yet
Ch9 2-MixturesofGaussians PDF
38 pages
Clustering Mixture
No ratings yet
Clustering Mixture
22 pages
PRML Slides 2
No ratings yet
PRML Slides 2
86 pages
Statistical Methods For NLP: Document and Topic Clustering, K-Means, Mixture Models, Expectation-Maximization
No ratings yet
Statistical Methods For NLP: Document and Topic Clustering, K-Means, Mixture Models, Expectation-Maximization
47 pages
401 Week7 Part 2 EM Algorithm
No ratings yet
401 Week7 Part 2 EM Algorithm
58 pages
GMMEMNotes
No ratings yet
GMMEMNotes
10 pages
Gaussian Mixture Model
No ratings yet
Gaussian Mixture Model
10 pages
Wk04 machine learning
No ratings yet
Wk04 machine learning
6 pages
Pattern Classification 08. Gaussian Mixture Model: Abdelmoniem Bayoumi, PHD
No ratings yet
Pattern Classification 08. Gaussian Mixture Model: Abdelmoniem Bayoumi, PHD
12 pages
20-gaussian-mixture-model
No ratings yet
20-gaussian-mixture-model
55 pages
UNIT 5 - ML
No ratings yet
UNIT 5 - ML
10 pages
Cse291d 7
No ratings yet
Cse291d 7
39 pages
Memory Segmentation of 8086
89% (9)
Memory Segmentation of 8086
17 pages
Algoritmo E-M. Utilizado para Calcular La Mezcla de Gausianas
No ratings yet
Algoritmo E-M. Utilizado para Calcular La Mezcla de Gausianas
8 pages
ML UNIT III
No ratings yet
ML UNIT III
12 pages
Module13 GaussianMixtureModel
No ratings yet
Module13 GaussianMixtureModel
17 pages
Week 7 GMM
No ratings yet
Week 7 GMM
9 pages
Expectation Maximization
No ratings yet
Expectation Maximization
19 pages
Dsci303-19 GM - em
No ratings yet
Dsci303-19 GM - em
81 pages
Gaussian Mixture Modelling GMM
No ratings yet
Gaussian Mixture Modelling GMM
11 pages
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 12-May-2021 5.5 Expectation Maximization
No ratings yet
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 12-May-2021 5.5 Expectation Maximization
28 pages
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
From Everand
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
Rob Porter
No ratings yet
AI29
No ratings yet
AI29
3 pages
VB19B
No ratings yet
VB19B
4 pages
S6, S7, S8 CS - U4 Getter Setter EM Algorithm
No ratings yet
S6, S7, S8 CS - U4 Getter Setter EM Algorithm
32 pages
Andrew Rosenberg - Lecture 18: Gaussian Mixture Models and Expectation Maximization
No ratings yet
Andrew Rosenberg - Lecture 18: Gaussian Mixture Models and Expectation Maximization
34 pages
3465-Article Text-21203-5-10-20240401
No ratings yet
3465-Article Text-21203-5-10-20240401
12 pages
MONITORING TIMES 03march1999
No ratings yet
MONITORING TIMES 03march1999
108 pages
Finite Mixture Modelling Model Specification, Estimation & Application
No ratings yet
Finite Mixture Modelling Model Specification, Estimation & Application
11 pages
Get One More Story in Your Member Preview When You Sign Up. It's Free
No ratings yet
Get One More Story in Your Member Preview When You Sign Up. It's Free
12 pages
Gaussian Distribution
No ratings yet
Gaussian Distribution
5 pages
CB PDF
No ratings yet
CB PDF
69 pages
Mixture Models and Clustering
No ratings yet
Mixture Models and Clustering
8 pages
CSL Chap-02 Part-2
No ratings yet
CSL Chap-02 Part-2
89 pages
Oose Lab Manual
No ratings yet
Oose Lab Manual
131 pages
Gaussian Mixture Model: P (X - Y) P (Y - X) P (X)
No ratings yet
Gaussian Mixture Model: P (X - Y) P (Y - X) P (X)
3 pages
Land Surveying
No ratings yet
Land Surveying
6 pages
E. Syaodih 2018 Smart Village Development (The 9th International Conference of Rural
No ratings yet
E. Syaodih 2018 Smart Village Development (The 9th International Conference of Rural
12 pages
Mixture Models and Expectation-Maximization: Justus H. Piater
No ratings yet
Mixture Models and Expectation-Maximization: Justus H. Piater
11 pages
Mathssssss
No ratings yet
Mathssssss
4 pages
Kaeser BSD 65 83
No ratings yet
Kaeser BSD 65 83
9 pages
Advance Excel Course Applicaiton 210721
No ratings yet
Advance Excel Course Applicaiton 210721
1 page
cs229 Notes7b PDF
No ratings yet
cs229 Notes7b PDF
4 pages
Electrical Control Valves EX4 / EX5 / EX6 / EX7 / EX8
No ratings yet
Electrical Control Valves EX4 / EX5 / EX6 / EX7 / EX8
9 pages
FD TAs Allottment-2nd Semester 2023-24
No ratings yet
FD TAs Allottment-2nd Semester 2023-24
4 pages
The 555 - A Versatile Timer (RE-EN - 1992 - 09-12)
No ratings yet
The 555 - A Versatile Timer (RE-EN - 1992 - 09-12)
28 pages
GW MT Datasheet-En
No ratings yet
GW MT Datasheet-En
2 pages
SecurityDayTunisia FortinetTeam Slides KB1 SecOPs
No ratings yet
SecurityDayTunisia FortinetTeam Slides KB1 SecOPs
28 pages
2018 Bulgaria National Olympiad: 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 AB 1 2 AD BC CD AB BC BC CD CD AD AD AB
No ratings yet
2018 Bulgaria National Olympiad: 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 AB 1 2 AD BC CD AB BC BC CD CD AD AD AB
2 pages
Attachment II U.S. Importers
No ratings yet
Attachment II U.S. Importers
39 pages
Dependability and Security Assurance in Software Engineering
No ratings yet
Dependability and Security Assurance in Software Engineering
74 pages
Ghalib International PVT LTD: Sitai Inox, Italy
No ratings yet
Ghalib International PVT LTD: Sitai Inox, Italy
8 pages
Project Based Learning 2020-2021: Topic Wireless Communications
No ratings yet
Project Based Learning 2020-2021: Topic Wireless Communications
23 pages
Belt Weigh Feeder
100% (1)
Belt Weigh Feeder
808 pages
AI Bot Trading For Beginners (Plus Premium Bot) The Ultimate Guide To Maximizing Profits (VICTOR ABEE) (Z-Library)
100% (1)
AI Bot Trading For Beginners (Plus Premium Bot) The Ultimate Guide To Maximizing Profits (VICTOR ABEE) (Z-Library)
82 pages
Simple Electromechanical Relay
No ratings yet
Simple Electromechanical Relay
11 pages
Exercises of Statistical Inference
From Everand
Exercises of Statistical Inference
Simone Malacrida
No ratings yet
1629715428candidate Instructions - COMPEX 2021 Exam
No ratings yet
1629715428candidate Instructions - COMPEX 2021 Exam
8 pages
Process Costing Activity
No ratings yet
Process Costing Activity
4 pages
Spec BST SPSS300W 1.5KW Series Solar Generator
No ratings yet
Spec BST SPSS300W 1.5KW Series Solar Generator
1 page
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Cold Applied Tape
No ratings yet
Cold Applied Tape
3 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chap2 Part2 GMM

Uploaded by

Chap2 Part2 GMM

Uploaded by

Chapter 2 : Generative Models

Gaussian Mixture Models

• Define a mixture of Gaussian Distribution

• Enumerate GMM parameters

• Able to identify the model parameters

• A limitation of the hard clustering approach is that there is no probability which

• Example: K-means is an example of Hard Clustering

• Most of the observations are around the

mean, and the further away an observation

is from the mean, the lower its probability

• 91,4% of the data are between [µ-2σ] and [µ+2σ]

Where µ is the mean and Σ is the covariance matrix

• Gaussian mixture models can be

used in many different areas,

including finance, marketing,

computer vision and so much more.

where is the prior probability (weight) of the jth Gaussian

• Solution: Use the technique of Maximum Likelihood

• In practise, a likelihood function is created based on the probability density

• It is required to maximise the log related to the likelihood function.

• MLE is a way to formally pose the problem

• Expectation-Maximisation (EM) is a way to solve the problem posed by MLE

• The Expectation-Maximisation (EM) Algorithm is an iterative method used to maximise

• Expectation (E) step: Given the current

• Maximization (M) step: Given the current

• Maximization (M) step: Computes

θ={ µ, Σ, ѡ } model parameters

X={x1, x2, …, xN} Observed data

– Minimize sum of squared Euclidean distance – Maximize log-likelihood

• Can be optimized by an EM algorithm • EM algorithm

– E-step: assign points to clusters – E-step: Compute posterior probability of membership

– M-step: optimize clusters – M-step: Optimize parameters

• Let’s apply the log to each side of the equation:

• So let’s now replace these in the previous equation:

• We have the following complete likelihood:

• Step 3 (Maximization step): Find the revised parameters θ* using:

• Note that Eq5 can be re-written as following:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.