0% found this document useful (0 votes)

19 views10 pages

Expectation-Maximization (EM) Algorithm With Example

The document explains the Expectation-Maximization (EM) algorithm, which is used to infer hidden variables from observable data to improve predictions in data science. It outlines the iterative process of the algorithm, consisting of the Expectation and Maximization steps, and illustrates its application through an example involving biased coins. The author emphasizes the importance of accurately estimating hidden variables to enhance model performance in real-world data scenarios.

Uploaded by

anbu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views10 pages

Expectation-Maximization (EM) Algorithm With Example

Uploaded by

anbu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

Data Science in you… · Follow publication

Expectation-Maximization (EM) Algorithm

with example
Mehul Gupta · Follow
Published in Data Science in your pocket
6 min read · Apr 27, 2020

Listen Share More

Real-life Data Science problems are way far away from what we see in Kaggle
competitions or in various online hackathons. Before being a professional, what I
used to think of Data Science is that I would be given some data initially. Then I
need to clean it up a bit (some regular steps), engineer some features, pick up
several models from Sklearn or Keras & train.

My debut book “LangChain in your Pocket” is out

now !!

LangChain in your Pocket: Beginner's Guide to Building Generative

AI Applications using LLMs
LangChain in your Pocket: Beginner's Guide to Building Generative
AI Applications using LLMs eBook : Gupta, Mehul…
www.amazon.in

But things aren’t that easy. To get perfect data, that initial step, is where it is decided
whether your model will be giving good results or not. Most of the time, there exist
some features that are observable for some cases, not available for others (which
we take NaN very easily). And if we can determine these missing features, our

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 1/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

predictions would be way better rather than substituting them with NaNs or mean
or some other means.

Expectation-Maximization EM algorithm maths explained with example

Here comes the Expectation-Maximization algorithm.

I myself heard it a few days back when I was going through some papers on
Tokenization algos in NLP.

The EM algorithm helps us to infer(conclude) those hidden variables using the

ones that are observable in the dataset Hence making our predictions even better.

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 2/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

📕My debut book on Generative AI is out

<a href=https://medium.com/data-science-in-your-pocket/my-first-book-langchain-in-your-
pocket-is-out-9a1f156c0f7b>LangChain in your Pocket: Beginner's Guide to Building
Generative AI Applications using LLMs

🎤My AI Podcast, AIQ is out now

AIQ : Artificial Intelligence Quotient

My Blogs
GENERATIVE AI

LLM

BEGINNERS

NLP

REINFORCEMENT LEARNING

Built with Streamlit 🎈 Fullscreen

Didn’t get it?

Examples always help

Consider 2 biased coins.

Suppose bias for 1st coin is ‘Θ_A’ & for 2nd is ‘Θ_B’ where Θ_A & Θ_B lies between
0<x<1. By bias ‘Θ_A’ & ‘Θ_B’, I mean that the probability of Heads with 1st coin isn’t
0.5 (for unbiased coin) but ‘Θ_A’ & similarly for 2nd coin, this probability is ‘Θ_B’.

Can we estimate ‘Θ_A’ & ‘Θ_B’ if we are given some trials(flip results) of
these coins?
https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 3/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

Simple!!

Consider Blue rows as 2nd coin trials & Red rows as 1st coin trials.

What I can do is count the number of Heads for the total number of samples for the
coin & simply calculate an average. This can give us the value for ‘Θ_A’ & ‘Θ_B’ pretty
easily. We can simply average the number of heads for the total number of flips
done for a particular coin as shown below

Θ_A = 24/30=0.8

Θ_B=9/20=0.45

But what if I give you the below condition:

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 4/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

Here, we can’t differentiate between the samples that which row belongs to which
coin.

How can I calculate ‘Θ_A’ & ‘Θ_B’ then?

We can still have an estimate of ‘Θ_A’ & ‘Θ_B’ using the EM algorithm!!

The algorithm follows 2 steps iteratively: Expectation & Maximization

Expect: Estimate the expected value for the hidden variable

Maximize: Optimize parameters using Maximum likelihood

Observe the below image:

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 5/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

EM Algorithm Steps:
1. Assume some random values for your hidden variables:

Θ_A = 0.6 & Θ_B = 0.5 in our example

By the way, Do you remember the binomial distribution somewhere in your school
life?

If not, let’s have a recapitulation for that as well.

The binomial distribution is used to model the probability of a system with only 2
possible outcomes(binary) where we perform ‘K’ number of trials & wish to know
the probability for a certain combination of success & failure using the formula

Where

n= total number of trials

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 6/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

X= Total number of successful events

p= Probability of a successful event

q=Probability of an unsuccessful event

For refreshing your concepts on Binomial Distribution, check here.

Now, if you have a good memory, you might remember why we multiply the
Combination (n!/(n-X)! * X!) constant?

This term is taken when we aren’t aware of the sequence of events taking place.
Like,

Suppose I say I had 10 tosses out of which 5 were heads & rest tails. Here, we will be
multiplying that constant as we aren’t aware of in which sequence this
happened(HHHHHTTTTT or HTHTHTHTHT or some other sequence, there exist a
number of sequences in which this could have happened). But if I am given the
sequence of events, we can drop this constant value.

Remember this!!

Coming back to the EM algorithm, what we have done so far is assumed two values
for ‘Θ_A’ & ‘Θ_B’
2. Expectation Step:

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 7/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

It must be assumed that any experiment/trial (experiment: each row with a sequence of
Heads & Tails in the grey box in the image) has been performed using only a specific coin
(whether 1st or 2nd but not both). The grey box contains 5 experiments

Look at the first experiment with 5 Heads & 5 Tails (1st row, grey block)

Now using the binomial distribution, we will try to estimate what is the probability
of 1st experiment carried on with 1st coin that has a bias ‘Θ_A’(where Θ_A=0.6 in the
1st step).

As we already know the sequence of events, I will be dropping the constant part of
the equation. Hence Probability of such results, if the 1st experiment belonged to
1st coin, is

(0.6)⁵x(0.4)⁵ = 0.00079 (As p(Success i.e Head)=0.6, p(Failure i.e Tails)=0.4)

Similarly, If the 1st experiment belonged to 2nd coin with Bias ‘Θ_B’(where Θ_B=0.5
for the 1st step), the probability for such results will be:

0.5⁵x0.5⁵ = 0.0009 (As p(Success)=0.5; p(Failure)=0.5)

On normalizing these 2 probabilities, we get

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 8/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

P(1st coin used for 1st experiment)=0.45

P(2nd coin used for 1st experiment)=0.55

Similarly, for the 2nd experiment, we have 9 Heads & 1 Tail.

Hence,

P(1st coin used for 2nd experiment) = 0.6⁹x0.4¹=0.004

P(2nd coin used for 2nd experiment) = 0.5⁹x0.5 = 0.0009

On Normalizing, the values we get are approximately 0.8 & 0.2 respectively

Do check the same calculation for other experiments as well

Moving to 2nd step:

Now, we will be multiplying the Probability of the experiment belonging to the

specific coin(calculated above) by the number of Heads & Tails in the experiment i.e

0.45 * 5 Heads, 0.45* 5 Tails= 2.2 Heads, 2.2 Tails for 1st Coin (Bias ‘Θ_A’)

Similarly,

0.55 * 5 Heads, 0.55* 5 Tails = 2.8 Heads, 2.8 Tails for 2nd coin

We can calculate other values as well to fill up the table on the right.

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 9/18
3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

Now once we are done, Calculate the total number of Heads & Tails for respective
coins.

For 1st coin, we have 21.3 H & 8.6 T;

For 2nd coin, 11.7 H & 8.4 T

Maximization step:
Now, what we want to do is to converge to the correct values of ‘Θ_A’ & ‘Θ_B’.

As the bias represented the probability of a Head, we will calculate the revised bias:

‘Θ_A’= Heads due to 1st coin/ All Heads observed= 21.3/21.3+8.6=0.71

Similarly,

Θ_B = 0.58 shown in the above equation. Now we will again switch back to the
Expectation step using the revised biases.

On 10 such iterations, we will get Θ_A=0.8 & Θ_B=0.52

Have you observed one thing!!

These values are quite close to the values we calculated when we knew the identity
of coins used for each experiment was Θ_A=0.8 & Θ_B=0.45 (taking the average at
the very beginning of the post)

Hence, the algorithm works!!!

https://medium.com/data-science-in-your-pocket/expectation-maximization-em-algorithm-explained-288626ce220e 10/18

BS 6349-2
100% (7)
BS 6349-2
64 pages
Week 7 - Latent Variable Models and Expectation Maximization
No ratings yet
Week 7 - Latent Variable Models and Expectation Maximization
39 pages
Local Search
No ratings yet
Local Search
37 pages
Data Science - Ebook
No ratings yet
Data Science - Ebook
32 pages
LN ML Rug
No ratings yet
LN ML Rug
283 pages
Naïve Bayes Classification in Python
No ratings yet
Naïve Bayes Classification in Python
16 pages
5
No ratings yet
5
29 pages
Expectation Maximization (EM) Algorithm
No ratings yet
Expectation Maximization (EM) Algorithm
47 pages
Lecture Expectation Maximization
No ratings yet
Lecture Expectation Maximization
58 pages
AIDS2 Assignment 1 06
No ratings yet
AIDS2 Assignment 1 06
10 pages
Expectation Maximization Homework Solution
100% (1)
Expectation Maximization Homework Solution
8 pages
Agglomerative Methods in Machine Learning
No ratings yet
Agglomerative Methods in Machine Learning
12 pages
Em and Forward
No ratings yet
Em and Forward
32 pages
How The Random Forest Algorithm Works in Machine Learning
No ratings yet
How The Random Forest Algorithm Works in Machine Learning
11 pages
Building Up Writing Ability
No ratings yet
Building Up Writing Ability
89 pages
Levels of Communication (Presentation)
No ratings yet
Levels of Communication (Presentation)
8 pages
Stacking To Improve Model Performance
No ratings yet
Stacking To Improve Model Performance
10 pages
Math Behind AdaBoost Algorithm in 3 Steps
No ratings yet
Math Behind AdaBoost Algorithm in 3 Steps
10 pages
Chapter 5 Organization
No ratings yet
Chapter 5 Organization
12 pages
Practical Data Science Cookbook Sample Chapter
100% (1)
Practical Data Science Cookbook Sample Chapter
31 pages
Lec15 16 Handout
No ratings yet
Lec15 16 Handout
33 pages
What Is Bagging in Machine Learning and How To Perform Bagging
No ratings yet
What Is Bagging in Machine Learning and How To Perform Bagging
9 pages
Random Forest Classification
No ratings yet
Random Forest Classification
8 pages
CIA I ML Important 16 Questions Answers PART A
No ratings yet
CIA I ML Important 16 Questions Answers PART A
35 pages
کتاب ششم بارگزاری شده
No ratings yet
کتاب ششم بارگزاری شده
49 pages
Karpagam College of Engineering: Reg - No
No ratings yet
Karpagam College of Engineering: Reg - No
32 pages
Tube House
No ratings yet
Tube House
12 pages
What Is Bagging in Machine Learning
No ratings yet
What Is Bagging in Machine Learning
6 pages
WLP Telephone Etiquette and Intrapersonal
No ratings yet
WLP Telephone Etiquette and Intrapersonal
5 pages
Lecture 5
No ratings yet
Lecture 5
16 pages
Data Science Essentials - The Must Know Mathematics and Statistics
No ratings yet
Data Science Essentials - The Must Know Mathematics and Statistics
17 pages
PROBABILISTIC Learning Jb-New
No ratings yet
PROBABILISTIC Learning Jb-New
13 pages
Презентация Microsoft PowerPoint
No ratings yet
Презентация Microsoft PowerPoint
9 pages
ML RUSA Module 6 Probablistic EM KNN SVM
No ratings yet
ML RUSA Module 6 Probablistic EM KNN SVM
51 pages
EM Algorithm
No ratings yet
EM Algorithm
5 pages
Section 2.3 - PROCEDURE FOR BATCH FORMULATION OF PRODUCTS
No ratings yet
Section 2.3 - PROCEDURE FOR BATCH FORMULATION OF PRODUCTS
4 pages
ds11 2
No ratings yet
ds11 2
19 pages
ML Unit Iii
No ratings yet
ML Unit Iii
12 pages
Unit2 6
No ratings yet
Unit2 6
12 pages
DS Cheat Sheets
No ratings yet
DS Cheat Sheets
18 pages
ExpectationMaximization Algorithm
No ratings yet
ExpectationMaximization Algorithm
7 pages
Science Answerskey-Living and Non-Living Things
No ratings yet
Science Answerskey-Living and Non-Living Things
3 pages
A Hidden Markov Model
No ratings yet
A Hidden Markov Model
6 pages
Lecture 2 - Math
No ratings yet
Lecture 2 - Math
39 pages
Regression Answer
No ratings yet
Regression Answer
2 pages
Aiml Lab Algorithms
No ratings yet
Aiml Lab Algorithms
10 pages
ML-2-Expectation Maximization
No ratings yet
ML-2-Expectation Maximization
11 pages
ECE 6504: Advanced Topics in Machine Learning: Probabilistic Graphical Models and Large-Scale Learning
No ratings yet
ECE 6504: Advanced Topics in Machine Learning: Probabilistic Graphical Models and Large-Scale Learning
40 pages
Expectation Maximization
No ratings yet
Expectation Maximization
23 pages
Unit 8 Stastical Learning Method
No ratings yet
Unit 8 Stastical Learning Method
4 pages
AI UNIT 3 Tycs
No ratings yet
AI UNIT 3 Tycs
16 pages
Lecture3 EM
No ratings yet
Lecture3 EM
36 pages
Likelihood EM HMM Kalman
No ratings yet
Likelihood EM HMM Kalman
46 pages
AI29
No ratings yet
AI29
3 pages
Unit 3 ML
No ratings yet
Unit 3 ML
45 pages
Gaussian Distribution
No ratings yet
Gaussian Distribution
5 pages
Odys Mp-x29fm Userbook
No ratings yet
Odys Mp-x29fm Userbook
40 pages
Personnel Contact Info
No ratings yet
Personnel Contact Info
1 page
Machine 2023 Part 1
No ratings yet
Machine 2023 Part 1
4 pages
EM Presentation 2013
No ratings yet
EM Presentation 2013
18 pages
S6, S7, S8 CS - U4 Getter Setter EM Algorithm
No ratings yet
S6, S7, S8 CS - U4 Getter Setter EM Algorithm
32 pages
Lesson 6.3 Applications of Quadratics
No ratings yet
Lesson 6.3 Applications of Quadratics
2 pages
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 12-May-2021 5.5 Expectation Maximization
No ratings yet
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 12-May-2021 5.5 Expectation Maximization
28 pages
1st File
No ratings yet
1st File
19 pages
Gaussian Mixture Models
No ratings yet
Gaussian Mixture Models
3 pages
Learning With Hidden Variables - EM Algorithm
No ratings yet
Learning With Hidden Variables - EM Algorithm
31 pages
Unit 2
No ratings yet
Unit 2
7 pages
HMM Tutorial
No ratings yet
HMM Tutorial
15 pages
Final Action Plan in NSTP
No ratings yet
Final Action Plan in NSTP
3 pages
The Expectation Maximization Algorithm
No ratings yet
The Expectation Maximization Algorithm
7 pages
Canada Webquest PDF
0% (1)
Canada Webquest PDF
2 pages
DS ML Probability Statistics Interview
No ratings yet
DS ML Probability Statistics Interview
6 pages
N D IX: The E-M Algorithm
No ratings yet
N D IX: The E-M Algorithm
12 pages
Upload Photosimages Into Custom Table & Print in Adobe Form
No ratings yet
Upload Photosimages Into Custom Table & Print in Adobe Form
14 pages
Expectation Maximization: Dekang Lin Department of Computing Science University of Alberta
No ratings yet
Expectation Maximization: Dekang Lin Department of Computing Science University of Alberta
22 pages
ML Imp Ques 1
No ratings yet
ML Imp Ques 1
22 pages
UNIT 4 - EM Alg
No ratings yet
UNIT 4 - EM Alg
3 pages
Comet K 500 PDF
No ratings yet
Comet K 500 PDF
1 page
The Invisible Man
No ratings yet
The Invisible Man
12 pages
AI Unit 3
No ratings yet
AI Unit 3
12 pages
ML - Expectation-Maximization Algorithm
No ratings yet
ML - Expectation-Maximization Algorithm
3 pages
Flow at Work
100% (3)
Flow at Work
12 pages
NHSST
No ratings yet
NHSST
1 page
Blaze Angle
No ratings yet
Blaze Angle
5 pages
TR 97 021
No ratings yet
TR 97 021
15 pages
Ingen Dynamics - Personality Assessment - 02
No ratings yet
Ingen Dynamics - Personality Assessment - 02
2 pages
Secondary
No ratings yet
Secondary
1 page
The EM Algorithm: Ajit Singh November 20, 2005
No ratings yet
The EM Algorithm: Ajit Singh November 20, 2005
4 pages
Expectation Maximization Notes
No ratings yet
Expectation Maximization Notes
5 pages
cs229 Notes7b PDF
No ratings yet
cs229 Notes7b PDF
4 pages
Domenicolauria Finaldraftofresearchpaper
No ratings yet
Domenicolauria Finaldraftofresearchpaper
4 pages
Soalan Ujian 1 MT Kertas 2 THN 2 DLP. Terkini
100% (1)
Soalan Ujian 1 MT Kertas 2 THN 2 DLP. Terkini
9 pages
Pagsusuri Huwag Mong Kukwentuhan Si Wei Fung Post Colonial Approach Paper
No ratings yet
Pagsusuri Huwag Mong Kukwentuhan Si Wei Fung Post Colonial Approach Paper
2 pages
Optimization of A Fully Air-Swept Dry Grinding Cement Raw Meal Ball Mill Closed Circuit Capacity With The Aid
No ratings yet
Optimization of A Fully Air-Swept Dry Grinding Cement Raw Meal Ball Mill Closed Circuit Capacity With The Aid
10 pages
Analysis and Design of Algorithms: A Beginner’s Hope
From Everand
Analysis and Design of Algorithms: A Beginner’s Hope
Shefali Singhal
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Machine Learning: Hands-On for Developers and Technical Professionals
From Everand
Machine Learning: Hands-On for Developers and Technical Professionals
Jason Bell
No ratings yet
Brute Force Search: Fundamentals and Applications
From Everand
Brute Force Search: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Expectation-Maximization (EM) Algorithm With Example

Uploaded by

Expectation-Maximization (EM) Algorithm With Example

Uploaded by

3/20/25, 4:26 PM Expectation-Maximization (EM) Algorithm with example | by Mehul Gupta | Data Science in your pocket | Medium

Data Science in you… · Follow publication

Expectation-Maximization (EM) Algorithm

Listen Share More

My debut book “LangChain in your Pocket” is out

LangChain in your Pocket: Beginner's Guide to Building Generative

Expectation-Maximization EM algorithm maths explained with example

Here comes the Expectation-Maximization algorithm.

The EM algorithm helps us to infer(conclude) those hidden variables using the

📕My debut book on Generative AI is out

🎤My AI Podcast, AIQ is out now

AIQ : Artificial Intelligence Quotient

Built with Streamlit 🎈 Fullscreen

Didn’t get it?

Examples always help

But what if I give you the below condition:

How can I calculate ‘Θ_A’ & ‘Θ_B’ then?

The algorithm follows 2 steps iteratively: Expectation & Maximization

Expect: Estimate the expected value for the hidden variable

Maximize: Optimize parameters using Maximum likelihood

Observe the below image:

Θ_A = 0.6 & Θ_B = 0.5 in our example

If not, let’s have a recapitulation for that as well.

n= total number of trials

X= Total number of successful events

p= Probability of a successful event

q=Probability of an unsuccessful event

For refreshing your concepts on Binomial Distribution, check here.

(0.6)⁵x(0.4)⁵ = 0.00079 (As p(Success i.e Head)=0.6, p(Failure i.e Tails)=0.4)

0.5⁵x0.5⁵ = 0.0009 (As p(Success)=0.5; p(Failure)=0.5)

On normalizing these 2 probabilities, we get

P(1st coin used for 1st experiment)=0.45

P(2nd coin used for 1st experiment)=0.55

Similarly, for the 2nd experiment, we have 9 Heads & 1 Tail.

P(1st coin used for 2nd experiment) = 0.6⁹x0.4¹=0.004

P(2nd coin used for 2nd experiment) = 0.5⁹x0.5 = 0.0009

Do check the same calculation for other experiments as well

Now, we will be multiplying the Probability of the experiment belonging to the

For 1st coin, we have 21.3 H & 8.6 T;

For 2nd coin, 11.7 H & 8.4 T

‘Θ_A’= Heads due to 1st coin/ All Heads observed= 21.3/21.3+8.6=0.71

On 10 such iterations, we will get Θ_A=0.8 & Θ_B=0.52

Have you observed one thing!!

Hence, the algorithm works!!!

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.