0% found this document useful (0 votes)
250 views2 pages

Figure 3-10: Mglearn Discrete - Scatter X - Train - Pca X - Train - Pca y - Train PLT Xlabel PLT Ylabel

This document discusses non-negative matrix factorization (NMF), an unsupervised learning algorithm similar to PCA but with non-negative components and coefficients. NMF decomposes data into a non-negative weighted sum of components and is useful for data created by overlaying independent sources. The document provides an example applying NMF to synthetic data, extracting non-negative components as directions from the origin to the data points.

Uploaded by

mychelrios
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
250 views2 pages

Figure 3-10: Mglearn Discrete - Scatter X - Train - Pca X - Train - Pca y - Train PLT Xlabel PLT Ylabel

This document discusses non-negative matrix factorization (NMF), an unsupervised learning algorithm similar to PCA but with non-negative components and coefficients. NMF decomposes data into a non-negative weighted sum of components and is useful for data created by overlaying independent sources. The document provides an example applying NMF to synthetic data, extracting non-negative components as directions from the origin to the data points.

Uploaded by

mychelrios
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

corresponds to extending the sum in Figure 3-10 to include more and more terms.

Using as many components as there are pixels would mean that we would not discard
any information after the rotation, and we would reconstruct the image perfectly.
We can also try to use PCA to visualize all the faces in the dataset in a scatter plot
using the first two principal components (Figure 3-12), with classes given by who is
shown in the image, similarly to what we did for the cancer dataset:
In[33]:
mglearn.discrete_scatter(X_train_pca[:, 0], X_train_pca[:, 1], y_train)
plt.xlabel("First principal component")
plt.ylabel("Second principal component")

Figure 3-12. Scatter plot of the faces dataset using the first two principal components (see
Figure 3-5 for the corresponding image for the cancer dataset)

As you can see, when we use only the first two principal components the whole data
is just a big blob, with no separation of classes visible. This is not very surprising,
given that even with 10 components, as shown earlier in Figure 3-11, PCA only cap‐
tures very rough characteristics of the faces.

Dimensionality Reduction, Feature Extraction, and Manifold Learning | 155


Non-Negative Matrix Factorization (NMF)
Non-negative matrix factorization is another unsupervised learning algorithm that
aims to extract useful features. It works similarly to PCA and can also be used for
dimensionality reduction. As in PCA, we are trying to write each data point as a
weighted sum of some components, as illustrated in Figure 3-10. But whereas in PCA
we wanted components that were orthogonal and that explained as much variance of
the data as possible, in NMF, we want the components and the coefficients to be non-
negative; that is, we want both the components and the coefficients to be greater than
or equal to zero. Consequently, this method can only be applied to data where each
feature is non-negative, as a non-negative sum of non-negative components cannot
become negative.
The process of decomposing data into a non-negative weighted sum is particularly
helpful for data that is created as the addition (or overlay) of several independent
sources, such as an audio track of multiple people speaking, or music with many
instruments. In these situations, NMF can identify the original components that
make up the combined data. Overall, NMF leads to more interpretable components
than PCA, as negative components and coefficients can lead to hard-to-interpret can‐
cellation effects. The eigenfaces in Figure 3-9, for example, contain both positive and
negative parts, and as we mentioned in the description of PCA, the sign is actually
arbitrary. Before we apply NMF to the face dataset, let’s briefly revisit the synthetic
data.

Applying NMF to synthetic data


In contrast to when using PCA, we need to ensure that our data is positive for NMF
to be able to operate on the data. This means where the data lies relative to the origin
(0, 0) actually matters for NMF. Therefore, you can think of the non-negative compo‐
nents that are extracted as directions from (0, 0) toward the data.
The following example (Figure 3-13) shows the results of NMF on the two-
dimensional toy data:
In[34]:
mglearn.plots.plot_nmf_illustration()

156 | Chapter 3: Unsupervised Learning and Preprocessing

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy