0% found this document useful (0 votes)
25 views3 pages

Companion To Marketing Data Miner

The Marketing Data Miner tool allows users to perform k-means clustering on market data to uncover customer segments. It comes with default data but users can input their own data about customers' ratings of product attributes and demographics. The tool runs k-means clustering with varying numbers of clusters and outputs graphs and tables showing the percentage of variance explained by each solution, the characteristics of the cluster centers, and cluster membership of each observation. This allows users to identify the optimal number of clusters for segmenting the customer base.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views3 pages

Companion To Marketing Data Miner

The Marketing Data Miner tool allows users to perform k-means clustering on market data to uncover customer segments. It comes with default data but users can input their own data about customers' ratings of product attributes and demographics. The tool runs k-means clustering with varying numbers of clusters and outputs graphs and tables showing the percentage of variance explained by each solution, the characteristics of the cluster centers, and cluster membership of each observation. This allows users to identify the optimal number of clusters for segmenting the customer base.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Marketing (professor Youjung Jun)

Companion to “Marketing Data Miner”


The Marketing Data Miner is an Excel-based tool that allows the user to cluster data (using K-means) to
uncover segments in the market.

Entering Data

The tool comes with a default dataset. If you would like to input your own data, Go to the “Input data”
worksheet, press “clean data” and enter your own data. Your data should contain scores (e.g.,
partworths, importance ratings) on several dimensions (e.g., attribute levels, evaluation dimensions) for
several units (e.g., consumers).

Suppose one wanted to analyze data on 10 consumers that rated the importance of 5 dimensions:
Durability, Service, Design, Prestige, and Affordability on a 1 to 7 scale, with 1 being not at all important
and 7 being extremely important. You also have some data about the demographics of these customers
in terms of their age in years, years of education completed and gender. The input would look like this:
K-means

K-means is a method for clustering observations. Simply press the “K-MEANS” button in the “Input
Data” tab. This will run K-means clustering with 1 cluster, 2 clusters, etc., all the way to 10 clusters. The
output is contained in the tab “K-Means” and looks like this:

“Percentage of explained variance” captures the proportion of the variance that would remain in the
data if we approximated each point with its cluster center. Ideally, we would like each point to be close
to its cluster center, that is, we would like the variance under this approximation to be close to the true
variance in the data (large proportion of explained variance). The graph shows you the impact of adding
more clusters on the percentage of explained variance. The more clusters we allow, the finer the
resolution and the better we are able to explain the variance in the data. In the above case, there are 10
customers being clustered. If we allow 10 clusters, then each customer is its own cluster and the clusters
match the data perfectly. In that case the percentage of explained variance is 100%. If we allow only 2
clusters, only 51% of the variance would be explained, etc. This graph allows you to identify a good
number of clusters. Intuitively, we would like to select a number of clusters that is not too large, while at
the same time explaining a lot of the variance in the data. In the above case 3 seems like the right
number of clusters: there is a big jump in the proportion of variance explained between 2 and 3 clusters,
but then it gets pretty flat as we add more clusters.

“Cluster Centers” give the center of each cluster, i.e., the average observation in each cluster. In our
case, with 3 clusters, it seems that the first cluster captures customers that care about Service; the
second cluster cares primarily about Design and Prestige; the third cluster cares about durability and
affordability. The size of the clusters is fairly similar.

“Cluster Membership” tells us the cluster to which each observation belongs. With 3 clusters, we look in
the “3 clusters” column and see that respondent 1 belongs to Cluster #3, respondent 4 to Cluster #1,
respondent 7 to Cluster #2, etc.
“Demographics”--in running the cluster analysis you can, if you wish, compare the clusters on other
variables such as demographics. If you wish to add comparisons of clusters on demographics click “yes”
in the “input data” worksheet under “Include Demographics.” The output would then present the
comparison of the clusters on demographics in the right-most table in “k-means worksheet”. For
example, we see in the 3 clusters solution that on average clusters 3 is a bit older (average age = 49.33)
than cluster 2 (average age = 46.25) which in turn is older than cluster 1 (average age = 40.33).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy