0% found this document useful (0 votes)
5 views65 pages

Multidimensional Scaling

Multidimensional Scaling (MDS) is a statistical method that visualizes relationships between objects by reducing high-dimensional data into lower dimensions while preserving distances. It is widely used across various fields such as psychology, marketing, geography, and biology to uncover patterns and insights. Key figures in its development include Torgerson, Shepard, and Kruskal, and it encompasses various types including classical, metric, non-metric, and generalized MDS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views65 pages

Multidimensional Scaling

Multidimensional Scaling (MDS) is a statistical method that visualizes relationships between objects by reducing high-dimensional data into lower dimensions while preserving distances. It is widely used across various fields such as psychology, marketing, geography, and biology to uncover patterns and insights. Key figures in its development include Torgerson, Shepard, and Kruskal, and it encompasses various types including classical, metric, non-metric, and generalized MDS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

MULTIDIMENSIONAL

SCALING
WHAT IS MULTIDIMENSIONAL SCALING?
It is a statistical method that identifies relationships
between objects in a lower dimensional space by utilizing
canonical similarity or dissimilarity data analysis techniques.

It visualizes the similarities or differences between a group


of objects or entities by converting high-dimensional data
into a more understandable two- or three-dimensional space.
This reduction seeks to preserve the fundamental
relationships present in the data, making analysis and
interpretation simpler.
BASIC CONCEPTS AND PRINCIPLES
• MDS transforms intricate, high-dimensional data into a
simpler, lower-dimensional form, facilitating easier
visualization and interpretation. The main aim is to develop a
spatial representation in which the distances between points
accurately correspond to their original similarities or
distinctions.

• This method focuses on preserving the initial proximities


among datasets, placing similar items closer together while
positioning dissimilar items further apart in the reduced
dimensional space.
BASIC CONCEPTS AND PRINCIPLES
• MDS (Multidimensional Scaling) helps researchers and analysts
uncover significant insights from data by visualizing patterns
and relationships. These insights are essential for developing
strategies in various fields, including cognitive research,
geographic information analysis, market trend evaluations, and
brand positioning.

• MDS helps researchers and analysts uncover patterns and


relationships in data visually, providing significant insights into
data structure. These insights are vital for developing strategies
in areas like cognitive research, geographic information analysis,
market trend evaluations, and brand positioning.
PURPOSE OF MULTIDIMENSIONAL SCALING
Provide a visual depiction of the similarities or differences among data
points in a reduced-dimensional space (typically 2D or 3D). This aids in:

• Grasping patterns or structures within the data


• Recognizing clusters or anomalies
• Decreasing dimensions while maintaining distance relationships
• Streamlining intricate relationships for improved interpretation and
decision-making.
FOUNDERS OF MULTIDIMENSIONAL SCALING
Multiple founders or figures developed and advanced multidimensional scaling (MDS) but
the key figures are generally considered to be Torgerson, Shepard, and Kruskal.

Warren S. Togerson Roger N. Shepard Joseph B. Kruskal


Known for major Developed Kruskal’s
He laid the
contributions to non-metric MDS
mathematical
non-metric MDS in algorithm, which
foundation for
the 1960s, became widely
classical MDS in the
especially in used.
1950s.
psychology.
TYPES OF MULTIDIMENSIONAL SCALING
Classical Multidimensional Scaling
Classical Multidimensional Scaling (CMDS) is a method that showcases the structure of
distance-like data in the form of a geometrical picture. CMDS is also referred to as Principal
Coordinates Analysis (PCoA), Torgerson Scaling, or Torgerson–Gower scaling.

It takes an input matrix displaying dissimilarities between pairs of items and generates an
output in the form of a coordinate matrix whose configuration minimizes a loss function known
as strain.

Stress is the measure of goodness-of-fit in multidimensional scaling. It is based on the


differences between the predicted and the actual distances.
This measures the difference between the observed (dis)similarity matrix and the estimated
similarity matrix by using one or multiple estimated stimuli dimensions.
Lower stress indicates that the fit is better.
TYPES OF MULTIDIMENSIONAL SCALING
Classical Multidimensional Scaling
This is how a Classical MDS algorithm tends to work:
It makes use of the fact that it can derive the coordinate matrix X through eigenvalue decomposition
from B = XX’. By making use of double centering, the matrix B can be computed from proximity matrix
D.
The steps involved are:
Setting up the squared proximity matrix D2 = [dij2]
Applying double centering: B = -12CD(2)Cby making use of the centering matrix C = I - 1nJn. Here, n
refers to the number of objects, I is the n x n identity matrix, and Jn is an n x n matrix of all ones.
Determining the m largest eigenvalues ƛ1, ƛ2,…, ƛm along with the corresponding eigenvectors e1,
e2,..., em, of B. Here m is the number of dimensions desired for the output).
Now, X = Emm1/2. Here, Em is the matrix of meigenvectors and mis the the diagonal matrix of
meigenvalues of B.
This does not apply to direct dissimilarity ratings because Classical Multidimensional Scaling assumes
Euclidean distances.
TYPES OF MULTIDIMENSIONAL SCALING
Metric Multidimensional Scaling
It is a superset of classical multidimensional scaling. It generalizes the
optimization procedure to a wide range of loss functions and input matrices of
known distances with weights and so on.

Non-metric Multidimensional Scaling


Unlike metric multidimensional scaling, non-metric multidimensional scaling
identifies a non-parametric monotonic relationship between the dissimilarities in
the item-item matrix as well as the Euclidean distances between items, along with
the location of every item in the low-dimensional space.
TYPES OF MULTIDIMENSIONAL SCALING
Non-metric Multidimensional Scaling
The basic steps that are involved in the functioning of an non-metric MDS algorithm
are:
Identifying a random configuration of points.
Calculate the distances d between the points.
Find the optimal monotonic transformation of the proximities, for the purpose
of obtaining optimally scaled data f(x).
Minimizing the stress between the optimally scaled data and the distances by
identifying a fresh configuration of points.
Comparing the stress to some criterion. If the stress is small enough then you
can leave the algorithm; if not, you can return to the second step.
TYPES OF MULTIDIMENSIONAL SCALING
Generalized Multidimensional Scaling

This is an extension of metric multidimensional scaling. In GMDS, an


arbitrary smooth non-Euclidean space becomes the target space.
Generalized multidimensional scaling allows you to find the minimum-
distortion embedding of one surface into another when the
dissimilarities are distances on a surface and the target space is
another surface.
EXAMPLE OF MULTIDIMENSIONAL SCALING USES
Psychology and Cognitive Science
MDS is the conventional method used in psychology to examine
human perception, thought processes, and decision-making. It
assists psychologists in understanding how similarities and
differences between stimuli—such as words, images, or sounds—are
perceived.
Market Research and Marketing
In market research, MDS is utilized for brand positioning, product
placement, and market segmentation. Marketers use MDS to
visualize and analyze how consumers perceive brands, products, or
services, enabling them to make strategic decisions for marketing
campaigns.
EXAMPLE OF MULTIDIMENSIONAL SCALING USES
Geography and Cartography
MDS is applied in geography and cartography to explore and
understand the spatial relationships among locations, areas, or
geographical features. This application allows cartographers to
create maps that accurately reflect the true nature of
geographical entities and their relative distances.
Biology and Bioinformatics
In the field of biology, MDS is primarily used for phylogenetic
studies, predicting protein structures, and comparative genomics.
Bioinformaticians use MDS to illustrate and analyze similarities or
differences in genetic sequences, protein formations, or the
evolutionary relationships among various species.
EXAMPLE OF MULTIDIMENSIONAL SCALING USES

Social Sciences and Sociology


MDS is employed in sociology and other social sciences to analyze
social networks, intergroup relations, and cultural variations.
Sociologists apply MDS to survey data, questionnaire results, or
relational data to gain insights into social structures and dynamics.
WHEN TO USE?
The use of Multidimensional Scaling is most appropriate
when the goal of your analysis is to find the structure in
a set of distance measures between a single set of
objects or cases. This is accomplished by assigning
observations to specific locations in a conceptual low-
dimensional space so that the distances between points
in the space match the given (dis)similarities as closely
as possible. The result is a least-squares representation
of the objects in that low-dimensional space, which, in
many cases, will help you further understand your data.
ASSUMPTIONS
1 | variable specification
MDS requires to have AT LEAST 3 variables to be specified. MDS is
spatial in nature and would need to represent a 2D space, which is
inherently impossible if less than 3 variables are given. Espcially
with the analysis of distances, pairwise dissimilarities between only
2 variable would not be sufficient.
ASSUMPTIONS
2 | number of dimensions
It cannot exceed the number of objects minus one. Objects, to recall,
are the units being compared by their differences. We would
consider meaningful dimensions at n-1. Excess dimensions risk
overfitting.
ASSUMPTIONS
3 | symmetry
Proximity matrix should be symmetrical, as when the similarity of a
stimulus A to B must be the same with the similiarity of the stimulus
B to A.
ASSUMPTIONS
4 | goodness of fit (stress)
The standard scale for the stress levels could be represented by
this table:
HOW TO RUN IN SPSS
initial steps
STEP 1: FROM THE MENUS CHOOSE: ANALYZE > SCALE > MULTIDIMENSIONAL SCALING
(PROXSCAL)... THIS OPENS THE DATA FORMAT DIALOG BOX.

STEP 2: SPECIFY THE FORMAT OF YOUR DATA:


DATA FORMAT. SPECIFY WHETHER YOUR DATA CONSIST OF PROXIMITY MEASURES OR YOU WANT
TO CREATE PROXIMITIES FROM THE DATA.

NUMBER OF SOURCES. IF YOUR DATA ARE PROXIMITIES, SPECIFY WHETHER YOU HAVE A SINGLE
SOURCE OR MULTIPLE SOURCES OF PROXIMITY MEASURES.
HOW TO RUN IN SPSS
initial steps
ONE SOURCE. IF THERE IS ONE SOURCE OF PROXIMITIES, SPECIFY WHETHER YOUR DATASET IS
FORMATTED WITH THE PROXIMITIES IN A MATRIX ACROSS THE COLUMNS OR IN A SINGLE
COLUMN WITH TWO SEPARATE VARIABLES TO IDENTIFY THE ROW AND COLUMN OF EACH
PROXIMITY.

STEP 3: CLICK DEFINE.


HOW TO RUN IN SPSS
proximities across columns
STEPS:
1. SELECT 3+ PROXIMITY VARIABLES (MATCH COLUMN ORDER).
2. (OPTIONAL) SELECT WEIGHTS FOR EACH.
3. (IF MULTIPLE SOURCES) SELECT A SOURCE VARIABLE (CASES = PROXIMITIES × SOURCES).
HOW TO RUN IN SPSS
proximities in columns
STEPS:
1. SELECT 2+ PROXIMITY VARIABLES.
2. SELECT ROWS AND COLUMNS VARIABLES TO DEFINE MATRIX POSITIONS.
3. (OPTIONAL) SELECT WEIGHTS.
HOW TO RUN IN SPSS
proximities in single column
STEPS:
1. SELECT THE PROXIMITY VARIABLE.
2. DEFINE ROW AND COLUMN IDENTIFIERS.
3. (IF MULTIPLE SOURCES) DEFINE A SOURCE VARIABLE.
4. (OPTIONAL) DEFINE WEIGHTS.
HOW TO RUN IN SPSS
create proximities from data
STEPS:
1. SELECT 3+ VARIABLES (FOR VARIABLE DISTANCES), OR 1 VARIABLE (FOR CASE DISTANCES).
2. (IF MULTIPLE SOURCES) SELECT A SOURCE VARIABLE.
3. CHOOSE A PROXIMITY MEASURE (E.G., EUCLIDEAN, CHI-SQUARE, ETC.).
HOW TO RUN IN SPSS
model configurations
SCALING MODEL:
IDENTITY = SAME CONFIG. FOR ALL.
WEIGHTED EUCLIDEAN = EACH SOURCE WEIGHTS DIMENSIONS.
GENERALIZED EUCLIDEAN = WEIGHTS + ROTATION PER SOURCE.
REDUCED RANK = GENERALIZED EUCLIDEAN WITH FEWER DIMENSIONS.
HOW TO RUN IN SPSS
model configurations
MATRIX SHAPE: USE UPPER, LOWER, OR FULL MATRIX (DIAGONAL INCLUDED).
PROXIMITIES TYPE: SIMILARITY OR DISSIMILARITY.
TRANSFORMATIONS: RATIO, INTERVAL, ORDINAL, SPLINE.
APPLY WITHIN SOURCE OR ACROSS SOURCES.
HOW TO RUN IN SPSS
restrictions
APPLY RESTRICTIONS TO THE COMMON SPACE:
NONE
SOME COORDINATES FIXED (PROVIDE VARIABLES)
LINEAR COMBINATION OF OTHER VARIABLES (WITH TRANSFORMATIONS)
HOW TO RUN IN SPSS
options
INITIAL CONFIGURATIONS
SIMPLEX: EQUAL SPACING
TORGERSON: CLASSICAL SCALING
RANDOM START(S): SINGLE OR MULTIPLE
CUSTOM: USE YOUR OWN COORDINATE VARIABLES
HOW TO RUN IN SPSS
options
ITERATION CRITERIA
STRESS CONVERGENCE
MINIMUM STRESS VALUE
MAXIMUM ITERATIONS
RELAXED UPDATES (FASTER, BUT FEWER MODELS SUPPORTED)
HOW TO RUN IN SPSS
available plot types
STRESS VS DIMENSIONS
COMMON SPACE (SCATTERPLOT MATRIX)
INDIVIDUAL SPACES (FOR MODELS WITH INDIVIDUAL DIFFS)
INDIVIDUAL WEIGHTS
ORIGINAL VS. TRANSFORMED PROXIMITIES
TRANSFORMED PROXIMITIES VS. DISTANCES
TRANSFORMED INDEPENDENT VARIABLES
INTERPRETING RESULTS
1 | stress value
as with the aforementioned, the stress value corresponds to the
goodness of fit of the MDS.
INTERPRETING RESULTS
2 | dimensions
Using a 2D solution: implies that you have reduced your data to two
axes (i.e., shown by a 2D scatterplot).
Using a 3D solution: might provide a more accurate representation.
the only tradeoff is that it would be quite harder to visualize in 3D
space.
INTERPRETING RESULTS
3 | item placement
The MDS procedure will give you coordinates for each item in the
reduced-dimensional space (usually X and Y coordinates for 2D MDS).
The relative positions of the items (represented as points) indicate
their similarity or dissimilarity. Items that are closer together in
the configuration are more similar, while those that are farther
apart are more dissimilar.
Check for patterns or clusters. Items that naturally group together
can provide insights into common themes or categories.
INTERPRETING RESULTS
4 | R-Squared for individual items
High RSQ values suggest that the item fits well in the reduced space.
Low RSQ values may indicate that the item is poorly represented in
the reduced space and may require further procedures or the
inclusion of more dimensions.
INTERPRETING RESULTS
5 | component loadings
High loadings for an item on a particular dimension suggest that the
item is strongly related to that dimension.
Low loadings suggest that the item contributes less to that
dimension.
INTERPRETING RESULTS
6 | scree plot
Look for the elbow point in the plot, where the slope of the
eigenvalues starts to flatten. This is generally the optimal number
of dimensions to keep.
A sharp drop in eigenvalues after a certain point suggests that
adding more dimensions will not meaningfully improve the solution.
INTERPRETING RESULTS
7 | similarity measures (visual)
The closer two points are in the plot, the more similar the objects
are.
The farther apart two points are, the more different the objects are
perceived to be.
CASE STUDY
The application of
multidimensional scaling of
data to determining changes in
retailer customers’ preferences

Marcin Pełka, Antonio Irpino


INTRODUCTION
The COVID-19 pandemic has had a significant impact on different
aspects of human life and the economy. It affected 213 countries
and territories all over the world. Over 23 million people were
infected and forced to stay at home (Chanda & Kaul, 2022, p. 111).

Chandra and Kaul (2022) analysed the Indian beauty industry.


They investigated the possibility of a paradigm shift in this
industry due to the COVID-19 pandemic and the small chances of
its recovery at the granular level.
INTRODUCTION
A doctoral thesis by Gardner (2021) and a master’s thesis by Unger
(2022) analysed the impact of the COVID-19 pandemic on the health
and beauty market. Their results show that people limited the use
of all the types of cosmetics (face, eye, lip, and skin) compared to
the pre-COVID-19 times.

However, all these authors use classical data, where objects are
usually described by single-valued variables. This makes it
possible to represent them as a vector of quantitative or
qualitative measurements, where each column represents a
variable.
OBJECTIVES
The aim of this study is to determine which of the three
approaches within the method of multidimensional
scaling (i.e. multidimensional scaling of classical,
symbolic interval-valued or symbolic histogram data) is
most adequate for capturing the shifts in retailer
customers’ preferences that took place during the
pandemic.
METHODS
This study uses data on orders for 2019, 2020 and 20211 collected from
health and beauty retailers that offer cosmetics products. The sample
consists of small companies (usually family-managed) that are not a part
of large retail chains (e.g. Rossmann, Hebe, etc.).

Multidimensional scaling for classical data

Multidimensional scaling for symbolic interval-valued data

Multidimensional scaling for symbolic histogram data


IMPORTANCE
It is important to determine which of the three MDS approaches—
classical, interval-valued, or histogram-valued—is most effective in
capturing the shifts in retailers' customer preferences during the
pandemic. Doing so not only enhances previous research that relied
solely on classical data but also provides valuable insights for
marketing strategies and product planning. Moreover, identifying the
most suitable model leads to more accurate and detailed results,
particularly in highlighting the differences across the years 2019,
2020, and 2021.
RESULTS
In the popular classical multidimensional scaling, average values
were used. Although this approach fits the data, it was not able to
capture all the changes that happened in the beauty industry in
Poland in 2020 and 2021 as compared to 2019.

The interval-valued multidimensional scaling for symbolic data


yields better results in terms of the stress value and Pearson’s
correlation coefficient for delta and configuration distances.

The last approach, non-concentric rectangle multidimensional


scaling for symbolic histogram data is proved to be the best way
to capture, compare and analyze changes that happen on markets
over time.
TABLES AND GRAPHS
TABLES AND GRAPHS
TABLES AND GRAPHS
TABLES AND GRAPHS
TABLES AND GRAPHS
TABLES AND GRAPHS
TABLES AND GRAPHS
TABLES AND GRAPHS
CONCLUSION
The traditional classical multidimensional scaling while fit to be a model,
does not show the difference between the years and the interval valued
multidimensional scaling while better better than the classical approach, the
symbolic histogram data multidimensional scaling proved to be the best.

Products that had been most popular before 2019 were also very popular in
2019 (as far as customers’ purchases are considered), with only slight changes
occurring. However, post 2019, face beauty products that were located ‘below
and near the mask line’ (e.g. lipsticks, blushers) were bought less often than
before, whereas eye beauty products (e.g. mascaras, eye shadows) became
more popular than before. It seems that customers tried to boost their
beauty but with application of other tools.
Cross-oceanic distribution and
origin of microplastics in the
subsurface water of the South
China Sea and Eastern Indian
Ocean

Li, et. al.


INTRODUCTION
Microplastics are synthetic marine pollutants that have attracted attention
due to risks of potential environmental harm. (Derraik, 2002; Sarkar et al. 2021).
Marine microplastics have spread throughout atmospheric, watery. sedentary
environments, affecting marine ecosystems. (Li et al., 2021b; Peng et al., 2020;
Wang et al., 2019b; Zhang et al., 2019).

Microplastics were investigated worldwide, but limitations in sampling


equipment, and difficulty obtaining information about microplastics in
subsurface layers has to be explored.
OBJECTIVES

The aim of the conducted study is to determine the


distribution and the origin of microplastics in the
subsurface water of the South China Sea and Eastern
Indian Ocean.
METHODS
Subsurface seawater was collected for MP samples alongside
research cruise on the vessel with speed ranging from 0 to 16 knots.
One-way ANOVA was used to analyze multiple comparisons across
regions.
Analysis of Similarities (ANOSIM) was performed to verify links and
differences of MP distributions among four environments.
Non-metric Multidimensional Scaling (NMDS) was used to display
spatial pattern of MP communities.
RESULTS
(MULTIDIMENSIONAL SCALING)
The Pearl River Estuary was separated from the other
communities, but most particles in the samples from the South
China Sea, Strait Zone and Eastern Indian Ocean were grouped
together (Stress = 0.1, p = 0.001, R = 0.703)

The finding had resulted by factors such as the natural selection


of oceanographic processes and weakening of land-based sources.
CONCLUSION
Microplastic content in subsurface water was determined using a pump-
underway ship intake system from the Pearl River Estuary to the Eastern
Indian Ocean.
MP abundance ranged between 0 and 4.97 items m-3, with an overall mean
value of 0.40 ± 0.62 items m-3.
No significant correlation was found between the Microplastic abundance
and the physical and chemical properties of water.
Microplastic communities varied significantly indifferent marine regions.
TABLES AND GRAPHS
REFERENCES
GeeksforGeeks. (2024, May 19). What is multidimensional scaling?
https://www.geeksforgeeks.org/what-is-multidimensional-scaling/

Multidimensional scaling. Engati. (n.d.).


https://www.engati.com/glossary/multidimensional-scaling

Groenen, P. J. F., & Borg, I. (2013). The past, present, and future of multidimensional
scaling Patrick J. F. Groenen; Ingwer Borg. Econometric Institute.

The application of multidimensional scaling of data to determining changes in


retailer customers’ preferences. (2024). Questa Soft.
https://www.ceeol.com/search/article-detail?id=1242863
REFERENCES
IBM SPSS Statistics. (n.d.). https://www.ibm.com/docs/no/spss-statistics/saas?
topic=application-multidimensional-scaling

Li, C., Zhu, L., Wang, X., Liu, K., & Li, D. (2021). Cross-oceanic distribution and origin
of microplastics in the subsurface water of the South China Sea and Eastern Indian
Ocean. The Science of the Total Environment, 805, 150243.
https://doi.org/10.1016/j.scitotenv.2021.150243
THANK
YOU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy