0% found this document useful (0 votes)

0 views13 pages

Deep Clustering Using 3D Attention

This study presents a novel deep clustering model tailored for hyperspectral image (HSI) analysis, addressing challenges related to high dimensionality and complex spatial-spectral characteristics. The model integrates principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) for dimensionality reduction, followed by a 3D attention convolutional autoencoder (3D-ACAE) to enhance feature extraction. Experimental results demonstrate that this approach yields superior clustering performance compared to traditional and existing deep clustering methods.

Uploaded by

hagua2o

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views13 pages

Deep Clustering Using 3D Attention

Uploaded by

hagua2o

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

www.nature.

com/scientificreports

OPEN Deep clustering using 3D attention

convolutional autoencoder
for hyperspectral image analysis
Ziyou Zheng 1, Shuzhen Zhang 1,2*, Hailong Song 1 & Qi Yan 1

Deep clustering has been widely applicated in various fields, including natural image and language
processing. However, when it is applied to hyperspectral image (HSI) processing, it encounters
challenges due to high dimensionality of HSI and complex spatial-spectral characteristics. This
study introduces a kind of deep clustering model specifically tailed for HSI analysis. To address the
high dimensionality issue, redundant dimension of HSI is firstly eliminated by combining principal
component analysis (PCA) with t-distributed stochastic neighbor embedding (t-SNE). The reduced
dataset is then input into a three-dimensional attention convolutional autoencoder (3D-ACAE) to
extract essential spatial-spectral features. The 3D-ACAE uses spatial-spectral attention mechanism
to enhance captured features. Finally, these enhanced features pass through an embedding layer to
create a compact data-representation, and the compact data-representation is divided into distinct
clusters by clustering layer. Experimental results on three publicly available datasets validate the
superiority of the proposed model for HSI analysis.

Hyperspectral image (HSI) consists of numerous narrow and contiguous spectral bands that can capture subtle
spectral variations between ground objects, which is beneficial for several remote sensing applications1–7. Most of
these applications require classifying every pixel in the s cene8,9. However, labeling HSI requires a lot of resources
and effort, which may impose difficulties to fulfill our targets including object detection and classification.
Therefore, developing an HSI method with less reliance on labeled samples is crucial.
The downscaling of HSI is necessary due to its property of rich spectral features. The mainstream dimen-
sionality reduction methods include two kinds: feature extraction and band selection. Dimensionality reduction
based feature extraction is to map the data into a low-dimensional space by projection transformation, and use
the features in the low-dimensional space to participate in the subsequent processing. Classical methods include
principal component analysis10, linear discriminant analysis11, etc.. Band selection is to select a subset of features
directly from the original feature set to replace all the features and some of the more famous methods, such as
band selection based on maximum information e ntropy12, and band selection method based on distance m etric13.
The use of classifiers to accomplish HSI classification using only the spectral information of the data has
been widely explored in early related studies, and many pattern recognition techniques, including Bayesian
discrimination14, support vector m achines15,16, decision trees17, and sparse representations18, have been proven
to be effective HSI classifiers in a wide range of studies. However, since these methods only consider spectral
features, the classification results conducted by them commonly contain a lot of noise.
To overcome this problem, many spatial-spectral feature extraction for HSI classification have been pro-
posed. For example, Zhang et al.19 proposed an active learning adaptive multi-view technique to achieve HSI
classification by incorporating spectral information and spatial information from segmentation maps into each
view. Hong et al.20 proposed an iterative multi-task regression method to learn a low-dimensional subspace by
considering both labeled and unlabeled samples.
Deep learning methods have received a lot of attention in recent years. Among deep learning methods, con-
volutional neural networks (CNNs) are widely used and typically comprise millions of trainable parameters.
When optimized effectively, CNNs improve the quality of extracted features and enhance classification perfor-
mance. These deep learning approaches are adopted in various applications, such as image classification, object
detection,etc. Sharma et al.21 and Hamida et al.22 introduced the pioneering 2D-CNN and 3D-CNN approaches
tailored for HSI classification. Dong et al.23 performed a weighted fusion of features with CNN and graph atten-
tion networks. These methods use deep learning to extract features that are more comprehensive than manual

1
College of Communication and Electronic Engineering, Jishou University, People’s South Road, Jishou 416000,
Hunan, China. 2Key Laboratory of Visual Perception and Artificial Intelligence, Hunan University, Lushan Road,
Changsha 410000, Hunan, China. *email: shuzhen_zhang@jsu.edu.cn

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 1

Vol.:(0123456789)
www.nature.com/scientificreports/

features and consistently outperform traditional methods in classification performance. However, due to the
limited samples of HSI, it is difficult for supervised deep neural network (DNN) to take full advantage of HSI
processing.
The challenge of scarce labeled HSI data and the associated performance issues with deep neural networks
(DNNs) have been mitigated through the application of unsupervised DNN techniques. Chen and colleagues24
utilized a deep autoencoder (DAE) for classifying HSI, incorporating principal component analysis (PCA) for
dimension reduction along with spatial-spectral information. Likewise, Mei and co-authors25 introduced a 3D
convolutional autoencoder (3D-CAE) designed to extract unsupervised spatial-spectral features. The proposed
methodology involves the integration of 3D operations, including convolution, pooling, and batch normalization,
thereby preserving the intrinsic 3D structure of HSI. However, due to the inherent characteristics of unsupervised
DNN models, they generally lack support for end-to-end training.
Unsupervised clustering algorithms are frequently employed to categorize data into clusters according to
sample similarities. Clustering is the process of partitioning a dataset into different classes or clusters according
to a specific criterion, so that the similarity of data objects within the same cluster is as large as possible, while
the differences of data objects not in the same cluster are as large as possible. That is, after clustering, the data
of the same class are gathered together as much as possible, and the data of different classes are separated as
much as possible. The K-means algorithm26 organizes samples by iteratively computing the distance between
each sample and the cluster center. Nonetheless, K-means is susceptible to uneven clustering outcomes owing
to its initial random selection of sample points. Consequently, an enhanced version known as Xmeans27 has
been developed to tackle this issue. Moreover, various clustering algorithms, including spectral clustering28 and
subspace clustering29,30, have undergone extensive research.
Studies have shown that traditional clustering algorithms combined with DNN, called deep c lustering31,
improve the effectiveness and accuracy of clustering algorithms by jointly optimizing DNN parameters and
clustering results. The deep clustering method involves employing a DNN as a feature extractor and incorporating
a layer designed to induce a clustering effect within the model structure. This specific layer is utilized to obtain
clustering results, subsequently guiding the parameter adjustments of the DNN based on these clustering results.
Deep clustering enables DNNs to learn more representative and discriminative features for clustering. In addition,
feature representation and clustering accuracy32 is significantly enhanced through iterative updates of the DNN
and clustering results. Deep clustering is categorized based on the DNN structure into autoencoder-based and
separated network-based deep clustering33. Autoencoder-based deep clustering methods usually induce a cluster-
ing layer between the encoder and the decoder, and directly use the clustering results to guide the network param-
eters, while deep clustering based on separated networks separates the DNN from the clustering, and the DNN
is trained by using the pseudo-labels obtained from the clustering. DeepCluster34 exemplifies a representative
separated network-based deep clustering method that alternates between K-means and network loss functions
for model optimization. In contrast, a deep clustering network (DCN)35, an instance of an autoencoder-based
deep clustering network, embeds clustering methods into the autoencoder for collaborative training purposes.
Utilizing the deep clustering technique in HSI has been studied by some researchers. Nalepa et al. applied
deep clustering technique in HSI segmentation for the first time by 3D convolutional autoencoder(3D CAE)36.
Meanwhile, Tulczyjew et al. extended their experiments. They went deeper by proposing an asymmetric recurrent
neural network (RNN)-based A E37 for deep clustering, utilizing recurrent neural networks with the replace-
ment of convolutional operations. Experimental results indicated that both models were able to achieve good
performance.
However, all existing deep clustering models have huge room for improvement in clustering accuracy. The
attention mechanism, as a deep learning structure, is able to automatically learn and calculate the magnitude of
the contribution of the input data to the output data. There are many studies that have added attention mecha-
nisms to deep learning models to further enhance the modeling performance. Mei et al.38 Integrating RNN
and CNN with attention mechanisms to capture both spectral and spatial correlations. Ribalta et al.39 utilized a
CNN with an attention mechanism for band selection, which was able to improve the feature extraction accu-
racy of the CNN without affecting the training speed and classification ability. However, in the study of HSI,
the current mainstream attention mechanisms have a two-branch structure, which is unable to jointly process
spatial-spectral features.
In light of the comprehensive analysis presented above and with the aim of reducing the model’s dependence
on sample quantity and achieving a more rational extraction of spatial-spectral information from HSI, this paper
introduces a novel deep clustering framework incorporating an attention mechanism. First, HSI undergoes
dimensionality reduction (DR) using the PCA and t-SNE method to eliminate redundant spectral bands. Sub-
sequently, the reduced-dimensional HSI is fed into the 3D attention convolutional autoencoder (3D-ACAE) for
feature extraction. The 3D-ACAE comprises an encoder and a decoder, incorporating 3D convolutional layers
and 3D spatial dropout to prevent overfitting. An attention module is introduced in the encoder after the sec-
ond convolutional layer to refine the extracted features. The features extracted by the encoder are flattened into
one-dimensional vectors and compressed using a fully connected layer called the embedding layer. Finally, the
output of the embedding layer is fed into the model’s clustering layer to obtain the final clustering results. The
clustering layer is optimized using the backpropagation algorithm, initialized with the K-means algorithm, and
utilizes the Kullback-Leibler (KL) divergence to calculate the clustering loss. Clustering labels are rearranged
using the Hungarian algorithm to obtain accurate evaluation metrics.
The main contributions of this paper are as follows.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 2

Vol:.(1234567890)
www.nature.com/scientificreports/

1. We propose the integration of a 3D-ACAE into a deep clustering framework. The 3D-ACAE incorporates
a spatial-spectral attention module, playing a vital role in the encoding process. This integration results in
improved precision and accuracy in feature extraction.
2. We implement a approach that combines PCA and t-SNE for preprocessing HSI data. By combining linear
and non-linear methods for preprocessing, the DNN can learn more accurate features, improving its overall
preformance.
3. The experimental results indicate that the application of deep clustering methods in HSI analysis has led to
improvements, yielding clustering results superior to both traditional clustering methods and newly proposed
deep clustering methods.

The rest of this article is organized as follows. “Related works” section provide an overview of related work,
“Proposed methods” section presents a detailed description of our proposed method. “Experimental results
and analysis” section illustrates experimental results to validate the proposed method, and “Conclusion” section
concludes the findings and discusses future perspectives.

Related works
Deep clustering
Deep clustering is a novel approach that combines DNNs with conventional clustering techniques, and it has
been extensively studied in computer vision and other domains. The primary characteristic of deep clustering is
its ability to optimize DNN parameters by employing loss functions from clustering. This study focuses on using
the autoencoder-based deep clustering structure.
The deep clustering network (DCN), an AE-based deep c lustering35, utilizes a deep autoencoder network to
reduce dimensionality and acquire K-means-compatible features. It optimizes the clustering and dimensionality
reduction tasks concurrently within a unified framework, as expressed in (1):
MN
γ 2
min ℓ g f (Ai ) , Ai + f (Ai ) − Isi 2
I,{si } 2 (1)
i=1
T
s. t. sj,i ∈ {0, 1}, 1 si = 1, ∀i, j,

where f (·) and g(·) denote non-linear mapping functions for the encoder and decoder respectively. l(·) repre-
sents the reconstruction loss function define as l(Ai , Bi ) = �Ai − Bi �22, where Ai is the original sample, Bi is the
reconstructed sample. I denotes the centroid matrix, with its i-th column representing the i-th cluster centroid.
si is the assignment vector with only one nonzero element. The objective function’s first and second itemss are
the network model loss and the clustering assignment loss, respectively, with γ being the trade off parameter.

3D convolutional autoencoder
Autoencoder (AE) is an unsupervised neural network model that learns implicit features from input data and
reconstructs the original input data using these learned features. AE primarily consists of an encoder and a
decoder. The encoder compresses the input into a compressed feature representation, and the decoder recon-
structs these into output data closely resembling the initial input data.
The convolutional autoencoder (CAE) is a variant of AE that employs convolution layers instead of fully con-
nected ones in its framework. This feature makes CAE more suitable for processing 2D and 3D data with spatial
patterns, as it does not force the input data into a 1D vector.
The 3D-CAE is implemented with an encoder and a decoder. The encoder uses standard convolution opera-
tions instead of fully-connected layers. Given an input I ∈ RL×H×W , using kernel K ∈ Rl×h×w (l ≤ L , h ≤ H
and w ≤ W ) for 3D convolution, the output can be defined as follows:
h−1 w−1
l−1
′ ′ ′
Ox,y,z = b + K p,q,r I x +p,y +q,z +r
p=0 q=0 r=0 (2)
where x ′ = x · sx ;′ = y · sy ; z ′ = z · sz
′ ′ ′
where Ox,y,z denote the (x, y, z)-th element of output O ∈ RL ×H ×W , x ∈ [1, L′ ], y ∈ [1, H ′ ], and z ∈ [1, W ′ ]. b
denotes the bias of the 3D convolution layer, (sx , sy , sz ) are the step sizes of the convolution kernel K in the three
dimensions. L′ , H ′ and W ′ represent the sizes of output O and are defined as:

L′ = L−lsx + 1, H ′ = H−h sy + 1, W ′ = W−w sz +1 (3)

where ⌊·⌋ means the round-to-zero process.

The decoder component employs a 3D transposed convolution to reconstruct the image, which is the inverse
of convolution. Transposed convolution maps the input from a low-dimensional space to a high-dimensional
space. This is achieved by zero-padding the input to create an intermediate input larger than the desired output
size. The intermediate input is then convolutionally filtered to obtain the final output.

PCA and t‑SNE

PCA technique is widely used for reducing data dimensionality in various research domains. PCA linearly trans-
forms the original linearly correlated high-dimensional data into linearly uncorrelated low-dimensional data.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 3

Vol.:(0123456789)
www.nature.com/scientificreports/

This process identifies the projection direction by maximizing the projected data variance for the original data
set. Consequently, the newly projected data retains the most relevant information from the original data while
discarding redundant components. The general formula for PCA can be summarized as follows:
1
C = XcT Xc (4)
n

Y =Xc W (5)
where C is the covariance matrix, Xc is centered data matrix, W is projection matrix, which with each column
being one of the selected eigenvectors of C, Y is reduced-dimensional data matrix and n is the number of samples.
Notably, t-SNE is a powerful dimensionality reduction technique for visualizing high-dimensional datasets
by projecting them into a lower-dimensional space, typically two or three dimensions. t-SNE preserves the pair-
wise distance relationships between individual samples in the reduced feature space by constructing probability
distributions for the high-dimensional samples. In these distributions, similar samples (represented by high
probabilities for nearby points) are more likely to be chosen, while different samples (with low probabilities for
distant points) are less likely to be selected. It achieves this by computing the conditional probabilities pj|i that
measure the similarity between data point i and data point j using a Gaussian kernel:
2
−xi − xj
pj|i = exp (6)
2 × σi2

where xi and xj are the data points, and σi is a bandwidth parameter chosen for each data point to adjust the scale
of the Gaussian distribution. Next, t-SNE defines a similar distribution for the points in the low-dimensional
embedding, computing the conditional probabilities qj|i for the lower-dimensional space:
(1 + ||yi − yj ||2 )−1
qj|i = (7)
Zi
where yi and yj are the mapped points in the lower-dimensional space, and Zi is a normalization constant. Then,
define the similarity between data points in the low-dimensional space as a joint probability:
qj|i + qi|j
qij = (8)
2n
where n is the number of samples. Finally the KL divergence between the conditional probabilities in the high
and low dimensional spaces is calculated to measure the difference between them. The KL divergence is then
minimised by gradient descent to find the best low-dimensional representation.

Proposed methods
The structure of the proposed method is shown in Fig. 1, which consists of two step: (1) HSI dimensionality
reduction by combinating PCA and t-SNE and (2) the integration of 3D-ACAE in deep clustering. The following
subsections describe the two main steps in detail.

Dimensionality reduction by PCA and t‑SNE ′

Given an original HSI Z ∈ RL×H×W , the reduced HSI Z ′ ∈ RL×H×W is obtained by using the combined PCA
and t-SNE method, where L and H represent the sizes of two spatial dimensions, W denotes the original size of
spectral dimension and W ′ means the size of the spectral dimension after reduction. After linear dimensionality
reduction and manifold learning, the original HSI retains the utmost critical information within it. The PCA
reduced the original HSI, and t-SNE reduced it further. In this study, we set W ′ to 60.

Deep clustering
Dimensionality reduction
Encoder
FC

PCA t-SNE 3D 3D-spatial 3D Att-

conv Dropout conv ention

Original HSI after

PCA_HSI
HSI DR
Decoder h

clustering
layer clustering
result
3D 3D-spatial 3D
deconv Dropout deconv

FC
Reconstructed
HSI

Figure 1. Structure of the 3D-ACAE. Where h is the embedding layer, FC is the fully connected layer, DR
means dimensionality reduction.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 4

Vol:.(1234567890)
www.nature.com/scientificreports/

Integration of 3D‑ACAE in deep clustering

After DR, the image is fed into the 3D-ACAE model for feature extraction and deep clustering. The 3D-ACAE
model comprises two main components: the encoder and the decoder. First, the HSI is divided into fixed-size
cubes and subjected to DR. These cubes are then input to the encoder for spatial-spectral feature extraction
through two 3D convolutional layers. Between these layers, specific regions of the HSI data are randomly deac-
tivated using 3D spatial dropout to mitigate overfitting. Subsequent refinement is accomplished through the
spatial-spectral attention module. The encoder’s output is flattened into a one-dimensional vector and connected
to an embedding layer, which is another fully connected layer. The dimension of this layer is set to the number
of data categories to encourage the encoder to extract features more conducive to clustering. The output of the
embedding layer is re-scaled to its pre-compression dimension in the decoder path and reconstructed into a 3D
spatial-spectral feature. The 3D transposed convolution is then applied to reconstruct the data and the 3D spatial
dropout is also applied to mitigate overfitting.
In this section, there are two issues that need to be discussed in particular, the first is the loss function of deep
clustering and the second is the spatial-spectral attention module.

Loss function of deep clustering

The mean square error (MSE) is computed between the reconstructed data and the input data, serving as the
loss function for the network. The MSE is defined as follows:
n
1 g f (xi ) − xi 2

Lr = 2 (9)
n
i=1

where n is the number of data in the dataset, xi ∈ R3 is the i-th data in the dataset.
The output features from the embedding layer are also fed into the clustering layer, where the data is parti-
tioned. We use the K-means++ algorithm to initialize the parameters in the clustering layer. Each data point zi
is mapped to the soft label qi using Student’s t-distribution, which is defined as:
2 −1
1 + zi − µj
qij = 2 −1 (10)
j 1 + zi − µj

where qij is the j-th entry of qi and represents the probability that zi belongs to cluster j.
The clustering layer utilizes the KL divergence as the loss function (11), which optimizes the clustering
process:
pij
Lc = KL(P�Q) = pij log (11)
i j
qij

where P is the target distribution, defined as:

qij2 / i qij

pij = (12)
q 2/
j ij i qij

Therefore, the optimization objective of the proposed method is expressed as the combination of reconstruction
and clustering losses:
L = Lr + γ Lc (13)
where γ controls the degree of distortion of the embedded space.
This study employs stochastic gradient descent and backpropagation algorithms to update the parameters
of the Convolutional Autoencoder (CAE) and the clustering centers. The target distribution P represents the
actual ground truth soft labels, which are contingent upon the predicted soft labels. After every T iterations of
Eqs. (9) and (11), an update is made to P.

Spatial‑spectral attention module

The attention mechanism enhances feature extraction, creating a detailed attention map. By combining this
map with the feature map, it strengthens specific feature details, boosting discriminative qualities and overall
model performance.
We enhance the accuracy and validity of the extracted features by integrating an attention module into our
3D-ACAE model (Fig. 2). Initially, the features undergo compression through a convolutional layer with a kernel
size of 1 × 1 × 1 for the number of channels, followed by the application of a Gaussian error linear unit (GeLU)
non-linear activation function. Subsequently, the processed features are fed into the 3D large kernel attention
(3D-LKA) module, where an attention mechanism is applied. The resulting feature map is then passed through
another convolutional layer with a kernel size of 1 × 1 × 1, restoring the original number of channels, and a
residual connection is added to refine the features.
The 3D-LKA module comprises three components: local convolution (depth-wise convolution, DW-Conv),
long-range convolution (depth-wise dilation convolution, DW-D-Conv), and channel convolution (point-wise
convolution, PW-Conv). In our study, we implement long-range convolution using a dilation convolution to

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 5

Vol.:(0123456789)
www.nature.com/scientificreports/

1×1×1 Conv

GeLU function
DW-Conv

3D-LKA DW-D-Conv
module

PW-Conv

1×1×1 Conv

Figure 2. The structure of spatial–spectral attention module.

obtain long-range relationships within the data without introducing additional parameters. Channel adaptation
is also essential at this stage because different channels in DNNs generally represent distinct objects. Therefore,
we utilize PW-Conv for channel adaptation within the 3D-LKA model. The 3D-LKA module is expressed as:
Attention =PW-Conv(DW-D-Conv(DW-conv(F))) (14)

Output =Attention ⊗ F (15)

where F, Attention, and ⊗ denote the input feature, attention map, and element -wise product, respectively.
In the experiments, the parameters of the LKA are configured as follows: the kernel size of the DW-conv is
5, the DW-D-conv is 1 and the dilation rate is 3. The PW-conv is 1. In addition, the convolutional layer in the
model is set up to have 32 channels with a kernel size of (5, 3, 3).

Experimental results and analysis

Here we evaluate the performance of our proposed method on three publicly available HSI datasets. The experi-
ments are conducted by using Python 3.6 and Tensorflow 2.6 with the Keras 2.6 API on an AMD Ryzen 7 5800H
CPU and NVIDIA Geforce RTX 3070 laptop GPU. We use the following three evaluation metrics to assess the
performance of the model: overall accuracy (OA), normalized mutual information (NMI), and adjusted rand
index (ARI)36,40,41.
To evaluate the effectiveness of the proposed method, we compare the 3D-ACAE deep clustering approach
with two traditional clustering methods and three newly developed methods. The traditional clustering methods
are fuzzy c-means (FCM)42 and spectral clustering (SC)43; the newly developed methods are graph convolutional
optimal transport (GCOT)40, graph convolutional subspace clustering (GCSC)41 and recurrent neural network
based deep clustering(DRNNC)37. GCOT leverages optimal transport (OT) to learn discrete transport coupling
and create natural affinity matrices for spectral clustering. It utilizes OT probability to detect the graph’s edge,
representing the original feature’s non-linear structure. GCSC transforms the self-expressive features of data
into the non-Euclidean domain, producing a more robust graph embedding dictionary. DRNNC utilizes RNN
to build autoencoder to apply deep clustering method. We ensure unbiased comparison by using the same sized
data for all experiments, with a uniform patch size of 5.

Description of datasets
The Indian Pines (IP) dataset was collected in 1992 using the AVIRIS sensor. It consists of 145 × 145 pixels
with a spatial resolution of 20 m. The dataset contains 224 bands in the wavelength range of 0.4−2.5 × 10−6
m, categorized into 16 groups. The number of bands was reduced to 200 by excluding those that covered the
absorption area.
The Pavia University (PU) dataset was acquired in 2001 using the ROSIS sensor during a flight over the
northern Italian region of Pavia University. The dataset comprises 103 bands in the range of 0.43−0.86 × 10−6
m, with a spatial resolution of 1.3 m per pixel and it contains 610 × 340 pixels. There are nine categories of land
cover in the scene for classification.
Salinas valleyValley (SA) dataset, like the IP dataset, was acquired using the AVIRIS sensor over the Salinas
Valley in California. It contains 512 × 217 pixels with a higher spatial resolution of 3.7 m per pixel. We elimi-
nated some absorption bands from the dataset, resulting in 204 bands in the final dataset. The dataset contains
16 classes for land cover classification.
We followed these work40,41,44–46 and selected sub-scenes for the three datasets to ensure fairness in the
comparison experiments. The sub-scenes of the IP, PU, and SA datasets used in this study are [30–115, 24–94],
[591–676,158–240] and [150–350,100–200], respectively.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 6

Vol:.(1234567890)
www.nature.com/scientificreports/

Hyperparameters discussion
We conducted a discussion on two hyperparameters within the proposed method, namely, the spatial distortion
control coefficient γ in (13) and the dimensionality reduction target W ′ . For γ , we conducted experiments with
four representative values, specifically 0.05, 0.1, 0.15, and 0.2. The experimental results for different γ values are
presented in Fig. 3. It can be observed that when γ is set to 0.05, all the metrics across the three datasets exhibit
relatively poor performance. This occurs because, when γ is too small, the impact of the clustering loss on the
overall loss of the network model is minimal, which is detrimental to the extraction of features suitable for clus-
tering. As γ is increased to 0.1, there is a noticeable improvement in all three metrics. However, with a further
increase in γ , the metrics exhibit a declining trend. This is due to the fact that when γ takes on larger values, the
influence of the clustering loss on the model parameters becomes more significant, leading to a decrease in the
accuracy of the features extracted by the model. Therefore, for subsequent experiments, we set γ to 0.1.
For W ′ , we considered values of 40, 50, 60, 70, and 80 for discussion. On all three datasets, these metrics
changed significantly as W ′ increased from 40 to 80. The experimental results are depicted in the Fig. 4. It can
be seen that all metrics on all three datasets achieve their maximum values at W ′ = 60. The evaluation metrics
show a lower performance when W ′ < 60 or W ′ > 60 than in the case of W ′ = 60. On both the PU dataset and
the SA dataset, all three metrics show an increasing and then decreasing trend. This indicates that if the value of
W ′ is obtained low, it will not provide enough spectral features for feature extraction, while if W ′ is obtained too
high, some redundant bands in the data will hurt the extracted features. Therefore, we choose W ′ as 60.

Ablation experiment
We conducted ablation experiments on three datasets to validate the effectiveness of the proposed method. The
experiments were divided into five groups: 3D-CAE without any modification, 3D-CAE after data preprocessing
using PCA, 3D-ACAE with only the addition of the attention mechanism, 3D-ACAE with preprocessing using
PCA, and 3D-ACAE with the addition of the attention mechanism and DR using PCA and t-SNE. The results
of these ablation experiments are presented in the Table 1. Regarding dimensionality reduction, the proposed
method, as compared to 3D-ACAE employing PCA for dimensionality reduction, exhibited improvements of
1.57%, 1.74%, and 0.8%, respectively. This underscores that our PCA+t-SNE approach incurs less information
loss when reducing the dimensionality of HSI compared to traditional methods, making it more conducive for
subsequent processing.
Concerning the incorporation of the spatial-spectral attention module into the model, the proposed
3D-ACAE, compared to the unmodified 3D-CAE, demonstrated improvements in OA of 10.16%, 1.19%, and

(a) OA result with different . (b) NMI result with different . (c) ARI result with different .

Figure 3. OA, NMI, and ARI results with different γ values.

(a) OA result with different W . (b) NMI result with different W . (c) ARI result with different W .

Figure 4. OA, NMI, and ARI results with different W ′ values.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 7

Vol.:(0123456789)
www.nature.com/scientificreports/

IP PU SA
Methods OA NMI ARI OA NMI ARI OA NMI ARI
3D-CAE 65.86 41.89 32.32 58.76 63.08 43.22 53.80 58.35 35.02
3D-CAE with PCA 69.23 48.56 36.96 58.01 66.98 55.95 62.92 60.74 43.10
3D-ACAE 76.02 57.57 48.94 59.95 72.50 54.86 74.85 80.30 70.81
3D-ACAE with PCA 76.95 57.10 50.50 63.75 75.15 46.98 75.00 82.65 66.51
Proposed method 78.52 57.63 51.95 65.49 77.95 61.54 75.80 87.89 78.10

Table 1. Ablation experimental results of the HSI dataset clustering. Best result values are indicated in bold.

21.05%, respectively, across the three datasets. The NMI also exhibited enhancements of 15.68%, 9.42%, and
21.95%, and the ARI increased by 16.62%, 11.64%, and 35.79%, respectively. These findings confirm the sig-
nificant effectiveness of the spatial-spectral attention module in refining the features extracted by the 3D-CAE,
resulting in a substantial enhancement of deep clustering performance for HSI.
The clustering maps generated by the proposed method further illustrate its superiority. Across the three
datasets, these maps produced by the proposed method are notably smoother when compared to the unimproved
methods. For instance, in Fig. 5, the Soybean-min region is markedly smoother in the proposed method, while
other clustering maps exhibit roughness within their regions, with the proposed method aligning closely with
the ground truth. Similarly, in Fig. 6, the Bare soil region in the proposed method contains only two distinct
colors with clear boundaries, in contrast to the other clustering maps, which include three or more colors and
exhibit greater spatial disorder. And in Fig. 7, the proposed method is the most similar to ground truth, and the
various other methods suffer from some degree of distortion. These observations collectively substantiate the
effectiveness of the proposed method in the task of HSI clustering.

Comparison experimental results

The clustering results (Tables 2, 3, 4; best outcomes highlighted in bold) of comparison methods on various
datasets indicate that the proposed method achieves the highest performance. This finding verifies the viability
of integrating deep learning with conventional clustering algorithms in the HSI deep clustering domain.
The Tables 2, 3, 4 clearly indicates that the proposed method has achieved the best results across all data-
sets in terms of its three evaluation metrics. For the IP dataset, the OA stands at 78.52%, representing a nota-
ble improvement of 28.6% over the worst-performing method, DRNNC, and a 10.4% improvement over the

Figure 5. Ablation experimental results of the IP dataset. (a) Ground truth, (b) 3D-CAE, (c) 3D-CAE with
PCA, (d) 3D-ACAE, (e) 3D-ACAE with PCA, and (f) the proposed method.

Figure 6. Ablation experimental results of the PU dataset. (a) Ground truth, (b) 3D-CAE, (c) 3D-CAE with
PCA, (d) 3D-ACAE, (e) 3D-ACAE with PCA, and (f) the proposed method.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 8

Vol:.(1234567890)
www.nature.com/scientificreports/

Figure 7. Ablation experimental results of the SA dataset. (a) Ground truth, (b) 3D-CAE, (c) 3D-CAE with
PCA, (d) 3D-ACAE, (e) 3D-ACAE with PCA, and (f) the proposed method.

FCM SC GCSC GCOT DRNNC Proposed

Soybean-min 54.48 76.67 60.79 36.7 56.74 79.94
Grass/trees 98.35 80.6 99.04 98.64 97.73 99.45
corn-notill 60.64 56.32 59.84 60.4 0 61.34
Soybean-notill 13.45 12.09 19.87 75.69 39.48 58.76
OA 54.82 68.12 59.74 60.31 49.92 78.52
NMI 39.28 53.15 47.58 45.43 35.28 57.63
ARI 32.76 37.56 34.50 32.48 24.45 51.95

Table 2. Clustering results of IP dataset. Best result values are indicated in bold.

FCM SC GCSC GCOT DRNNC Proposed

Painted metal sheets 35.46 35.32 38.46 99.69 87.66 61.48
Asphalt 0 0 0 0 0 0
Bitumen 78 97.93 97.85 96.33 95.42 98.67
Self-blocking bricks 24.22 0 0 11.32 37.48 14.46
Meadows 96.77 100 43.43 53.48 98.98 100
Bare soil 30.65 26.78 47.78 24.77 24.68 54.68
Trees 0 99.32 0 89.96 24.36 67.66
Shadows 97.4 96.99 97.09 96.73 96.48 97.64
OA 54.37 53.31 62.05 62.02 54.960 65.49
NMI 66.86 73.10 70.24 75.70 61.55 77.95
ARI 46.18 48.01 49.68 55.14 41.24 61.54

Table 3. Clustering results of PU dataset. Best result values are indicated in bold.

FCM SC GCSC GCOT DRNNC Proposed

Brocoli_green_weeds 1 0 0 97.43 96.54 97.87 98.75
Lettuce_romaine_4wk 74.36 96.74 99.96 0 78.64 97.7
Lettuce_romaine_5wk 99.4 98.78 99.53 100 97.77 100
Lettuce_romaine_6wk 94.35 100 78.59 0 85.43 21.45
Lettuce_romaine_7wk 98.76 95.69 43.43 0 20.58 87.9
Corn_seneed_green_weeds 68.48 75.8 25.46 100 21.46 80.19
OA 75.08 67.03 72.87 60.10 64.36 75.80
NMI 80.59 79.07 75.21 63.79 70.10 87.89
ARI 74.76 57.53 65.60 39.26 60.34 78.10

Table 4. Clustering results of SA dataset. Best result values are indicated in bold.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 9

Vol.:(0123456789)
www.nature.com/scientificreports/

best-performing method, SC. Furthermore, the proposed method outperforms these two methods in terms of
NMI, with improvements of 22.35% and 4.48%, respectively. Turning to the PU dataset, the proposed method
showcases an OA of 65.49%, along with NMI and ARI values of 77.95% and 61.54%, respectively. These results
signify improvements of 3.44% (GCSC), 2.25% (GCOT), and 6.4% (GCOT) over the methods that perform
best on these three metrics. In the case of the SA dataset, the proposed method attains an OA of 75.80%, NMI
of 87.89%, and ARI of 78.10%, indicating enhancements of 0.72%, 7.3%, and 3.34%, respectively, over the top-
performing method, FCM, across these three metrics.
Tables 2, 3 and 4 also report the results of the accuracy of each category (CA) in the three datasets. According
to CA, our proposed method gives better results for most of the categories, and a small number of categories are
misclassified because of the great similarity of spatial-spectral features between categories, such as Soybean-notil
and Soybean-min in IP, and Asphalt and Bitumen in PU. Overall, the results of CA also prove the effectiveness
of the proposed method.
The improved clustering accuracy achieved by the proposed method is also reflected in the clustering maps it
generates. In the IP dataset, as depicted in Fig. 8, the Soybean-min region is notably more complete when com-
pared to other methods, and the Corn-notill region also exhibits closer proximity to the ground truth. In the PU
dataset, the Meadows region in Fig. 9 is the smoothest among all comparative methods. For the SA dataset, the
clustering map generated by the proposed method, as shown in Fig. 10, closely aligns with the ground truth, fur-
ther validating the substantial performance enhancement offered by the proposed method in clustering accuracy.
We also compared the running times of different methods, as shown in Table 5. Among all the methods, the
two traditional approaches, FCM and SC, exhibited the fastest execution times. GCSC and GCOT, utilizing graph
convolution for computation, required slightly more time than the two aforementioned algorithms. In contrast,

Figure 8. Clustering maps of the IP dataset. (a) Ground truth, (b) FCM, (c) SC, (d) GCSC, (e) GCOT, (f)
DRNNC, and (g) the proposed method.

Figure 9. Clustering maps of the PU dataset. (a) Ground truth, (b) FCM, (c) SC, (d) GCSC, (e) GCOT, (f)
DRNNC, and (g) the proposed method.

Figure 10. Clustering maps of the SA dataset. (a) Ground truth, (b) FCM, (c) SC, (d) GCSC, (e) GCOT, (f)
DRNNC, and (g) the proposed method.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 10

Vol:.(1234567890)
www.nature.com/scientificreports/

Dataset FCM SC GCSC GCOT DRNNC Proposed method

IP 0.358 2.762 16.382 3.708 106.888 258.104
PU 0.487 2.819 33.344 14.285 103.314 130.791
SA 0.423 1.873 25.709 6.759 103.511 256.604

Table 5. Comparison of running times of different clustering methods.

both DRNNC and the proposed method, involving deep learning, necessitated model training and parameter
adjustment through backpropagation. As a result, they incurred the longest execution times. Nevertheless, given
the substantial improvement in clustering performance, the increase in execution time is acceptable.

Conclusion
This study presents a novel deep clustering approach for HSI analysis. The proposed method is first applied
PCA and t-SNE to reduce the dimensionality of the original HSI. Subsequently, a novel 3D-ACAE model is
constructed to extract spatial-spectral features from the dimension-reducted HSI data. To refine the extracted
features and improve the representation for the HSI, a spatial-spectral attention module is incorporated. The
model integrates a clustering layer after the embedding layer to obtain clustering results. The parameters of the
clustering layer are initialized using K-means and optimized through the backpropagation algorithm. Experi-
mental results demonstrate the feasibility of deep clustering for HSI analysis. In future work, the deep clustering
approach will be combined with superpixel segmentation to explore the spatial and spectral information in HSI
more comprehensively.

Data availability
The datasets analysed during the current study are available in the Search Computational Intelligence Group
repository, https://www.ehu.eus/ccwintco/index.php?title=Hyperspectral_Remote_Sensing_Scenes.

Received: 4 December 2023; Accepted: 14 February 2024

References
1. Shimoni, M., Haelterman, R. & Perneel, C. Hypersectral imaging for military and security applications: Combining myriad pro-
cessing and sensing techniques. IEEE Geosci. Remote Sens. Mag. 7, 101–117 (2019).
2. Gao, Y. et al. Hyperspectral and multispectral classification for coastal wetland using depthwise feature interaction network. IEEE
Trans. Geosci. Remote Sens. 60, 1–15 (2021).
3. Zolfaghari, K. et al. Impact of spectral resolution on quantifying cyanobacteria in lakes and reservoirs: A machine-learning assess-
ment. IEEE Trans. Geosci. Remote Sens. 60, 1–20 (2021).
4. Meerdink, S. et al. Multitarget multiple-instance learning for hyperspectral target detection. IEEE Trans. Geosci. Remote Sens. 60,
1–14 (2021).
5. Dian, R., Li, S. & Kang, X. Regularizing hyperspectral and multispectral image fusion by CNN denoiser. IEEE Trans. Neural Netw.
Learn. Syst. 32, 1124–1135 (2020).
6. Mangotra, H., Srivastava, S., Jaiswal, G., Rani, R. & Sharma, A. Hyperspectral imaging for early diagnosis of diseases: A review.
Expert Syst. 40, e13311 (2023).
7. Jaiswal, G., Sharma, A. & Yadav, S. K. Critical insights into modern hyperspectral image applications through deep learning. Wiley
Interdiscip. Rev. Data Min. Knowl. Discov. 11, e1426 (2021).
8. He, L., Li, J., Liu, C. & Li, S. Recent advances on spectral-spatial hyperspectral image classification: An overview and new guidelines.
IEEE Trans. Geosci. Remote Sens. 56, 1579–1597 (2017).
9. Li, S. et al. Deep learning for hyperspectral image classification: An overview. IEEE Trans. Geosci. Remote Sens. 57, 6690–6709
(2019).
10. Rodarmel, C. & Shan, J. Principal component analysis for hyperspectral image classification. Surv. Land Inf. Sci. 62, 115–122 (2002).
11. Chang, C.-I. & Ren, H. An experiment-based quantitative and comparative analysis of target detection and image classification
algorithms for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 38, 1044–1063 (2000).
12. Shri, T. P. & Sriraam, N. Spectral entropy feature subset selection using sepcor to detect alcoholic impact on gamma sub band
visual event related potentials of multichannel electroencephalograms (eeg). Appl. Soft Comput. 46, 441–451 (2016).
13. Mausel, P. Optimum band selection for supervised classification of multispectral data. Photogramm. Eng. Remote Sens. 56, 55–60
(1990).
14. Ruiz, P., Mateos, J., Camps-Valls, G., Molina, R. & Katsaggelos, A. K. Bayesian active remote sensing image classification. IEEE
Trans. Geosci. Remote Sens. 52, 2186–2196 (2013).
15. Melgani, F. & Bruzzone, L. Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci.
Remote Sens. 42, 1778–1790 (2004).
16. Guo, B., Gunn, S. R., Damper, R. I. & Nelson, J. D. Customizing kernel functions for SVM-based hyperspectral image classification.
IEEE Trans. Image Process. 17, 622–629 (2008).
17. Bittencourt, H. R., de Oliveira Moraes, D. A. & Haertel, V. A binary decision tree classifier implementing logistic regression as a
feature selection and classification method and its comparison with maximum likelihood. In 2007 IEEE International Geoscience
and Remote Sensing Symposium, 1755–1758 (IEEE, 2007).
18. Fu, W., Li, S., Fang, L., Kang, X. & Benediktsson, J. A. Hyperspectral image classification via shape-adaptive joint sparse representa-
tion. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 9, 556–567 (2015).
19. Zhang, Z., Pasolli, E. & Crawford, M. M. An adaptive multiview active learning approach for spectral-spatial classification of
hyperspectral images. IEEE Trans. Geosci. Remote Sens. 58, 2557–2570 (2019).
20. Hong, D., Yokoya, N., Chanussot, J., Xu, J. & Zhu, X. X. Learning to propagate labels on graphs: An iterative multitask regression
framework for semi-supervised hyperspectral dimensionality reduction. ISPRS J. Photogramm. Remote. Sens. 158, 35–49 (2019).

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 11

Vol.:(0123456789)
www.nature.com/scientificreports/

21. Sharma, V., Diba, A., Tuytelaars, T. & Van Gool, L. Hyperspectral cnn for image classification & band selection, with application
to face recognition. Technical report KUL/ESAT/PSI/1604, KU Leuven, ESAT, Leuven, Belgium (2016).
22. Hamida, A. B., Benoît, A., Lambert, P. & Amar, C. B. 3-d deep learning approach for remote sensing image classification. IEEE
Trans. Geosci. Remote Sens. 56, 4420–4434 (2018).
23. Dong, Y., Liu, Q., Du, B. & Zhang, L. Weighted feature fusion of convolutional neural network and graph attention network for
hyperspectral image classification. IEEE Trans. Image Process. 31, 1559–1572 (2022).
24. Chen, Y., Lin, Z., Zhao, X., Wang, G. & Gu, Y. Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth
Observ. Remote Sens. 7, 2094–2107 (2014).
25. Mei, S. et al. Unsupervised spatial-spectral feature learning by 3d convolutional autoencoder for hyperspectral classification. IEEE
Trans. Geosci. Remote Sens. 57, 6808–6820 (2019).
26. Ahmed, M., Seraj, R. & Islam, S. M. S. The k-means algorithm: A comprehensive survey and performance evaluation. Electronics
9, 1295 (2020).
27. Pelleg, D. et al. X-means: Extending k-means with efficient estimation of the number of clusters. In Icml vol. 1, 727–734 (2000).
28. Liu, J. & Han, J. Spectral clustering. In Data Clustering, 177–200 (Chapman and Hall/CRC, 2018).
29. Deng, Z., Choi, K.-S., Jiang, Y., Wang, J. & Wang, S. A survey on soft subspace clustering. Inf. Sci. 348, 84–106 (2016).
30. Matsushima, S. & Brbic, M. Selective sampling-based scalable sparse subspace clustering. Adv. Neural Inf. Process. Syst.32, 12416–
12425 (2019).
31. Ren, Y. et al. Deep clustering: A comprehensive survey. arXiv preprintarXiv:2210.04142 (2022).
32. Zhou, S. et al. A comprehensive survey on deep clustering: Taxonomy, challenges, and future directions. arXiv preprintarXiv:2206.
07579 (2022).
33. Min, E. et al. A survey of clustering with deep learning: From the perspective of network architecture. IEEE Access 6, 39501–39514
(2018).
34. Caron, M., Bojanowski, P., Joulin, A. & Douze, M. Deep clustering for unsupervised learning of visual features. In Proceedings of
the European Conference on Computer Vision (ECCV), 132–149 (2018).
35. Yang, B., Fu, X., Sidiropoulos, N. D. & Hong, M. Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In
International Conference on Machine Learning, 3861–3870 (PMLR, 2017).
36. Nalepa, J. et al. Unsupervised segmentation of hyperspectral images using 3-d convolutional autoencoders. IEEE Geosci. Remote
Sens. Lett. 17, 1948–1952 (2020).
37. Tulczyjew, L., Kawulok, M. & Nalepa, J. Unsupervised feature learning using recurrent neural nets for segmenting hyperspectral
images. IEEE Geosci. Remote Sens. Lett. 18, 2142–2146 (2021).
38. Mei, X. et al. Spectral-spatial attention networks for hyperspectral image classification. Remote Sens. 11, 963 (2019).
39. Ribalta Lorenzo, P., Tulczyjew, L., Marcinkiewicz, M. & Nalepa, J. Hyperspectral band selection using attention-based convolutional
neural networks. IEEE Access 8, 42384–42403 (2020).
40. Liu, S. & Wang, H. Graph convolutional optimal transport for hyperspectral image spectral clustering. IEEE Trans. Geosci. Remote
Sens. 60, 1–13 (2022).
41. Cai, Y. et al. Graph convolutional subspace clustering: A robust subspace clustering framework for hyperspectral image. IEEE
Trans. Geosci. Remote Sens. 59, 4191–4202 (2020).
42. Ahn, C.-W., Baumgardner, M. & Biehl, L. Delineation of soil variability using geostatistics and fuzzy clustering analyses of hyper-
spectral data. Soil Sci. Soc. Am. J. 63, 142–150 (1999).
43. Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007).
44. Lei, J. et al. Deep spatial-spectral subspace clustering for hyperspectral image. IEEE Trans. Circuits Syst. Video Technol. 31, 2686–
2697 (2021).
45. Kong, Y., Cheng, Y., Chen, C. L. P. & Wang, X. Hyperspectral image clustering based on unsupervised broad learning. IEEE Geosci.
Remote Sens. Lett. 16, 1741–1745 (2019).
46. Cai, Y., Zeng, M., Cai, Z., Liu, X. & Zhang, Z. Graph regularized residual subspace clustering network for hyperspectral image
clustering. Inf. Sci. 578, 85–101 (2021).

Author contributions
Conceptualization, Z.Z.; methodology, Z.Z., and S.Z.; sofware, Z.Z.; validation, Z.Z.; formal analysis, H.L and
Q.Y.; investigation, Q.Y.; writing-original draf reparation, L.W.; writing-review and editing, Z. Z., and S.Z.;
visualization, Z.Z., Q.Y.; funding acquisition, Z. Z. and S.Z. All authors have read and agreed to the published
version of the manuscript.

Funding
This work was funded by the Research Foundation of Education Department of Hunan Province of China under
Grant No.22A0371; the Graduate Research Project of Jishou University under Grant No. jdy22024.

Competing interests
Te authors declare no competing interests.

Additional information
Correspondence and requests for materials should be addressed to S.Z.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 12

Vol:.(1234567890)
www.nature.com/scientificreports/

Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. The images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 13

Vol.:(0123456789)

MambaHSI SpatialSpectral Mamba For Hyperspectral Image Classification
No ratings yet
MambaHSI SpatialSpectral Mamba For Hyperspectral Image Classification
16 pages
ML MCQ
100% (4)
ML MCQ
31 pages
Learning high-level spectral-spatial features for hyperspectral image classification with insufficient labeled samples
No ratings yet
Learning high-level spectral-spatial features for hyperspectral image classification with insufficient labeled samples
9 pages
Hyperspectral Image Classification With Spectral-Spatial Feature Integration and Ensemble Learning
No ratings yet
Hyperspectral Image Classification With Spectral-Spatial Feature Integration and Ensemble Learning
12 pages
Remote Sensing: An Enhanced Spectral Fusion 3D CNN Model For Hyperspectral Image Classification
No ratings yet
Remote Sensing: An Enhanced Spectral Fusion 3D CNN Model For Hyperspectral Image Classification
24 pages
Remote Sensing: Spectral-Spatial Classification of Hyperspectral Imagery With 3D Convolutional Neural Network
No ratings yet
Remote Sensing: Spectral-Spatial Classification of Hyperspectral Imagery With 3D Convolutional Neural Network
21 pages
Survey Paper
No ratings yet
Survey Paper
35 pages
A convolution - Transformer Fusion Network for Hyperspectral Image Classification
No ratings yet
A convolution - Transformer Fusion Network for Hyperspectral Image Classification
21 pages
SpectralSpatial Morphological Attention Transformer For Hyperspectral Image Classification
No ratings yet
SpectralSpatial Morphological Attention Transformer For Hyperspectral Image Classification
15 pages
Zhang 2018
No ratings yet
Zhang 2018
12 pages
GAO 2020 Combining t-distributed stochastic (AAM)
No ratings yet
GAO 2020 Combining t-distributed stochastic (AAM)
6 pages
DL For HSI - Review
No ratings yet
DL For HSI - Review
39 pages
Deep Feature Extraction and Classification of Hyperspectral Images Based On Convolutional Neural Networks
No ratings yet
Deep Feature Extraction and Classification of Hyperspectral Images Based On Convolutional Neural Networks
38 pages
A Survey of Deep Learning For Hyperspectral Image Classification
No ratings yet
A Survey of Deep Learning For Hyperspectral Image Classification
26 pages
2019 Deep Learning Ensemble for Hyperspectral Image Classification
No ratings yet
2019 Deep Learning Ensemble for Hyperspectral Image Classification
16 pages
Electronics 12 00488 v2
No ratings yet
Electronics 12 00488 v2
34 pages
Combining t-Distributed Stochastic Neighbor Embedding With Convolutional Neural Networks for Hyperspectral Image Classification
No ratings yet
Combining t-Distributed Stochastic Neighbor Embedding With Convolutional Neural Networks for Hyperspectral Image Classification
5 pages
201_final
No ratings yet
201_final
5 pages
PSASL Pixel-Level and Superpixel-Level Aware Subspace Learning For Hyperspectral Image Classification
No ratings yet
PSASL Pixel-Level and Superpixel-Level Aware Subspace Learning For Hyperspectral Image Classification
16 pages
Sample EIP-II Report
No ratings yet
Sample EIP-II Report
7 pages
FUZZY BASED HYPERSPECTRAL IMAGE SEGMENTATION USING SUBPIXEL DETECTION
No ratings yet
FUZZY BASED HYPERSPECTRAL IMAGE SEGMENTATION USING SUBPIXEL DETECTION
10 pages
Deep Feature Learning and Classification of Remote Sensing Images
No ratings yet
Deep Feature Learning and Classification of Remote Sensing Images
19 pages
GlobalLocal Multigranularity Transformer for Hyperspectral Image Classification
No ratings yet
GlobalLocal Multigranularity Transformer for Hyperspectral Image Classification
20 pages
Spectral-Spatial Hyperspectral Image Classification With Edge-Preserving Filtering
No ratings yet
Spectral-Spatial Hyperspectral Image Classification With Edge-Preserving Filtering
12 pages
R&D HiFACE
No ratings yet
R&D HiFACE
5 pages
s41598-025-97052-w
No ratings yet
s41598-025-97052-w
18 pages
Hasan 2019 IOP Conf. Ser. Earth Environ. Sci. 357 012035
No ratings yet
Hasan 2019 IOP Conf. Ser. Earth Environ. Sci. 357 012035
11 pages
2017 Multiple Kernel Learning for Hyperspectral Image Classification A Review
No ratings yet
2017 Multiple Kernel Learning for Hyperspectral Image Classification A Review
19 pages
Paper 82-Hyperspectral Image Classification
No ratings yet
Paper 82-Hyperspectral Image Classification
7 pages
Radiometric Indices-Based Spectro-Spatial Approach For Hyperspectral Image Classification
100% (1)
Radiometric Indices-Based Spectro-Spatial Approach For Hyperspectral Image Classification
15 pages
A Fast 3D CNN For Hyperspectral Image Classification: Muhammad Ahmad
No ratings yet
A Fast 3D CNN For Hyperspectral Image Classification: Muhammad Ahmad
5 pages
When Hyperspectral Image Classification Meets Diffusion Models: An Unsupervised Feature Learning Framework
No ratings yet
When Hyperspectral Image Classification Meets Diffusion Models: An Unsupervised Feature Learning Framework
13 pages
Base Paper
No ratings yet
Base Paper
16 pages
HybridCNN Based Hyperspectral Image Classification Using Multiscalespatiospectral Features
No ratings yet
HybridCNN Based Hyperspectral Image Classification Using Multiscalespatiospectral Features
10 pages
Sensors: Comparison of CNN Algorithms On Hyperspectral Image Classification in Agricultural Lands
No ratings yet
Sensors: Comparison of CNN Algorithms On Hyperspectral Image Classification in Agricultural Lands
17 pages
Hyperspectral Image Classification Based On Deep Attention Graph Convolutional Network
No ratings yet
Hyperspectral Image Classification Based On Deep Attention Graph Convolutional Network
16 pages
Aplicaciones de Aprendizaje Profundo para Imágenes Hiperespectrales-Una Revisión Sistemática
No ratings yet
Aplicaciones de Aprendizaje Profundo para Imágenes Hiperespectrales-Una Revisión Sistemática
18 pages
A Multiscale Dual-Branch Feature Fusion and Attention Network For Hyperspectral Images Classification
No ratings yet
A Multiscale Dual-Branch Feature Fusion and Attention Network For Hyperspectral Images Classification
13 pages
4.Final Version
No ratings yet
4.Final Version
18 pages
Ijecet 08 01 003 PDF
No ratings yet
Ijecet 08 01 003 PDF
14 pages
2018 Recent Advances on Spectral–Spatial Hyperspectral Image Classification An Overview and New Guidelines
No ratings yet
2018 Recent Advances on Spectral–Spatial Hyperspectral Image Classification An Overview and New Guidelines
19 pages
Kumar 2021 J. Phys. - Conf. Ser. 1950 012087
No ratings yet
Kumar 2021 J. Phys. - Conf. Ser. 1950 012087
13 pages
A Lightweight Transformer Network For Hyperspectral Image Classification
No ratings yet
A Lightweight Transformer Network For Hyperspectral Image Classification
17 pages
s11042-022-13959-w
No ratings yet
s11042-022-13959-w
54 pages
Hyperspectral Image Fundamentals2018
100% (1)
Hyperspectral Image Fundamentals2018
24 pages
WaveFormer SpectralSpatial Wavelet Transformer For Hyperspectral Image Classification
No ratings yet
WaveFormer SpectralSpatial Wavelet Transformer For Hyperspectral Image Classification
5 pages
Full Document - Hyperspectral PDF
No ratings yet
Full Document - Hyperspectral PDF
96 pages
Dual-Branch_Domain_Adaptation_Few-Shot_Learning_for_Hyperspectral_Image_Classification
No ratings yet
Dual-Branch_Domain_Adaptation_Few-Shot_Learning_for_Hyperspectral_Image_Classification
16 pages
Koumoutsou 2020
No ratings yet
Koumoutsou 2020
8 pages
Chen 2016
No ratings yet
Chen 2016
20 pages
Hierarchical Attention Transformer For Hyperspectral Image Classification
No ratings yet
Hierarchical Attention Transformer For Hyperspectral Image Classification
5 pages
Hyperspectral Remote Sensing Data Analysis and Future Challenges
No ratings yet
Hyperspectral Remote Sensing Data Analysis and Future Challenges
31 pages
IET Image Processing - 2019 - Hamouda - Hyperspectral Imaging Classification Based On Convolutional Neural Networks by
No ratings yet
IET Image Processing - 2019 - Hamouda - Hyperspectral Imaging Classification Based On Convolutional Neural Networks by
7 pages
Journal 2
No ratings yet
Journal 2
11 pages
14bce019 14bce023 Hyperspectral Image Classification
No ratings yet
14bce019 14bce023 Hyperspectral Image Classification
34 pages
Deep Learning Meets Hyperspectral Image Analysis: A Multidisciplinary Review
No ratings yet
Deep Learning Meets Hyperspectral Image Analysis: A Multidisciplinary Review
32 pages
Dimensionality Reduction Techniques For Hyperspectral Images
No ratings yet
Dimensionality Reduction Techniques For Hyperspectral Images
8 pages
AUTOMATIC TARGET DETECTION IN HYPERSPECTRAL IMAGES USING NEURAL NETWORK
No ratings yet
AUTOMATIC TARGET DETECTION IN HYPERSPECTRAL IMAGES USING NEURAL NETWORK
8 pages
Automatic Target Detection in
No ratings yet
Automatic Target Detection in
8 pages
Hybrid Mamba Transformer Paper
No ratings yet
Hybrid Mamba Transformer Paper
32 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Sat - 6.Pdf - Prediction of Modernized Loan Approval System Based On Machine Learning Approach
No ratings yet
Sat - 6.Pdf - Prediction of Modernized Loan Approval System Based On Machine Learning Approach
11 pages
"Android Malware Detection": Mini Project 2-B Report
No ratings yet
"Android Malware Detection": Mini Project 2-B Report
24 pages
ENS243 - Image Classification - LEC 7
No ratings yet
ENS243 - Image Classification - LEC 7
23 pages
Machine Learning With SAP
100% (1)
Machine Learning With SAP
487 pages
M5 - Custom Model Building With SQL in BigQuery ML Slides
No ratings yet
M5 - Custom Model Building With SQL in BigQuery ML Slides
32 pages
Ai, Iot, Big Data & Blockchain
No ratings yet
Ai, Iot, Big Data & Blockchain
19 pages
Artificial Intelligence in Healthcare: A Comprehensive Review of Its Ethical Concerns
No ratings yet
Artificial Intelligence in Healthcare: A Comprehensive Review of Its Ethical Concerns
12 pages
Deep Learning Schizophrenic
No ratings yet
Deep Learning Schizophrenic
16 pages
Ramaiah University of Applied Sciences: Faculty of Engineering & Technology Lab Exam Question Paper - M. Tech
No ratings yet
Ramaiah University of Applied Sciences: Faculty of Engineering & Technology Lab Exam Question Paper - M. Tech
7 pages
Economics 2150 Syllabus 8a
No ratings yet
Economics 2150 Syllabus 8a
22 pages
Machine Learning For Networking
No ratings yet
Machine Learning For Networking
10 pages
Robust Flow Control and Optimal Sensor Placement Using Deep Reinforcement Learning
No ratings yet
Robust Flow Control and Optimal Sensor Placement Using Deep Reinforcement Learning
32 pages
Flight Price Predictions
No ratings yet
Flight Price Predictions
37 pages
Image Scrapper
No ratings yet
Image Scrapper
14 pages
No Code Data Science Outline
No ratings yet
No Code Data Science Outline
6 pages
Yigezu Agonafir
No ratings yet
Yigezu Agonafir
119 pages
EE2211 Introduction To Machine Learning
No ratings yet
EE2211 Introduction To Machine Learning
99 pages
Updated Lung Format Two
No ratings yet
Updated Lung Format Two
8 pages
Titanic
No ratings yet
Titanic
7 pages
The Path to Becoming an AI Expert- A Four-Year Roadmap
No ratings yet
The Path to Becoming an AI Expert- A Four-Year Roadmap
11 pages
Automatic Multichannel Volcano-Seismic Classification Using Machine Learning and EMD
No ratings yet
Automatic Multichannel Volcano-Seismic Classification Using Machine Learning and EMD
10 pages
Generative AI For Educators Glossary
No ratings yet
Generative AI For Educators Glossary
2 pages
Brochure - Columbia - Digital Health - 08-07-2022 - V30
No ratings yet
Brochure - Columbia - Digital Health - 08-07-2022 - V30
15 pages
Bioacoustic Detection With Wavelet-Conditioned Convolutional Neural Networks
No ratings yet
Bioacoustic Detection With Wavelet-Conditioned Convolutional Neural Networks
13 pages
goshisht-2024-machine-learning-and-deep-learning-in-synthetic-biology-key-architectures-applications-and-challenges (1)
No ratings yet
goshisht-2024-machine-learning-and-deep-learning-in-synthetic-biology-key-architectures-applications-and-challenges (1)
25 pages
Unified Analytics Disrupting Traditional Healthcare Delivery and Driving The Future of Health
No ratings yet
Unified Analytics Disrupting Traditional Healthcare Delivery and Driving The Future of Health
13 pages
Farah et al 2023
No ratings yet
Farah et al 2023
19 pages
CNN Architecture Optimization Using Bio Inspired Algor - 2022 - Computers in Bio
No ratings yet
CNN Architecture Optimization Using Bio Inspired Algor - 2022 - Computers in Bio
13 pages
Class 9 Ai Ch-1 Qa
No ratings yet
Class 9 Ai Ch-1 Qa
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Deep Clustering Using 3D Attention

Uploaded by

Deep Clustering Using 3D Attention

Uploaded by

www.nature.

OPEN Deep clustering using 3D attention

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 1

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 2

where ⌊·⌋ means the round-to-zero process.

PCA and t‑SNE

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 3

Dimensionality reduction by PCA and t‑SNE ′

PCA t-SNE 3D 3D-spatial 3D Att-

Original HSI after

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 4

Integration of 3D‑ACAE in deep clustering

Loss function of deep clustering

where P is the target distribution, defined as:

Spatial‑spectral attention module

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 5

Figure 2. The structure of spatial–spectral attention module.

Output =Attention ⊗ F (15)

Experimental results and analysis

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 6

Figure 3. OA, NMI, and ARI results with different γ values.

Figure 4. OA, NMI, and ARI results with different W ′ values.

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 7

Comparison experimental results

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 8

FCM SC GCSC GCOT DRNNC Proposed

FCM SC GCSC GCOT DRNNC Proposed

FCM SC GCSC GCOT DRNNC Proposed

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 9

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 10

Dataset FCM SC GCSC GCOT DRNNC Proposed method

Table 5. Comparison of running times of different clustering methods.

Received: 4 December 2023; Accepted: 14 February 2024

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 11

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 12

© The Author(s) 2024

Scientific Reports | (2024) 14:4209 | https://doi.org/10.1038/s41598-024-54547-2 13

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.