Hierarchical Guidance Filtering

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/316445845
Hierarchical Guidance Filtering-Based Ensemble Classiﬁcation for

Hyperspectral Images
Article in IEEE Transactions on Geoscience and Remote Sensing · April 2017

DOI: 10.1109/TGRS.2017.2689805
CITATIONS READS
102 228
3 authors, including:
Bin Pan Xia Xu

Nankai University Nankai University
51 PUBLICATIONS 1,452 CITATIONS 43 PUBLICATIONS 1,428 CITATIONS
SEE PROFILE SEE PROFILE
All content following this page was uploaded by Bin Pan on 16 April 2020.
The user has requested enhancement of the downloaded file.

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 7, JULY 2017 4177
Hierarchical Guidance Filtering-Based Ensemble

Classification for Hyperspectral Images
Bin Pan, Zhenwei Shi, Member, IEEE, and Xia Xu
Abstract— Joint spectral and spatial information should be classification is still challenging, due to many reasons, such
fully exploited in order to achieve accurate classification results as the Hughes phenomenon [4].
for hyperspectral images. In this paper, we propose an ensemble A popular strategy to improve the classification accuracy
framework, which combines spectral and spatial information
in different scales. The motivation of the proposed method is designing multifeature systems. Gu et al. [5] proposed a
derives from the basic idea: by integrating many individual multiple-kernel learning method by extracting the variation
learners, ensemble learning can achieve better generalization from the different features space. Gu et al. [6] improved
ability than a single learner. In the proposed work, the individual the multikernel models using low-rank nonnegative matrix
learners are obtained by joint spectral-spatial features generated factorization. In [7] and [8], spatial information was utilized
from different scales. Specially, we develop two techniques to
construct the ensemble model, namely, hierarchical guidance to enhance the performance of multiple-kernel models. Vector
filtering (HGF) and matrix of spectral angle distance (mSAD). stacking is also a typical approach to address multifeature
HGF and mSAD are combined via a weighted ensemble strategy. problem, which refers to concatenating the multiple features
HGF is a hierarchical edge-preserving filtering operation, which and putting them into a single classifier. Chen et al. [9]
could produce diverse sample sets. Meanwhile, in each hierarchy, combined the magnitude and shape feature spaces via a
a different spatial contextual information is extracted. With the
increase of hierarchy, the pixels spectra tend smooth, while the stacked generalization. Huang and Zhang [10] compared the
spatial features are enhanced. Based on the outputs of HGF, performance of vector stacking with other multifeature meth-
a series of classifiers can be obtained. Subsequently, we define ods. However, vector stacking approach does not necessarily
a low-rank matrix, mSAD, to measure the diversity among lead to better results, because studies have shown that the
training samples in each hierarchy. Finally, an ensemble strategy classification accuracy may vary as a function of the number
is proposed using the obtained individual classifiers and mSAD.
We term the proposed method as HiFi-We. Experiments are of selected features [11].
conducted on two popular data sets, Indian Pines and Pavia Recently, ensemble learning-based methods were developed
University, as well as a challenging hyperspectral data set used for HSI classification. By integrating many individual learners,
in 2014 Data Fusion Contest (GRSS_DFC_2014). An effectiveness ensemble learning can achieve better generalization perfor-
analysis about the ensemble strategy is also displayed. mance [12]. Dietterich [13] considered that the incorpora-
Index Terms— Ensemble learning, hierarchical guidance tion of individual learners could outperform single learner
filtering (HGF), hyperspectral image (HSI) classification. mainly because of the following three aspects: first, a single
I. I NTRODUCTION learner may fall into local minima; second, the ensemble
strategy could slightly expand the hypothesis space; finally,
H YPERSPECTRAL sensors can provide images with
hundreds of continuous spectral bands as well as high
spatial resolution. During the past two decades, hyperspectral
because there may be several hypothesizes that achieve the
same performance in training sets, combining many individual
learners could reduce the risk of false hypothesis. For the
images (HSIs) processing techniques have been widely used in
task of HSI classification, researchers have proposed many
many fields, such as spectral unmixing [1], mineral identifica-
ensemble learning-based methods. Random forest methods are
tion [2], and environmental monitoring [3]. To better utilize the
typical ensemble approaches, and the use of random forest
HSI data, many HSI processing techniques are developed. One
was investigated in [14]–[17], etc. Xia et al. [18] utilized
of the most important techniques is per-pixel classification,
rotation forest [19] for HSI classification and achieved better
i.e., assign a unique class label to each pixel. However, HSI
results than random forest. Support vector machine (SVM) is
Manuscript received March 1, 2017; revised March 19, 2017; accepted also studied in some ensemble-based HSI classification meth-
March 28, 2017. Date of publication April 24, 2017; date of current version ods [20]. Pal [21] proposed two ensemble approaches based
June 22, 2017. This work was supported in part by the National Natural
Science Foundation of China under Grant 61671037, in part by the Beijing on SVM using boosting and bagging. Huang and Zhang [22]
Natural Science Foundation under Grant 4152031, and in part by the funding combined the spectral, structural, and semantic features to
project of the State Key Laboratory of Virtual Reality Technology and construct an SVM ensemble approach. Santos et al. [23]
Systems, Beihang University, under Grant BUAA-VR-16ZZ-03. (Correspond-
ing author: Zhenwei Shi.) performed a combination of six different classification models
The authors are with the Image Processing Center, School of Astronautics, based on SVM and multilayer perceptron neural network.
Beihang University, Beijing 100191, China, the Beijing Key Laboratory Xia et al. [24] proposed a rotation-based SVM ensemble
of Digital Media, Beihang University, Beijing 100191, China, and also
with the State Key Laboratory of Virtual Reality Technology and Systems, strategy with limited training samples. Diverse ensemble-based
School of Astronautics, Beihang University, Beijing 100191, China (e-mail: HSI classification methods were reported in [25]–[28], etc.
panbin@buaa.edu.cn; shizhenwei@buaa.edu.cn; xuxia@buaa.edu.cn). Many studies have also demonstrated that the use of spa-
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. tial information could significantly improve the classification
Digital Object Identifier 10.1109/TGRS.2017.2689805 accuracy [29]–[34]. A common strategy to express the spatial
0196-2892 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
4178 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 7, JULY 2017
information is using a neighborhood system [31], for example,

Markov random field [35], [36] and attribute profile [37]–[39].
Wavelet-based and Gabor-based methods are also reported in
many works. Based on the properties of HSI data, 3-D wavelet
is used to extract the texture feature [40]–[42]. Li and Du [43]
proposed a Gabor-filtering-based method using the nearest
regularized subspace (NRS). More recently, edge-preserving
filtering (EPF) has become an active research topic in natural
scene image processing [44]–[49]. The basic idea of EPF
is to remove small details and noise from the image while
preserving large-scale edges automatically. In [50], EPF is
used to address the task of HSI classification for the first time.
Xia et al. [28] combined independent component analysis
and EPF via an ensemble strategy. However, since EPF is
still a kind of smoothing filtering method, it is difficult to
determine what level of filtering is the most appropriate.
Stronger smoothing could result in better spatial represen-
tation, but at the same time lead to more loss of spectral
information.
In this paper, we present a novel ensemble learning-based
HSI classification method, which is composed of joint spectral-
spatial features of different scales. First, in order to exploit
the joint spectral-spatial information, we propose a hierar-
chical feature extraction strategy, hierarchical guidance filter-
ing (HGF). HGF is an extension of guided filtering (GF) [44]
and rolling guidance filtering (RGF) [46], which is able to
generate a series of joint spectral-spatial features. Spatial
contextual information of different scales is obtained by the
filtering in different hierarchies. Second, instead of using com-
plicated optimization techniques, we define a metric matrix,
matrix of spectral angle distance (mSAD), to evaluate the
feature quality in each hierarchy. Based on the obtained hierar-
chical features and the evaluation results, a popular ensemble
strategy, the weighted voting, is employed to determine the
final classification results. We term the proposed method as
HiFi-We.
The initial motivations of this paper include two aspects.
First, we want to combine the joint spectral-spatial infor-
Fig. 1. Flowchart of HiFi-We.
mation in different scales. The classification model should
be determined from a more representative feature space.
Second, spectral-spatial features extracted from different scales
HSI classification method. Experimental results and analysis
should have different contributions. More reliable and qualified
are displayed in Section III. We conclude this paper in
features should get higher confidence. The solution comes
Section IV.
as no surprise: we propose the HGF to obtain a series of
spectral-spatial features from different scales; then, we design
II. P ROPOSED M ETHOD
an ensemble model to simultaneously utilize these features; a
new weighting method, mSAD, is also developed. Ensemble learning can achieve better performance than
The major contributions of HiFi-We can be summarized as the best single learner via combining many individual
follows. learners [12]. Based on this idea, we propose an HSI clas-
1) A new ensemble-based HSI classification method is sification method using ensemble learning. The proposed
proposed, where joint spectral-spatial information of ensemble method contains three components: HGF, mSAD,
different scales is combined. and weighted voting-based classification. To generate various
2) We develop the HGF to extract more various spectral- joint spectral-spatial features, HGF is developed. Based on
spatial features. HGF, an individual learner can be obtained in each hierarchy.
3) The mSAD is designed and used to generate the weight Then, the mSAD is designed to evaluate the contribution of
coefficients in the ensemble model. each individual learner. At last, weighted voting is conducted
The remainder of this paper is organized as follows. to get the final classification results. The flowchart of the
In Section II, we give a detailed description of the proposed proposed method is exhibited in Fig. 1.
PAN et al.: HGF-BASED ENSEMBLE CLASSIFICATION FOR HSIs 4179
Fig. 2. Examples of HGF and RGF for Indian Pines data set. HGF and RGF are both conducted on the band 2 of the data set. (a) Guidance image, using
the first PC. (b) Original image in band 2. (c) Results of RGF, rolling five times. (d) Results of HGF, five hierarchies. (e) Results of HGF, 50 hierarchies.
A. HGF as the inputs of the next hierarchy. From Fig. 2(d) and (e),
As a kind of EPF, HGF is able to remove noise and small we can find that the outputs in the 5th and the 50th have
details while preserving the overall structure of the image. certain difference. These differences are generated, because
Therefore, it can be utilized as an implementation to extract HGF could introduce some variations on both the spectral and
the spatial contextual information for HSI classification. spatial characteristics for HSI data. In this paper, we consider
In the first hierarchy, we conduct a GF [44], [45] for each these variations as the joint spectral-spatial information in
bands of the HSI. Let Q p denotes the pth bands of the filtering different scales. To ensure that the spectral information does
output, and then, the output of GF can be expressed by not suffer severe loss after many hierarchies, here, we give
p p p parameters r and as small values as possible, for example,
Qi = ak Gi + bk ∀i ∈ ωk (1) r = 1 and = 0.01.
where ωk is a window around pixel k with size (2r + 1) × Generally, the better spatial smoothness, the greater loss of
(2r + 1), r is the window radius, i is one of a pixel in ωk , spectral characteristics. It is quite difficult to determine what
p p degree of smooth is the best. In the proposed work, we try
G is a guidance image, and ak and bk are the coefficients to be
estimated. Equation (1) indicates that the output of the filtering to address this problem via a hierarchy strategy, i.e., HGF.
is a linear transform of the guidance image. Conducting the HGF is developed to enhance the diversity of samples, where
gradient operation for (1), we can find that we can get an individual learner based on the output data in
p
every hierarchy. That is to say, for each individual learner, both
∇Qi = a p ∇Gi . (2) the number and the dimensionality of the training samples
According to (2), the filtering output Q p has an edge only keep the same. To some extent, HGF can be regarded as a
if G i also has an edge, and this is just the reason for edge linear transform of GF. Compared with some traditional subset
preserving. Then, we need to determine the linear coefficients selection methods, such as bootstrapping and bands selection,
p p
ak and bk based on the input HSI data I and the guidance using HGF can not only avoid the information loss in each
image G. The following cost function is minimized in the individual learner, but also provide more abundant feature
window ωk : expression.
p p
p p p 2 p2
The idea of HGF is similar to that of RGF [46]. However,
E(ak , bk ) = ak Gi + bk − Ii + ak (3) there are at least two characteristics of HGF seem to challenge
i∈ωk RGF. First, in HGF, we run a GF in each hierarchy, while RGF
where is a parameter controlling the smooth degree. Larger usually adopts joint bilateral filtering [52] in each iteration.
p Since GF belongs to a linear transform, whereas joint bilateral
corresponds to stronger penalization for ak , which leads
to smoother output. Equation (3) guarantees the similarity filtering is based on nonlinear model, HGF is more efficient
between the input and output of the filtering; meanwhile, noise than RGF. Note that RGF can also use GF in each rolling.
and small details are removed. Equation (3) is a linear ridge However, with the increase of rolling times, the results of
regression [51], and thus, it can be solved by RGF will get more blurry, as shown in [46]. In other words,
1 p p GF is not suitable for RGF. More importantly, the guidance
p |w| i∈ωk Ii Gi − μk Ik images used in HGF and RGF are different. Because RGF
ak =
σk 2 + is originally designed for natural scene images where only
p p p p
bk = Ik − ak μk (4) one or three bands are observed, the spectral diversity and
correlation are not considered. In RGF, the guidance image
where μk and σk are the mean value and standard variance is the original input. In this case, with the increase of rolling
p
of G in ωk , Ik is the mean value of I in ωk , and |ω| is the times, the result image will be more similar to the original
number of pixels in ωk . image. However, if the original image is seriously polluted
p p p
After obtaining (ak , bk ), the output Qi can be determined by noise, the RGF can hardly lead to satisfying results,
p
by (1). Then, we use Qi as the input of the next hierarchy. as shown in Fig. 2(b) and (c). In HGF, we use the first
In other words, the outputs of current hierarchy are considered principal component (PC) of the HSI as the guidance image
Let Xc = [x1 , x2 , · · · , xi , · · · , xn ] denote a group of the

training samples in class c, xi ∈ R L×1 is a pixel spectrum
with L bands, and n is the number of training samples in
class c. The SAD between two spectral vector xi and x j can
be expressed by

xiT x j
SAD(xi , x j ) = arccos . (5)
xi 2 · x j 2
SAD can be used to measure the difference degree between
two pixels, where lower value corresponds to smaller differ-
ence. Based on SAD, we first obtain a square matrix
⎡ ⎤
SAD(x1 , x1 ) · · · SAD(x1 , xn )
⎢ .. .. ⎥ n×n
Ŝc = ⎣ . . ⎦ ∈ R . (6)
SAD(xn , x1 ) ··· SAD(xn , xn )
Then, we define the mSAD for Xc by removing the diagonal
elements from Ŝc
Sc = [si j ]
⎡ ⎤
SAD(x1 , x2 ) · · · SAD(x1 , xn )
⎢ .. .. ⎥
=⎣ . . ⎦ ∈ Rn×(n−1) . (7)
SAD(xn , x1 ) · · · SAD(xn , xn−1 )
Sc is the mSAD for class c. Ideally, Sc should be On×(n−1) ,
i.e., all the samples in the training set are the same. According
to the hypothesis that the testing set shares the consistent
distribution as the training set, samples in testing set are
also the same as those in the training set, or at least very
similar. In this case, only limited samples are necessary for
training a powerful model. In real HSI data, this ideal situation
is impossible. However, since samples in the same class
Fig. 3. Spectral characteristics (a) before and (b) after HGF (50 hierarchies).
usually present close spectral characteristics, Sc should be
We take class Soybean-notill in Indian Pines data set for example. low rank. Fewer outliers correspond to the lower rank of Sc .
Therefore, we use the rank of Sc to measure the quality of
training samples. Usually, the rank of Sc is relaxed by nuclear
considering that it could provide an ideal representation of
norm [53], that is
the image. Therefore, in HGF, higher hierarchy could generate
.
images more similar to the guidance image, namely, the first Rc = rank(Sc ) = Sc ∗ = σi (Sc ) (8)
PC (Fig. 2). In Tables II and IV, experimental results also i
demonstrate the superiority of HGF when compared with RGF. where σi (Sc ) denotes the singular values of Sc , and Rc is the
Note that HGF is a global operation, which means that it nuclear norm of Sc . Higher Rc indicates that samples in class
should be conducted on both the training and testing sets. c are more discrepant. In this case, we can consider that these
samples have low “quality,” and vice versa.
B. mSAD Because there are many classes in HSI data, we use the
Based on HGF, we can obtain many groups of features. mean value of all Rc to calculate the weight. Based on (5)–(8),
The number of features groups is determined by the number the weight of the tth hierarchy can be obtained by
of hierarchies, i.e., each hierarchy’s outputs corresponds to −1
1
C
a certain group of features. However, the contributions of ωt = Rc (9)
different groups may not be equal. Generally, features with C
c=1
high quality have greater weights. Here, we define the term
where C is the number of classes in an HSI. Note that the
mSAD to represent the “quality” of samples. The mSAD is
reciprocal is adopted in (9), because the weight values are
based on the assumption that the samples of the same class
negatively related to the within-class diversity.
should present similar spectral characteristics. For example,
samples in Fig. 3(b) are closer between each other than
those in Fig. 3(a), i.e., samples in Fig. 3(b) have higher C. Weighted Voting for Classification
quality. In this case, the power of feature expression is Research has shown that although the class posterior prob-
enhanced; meanwhile, the number of training samples required abilities estimated by individual classifiers are often not very
is declined. accurate, soft voting usually presents better performance than
hard voting [12]. Therefore, in this paper, we adopt a soft TABLE I

voting strategy to determine the labels of test samples. In each PARAMETERS U SED IN HiFi-WE
hierarchy, we use logistic regression (softmax) classifier to
obtain the class posterior probabilities for a test sample. Let
h t denote the classifier in the tth hierarchy and h ct (x) ∈ [0, 1]
denote the probability of classifying sample x to class c. Then,
the final classification result for x is determined by

T
III. E XPERIMENTS AND D ISCUSSION
H (x) = arg max ωt h ct (x) (10)
c A. Experimental Setup
t =1
where T is the number of hierarchies, and H (x) is the In this section, we conduct three types of experiments.
predicted label. In different hierarchies, the features “x” are We have published the MATLAB demo in our homepage.1
not the same. x is determined by the outputs of HGF. Note that First, we compare the proposed HiFi-We with some state-
logistic regression is an available classifier, but not the only of-the-art HSI classification algorithms, including GCK [7],
one. Algorithm 1 depicts the overall process of the proposed NSSNet [34], NRS [43], EPF-G [50], and IIDF [54]. GCK
method. is a multiply kernel learning-based method. NRS is devel-
oped by Gabor filtering. EPF-G and IIDF are based on EPF
(we use GF in this experiment) and intrinsic image
Algorithm 1 Proposed HiFi-We Method
decomposition, respectively. NSSNet is a recently proposed
Input:I, T , r , , h(·) deep-learning-based HSI classification approach. Overall
Initialize: training set, testing set accuracy (OA), average accuracy (AA), and Kappa coeffi-
1.HGF cient (κ) are used to evaluate the performances of all the
Obtain G based on PCA for I methods. In order to verify the effectiveness of the “HGF”
For t=1:T and “mSAD,” we replace HGF by RGF (RGF-W), as well as
Generate Qt by Eq. (1)-(4) use simple majority voting (HGF-V). HGF-V and RGF-W can
End for be regarded as extensions of the HiFi-We. The performance
2.mSAD of RGF-W and HGF-V is also reported.
For t=1:T Second, the influence of important parameters is discussed.
Determine ωt by Eq. (5)-(9) The parameters used in HiFi-We are listed in Table I.
End for At last, we give statistical evaluation about the effectiveness
3.Weighted voting of the proposed ensemble strategy. Furthermore, we also verify
Classify by h t (·) that the improvement achieved by the proposed method is
Combine ωt and the results of h t (·) by Eq. (10) significant.
Output: Ensemble-based classification results for I Three data sets are used in our experiments, namely, Indian
Pines, Pavia University,2 and GRSS_DFC_2014 [55].
Readers may doubt the effectiveness of the above-mentioned 1) Indian Pines is a widely used data set for HSI clas-
ensemble strategy. According to [12], the generalization error sification. It was acquired by airborne visible/infrared
of an ensemble model is determined by imaging spectrometer in Northwestern Indiana, with the
wavelengths ranges from 0.4 to 2.5 μm. The spatial
E = Ē − Ā (11) resolution is 20 m, and the size is 145 × 145 pix-
els. After removing the water absorption bands, there
where Ē is the average error of individual learners, Ā denotes
are 200 spectral bands remain. Totally 10 249 pixels
the average ensemble “ambiguity,” and E is the generalization
are labeled, and they are classified into 16 classes.
error. For a given test sample x, A(x) is obtained by
Fig. 4(a) and (b) shows the false color composite image

T and the corresponding ground truth for this data.
A(x) = ωt (h t (x) − H (x))2. (12) 2) Pavia University data set was collected over the city
t =1 of Pavia, Italy, by reflective optics system imaging
Equation (11) is called error-ambiguity decomposition, which spectrometer (ROSIS-3) sensor. This data set contains
demonstrates that an ensemble model is effective as long as 42 776 labeled pixels, which are composed of nine
Ē is reduced and Ā is enhanced. Unfortunately, (11) cannot different classes. It has 1.3-m spatial resolution and
be optimized directly, because Ā is obtained only after the 610 × 340 pixels size. After removing the noise bands
ensemble model is determined [12]. Moreover, it is hard to totally, 103 channels are preserved. A false color com-
expand (11) and (12) from regression to classification task. posite image and the corresponding ground truth image
In this paper, the statistical significance analysis and the are shown in Fig. 5(a) and (b).
experimental discussion are conducted to verify the effective- 1 http://levir.buaa.edu.cn/Code.htm
ness of the proposed ensemble approach. Details are shown 2 http://www.ehu.eus/ccwintco/index.php?title=Hyperspectral_Remote
in Section III-D. _Sensing_Scenes
Fig. 4. Indian Pines data set. (a) False color composite image. R-G-B = bands 36-17-11. (b) Ground truth. Each color corresponds to a specific class. Results
by (c) GCK, (d) NRS, (e) EPF-G, (f) IIDF, (g) NSSNet, (h) RGF-W, (i) HGF-V, and (j) HiFi-We.
Fig. 5. Pavia University data set. (a) False color composite image. R-G-B = bands 10-27-46. (b) Ground truth. Each color corresponds to a specific class.
Results by (c) GCK, (d) NRS, (e) EPF-G, (f) IIDF, (g) NSSNet, (h) RGF-W, (i) HGF-V, and (j) HiFi-We.
3) GRSS_DFC_2014 is the long-wave infrared (LWIR, by an 84-channel airborne LWIR hyperspectral imager
thermal infrared) hyperspectral data set used in the covering an urban area near Thetford Mines in Québec,
2014 IEEE GRSS Data Fusion Contest. It was acquired Canada, with the wavelengths between 7.8 and 11.5 μm
Fig. 6. GRSS_DFC_2014 data set. (a) False color composite image. R-G-B = bands 30-45-66. (b) Ground truth. Each color corresponds to a specific class.
Results by (c) GCK, (d) NRS, (e) EPF-G, (f) IIDF, (g) NSSNet, (h) RGF-W, (i) HGF-V, and (j) HiFi-We.
and approximately 1-m spatial resolution. The size of slightly gaps could still be observed. These results indicate
this data set is 795 × 564 pixels; 22 532 labeled pixels that the improvements in HGF and mSAD are valid. It seems
and a ground truth with seven land cover classes are that HGF contributes more to the final results.
provided. Since this data set is collected from LWIR Fig. 7(a) shows the influence of training samples size on this
bands, its quality is much lower than that of Indian data set. All the compared methods achieve similar accuracies,
Pines. Therefore, this data set is more challenging. especially when the training samples number is large. The
A false color composite image and the corresponding advantage of HiFi-We mainly reflects in the case of limited
ground truth are shown in Fig. 6(a) and (b). samples, such as 10–20 per class. Performing well with limited
samples is meaningful, since a very simple method may also
B. Classification Results work well with abundant samples.
We validate the superiority of the proposed HiFi-We method 2) Results on Pavia University Data Set: Fig. 5(c)–(j) and
in the three data sets. All the methods are conducted 50 times Table III show the classification results on Pavia University
and the average results are reported. The standard deviations of data set. Compared with Indian Pines, all the methods perform
HiFi-We are also revealed. The number of hierarchies in Indian better on Pavia University data set. This maybe because
Pines, Pavia University, and GRSS_DFC_2014 data sets are the latter has higher spatial resolution. HiFi-We method still
set as 80, 20, and 20, respectively. outperforms others, and the superiority on AA is more obvi-
1) Results on Indian Pines Data Set: Indian Pines data set ous. When training samples are limited, the AA becomes a
is widely used in many works. Here, we only use 20 samples more important measure. Since several methods have achieved
in each class for training, and the rests for testing. Fig. 4(c)–(j) above 93% accuracies, in this case, 1.5%–2% advantages are
shows the overall classification maps of all the compared also significant. Among all the nine classes, HiFi-We performs
methods. We can see that strong spatial correlation is observed. best in four ones, and exceeds 90% in eight ones. The influence
In Table II, the quantitative results of different methods are of training samples size on this data set is shown in Fig. 7(b).
reported. Compared with GCK, EPF, IIDF, and NSSNet, Increasing training samples will lead to better performance.
the proposed method achieves about 4% advantage in OA, AA, When the training samples number is up to 50 per class,
and κ. Among all the 16 classes, HiFi-We performs the best HiFi-We presents nearly 98% κ.
in eight classes. Specially, 13 of all classes present over 90% 3) Results on GRSS_DFC_2014 Data Set: This data set
accuracies. RGF-W and HGF-V also perform well, however, is collected from LWIR bands, and the imaging quality is
TABLE II
C LASSIFICATION A CCURACIES OF D IFFERENT M ETHODS ON I NDIAN P INES D ATA S ET (%)
Fig. 7. Influence of training samples on (a) Indian Pines, (b) Pavia University, and (c) GRSS_DFC_2014 data sets.
TABLE III
C LASSIFICATION A CCURACIES OF D IFFERENT M ETHODS ON PAVIA U NIVERSITY D ATA S ET (%)
relatively low. Thus, this data set is more challenging. For HiFi-We outperforms them by 2%–5%. Both the HGF and
all the compared methods, 50 samples in each class are used mSAD work, and HGF still contributes more to the accuracies.
for training. The visual classification maps of all the methods Similar conclusion could be reached from the experiments in
are revealed in Fig. 6(c)–(j). In Table IV, we reveal the Indian Pines and Pavia University data sets.
objective evaluation about the classification accuracies. About The influence of training samples size is shown in Fig. 7(c).
5% advantage is observed in OA, AA, and κ. HiFi-We presents In this data set, the gaps are more apparent. When the training
the best performance in four classes, and above 80% accu- samples number is below 50 per class, the proposed method
racy in all the classes. Compared with RGF-W and HGF-V, presents 5%–10% advantage. This is mainly because the
TABLE IV
C LASSIFICATION A CCURACIES OF D IFFERENT M ETHODS ON GRSS_DFC_2014 D ATA S ET (%)
Fig. 8. Influence of T and ω. Results on (a) Indian Pines, (b) Pavia University, and (c) GRSS_DFC_2014 data sets. OA and κ correspond to the left
coordinate axis, and ω corresponds to the right one.
quality of this data is relatively lower than the other two, In general, the hierarchical strategy could really lead to diverse
while the HGF operation has actually removed some noise classification results, at the same time, the accuracies and the
and improved the image quality. weights have consistent trend.
C. Parameters Analysis D. Statistical Evaluation

As we mentioned earlier, to avoid the spectral information In this part, we give statistical evaluation about the effec-
loss, the parameters r and should be as small as possible. tiveness of the ensemble strategy. According to (11), the clas-
In this case, the different scales’ spatial information is mainly sification error can be reduced by ensemble, as long as Ā is
extracted by different hierarchies. Therefore, the number of enhanced and Ē is reduced, with the increase of individual
hierarchies T is the most important parameter in HiFi-We. learners. For Ā, we use McNemar’s test [56] to evaluate the
In addition, in Section II-B, we claim that the weighting difference of individual learners. McNemar’s test has been
strategy by (5)–(9) are meaningful. Here, we design a uniform widely used in many ensemble-based works [22], [24], [28],
experiment to evaluate the influence of T and the weights ω, which is defined by
as shown in Fig. 8. This figure is also an illustration about f 12 − f21
the influence of different scales’ spatial information. We can Z= √ (13)
see the classification accuracies keep increasing at the original f 12 + f 21
hierarchies, and after that they tend stable. This result indicates where f 12 denotes the number of samples correctly classified
that the different scales’ spatial information is not the same, by learner 1, while incorrectly by learner 2. The difference
i.e., it really has some influence on the final classification between learners 1 and 2 is statistically significant if |Z |
results. Note that the OA and κ in Fig. 8 correspond to is above 1.96. The evaluation matrix for the two data sets
the results of each individual learner. Sharp increases in is shown in Tables V–VII and Fig. 9. Note that absolute
OA and κ are observed at the first several hierarchies. For values have been adopted, and only the first 20 learners
Indian Pines data set, the OA and κ keep stable after the 20th are reported. In Fig. 9(a)–(c), we can clearly find that the
hierarchy. Similar phenomena appear at Pavia University and values around the diagonal are relatively lower, and this is
GRSS_DFC_2014 data sets after the 15th and 10th hierar- in line with our expectation. Most blocks in Fig. 9(a) are
chies, respectively. Further increasing the number hierarchies close to deep color (red), which demonstrates that the diversity
contribute little to the individual learners. Moreover, we find of the first 20 learners in Indian Pines data set is strong.
that the curves of ω present similar tendency as OA and κ. However, we notice that the diversity in Pavia University and
TABLE V
M CNEMAR ’ S T EST FOR I NDIAN P INES D ATA S ET
TABLE VI
M CNEMAR ’ S T EST FOR PAVIA U NIVERSITY D ATA S ET
Fig. 9. McNermar’s test for (a) Indian Pines data set, (b) Pavia University data set, and (c) GRSS_DFC_2014 data set.
GRSS_DFC_2014 presents significant decline after around tendency is shown that with the increase of hierarchy number,
tenth hierarchy. Thus, further increase hierarchies would not |Z | tends to decrease. The results indicate that the diversity
improve the performance of the ensemble model. Quantitative of individual learners is enhanced by HGF, at least in the
results are shown in Tables V–VII. In Table V, only two first 20 hierarchies. In Table VI, about 90% of all the values
pairs of learners score lower than 1.96. However, an obvious are above 1.96 (after removing the diagonal), which indicates
TABLE VII
M CNEMAR ’ S T EST FOR GRSS_DFC_2014 D ATA S ET
Furthermore, Fig. 8 shows that the classification accuracies

keep growing with the increase of hierarchy. This results
guarantee that Ē could be reduced after adding new individual
learners. However, it is not to say more individual learners will
lead to better accuracies. When the accuracies curves present
steady shapes, further increasing individual learners may not
reduce Ē, instead, it will harm the diversity of the ensemble
system.
To verify that the improvement achieved by HiFi-We is
significant, we use box plot to describe the detailed statistics,
as shown in Fig. 10. We compare HiFi-We with HGF-V,
RGF-W, and IIDF, because they present the best performance
among all the compared methods. Moreover, paired t-test
results also show that the improvements on OA, AA, and κ
are statistically significant (at the level of 95%) in most cases.
IV. C ONCLUSION
The initial motivation of this paper is to develop a simple
but effective HSI classification model, which could combine
spectral and spatial information in different scales. The most
immediate idea is using ensemble learning. However, to ensure
that the ensemble model really works, we must design diver-
sity enhancing as well as valid ensemble strategies. In this
paper, we propose a novel ensemble-based method for HSI
classification. The major contributions of this paper include
two folds: HGF and mSAD. HGF is an EPF operation, which
is able to generate diverse sample sets. Joint spectral-spatial
information in different scales is extracted and utilized by
Fig. 10. Statistical evaluation. (a) AA and (b) κ for Indian Pines.
(c) AA and (d) κ for Pavia University. (e) AA and (f) κ for HGF. Considering that the samples generated in each hierarchy
GRSS_DFC_2014 data sets. may have different qualities and confidences, we propose a
measurement strategy called mSAD. Finally, the HGF and the
mSAD are unified via weighted voting.
that the diversity in Pavia University data set is relatively To evaluate the performance of the proposed method,
high. For GRSS_DFC_2014 data set, descent rate of |Z | is we conduct contrast experiments with some state-of-the-art
much faster. The diversity of individual learners increases little methods on two popular data sets and a challenging data set.
after the 14 hierarchy. This results can also be implied from The results indicate that the proposed method works well, and
Fig. 8(b). Overall, we can safely infer that hierarchical strategy the effectiveness is verified via statistical evaluation.
is generally effective, but it is not necessary to set a very high There are several future works associated with the proposed
hierarchy number. method: 1) it would be interesting to extend the proposed HGF
and mSAD to other application, such as hyperspectral quality [16] J. C.-W. Chan and D. Paelinckx, “Evaluation of random forest and
evaluation, and 2) in this paper, we only study the influence Adaboost tree-based ensemble classification and spectral band selection
for ecotope mapping using airborne hyperspectral imagery,” Remote
of samples to the ensemble system. More attention could be Sens. Environ., vol. 112, no. 6, pp. 2999–3011, Jun. 2008.
paid to the design of classifiers. [17] V. F. Rodriguez-Galiano, B. Ghimire, J. Rogan, M. Chica-Olmo, and
J. P. Rigol-Sanchez, “An assessment of the effectiveness of a random
forest classifier for land-cover classification,” ISPRS J. Photogram.
ACKNOWLEDGMENT Remote Sens., vol. 67, no. 1, pp. 93–104, Jan. 2012.
[18] J. Xia, P. Du, X. He, and J. Chanussot, “Hyperspectral remote sensing
The authors would like to thank Telops Inc. (Québec image classification based on rotation forest,” IEEE Geosci. Remote
Sens. Lett., vol. 11, no. 1, pp. 239–243, Jan. 2014.
Canada) for acquiring and providing the data used in this
[19] J. J. Rodríguez, L. I. Kuncheva, and C. J. Alonso, “Rotation forest:
study, the IEEE GRSS Image Analysis and Data Fusion A new classifier ensemble method,” IEEE Trans. Pattern Anal. Mach.
Technical Committee and Dr. M. Shimoni (Signal and Image Intell., vol. 28, no. 10, pp. 1619–1630, Oct. 2006.
Centre, Royal Military Academy, Belgium) for organizing the [20] G. Mountrakis, J. Im, and C. Ogole, “Support vector machines in remote
sensing: A review,” ISPRS J. Photogram. Remote Sens., vol. 66, no. 3,
2014 Data Fusion Contest, the Centre de Recherche Public pp. 247–259, May 2011.
Gabriel Lippmann (CRPGL, Luxembourg) and Dr. M. Schlerf [21] M. Pal, “Ensemble of support vector machines for land cover classifi-
(CRPGL) for their contribution of the Hyper-Cam LWIR cation,” Int. J. Remote Sens., vol. 29, no. 10, pp. 3043–3049, 2008.
[22] X. Huang and L. Zhang, “An SVM ensemble approach combining
sensor, and Dr. M. De Martino (University of Genoa, Italy) spectral, structural, and semantic features for the classification of high-
for her contribution to data preparation. resolution remotely sensed imagery,” IEEE Trans. Geosci. Remote Sens.,
vol. 51, no. 1, pp. 257–272, Jan. 2013.
[23] A. B. Santos, A. de Albuquerque Araujo, and D. Menotti, “Combining
R EFERENCES multiple classification methods for hyperspectral data interpretation,”
IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 6, no. 3,
[1] X. Xu and Z. Shi, “Multi-objective based spectral unmixing for pp. 1450–1459, Jun. 2013.
hyperspectral images,” ISPRS J. Photogram. Remote Sens., vol. 124, [24] J. Xia, J. Chanussot, P. Du, and X. He, “Rotation-based support vector
pp. 54–69, Feb. 2017. machine ensemble in classification of hyperspectral data with limited
[2] F. J. A. van Ruitenbeek, P. Debba, F. D. van der Meer, T. Cudahy, training samples,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 3,
M. van der Meijde, and M. Hale, “Mapping white micas and their pp. 1519–1531, Mar. 2016.
absorption wavelengths using hyperspectral band ratios,” Remote Sens. [25] M. Chi, K. Qian, J. A. Benediktsson, and R. Feng, “Ensemble classifi-
Environ., vol. 102, nos. 3–4, pp. 211–222, Jun. 2006. cation algorithm for hyperspectral remote sensing data,” IEEE Geosci.
[3] B. Pan, Z. Shi, Z. An, Z. Jiang, and Y. Ma, “A novel spectral-unmixing- Remote Sens. Lett., vol. 6, no. 4, pp. 762–766, Oct. 2009.
based green algae area estimation method for GOCI data,” IEEE J. Sel. [26] B. Waske, S. van der Linden, J. A. Benediktsson, A. Rabe, and
Topics Appl. Earth Observ. Remote Sens., vol. 10, no. 2, pp. 437–449, P. Hostert, “Sensitivity of support vector machines to random feature
Feb. 2017. selection in classification of hyperspectral data,” IEEE Trans. Geosci.
[4] G. Hughes, “On the mean accuracy of statistical pattern recognizers,” Remote Sens., vol. 48, no. 7, pp. 2880–2889, Jul. 2010.
IEEE Trans. Inf. Theory, vol. 14, no. 1, pp. 55–63, Jan. 1968. [27] A. Samat, P. Du, S. Liu, J. Li, and L. Cheng, “E2 LMs: Ensemble extreme
[5] Y. Gu, C. Wang, D. You, Y. Zhang, S. Wang, and Y. Zhang, “Rep- learning machines for hyperspectral image classification,” IEEE J. Sel.
resentative multiple kernel learning for classification in hyperspec- Topics Appl. Earth Observ. Remote Sens., vol. 7, no. 4, pp. 1060–1069,
tral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 7, Apr. 2014.
pp. 2852–2865, Jul. 2012. [28] J. Xia, L. Bombrun, T. Adalı, Y. Berthoumieu, and C. Germain,
[6] Y. Gu, Q. Wang, H. Wang, D. You, and Y. Zhang, “Multiple kernel “Spectral–spatial classification of hyperspectral images using ICA and
learning via low-rank nonnegative matrix factorization for classification edge-preserving filter via an ensemble strategy,” IEEE Trans. Geosci.
of hyperspectral imagery,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 54, no. 8, pp. 4971–4982, Aug. 2016.
Remote Sens., vol. 8, no. 6, pp. 2739–2751, Jun. 2015. [29] G. Camps-Valls, D. Tuia, L. Bruzzone, and J. A. Benediktsson,
[7] J. Li, P. R. Marpu, A. Plaza, J. M. Bioucas-Dias, and J. A. Benediktsson, “Advances in hyperspectral image classification: Earth monitoring with
“Generalized composite kernel framework for hyperspectral image statistical learning methods,” IEEE Signal Process. Mag., vol. 31, no. 1,
classification,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 9, pp. 45–54, Jan. 2014.
pp. 4816–4829, Sep. 2013. [30] P. Ghamisi, M. D. Mura, and J. A. Benediktsson, “A survey on
[8] Y. Gu, T. Liu, X. Jia, J. A. Benediktsson, and J. Chanussot, “Nonlinear spectral–spatial classification techniques based on attribute profiles,”
multiple kernel learning with multiple-structure-element extended mor- IEEE Trans. Geosci. Remote Sens., vol. 53, no. 5, pp. 2335–2353,
phological profiles for hyperspectral image classification,” IEEE Trans. May 2015.
Geosci. Remote Sens., vol. 54, no. 6, pp. 3235–3247, Jun. 2016. [31] W. Li and Q. Du, “A survey on representation-based classification and
[9] J. Chen, C. Wang, and R. Wang, “Using stacked generalization to detection in hyperspectral remote sensing imagery,” Pattern Recognit.
combine SVMs in magnitude and shape feature spaces for classification Lett., vol. 82, pp. 115–123, Nov. 2015.
of hyperspectral data,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 7, [32] Y. Zhong, Y. Wu, X. Xu, and L. Zhang, “An adaptive subpixel mapping
pp. 2193–2205, Jul. 2009. method based on MAP model and class determination strategy for
[10] X. Huang and L. Zhang, “Comparison of vector stacking, multi-SVMs hyperspectral remote sensing imagery,” IEEE Trans. Geosci. Remote
fuzzy output, and multi-SVMs voting methods for multiscale VHR urban Sens., vol. 53, no. 3, pp. 1411–1426, Mar. 2015.
mapping,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 2, pp. 261–265, [33] J. Zhao, Y. Zhong, Y. Wu, L. Zhang, and H. Shu, “Sub-pixel map-
Apr. 2010. ping based on conditional random fields for hyperspectral remote
[11] M. Pal and G. M. Foody, “Feature selection for classification of sensing imagery,” IEEE J. Sel. Topics Signal Process., vol. 9, no. 6,
hyperspectral data by SVM,” IEEE Trans. Geosci. Remote Sens., vol. 48, pp. 1049–1060, Sep. 2015.
no. 5, pp. 2297–2307, May 2010. [34] B. Pan, Z. Shi, N. Zhang, and S. Xie, “Hyperspectral image classification
[12] F. Schwenker, “Ensemble methods: Foundations and algorithms,” IEEE based on nonlinear spectral–spatial network,” IEEE Geosci. Remote
Comput. Intell. Mag., vol. 8, no. 1, pp. 77–79, Feb. 2013. Sens. Lett., vol. 13, no. 12, pp. 1782–1786, Dec. 2016.
[13] T. G. Dietterich, “Ensemble methods in machine learning,” in Proc. Int. [35] Y. Tarabalka, M. Fauvel, J. Chanussot, and J. A. Benediktsson, “SVM-
Workshop Multiple Classifier Syst., 2000, pp. 1–15. and MRF-based method for accurate classification of hyperspectral
[14] J. Ham, Y. Chen, M. M. Crawford, and J. Ghosh, “Investigation of the images,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 4, pp. 736–740,
random forest framework for classification of hyperspectral data,” IEEE Oct. 2010.
Trans. Geosci. Remote Sens., vol. 43, no. 3, pp. 492–501, Mar. 2005. [36] P. Ghamisi, J. A. Benediktsson, and M. O. Ulfarsson, “Spectral–spatial
[15] P. O. Gislason, J. A. Benediktsson, and J. R. Sveinsson, “Random Forests classification of hyperspectral images based on hidden Markov random
for land cover classification,” Pattern Recognit. Lett., vol. 27, no. 4, fields,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 5, pp. 2565–2574,
pp. 294–300, Mar. 2006. May 2014.
[37] J. A. Benediktsson, J. A. Palmason, and J. R. Sveinsson, “Classification Bin Pan received the B.S. degree from the
of hyperspectral data from urban areas based on extended morphological School of Astronautics, Beihang University, Beijing,
profiles,” IEEE Trans. Geosci. Remote Sens., vol. 43, no. 3, pp. 480–491, China, in 2013, where he is currently pursuing the
Mar. 2005. Ph.D. degree with the Image Processing Center.
[38] M. D. Mura, J. A. Benediktsson, B. Waske, and L. Bruzzone, “Extended His research interests include deep learning, hyper-
profiles with morphological attribute filters for the analysis of hyperspectral unmixing, and hyperspectral image classifi-
spectral data,” Int. J. Remote Sens., vol. 31, no. 22, pp. 5975–5991, cation.
Dec. 2010.
[39] P. R. Marpu, M. Pedergnana, M. D. Mura, J. A. Benediktsson, and
L. Bruzzone, “Automatic generation of standard deviation attribute
profiles for spectral–spatial classification of remote sensing data,” IEEE
Geosci. Remote Sens. Lett., vol. 10, no. 2, pp. 293–297, Mar. 2013.
[40] L. Shen and S. Jia, “Three-dimensional Gabor wavelets for pixel-based
hyperspectral imagery classification,” IEEE Trans. Geosci. Remote Sens.,
vol. 49, no. 12, pp. 5039–5046, Dec. 2011.
[41] Y. Qian, M. Ye, and J. Zhou, “Hyperspectral image classification based
on structured sparse logistic regression and three-dimensional wavelet
texture features,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 4,
pp. 2276–2291, Apr. 2013.
[42] X. Guo, X. Huang, and L. Zhang, “Three-dimensional wavelet texture Zhenwei Shi (M’13) received the Ph.D. degree in
feature extraction and classification for multi/hyperspectral imagery,” mathematics from the Dalian University of Technol-
IEEE Geosci. Remote Sens. Lett., vol. 11, no. 12, pp. 2183–2187, ogy, Dalian, China, in 2005.
Dec. 2014. He was a Post-Doctoral Researcher with the
[43] W. Li and Q. Du, “Gabor-filtering-based nearest regularized subspace Department of Automation, Tsinghua University,
for hyperspectral image classification,” IEEE J. Sel. Topics Appl. Earth Beijing, China, from 2005 to 2007. He was a Visiting
Observ. Remote Sens., vol. 7, no. 4, pp. 1012–1022, Apr. 2014. Scholar with the Department of Electrical Engineer-
[44] K. He, J. Sun, and X. Tang, “Guided image filtering,” in Proc. Eur. ing and Computer Science, Northwestern University,
Conf. Comput. Vis. (ECCV), 2010, pp. 1–14. Evanston, IL, USA, from 2013 to 2014. He is
[45] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Trans. currently a Professor and the Dean of the Image
Pattern Anal. Mach. Intell., vol. 35, no. 6, pp. 1397–1409, Jun. 2013. Processing Center, School of Astronautics, Beihang
[46] Q. Zhang, X. Shen, L. Xu, and J. Jia, “Rolling guidance filter,” in Proc. University, Beijing. He has authored or co-authored over 100 scientific
Eur. Conf. Comput. Vis. (ECCV), 2014, pp. 815–830. papers in refereed journals and proceedings, including the IEEE T RANSAC -
[47] B. Ham, M. Cho, and J. Ponce, “Robust image filtering using joint TIONS ON PATTERN A NALYSIS AND M ACHINE I NTELLIGENCE, the IEEE
static and dynamic guidance,” in Proc. IEEE Conf. Comput. Vis. Pattern T RANSACTIONS ON N EURAL N ETWORKS , the IEEE T RANSACTIONS ON
Recognit. (CVPR), Jun. 2015, pp. 4823–4831. G EOSCIENCE AND R EMOTE S ENSING, the IEEE G EOSCIENCE AND R EMOTE
[48] Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep joint image S ENSING L ETTERS , and the IEEE Conference on Computer Vision and
filtering,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 154–169. Pattern Recognition. His research interests include remote sensing image
[49] B. Pan, Z. Shi, and X. Xu, “R-VCANet: A new deep-learning-based processing and analysis, computer vision, pattern recognition, and machine
hyperspectral image classification method,” IEEE J. Sel. Topics Appl. learning.
Earth Observ. Remote Sens., vol. 10, no. 5, pp. 1975–1986, May 2017.
[50] X. Kang, S. Li, and J. A. Benediktsson, “Spectral–spatial hyperspectral
image classification with edge-preserving filtering,” IEEE Trans. Geosci.
Remote Sens., vol. 52, no. 5, pp. 2666–2677, May 2014.
[51] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Learning. Berlin, Germany: Springer, 2001.
[52] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color
images,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Jan. 1998,
pp. 839–846.
[53] J. Wright, A. Ganesh, S. Rao, Y. Peng, and Y. Ma, “Robust principal
component analysis: Exact recovery of corrupted low-rank matrices via Xia Xu received the B.S. and M.S. degrees from the
convex optimization,” in Proc. Adv. Neural Inf. Process. Syst. (NIPS), School of Electrical Engineering, Yanshan Univer-
2009, pp. 2080–2088. sity, Qinhuangdao, China, in 2012 and 2015, respec-
[54] X. Kang, S. Li, L. Fang, and J. A. Benediktsson, “Intrinsic image tively. She is currently pursuing the Ph.D. degree
decomposition for feature extraction of hyperspectral images,” IEEE with the Image Processing Center, School of Astro-
Trans. Geosci. Remote Sens., vol. 53, no. 4, pp. 2241–2253, Apr. 2015. nautics, Beihang University, Beijing, China.
[55] (2014). IEEE GRSS Data Fusion Contest. [Online]. Available: Her research interests include hyperspectral
http://www.grss-ieee.org/community/technical-committees/data-fusion/ unmixing and multiobjective optimization.
[56] G. M. Foody, “Thematic map comparison: Evaluating the statistical
significance of differences in classification accuracy,” Photogram. Eng.
Remote Sens., vol. 70, no. 5, pp. 627–633, 2004.
View publication stats

Hierarchical Guidance Filtering

Uploaded by

Copyright:

Available Formats

Hierarchical Guidance Filtering

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hierarchical Guidance Filtering

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Hierarchical Guidance Filtering-Based Ensemble Classiﬁcation for

Article in IEEE Transactions on Geoscience and Remote Sensing · April 2017

Bin Pan Xia Xu

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Hierarchical Guidance Filtering-Based Ensemble

information is using a neighborhood system [31], for example,

Let Xc = [x1 , x2 , · · · , xi , · · · , xn ] denote a group of the

hard voting [12]. Therefore, in this paper, we adopt a soft TABLE I

C. Parameters Analysis D. Statistical Evaluation

Furthermore, Fig. 8 shows that the classification accuracies

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.