Hierarchical Guidance Filtering
Hierarchical Guidance Filtering
Hierarchical Guidance Filtering
net/publication/316445845
CITATIONS READS
102 228
3 authors, including:
All content following this page was uploaded by Bin Pan on 16 April 2020.
Abstract— Joint spectral and spatial information should be classification is still challenging, due to many reasons, such
fully exploited in order to achieve accurate classification results as the Hughes phenomenon [4].
for hyperspectral images. In this paper, we propose an ensemble A popular strategy to improve the classification accuracy
framework, which combines spectral and spatial information
in different scales. The motivation of the proposed method is designing multifeature systems. Gu et al. [5] proposed a
derives from the basic idea: by integrating many individual multiple-kernel learning method by extracting the variation
learners, ensemble learning can achieve better generalization from the different features space. Gu et al. [6] improved
ability than a single learner. In the proposed work, the individual the multikernel models using low-rank nonnegative matrix
learners are obtained by joint spectral-spatial features generated factorization. In [7] and [8], spatial information was utilized
from different scales. Specially, we develop two techniques to
construct the ensemble model, namely, hierarchical guidance to enhance the performance of multiple-kernel models. Vector
filtering (HGF) and matrix of spectral angle distance (mSAD). stacking is also a typical approach to address multifeature
HGF and mSAD are combined via a weighted ensemble strategy. problem, which refers to concatenating the multiple features
HGF is a hierarchical edge-preserving filtering operation, which and putting them into a single classifier. Chen et al. [9]
could produce diverse sample sets. Meanwhile, in each hierarchy, combined the magnitude and shape feature spaces via a
a different spatial contextual information is extracted. With the
increase of hierarchy, the pixels spectra tend smooth, while the stacked generalization. Huang and Zhang [10] compared the
spatial features are enhanced. Based on the outputs of HGF, performance of vector stacking with other multifeature meth-
a series of classifiers can be obtained. Subsequently, we define ods. However, vector stacking approach does not necessarily
a low-rank matrix, mSAD, to measure the diversity among lead to better results, because studies have shown that the
training samples in each hierarchy. Finally, an ensemble strategy classification accuracy may vary as a function of the number
is proposed using the obtained individual classifiers and mSAD.
We term the proposed method as HiFi-We. Experiments are of selected features [11].
conducted on two popular data sets, Indian Pines and Pavia Recently, ensemble learning-based methods were developed
University, as well as a challenging hyperspectral data set used for HSI classification. By integrating many individual learners,
in 2014 Data Fusion Contest (GRSS_DFC_2014). An effectiveness ensemble learning can achieve better generalization perfor-
analysis about the ensemble strategy is also displayed. mance [12]. Dietterich [13] considered that the incorpora-
Index Terms— Ensemble learning, hierarchical guidance tion of individual learners could outperform single learner
filtering (HGF), hyperspectral image (HSI) classification. mainly because of the following three aspects: first, a single
I. I NTRODUCTION learner may fall into local minima; second, the ensemble
strategy could slightly expand the hypothesis space; finally,
H YPERSPECTRAL sensors can provide images with
hundreds of continuous spectral bands as well as high
spatial resolution. During the past two decades, hyperspectral
because there may be several hypothesizes that achieve the
same performance in training sets, combining many individual
learners could reduce the risk of false hypothesis. For the
images (HSIs) processing techniques have been widely used in
task of HSI classification, researchers have proposed many
many fields, such as spectral unmixing [1], mineral identifica-
ensemble learning-based methods. Random forest methods are
tion [2], and environmental monitoring [3]. To better utilize the
typical ensemble approaches, and the use of random forest
HSI data, many HSI processing techniques are developed. One
was investigated in [14]–[17], etc. Xia et al. [18] utilized
of the most important techniques is per-pixel classification,
rotation forest [19] for HSI classification and achieved better
i.e., assign a unique class label to each pixel. However, HSI
results than random forest. Support vector machine (SVM) is
Manuscript received March 1, 2017; revised March 19, 2017; accepted also studied in some ensemble-based HSI classification meth-
March 28, 2017. Date of publication April 24, 2017; date of current version ods [20]. Pal [21] proposed two ensemble approaches based
June 22, 2017. This work was supported in part by the National Natural
Science Foundation of China under Grant 61671037, in part by the Beijing on SVM using boosting and bagging. Huang and Zhang [22]
Natural Science Foundation under Grant 4152031, and in part by the funding combined the spectral, structural, and semantic features to
project of the State Key Laboratory of Virtual Reality Technology and construct an SVM ensemble approach. Santos et al. [23]
Systems, Beihang University, under Grant BUAA-VR-16ZZ-03. (Correspond-
ing author: Zhenwei Shi.) performed a combination of six different classification models
The authors are with the Image Processing Center, School of Astronautics, based on SVM and multilayer perceptron neural network.
Beihang University, Beijing 100191, China, the Beijing Key Laboratory Xia et al. [24] proposed a rotation-based SVM ensemble
of Digital Media, Beihang University, Beijing 100191, China, and also
with the State Key Laboratory of Virtual Reality Technology and Systems, strategy with limited training samples. Diverse ensemble-based
School of Astronautics, Beihang University, Beijing 100191, China (e-mail: HSI classification methods were reported in [25]–[28], etc.
panbin@buaa.edu.cn; shizhenwei@buaa.edu.cn; xuxia@buaa.edu.cn). Many studies have also demonstrated that the use of spa-
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. tial information could significantly improve the classification
Digital Object Identifier 10.1109/TGRS.2017.2689805 accuracy [29]–[34]. A common strategy to express the spatial
0196-2892 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
4178 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 7, JULY 2017
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: HGF-BASED ENSEMBLE CLASSIFICATION FOR HSIs 4179
Fig. 2. Examples of HGF and RGF for Indian Pines data set. HGF and RGF are both conducted on the band 2 of the data set. (a) Guidance image, using
the first PC. (b) Original image in band 2. (c) Results of RGF, rolling five times. (d) Results of HGF, five hierarchies. (e) Results of HGF, 50 hierarchies.
A. HGF as the inputs of the next hierarchy. From Fig. 2(d) and (e),
As a kind of EPF, HGF is able to remove noise and small we can find that the outputs in the 5th and the 50th have
details while preserving the overall structure of the image. certain difference. These differences are generated, because
Therefore, it can be utilized as an implementation to extract HGF could introduce some variations on both the spectral and
the spatial contextual information for HSI classification. spatial characteristics for HSI data. In this paper, we consider
In the first hierarchy, we conduct a GF [44], [45] for each these variations as the joint spectral-spatial information in
bands of the HSI. Let Q p denotes the pth bands of the filtering different scales. To ensure that the spectral information does
output, and then, the output of GF can be expressed by not suffer severe loss after many hierarchies, here, we give
p p p parameters r and as small values as possible, for example,
Qi = ak Gi + bk ∀i ∈ ωk (1) r = 1 and = 0.01.
where ωk is a window around pixel k with size (2r + 1) × Generally, the better spatial smoothness, the greater loss of
(2r + 1), r is the window radius, i is one of a pixel in ωk , spectral characteristics. It is quite difficult to determine what
p p degree of smooth is the best. In the proposed work, we try
G is a guidance image, and ak and bk are the coefficients to be
estimated. Equation (1) indicates that the output of the filtering to address this problem via a hierarchy strategy, i.e., HGF.
is a linear transform of the guidance image. Conducting the HGF is developed to enhance the diversity of samples, where
gradient operation for (1), we can find that we can get an individual learner based on the output data in
p
every hierarchy. That is to say, for each individual learner, both
∇Qi = a p ∇Gi . (2) the number and the dimensionality of the training samples
According to (2), the filtering output Q p has an edge only keep the same. To some extent, HGF can be regarded as a
if G i also has an edge, and this is just the reason for edge linear transform of GF. Compared with some traditional subset
preserving. Then, we need to determine the linear coefficients selection methods, such as bootstrapping and bands selection,
p p
ak and bk based on the input HSI data I and the guidance using HGF can not only avoid the information loss in each
image G. The following cost function is minimized in the individual learner, but also provide more abundant feature
window ωk : expression.
p p
p p p 2 p2
The idea of HGF is similar to that of RGF [46]. However,
E(ak , bk ) = ak Gi + bk − Ii + ak (3) there are at least two characteristics of HGF seem to challenge
i∈ωk RGF. First, in HGF, we run a GF in each hierarchy, while RGF
where is a parameter controlling the smooth degree. Larger usually adopts joint bilateral filtering [52] in each iteration.
p Since GF belongs to a linear transform, whereas joint bilateral
corresponds to stronger penalization for ak , which leads
to smoother output. Equation (3) guarantees the similarity filtering is based on nonlinear model, HGF is more efficient
between the input and output of the filtering; meanwhile, noise than RGF. Note that RGF can also use GF in each rolling.
and small details are removed. Equation (3) is a linear ridge However, with the increase of rolling times, the results of
regression [51], and thus, it can be solved by RGF will get more blurry, as shown in [46]. In other words,
1 p p GF is not suitable for RGF. More importantly, the guidance
p |w| i∈ωk Ii Gi − μk Ik images used in HGF and RGF are different. Because RGF
ak =
σk 2 + is originally designed for natural scene images where only
p p p p
bk = Ik − ak μk (4) one or three bands are observed, the spectral diversity and
correlation are not considered. In RGF, the guidance image
where μk and σk are the mean value and standard variance is the original input. In this case, with the increase of rolling
p
of G in ωk , Ik is the mean value of I in ωk , and |ω| is the times, the result image will be more similar to the original
number of pixels in ωk . image. However, if the original image is seriously polluted
p p p
After obtaining (ak , bk ), the output Qi can be determined by noise, the RGF can hardly lead to satisfying results,
p
by (1). Then, we use Qi as the input of the next hierarchy. as shown in Fig. 2(b) and (c). In HGF, we use the first
In other words, the outputs of current hierarchy are considered principal component (PC) of the HSI as the guidance image
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
4180 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 7, JULY 2017
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: HGF-BASED ENSEMBLE CLASSIFICATION FOR HSIs 4181
where T is the number of hierarchies, and H (x) is the In this section, we conduct three types of experiments.
predicted label. In different hierarchies, the features “x” are We have published the MATLAB demo in our homepage.1
not the same. x is determined by the outputs of HGF. Note that First, we compare the proposed HiFi-We with some state-
logistic regression is an available classifier, but not the only of-the-art HSI classification algorithms, including GCK [7],
one. Algorithm 1 depicts the overall process of the proposed NSSNet [34], NRS [43], EPF-G [50], and IIDF [54]. GCK
method. is a multiply kernel learning-based method. NRS is devel-
oped by Gabor filtering. EPF-G and IIDF are based on EPF
(we use GF in this experiment) and intrinsic image
Algorithm 1 Proposed HiFi-We Method
decomposition, respectively. NSSNet is a recently proposed
Input:I, T , r , , h(·) deep-learning-based HSI classification approach. Overall
Initialize: training set, testing set accuracy (OA), average accuracy (AA), and Kappa coeffi-
1.HGF cient (κ) are used to evaluate the performances of all the
Obtain G based on PCA for I methods. In order to verify the effectiveness of the “HGF”
For t=1:T and “mSAD,” we replace HGF by RGF (RGF-W), as well as
Generate Qt by Eq. (1)-(4) use simple majority voting (HGF-V). HGF-V and RGF-W can
End for be regarded as extensions of the HiFi-We. The performance
2.mSAD of RGF-W and HGF-V is also reported.
For t=1:T Second, the influence of important parameters is discussed.
Determine ωt by Eq. (5)-(9) The parameters used in HiFi-We are listed in Table I.
End for At last, we give statistical evaluation about the effectiveness
3.Weighted voting of the proposed ensemble strategy. Furthermore, we also verify
Classify by h t (·) that the improvement achieved by the proposed method is
Combine ωt and the results of h t (·) by Eq. (10) significant.
Output: Ensemble-based classification results for I Three data sets are used in our experiments, namely, Indian
Pines, Pavia University,2 and GRSS_DFC_2014 [55].
Readers may doubt the effectiveness of the above-mentioned 1) Indian Pines is a widely used data set for HSI clas-
ensemble strategy. According to [12], the generalization error sification. It was acquired by airborne visible/infrared
of an ensemble model is determined by imaging spectrometer in Northwestern Indiana, with the
wavelengths ranges from 0.4 to 2.5 μm. The spatial
E = Ē − Ā (11) resolution is 20 m, and the size is 145 × 145 pix-
els. After removing the water absorption bands, there
where Ē is the average error of individual learners, Ā denotes
are 200 spectral bands remain. Totally 10 249 pixels
the average ensemble “ambiguity,” and E is the generalization
are labeled, and they are classified into 16 classes.
error. For a given test sample x, A(x) is obtained by
Fig. 4(a) and (b) shows the false color composite image
T and the corresponding ground truth for this data.
A(x) = ωt (h t (x) − H (x))2. (12) 2) Pavia University data set was collected over the city
t =1 of Pavia, Italy, by reflective optics system imaging
Equation (11) is called error-ambiguity decomposition, which spectrometer (ROSIS-3) sensor. This data set contains
demonstrates that an ensemble model is effective as long as 42 776 labeled pixels, which are composed of nine
Ē is reduced and Ā is enhanced. Unfortunately, (11) cannot different classes. It has 1.3-m spatial resolution and
be optimized directly, because Ā is obtained only after the 610 × 340 pixels size. After removing the noise bands
ensemble model is determined [12]. Moreover, it is hard to totally, 103 channels are preserved. A false color com-
expand (11) and (12) from regression to classification task. posite image and the corresponding ground truth image
In this paper, the statistical significance analysis and the are shown in Fig. 5(a) and (b).
experimental discussion are conducted to verify the effective- 1 http://levir.buaa.edu.cn/Code.htm
ness of the proposed ensemble approach. Details are shown 2 http://www.ehu.eus/ccwintco/index.php?title=Hyperspectral_Remote
in Section III-D. _Sensing_Scenes
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
4182 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 7, JULY 2017
Fig. 4. Indian Pines data set. (a) False color composite image. R-G-B = bands 36-17-11. (b) Ground truth. Each color corresponds to a specific class. Results
by (c) GCK, (d) NRS, (e) EPF-G, (f) IIDF, (g) NSSNet, (h) RGF-W, (i) HGF-V, and (j) HiFi-We.
Fig. 5. Pavia University data set. (a) False color composite image. R-G-B = bands 10-27-46. (b) Ground truth. Each color corresponds to a specific class.
Results by (c) GCK, (d) NRS, (e) EPF-G, (f) IIDF, (g) NSSNet, (h) RGF-W, (i) HGF-V, and (j) HiFi-We.
3) GRSS_DFC_2014 is the long-wave infrared (LWIR, by an 84-channel airborne LWIR hyperspectral imager
thermal infrared) hyperspectral data set used in the covering an urban area near Thetford Mines in Québec,
2014 IEEE GRSS Data Fusion Contest. It was acquired Canada, with the wavelengths between 7.8 and 11.5 μm
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: HGF-BASED ENSEMBLE CLASSIFICATION FOR HSIs 4183
Fig. 6. GRSS_DFC_2014 data set. (a) False color composite image. R-G-B = bands 30-45-66. (b) Ground truth. Each color corresponds to a specific class.
Results by (c) GCK, (d) NRS, (e) EPF-G, (f) IIDF, (g) NSSNet, (h) RGF-W, (i) HGF-V, and (j) HiFi-We.
and approximately 1-m spatial resolution. The size of slightly gaps could still be observed. These results indicate
this data set is 795 × 564 pixels; 22 532 labeled pixels that the improvements in HGF and mSAD are valid. It seems
and a ground truth with seven land cover classes are that HGF contributes more to the final results.
provided. Since this data set is collected from LWIR Fig. 7(a) shows the influence of training samples size on this
bands, its quality is much lower than that of Indian data set. All the compared methods achieve similar accuracies,
Pines. Therefore, this data set is more challenging. especially when the training samples number is large. The
A false color composite image and the corresponding advantage of HiFi-We mainly reflects in the case of limited
ground truth are shown in Fig. 6(a) and (b). samples, such as 10–20 per class. Performing well with limited
samples is meaningful, since a very simple method may also
B. Classification Results work well with abundant samples.
We validate the superiority of the proposed HiFi-We method 2) Results on Pavia University Data Set: Fig. 5(c)–(j) and
in the three data sets. All the methods are conducted 50 times Table III show the classification results on Pavia University
and the average results are reported. The standard deviations of data set. Compared with Indian Pines, all the methods perform
HiFi-We are also revealed. The number of hierarchies in Indian better on Pavia University data set. This maybe because
Pines, Pavia University, and GRSS_DFC_2014 data sets are the latter has higher spatial resolution. HiFi-We method still
set as 80, 20, and 20, respectively. outperforms others, and the superiority on AA is more obvi-
1) Results on Indian Pines Data Set: Indian Pines data set ous. When training samples are limited, the AA becomes a
is widely used in many works. Here, we only use 20 samples more important measure. Since several methods have achieved
in each class for training, and the rests for testing. Fig. 4(c)–(j) above 93% accuracies, in this case, 1.5%–2% advantages are
shows the overall classification maps of all the compared also significant. Among all the nine classes, HiFi-We performs
methods. We can see that strong spatial correlation is observed. best in four ones, and exceeds 90% in eight ones. The influence
In Table II, the quantitative results of different methods are of training samples size on this data set is shown in Fig. 7(b).
reported. Compared with GCK, EPF, IIDF, and NSSNet, Increasing training samples will lead to better performance.
the proposed method achieves about 4% advantage in OA, AA, When the training samples number is up to 50 per class,
and κ. Among all the 16 classes, HiFi-We performs the best HiFi-We presents nearly 98% κ.
in eight classes. Specially, 13 of all classes present over 90% 3) Results on GRSS_DFC_2014 Data Set: This data set
accuracies. RGF-W and HGF-V also perform well, however, is collected from LWIR bands, and the imaging quality is
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
4184 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 7, JULY 2017
TABLE II
C LASSIFICATION A CCURACIES OF D IFFERENT M ETHODS ON I NDIAN P INES D ATA S ET (%)
Fig. 7. Influence of training samples on (a) Indian Pines, (b) Pavia University, and (c) GRSS_DFC_2014 data sets.
TABLE III
C LASSIFICATION A CCURACIES OF D IFFERENT M ETHODS ON PAVIA U NIVERSITY D ATA S ET (%)
relatively low. Thus, this data set is more challenging. For HiFi-We outperforms them by 2%–5%. Both the HGF and
all the compared methods, 50 samples in each class are used mSAD work, and HGF still contributes more to the accuracies.
for training. The visual classification maps of all the methods Similar conclusion could be reached from the experiments in
are revealed in Fig. 6(c)–(j). In Table IV, we reveal the Indian Pines and Pavia University data sets.
objective evaluation about the classification accuracies. About The influence of training samples size is shown in Fig. 7(c).
5% advantage is observed in OA, AA, and κ. HiFi-We presents In this data set, the gaps are more apparent. When the training
the best performance in four classes, and above 80% accu- samples number is below 50 per class, the proposed method
racy in all the classes. Compared with RGF-W and HGF-V, presents 5%–10% advantage. This is mainly because the
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: HGF-BASED ENSEMBLE CLASSIFICATION FOR HSIs 4185
TABLE IV
C LASSIFICATION A CCURACIES OF D IFFERENT M ETHODS ON GRSS_DFC_2014 D ATA S ET (%)
Fig. 8. Influence of T and ω. Results on (a) Indian Pines, (b) Pavia University, and (c) GRSS_DFC_2014 data sets. OA and κ correspond to the left
coordinate axis, and ω corresponds to the right one.
quality of this data is relatively lower than the other two, In general, the hierarchical strategy could really lead to diverse
while the HGF operation has actually removed some noise classification results, at the same time, the accuracies and the
and improved the image quality. weights have consistent trend.
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
4186 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 7, JULY 2017
TABLE V
M CNEMAR ’ S T EST FOR I NDIAN P INES D ATA S ET
TABLE VI
M CNEMAR ’ S T EST FOR PAVIA U NIVERSITY D ATA S ET
Fig. 9. McNermar’s test for (a) Indian Pines data set, (b) Pavia University data set, and (c) GRSS_DFC_2014 data set.
GRSS_DFC_2014 presents significant decline after around tendency is shown that with the increase of hierarchy number,
tenth hierarchy. Thus, further increase hierarchies would not |Z | tends to decrease. The results indicate that the diversity
improve the performance of the ensemble model. Quantitative of individual learners is enhanced by HGF, at least in the
results are shown in Tables V–VII. In Table V, only two first 20 hierarchies. In Table VI, about 90% of all the values
pairs of learners score lower than 1.96. However, an obvious are above 1.96 (after removing the diagonal), which indicates
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: HGF-BASED ENSEMBLE CLASSIFICATION FOR HSIs 4187
TABLE VII
M CNEMAR ’ S T EST FOR GRSS_DFC_2014 D ATA S ET
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
4188 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 7, JULY 2017
and mSAD to other application, such as hyperspectral quality [16] J. C.-W. Chan and D. Paelinckx, “Evaluation of random forest and
evaluation, and 2) in this paper, we only study the influence Adaboost tree-based ensemble classification and spectral band selection
for ecotope mapping using airborne hyperspectral imagery,” Remote
of samples to the ensemble system. More attention could be Sens. Environ., vol. 112, no. 6, pp. 2999–3011, Jun. 2008.
paid to the design of classifiers. [17] V. F. Rodriguez-Galiano, B. Ghimire, J. Rogan, M. Chica-Olmo, and
J. P. Rigol-Sanchez, “An assessment of the effectiveness of a random
forest classifier for land-cover classification,” ISPRS J. Photogram.
ACKNOWLEDGMENT Remote Sens., vol. 67, no. 1, pp. 93–104, Jan. 2012.
[18] J. Xia, P. Du, X. He, and J. Chanussot, “Hyperspectral remote sensing
The authors would like to thank Telops Inc. (Québec image classification based on rotation forest,” IEEE Geosci. Remote
Sens. Lett., vol. 11, no. 1, pp. 239–243, Jan. 2014.
Canada) for acquiring and providing the data used in this
[19] J. J. Rodríguez, L. I. Kuncheva, and C. J. Alonso, “Rotation forest:
study, the IEEE GRSS Image Analysis and Data Fusion A new classifier ensemble method,” IEEE Trans. Pattern Anal. Mach.
Technical Committee and Dr. M. Shimoni (Signal and Image Intell., vol. 28, no. 10, pp. 1619–1630, Oct. 2006.
Centre, Royal Military Academy, Belgium) for organizing the [20] G. Mountrakis, J. Im, and C. Ogole, “Support vector machines in remote
sensing: A review,” ISPRS J. Photogram. Remote Sens., vol. 66, no. 3,
2014 Data Fusion Contest, the Centre de Recherche Public pp. 247–259, May 2011.
Gabriel Lippmann (CRPGL, Luxembourg) and Dr. M. Schlerf [21] M. Pal, “Ensemble of support vector machines for land cover classifi-
(CRPGL) for their contribution of the Hyper-Cam LWIR cation,” Int. J. Remote Sens., vol. 29, no. 10, pp. 3043–3049, 2008.
[22] X. Huang and L. Zhang, “An SVM ensemble approach combining
sensor, and Dr. M. De Martino (University of Genoa, Italy) spectral, structural, and semantic features for the classification of high-
for her contribution to data preparation. resolution remotely sensed imagery,” IEEE Trans. Geosci. Remote Sens.,
vol. 51, no. 1, pp. 257–272, Jan. 2013.
[23] A. B. Santos, A. de Albuquerque Araujo, and D. Menotti, “Combining
R EFERENCES multiple classification methods for hyperspectral data interpretation,”
IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 6, no. 3,
[1] X. Xu and Z. Shi, “Multi-objective based spectral unmixing for pp. 1450–1459, Jun. 2013.
hyperspectral images,” ISPRS J. Photogram. Remote Sens., vol. 124, [24] J. Xia, J. Chanussot, P. Du, and X. He, “Rotation-based support vector
pp. 54–69, Feb. 2017. machine ensemble in classification of hyperspectral data with limited
[2] F. J. A. van Ruitenbeek, P. Debba, F. D. van der Meer, T. Cudahy, training samples,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 3,
M. van der Meijde, and M. Hale, “Mapping white micas and their pp. 1519–1531, Mar. 2016.
absorption wavelengths using hyperspectral band ratios,” Remote Sens. [25] M. Chi, K. Qian, J. A. Benediktsson, and R. Feng, “Ensemble classifi-
Environ., vol. 102, nos. 3–4, pp. 211–222, Jun. 2006. cation algorithm for hyperspectral remote sensing data,” IEEE Geosci.
[3] B. Pan, Z. Shi, Z. An, Z. Jiang, and Y. Ma, “A novel spectral-unmixing- Remote Sens. Lett., vol. 6, no. 4, pp. 762–766, Oct. 2009.
based green algae area estimation method for GOCI data,” IEEE J. Sel. [26] B. Waske, S. van der Linden, J. A. Benediktsson, A. Rabe, and
Topics Appl. Earth Observ. Remote Sens., vol. 10, no. 2, pp. 437–449, P. Hostert, “Sensitivity of support vector machines to random feature
Feb. 2017. selection in classification of hyperspectral data,” IEEE Trans. Geosci.
[4] G. Hughes, “On the mean accuracy of statistical pattern recognizers,” Remote Sens., vol. 48, no. 7, pp. 2880–2889, Jul. 2010.
IEEE Trans. Inf. Theory, vol. 14, no. 1, pp. 55–63, Jan. 1968. [27] A. Samat, P. Du, S. Liu, J. Li, and L. Cheng, “E2 LMs: Ensemble extreme
[5] Y. Gu, C. Wang, D. You, Y. Zhang, S. Wang, and Y. Zhang, “Rep- learning machines for hyperspectral image classification,” IEEE J. Sel.
resentative multiple kernel learning for classification in hyperspec- Topics Appl. Earth Observ. Remote Sens., vol. 7, no. 4, pp. 1060–1069,
tral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 7, Apr. 2014.
pp. 2852–2865, Jul. 2012. [28] J. Xia, L. Bombrun, T. Adalı, Y. Berthoumieu, and C. Germain,
[6] Y. Gu, Q. Wang, H. Wang, D. You, and Y. Zhang, “Multiple kernel “Spectral–spatial classification of hyperspectral images using ICA and
learning via low-rank nonnegative matrix factorization for classification edge-preserving filter via an ensemble strategy,” IEEE Trans. Geosci.
of hyperspectral imagery,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 54, no. 8, pp. 4971–4982, Aug. 2016.
Remote Sens., vol. 8, no. 6, pp. 2739–2751, Jun. 2015. [29] G. Camps-Valls, D. Tuia, L. Bruzzone, and J. A. Benediktsson,
[7] J. Li, P. R. Marpu, A. Plaza, J. M. Bioucas-Dias, and J. A. Benediktsson, “Advances in hyperspectral image classification: Earth monitoring with
“Generalized composite kernel framework for hyperspectral image statistical learning methods,” IEEE Signal Process. Mag., vol. 31, no. 1,
classification,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 9, pp. 45–54, Jan. 2014.
pp. 4816–4829, Sep. 2013. [30] P. Ghamisi, M. D. Mura, and J. A. Benediktsson, “A survey on
[8] Y. Gu, T. Liu, X. Jia, J. A. Benediktsson, and J. Chanussot, “Nonlinear spectral–spatial classification techniques based on attribute profiles,”
multiple kernel learning with multiple-structure-element extended mor- IEEE Trans. Geosci. Remote Sens., vol. 53, no. 5, pp. 2335–2353,
phological profiles for hyperspectral image classification,” IEEE Trans. May 2015.
Geosci. Remote Sens., vol. 54, no. 6, pp. 3235–3247, Jun. 2016. [31] W. Li and Q. Du, “A survey on representation-based classification and
[9] J. Chen, C. Wang, and R. Wang, “Using stacked generalization to detection in hyperspectral remote sensing imagery,” Pattern Recognit.
combine SVMs in magnitude and shape feature spaces for classification Lett., vol. 82, pp. 115–123, Nov. 2015.
of hyperspectral data,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 7, [32] Y. Zhong, Y. Wu, X. Xu, and L. Zhang, “An adaptive subpixel mapping
pp. 2193–2205, Jul. 2009. method based on MAP model and class determination strategy for
[10] X. Huang and L. Zhang, “Comparison of vector stacking, multi-SVMs hyperspectral remote sensing imagery,” IEEE Trans. Geosci. Remote
fuzzy output, and multi-SVMs voting methods for multiscale VHR urban Sens., vol. 53, no. 3, pp. 1411–1426, Mar. 2015.
mapping,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 2, pp. 261–265, [33] J. Zhao, Y. Zhong, Y. Wu, L. Zhang, and H. Shu, “Sub-pixel map-
Apr. 2010. ping based on conditional random fields for hyperspectral remote
[11] M. Pal and G. M. Foody, “Feature selection for classification of sensing imagery,” IEEE J. Sel. Topics Signal Process., vol. 9, no. 6,
hyperspectral data by SVM,” IEEE Trans. Geosci. Remote Sens., vol. 48, pp. 1049–1060, Sep. 2015.
no. 5, pp. 2297–2307, May 2010. [34] B. Pan, Z. Shi, N. Zhang, and S. Xie, “Hyperspectral image classification
[12] F. Schwenker, “Ensemble methods: Foundations and algorithms,” IEEE based on nonlinear spectral–spatial network,” IEEE Geosci. Remote
Comput. Intell. Mag., vol. 8, no. 1, pp. 77–79, Feb. 2013. Sens. Lett., vol. 13, no. 12, pp. 1782–1786, Dec. 2016.
[13] T. G. Dietterich, “Ensemble methods in machine learning,” in Proc. Int. [35] Y. Tarabalka, M. Fauvel, J. Chanussot, and J. A. Benediktsson, “SVM-
Workshop Multiple Classifier Syst., 2000, pp. 1–15. and MRF-based method for accurate classification of hyperspectral
[14] J. Ham, Y. Chen, M. M. Crawford, and J. Ghosh, “Investigation of the images,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 4, pp. 736–740,
random forest framework for classification of hyperspectral data,” IEEE Oct. 2010.
Trans. Geosci. Remote Sens., vol. 43, no. 3, pp. 492–501, Mar. 2005. [36] P. Ghamisi, J. A. Benediktsson, and M. O. Ulfarsson, “Spectral–spatial
[15] P. O. Gislason, J. A. Benediktsson, and J. R. Sveinsson, “Random Forests classification of hyperspectral images based on hidden Markov random
for land cover classification,” Pattern Recognit. Lett., vol. 27, no. 4, fields,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 5, pp. 2565–2574,
pp. 294–300, Mar. 2006. May 2014.
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: HGF-BASED ENSEMBLE CLASSIFICATION FOR HSIs 4189
[37] J. A. Benediktsson, J. A. Palmason, and J. R. Sveinsson, “Classification Bin Pan received the B.S. degree from the
of hyperspectral data from urban areas based on extended morphological School of Astronautics, Beihang University, Beijing,
profiles,” IEEE Trans. Geosci. Remote Sens., vol. 43, no. 3, pp. 480–491, China, in 2013, where he is currently pursuing the
Mar. 2005. Ph.D. degree with the Image Processing Center.
[38] M. D. Mura, J. A. Benediktsson, B. Waske, and L. Bruzzone, “Extended His research interests include deep learning, hyper-
profiles with morphological attribute filters for the analysis of hyper- spectral unmixing, and hyperspectral image classifi-
spectral data,” Int. J. Remote Sens., vol. 31, no. 22, pp. 5975–5991, cation.
Dec. 2010.
[39] P. R. Marpu, M. Pedergnana, M. D. Mura, J. A. Benediktsson, and
L. Bruzzone, “Automatic generation of standard deviation attribute
profiles for spectral–spatial classification of remote sensing data,” IEEE
Geosci. Remote Sens. Lett., vol. 10, no. 2, pp. 293–297, Mar. 2013.
[40] L. Shen and S. Jia, “Three-dimensional Gabor wavelets for pixel-based
hyperspectral imagery classification,” IEEE Trans. Geosci. Remote Sens.,
vol. 49, no. 12, pp. 5039–5046, Dec. 2011.
[41] Y. Qian, M. Ye, and J. Zhou, “Hyperspectral image classification based
on structured sparse logistic regression and three-dimensional wavelet
texture features,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 4,
pp. 2276–2291, Apr. 2013.
[42] X. Guo, X. Huang, and L. Zhang, “Three-dimensional wavelet texture Zhenwei Shi (M’13) received the Ph.D. degree in
feature extraction and classification for multi/hyperspectral imagery,” mathematics from the Dalian University of Technol-
IEEE Geosci. Remote Sens. Lett., vol. 11, no. 12, pp. 2183–2187, ogy, Dalian, China, in 2005.
Dec. 2014. He was a Post-Doctoral Researcher with the
[43] W. Li and Q. Du, “Gabor-filtering-based nearest regularized subspace Department of Automation, Tsinghua University,
for hyperspectral image classification,” IEEE J. Sel. Topics Appl. Earth Beijing, China, from 2005 to 2007. He was a Visiting
Observ. Remote Sens., vol. 7, no. 4, pp. 1012–1022, Apr. 2014. Scholar with the Department of Electrical Engineer-
[44] K. He, J. Sun, and X. Tang, “Guided image filtering,” in Proc. Eur. ing and Computer Science, Northwestern University,
Conf. Comput. Vis. (ECCV), 2010, pp. 1–14. Evanston, IL, USA, from 2013 to 2014. He is
[45] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Trans. currently a Professor and the Dean of the Image
Pattern Anal. Mach. Intell., vol. 35, no. 6, pp. 1397–1409, Jun. 2013. Processing Center, School of Astronautics, Beihang
[46] Q. Zhang, X. Shen, L. Xu, and J. Jia, “Rolling guidance filter,” in Proc. University, Beijing. He has authored or co-authored over 100 scientific
Eur. Conf. Comput. Vis. (ECCV), 2014, pp. 815–830. papers in refereed journals and proceedings, including the IEEE T RANSAC -
[47] B. Ham, M. Cho, and J. Ponce, “Robust image filtering using joint TIONS ON PATTERN A NALYSIS AND M ACHINE I NTELLIGENCE, the IEEE
static and dynamic guidance,” in Proc. IEEE Conf. Comput. Vis. Pattern T RANSACTIONS ON N EURAL N ETWORKS , the IEEE T RANSACTIONS ON
Recognit. (CVPR), Jun. 2015, pp. 4823–4831. G EOSCIENCE AND R EMOTE S ENSING, the IEEE G EOSCIENCE AND R EMOTE
[48] Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep joint image S ENSING L ETTERS , and the IEEE Conference on Computer Vision and
filtering,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 154–169. Pattern Recognition. His research interests include remote sensing image
[49] B. Pan, Z. Shi, and X. Xu, “R-VCANet: A new deep-learning-based processing and analysis, computer vision, pattern recognition, and machine
hyperspectral image classification method,” IEEE J. Sel. Topics Appl. learning.
Earth Observ. Remote Sens., vol. 10, no. 5, pp. 1975–1986, May 2017.
[50] X. Kang, S. Li, and J. A. Benediktsson, “Spectral–spatial hyperspectral
image classification with edge-preserving filtering,” IEEE Trans. Geosci.
Remote Sens., vol. 52, no. 5, pp. 2666–2677, May 2014.
[51] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Learning. Berlin, Germany: Springer, 2001.
[52] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color
images,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Jan. 1998,
pp. 839–846.
[53] J. Wright, A. Ganesh, S. Rao, Y. Peng, and Y. Ma, “Robust principal
component analysis: Exact recovery of corrupted low-rank matrices via Xia Xu received the B.S. and M.S. degrees from the
convex optimization,” in Proc. Adv. Neural Inf. Process. Syst. (NIPS), School of Electrical Engineering, Yanshan Univer-
2009, pp. 2080–2088. sity, Qinhuangdao, China, in 2012 and 2015, respec-
[54] X. Kang, S. Li, L. Fang, and J. A. Benediktsson, “Intrinsic image tively. She is currently pursuing the Ph.D. degree
decomposition for feature extraction of hyperspectral images,” IEEE with the Image Processing Center, School of Astro-
Trans. Geosci. Remote Sens., vol. 53, no. 4, pp. 2241–2253, Apr. 2015. nautics, Beihang University, Beijing, China.
[55] (2014). IEEE GRSS Data Fusion Contest. [Online]. Available: Her research interests include hyperspectral
http://www.grss-ieee.org/community/technical-committees/data-fusion/ unmixing and multiobjective optimization.
[56] G. M. Foody, “Thematic map comparison: Evaluating the statistical
significance of differences in classification accuracy,” Photogram. Eng.
Remote Sens., vol. 70, no. 5, pp. 627–633, 2004.
Authorized licensed use limited to: NANKAI UNIVERSITY. Downloaded on March 24,2020 at 06:58:11 UTC from IEEE Xplore. Restrictions apply.
View publication stats