PCR_ZKNI_TIP20_2020
PCR_ZKNI_TIP20_2020
PCR_ZKNI_TIP20_2020
29, 2020
Abstract— In this paper, a progressive collaborative color components at each pixel position. In general, a full-
representation (PCR) framework is proposed that is able color image is composed of three primary color components
to incorporate any existing color image demosaicing method for (i.e., red, green, and blue, which are denoted by R, G, and
further boosting its demosaicing performance. Our PCR consists
of two phases: (i) offline training and (ii) online refinement. B, respectively) at each pixel location. However, considering
In phase (i), multiple training-and-refining stages will be the cost and physical size, almost all consumer-grade digital
performed. In each stage, a new dictionary will be established cameras exploit a single image sensor covered with a color
through the learning of a large number of feature-patch pairs, filter array (CFA) on the sensor’s surface such that only one
extracted from the demosaicked images of the current stage and color-component value can be registered at each pixel location
their corresponding original full-color images. After training,
a projection matrix will be generated and exploited to refine on the CFA. The most widely-used CFA pattern is the so-called
the current demosaicked image. The updated image with Bayer’s pattern [1]. Such recorded CFA data is commonly
improved image quality will be used as the input for the next termed as a mosaicked image. In order to produce a full color
training-and-refining stage and performed the same processing image from the mosaicked image, the other two missing color
likewise. At the end of phase (i), all the projection matrices component values at each pixel’s location are required to be
generated as above-mentioned will be exploited in phase (ii) to
conduct online demosaicked image refinement of the test image. estimated, and this process is called the demosaicing.
Extensive simulations conducted on two commonly-used test Due to strong correlations existing among three color chan-
datasets (i.e., IMAX and Kodak) for evaluating the demosaicing nels in nature, many demosaicing algorithms estimate the
algorithms have clearly demonstrated that our proposed PCR missing color-component values based on the color difference
framework is able to constantly boost the performance of (CD) fields (e.g., R-G or B-G) [3]–[9]. The effectiveness of
any image demosaicing method we experimented, in terms of
objective and subjective performance evaluations. using this strategy is due to the fact that each generated CD
field tends to yield a fairly smooth data field that is highly
Index Terms— Image demosaicing, color filter array (CFA), beneficial to the estimation of those missing color-component
residual interpolation, progressive collaborative representation.
values. Since the Bayer’s pattern [1] has twice the number of
the available G channel samples as that of the R channel and
I. I NTRODUCTION of the B channel respectively, the reconstruction of a full G
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
NI et al.: COLOR IMAGE DEMOSAICING USING PROGRESSIVE COLLABORATIVE REPRESENTATION 4953
called the residual interpolation (RI) methods [2], [18]–[21], II. P ROGRESSIVE C OLLABORATIVE
in which the guided filter (GF) [23] is exploited to esti- R EPRESENTATION (PCR) I MAGE
mate the missing color components. That is, the estima- E NHANCEMENT F RAMEWORK
tion of a certain channel is generated under the guidance
A. Overview
of another channel. The residuals (i.e., estimation errors),
resulted by the GF, are the difference yielded between In this paper, a progressive collaborative representation
the sensor-registered pixel values (i.e., the ground truth) and (PCR) framework is proposed and exploited for enhancing
the pixel values estimated by the GF. Compared with the demosaicked image quality, as depicted in Fig. 1. It consists
CD-based approach, the RI-based methods are advantageous of two phases: (i) offline training and (ii) online demosaicing.
on both peak signal-to-noise (PSNR) and subjective visual In the offline phase (i), multiple training-and-refining stages
quality. Since a smoother image field is more easy to conduct will be performed. The goal is to generate a projection
interpolation, the success of the RI-based approach lies in the matrix at the end of each training stage such that it can be
fact that its resulted PR data field is smoother than the CD exploited for refining the demosaicked images obtained from
data field. Furthermore, among all existing RI-based methods, the previous stage. The refined demosaicked images will be
the IRI [2], [20] delivers attractive demosaicing performance used for the next training-and-refining stage.
while maintaining reasonable computational complexity. In the online phase (ii), the given mosaicked image is
In recent years, many image processing algorithms have subject to be demosaicked. For that, a chosen demosaicing
been developed by leveraging the potentials of convolutional algorithm (i.e., the IRI [2] as shown in Fig. 1) will be
neural network (CNN) to alleviate the dependence on the applied to produce an initial demosaicked image as the starting
hand-crafted priors. Tan et al. [24] proposed a two-stage stage for conducting progressive refinements—based on the
CNN-based demosaicking algorithm, while Cui et al. [25] projection matrices supplied from the offline training stage.
proposed a three-stage CNN-based demosaicking method. Our proposed PCR framework is detailed in the following sub-
Tan et al. [26] proposed a multiple CNN structure for demo- sections, respectively.
saicking. Besides, Gharbi et al. [27] and Kokkinos et al. [28]
designed a fully CNN to perform joint demosaicking and B. Phase 1: Compute the Projection Matrices via Offline
denoising. Training
In this paper, a general image refinement framework, called
the progressive collaborative representation (PCR), is pro- As shown in Fig. 1, the original full-color images
posed. The block diagram of the PCR is depicted in Fig. 1, (i.e., the ground truth) Oi (where i = 1, 2, . . . , N O ), the sim-
where the IRI is used for demonstration while any other ulated mosaicked images Mi , and their initially-generated
(1)
demosaicing algorithm can be used in this framework for demosaicked images Di are prepared for the training stage,
improving its demosaicked image quality. Therefore, the pro- from which the original Oi and the demosaicked images
posed PCR is a general framework for conducting image Di(1) are used to form collaborative representations as the
quality refinement, which consists of two phases: (i) offline inputs for conducting training for generating the first projec-
training and (ii) online refinement. The learned projection tion matrix P (1) . Note that the superscript (1) here denotes
matrices computed in the phase (i) will be exploited to the first iteration stage. Such practice will be applied to
progressively refine the image quality of the test image in the all other defined variables that also involve iteration index
phase (ii). Our PCR is developed based on the intuition that likewise. That is, symbol (n) on the superscript of a vari-
the loss or distortion of image details in a demosaicked image able denotes the mentioned quantity obtained in the n-th
can be recovered through a sequence of training-and-refining iteration.
stages to correct these errors, due to algorithm’s limitations The generated P (1) is then used to refine the demosaicked
on handling adverse conditions such as spectral correlation images Di(1) of the current (first) stage; thus, images Di(1) are
among three color channels, low-lighting incurred noise, and further refined and denoted as Di(2) , which is the input of the
sophisticated image contents. Given a stage, it can be viewed subsequent stage. Such training-and-refining process will be
as a corrector of the previous stage, exercising the similar iteratively performed for N times in the training stage. At the
strategy of the prediction-and-correction methodology [29]. end of phase (i), a set of projection matrices P (n) (where
Consequently, any large prediction errors that have not been n = 1, 2, . . . , N) will be generated and to be used in phase (ii)
effectively corrected in the current stage would have more for performing online demosaicing of the given mosaicked
chances to be further corrected in the subsequent stages, since image. For the above-mentioned, some important notes are
each stage will re-compute the remaining prediction errors and highlighted as follows.
re-learn from them for conducting further corrections via the First, the simulation of the mosaicked images Mi are
generated projection matrix. generated from the ground-truth images Oi by subsampling
The remainder of this paper is organized as follows. each image according to the Bayer’s pattern [1]. Second, due
In Section II, the proposed PCR demosaicing method is to its superior demosaicing performance, the iterative residual
presented in detail. In Section III, extensive performance interpolation (IRI) [2] is adopted in our work for generating
(1)
evaluations of the proposed PCR and other state-of-the- the initial demosaicked images Di , respectively. (The same
arts are performed and compared. Section IV concludes the IRI will be used in the online demosaicing stage as well.)
paper. The proposed PCR is generic in the sense that one might
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
4954 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 29, 2020
Fig. 1. The proposed progressive collaborative representation (PCR) framework for progressively refining the demosaicked color image through multiple
stages of training-and-refining (or prediction-and-correction) strategy. It consists of two phases; offline training (top part) and online refinement (bottom part).
At each stage in the offline training, a projection matrix will be generated to refine the demosaicked images of the current stage, which will be used as the input
of the next refinement stage. This projection matrix will be also used in the corresponding stage in the online refinement phase. The state-of-the-art iterative
residual interpolation (IRI) [2] depicted in this figure is used for demonstration, while any other demosaicing algorithm can be used here for improving its
demosicked image’s quality.
exploit any other demosaicing algorithm to replace the IRI generated simultaneously via the following:
in both offline training and online testing stages to boost the
performance of the considered demosaicing algorithm. Third,
(1)
2
(1) , {ci } = arg min yi − (1) · ci ,
two pre-processing steps—(i) feature extraction and (ii) dimen- [(1) ,{ci }] i
sionality reduction (not shown in Fig. 1 for the simplicity of
s.t. ci 0 ≤ L and i = 1, 2, . . . , N y , (1)
drawing) had been applied to the feature patches generated
from the demosaicked images Di(1) for learning the dictionary. where ci are the coefficients corresponding to the feature patch
This is to be further detailed using the first iteration for (1)
yi and L = 3 is the maximal sparsity set for establishing the
illustration as follows. dictionary (1) .
The IRI-generated demosaicked images Di(1) are filtered In our work, the color image demosaicing refinement is
by using the 1D Roberts and Laplacian high-pass filters considered as a collaborative representation problem [31].
performed in the horizontal and vertical directions to extract That is, given an IRI-demosaicked image, the feature patches
features, followed by segementing the filtered images into 3×3 (1)
x i are obtained from the image by processing it in a similar
image patches. These patches are collectively denoted as the (1)
(1) way as that of obtaining yi described previously, and this
feature patches of the demosaicked images Di , respectively.
problem can be boiled down to a least-squares regression
Further note that the above-described steps for generating
problem and regularized by the l2 -norm of the coefficient
feature patches will be applied to three color channels (i.e.,
vector as follows.
R, G, and B) separately, followed by concatenating their
First, we need to establish a large number of feature-patch
resulted feature patches together. With consideration of the (1) (1) (1)
pairs { f Di , f Ei } (where i = 1, 2, . . . , N f ), where f Di
computational complexity, the principal component analysis
are extracted from a scaled pyramid representation of the
(PCA) algorithm is further applied to these feature patches for (1) (1)
demosaicked images Di , while f Ei are generated from the
reducing their dimensionality while preserving 99.9% of their (1)
average energy. After the PCA process, these feature patches error images E i , which is computed by subtracting the
yi(1) (where i = 1, 2, . . . , N y ) for the respective demosaicked demosaicked image from the original full-color image. Further
(1)
images Di(1) are used to learn the dictionary (1) by following note that no dimensionality reduction will be applied to f Ei .
(1)
a similar processing pipeline as performed in the K-SVD Next, we need to search the neighborhood for each atom dk
(1)
method [30]; i.e., the dictionary (1) and ci are required to be of the dictionary as follows.
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
NI et al.: COLOR IMAGE DEMOSAICING USING PROGRESSIVE COLLABORATIVE REPRESENTATION 4955
(1)
For each atom, the nearest neighborhoods {N D k
, N E(1)
k
} of Algorithm 1 Compute the Projection Matrices via Offline
the demosaicked images and the corresponding full-color Training
(1) (1)
images will be identified from the { f Di , f Ei }. Here,
the absolute value of the dot product between the atom dk(1)
and each individual feature patch in f D(1)
i
will be computed to
measure the degree of similarity; that is,
(1) (1) (1) T (1)
δ dk , f Di = dk · f Di . (2)
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
4956 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 29, 2020
Algorithm 2 Conduct Online Demosaicing With Progressive images in the IMAX dataset have weaker spectral correlations
Refinements among three color channels and are considered to be more
challenging, while the images in the Kodak dataset have
stronger correlations among three color channels [33]. Each
full-color test image is firstly sub-sampled according to the
Bayer’s pattern [1], followed by conducting the demosaicing
process.
3) Evaluation Criteria: For objectively evaluating the per-
formance of the demosaicing methods under comparisons, two
evaluation metrics are used; i.e., the color peak signal-to-
noise ratio (CPSNR) and the structural similarity (SSIM) [34].
The former is applied to compute the intensity differences
between the individual channel of the demosaiced images and
the original images (i.e., the ground truth); that is,
2552
CPSNR = 10 × log10 , where
CMSE
H W
C∈{R,G,B} i j IoC (i, j ) − IdC (i, j ) 22
CMSE = ,
3× H ×W
(9)
symbol || · ||2 denotes the l2 norm of a vector, parameters
H and W denote the height and the width of the image,
respectively. Note that both IoC (i, j ) and IdC (i, j ) are vectors,
representing the R, G, or B values at the pixel (i, j ) from
the original image and the demosaiced image, respectively.
To further evaluate the image quality from the perceptual
viewpoint of the human visual system, the SSIM is computed
as another supporting performance evaluation measurement to
reflect the similarity yielded between the original image and
the demosaicked images. The SSIM considers the degradations
incurred on the demosaicked image’s structure, instead of the
pixel-intensity differences only. Three types of similarities
are concerned in the SSIM—that is, the luminance similarity,
demosaicked image patch z i(n) ; that is, the contrast similarity, and the structural similarity. For a joint
(n+1) (n+1) (n) measurement of the SSIM, the average SSIM is obtained by
z Di = f Di + zi . (8)
averaging the SSIM values individually obtained from the R,
By combing all the refined patches z i(n+1) followed by aver- G, and B channels. Note that the higher the SSIM value,
aging the intensity values on the overlapping areas, the final the better the perceptual quality of the demosaicked image.
(n+1) 4) Default Settings for the PCR’s Parameters: Unless other-
refined demosaicked image images Di will be generated.
wise specified, the simulations conducted for the performance
The entire phase (ii) is summarized in Algorithm 2.
evaluation of our proposed PCR adopted the following default
Lastly, the above-described progressive refinement process-
settings. The size of the image patches is set to 3 × 3 pixels
ing will be conducted for each of three color channels
with an overlap of 2 pixels between adjacent patches. Four 1D
separately.
high-pass filters (i.e., f1 = [−1, 0, 1], f2 = [−1, 0, 1]T , f 3 =
III. E XPERIMENTS [1, 0, −2, 0, 1], and f 4 = [1, 0, −2, 0, 1]T ) are used to extract
the high-frequency details for each image patch, followed by
A. Experiment Settings applying the PCA for reducing the dimensionality. The same
1) Training Dataset: For the learning-based demosaicing dictionary training method as described in [32] and [30] is
methods, the training dataset exploited for training has a direct exploited for the PCR; i.e., 4,096 atoms for each dictionary,
impact on the quality of the demosaicked images. In our a neighborhood size of 2,048 training samples, and 5 millions
experiments, the proposed PCR is trained by using the same of the training samples from the demosaicked image and
100 training images as the ones used in [17], and these training original image patches. The algorithm’s performance resulted
images do not adopt any data augmentation (i.e., rotation and by various experimental settings will be investigated in the
flipping) for increasing the size of the training dataset. following sub-sections.
2) Testing Dataset: The IMAX dataset (18 images) and
Kodak dataset (24 images) are used in our experiments, since B. Performance Analysis
they have been widely adopted for assessing the performance To evaluate the performance, the proposed PCR is compared
of demosaicing methods (e.g., [2], [17], [21]). Note that the with ten state-of-the-art demosaicing methods: the learned
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
NI et al.: COLOR IMAGE DEMOSAICING USING PROGRESSIVE COLLABORATIVE REPRESENTATION 4957
TABLE I
AVERAGE PSNR AND CPSNR R ESULTS ( IN dB) O BTAINED F ROM THE K ODAK AND THE IMAX DATASETS . T HE F IRST-R ANKED , THE S ECOND -R ANKED ,
AND T HIRD -R ANKED P ERFORMANCE IN E ACH C OLUMN I S H IGHLIGHTED IN B OLD W ITH R ED , B LUE , AND B LACK C OLORS , R ESPECTIVELY
simultaneous sparse coding (LSSC) [14], the gradient-based which one can see that the average PSNR obtained from each
threshold-free (GBTF) [5], the local directional interpolation color channel using our proposed PCR also delivers the best
and nonlocal adaptive thresholding (LDI-NAT) [6], the mul- performance on the IMAX dataset. In addition, our proposed
tiscale gradients-based (MSG) [8], the residual interpola- approach PCR also achieves the highest average CPSNR on
tion (RI) [18], the minimized-Laplacian residual interpolation the combined IMAX+Kodak dataset.
(MLRI) [19], the directional difference regression (DDR) [17], Similar to Table I, Table II and Table III compare the
the fused regression (FR) [17], the adaptive residual inter- proposed method with the same set of state-of-the-art algo-
polation (ARI) [21], and the iterative residual interpolation rithms in terms of the SSIM and MSSIM, respectively. One
(IRI) [2]. Note that the source codes of these methods are all can still observe that our proposed PCR consistently yields
downloaded from their corresponding authors. the best performance on the IMAX dataset and the combined
1) Objective Performance Analysis: Firstly, Table I com- IMAX+Kodak dataset. For the Kodak dataset, our proposed
pares the proposed method with the state-of-the-art algo- method delivers a fairly close performance to the best method
rithms in terms of the average PSNR and CPSNR on the LSSC [14].
IMAX dataset, the Kodak dataset, and their combined dataset Lastly, it is worthwhile to point out that several demosaicing
(denoted as “IMAX+Kodak”). Note that the DDR [17] and methods achieve good performance on the Kodak dataset,
FR [17] have two different sets of parameters for the IMAX but not on the IMAX dataset. For example, the sparse-
and Kodak datasets, separately. For the “IMAX+Kodak” representation-based LSSC [14] method has delivered the
dataset, we apply each above-mentioned parameter set for the best CPSNR performance on the Kodak dataset. However,
images of this dataset and consequently lead to two sets of its performance drops drastically on the IMAX dataset. The
experimental results, one for the DDR [17] and the other one DDR [17] and FR [17] have also shown such trend. This might
for the FR. be due to the well-known fact that the color images from the
From the Table I, one can observe that our proposed PCR Kodak dataset have unusually high spectral correlations among
(with the use of IRI [2] as the initial demosaicing stage) has three color channels; this has been highlighted in several
achieved the best average CPSNR among all the demosaicing previous works (e.g., [33]). Since these methods favour to
methods under comparison on the IMAX dataset. Moreover, those color images with high degree of spectral correlations,
the PCR gains additional 1.1351 dB improvement on the they tend to produce inferior demosaicked results otherwise,
IMAX dataset and 1.0381 dB on the Kodak dataset, when such as those images from the IMAX dataset. In contrast,
compared to that of IRI [2]. It is clearly to see that the our proposed PCR is much insensitive to spectral correlations
proposed PCR refinement framework effectively improves the among three color channels and thus achieves more robust
performance even for the state-of-the-art method such as performance regardless which dataset is used.
the IRI experimented in this case. Considered the second 2) Subjective Performance Analysis: Besides the superiority
best method, ARI [21], performed on the IMAX dataset, on the objective evaluations, our proposed PCR method also
additional 0.5985 dB gain is achieved. Compared with three shows superiority on the subjective quality assessment. All
learning-based methods (i.e., LSSC [14], DDR [17], and images from both datasets have been experimented, and the
FR [17]) experimented on the IMAX dataset, additional per- demosaicked images have shown consistent performance trend
formance gain yielded by our proposed PCR are 2.0250 dB, on both objective and subjective evaluations. Two representa-
1.0473 dB, and 0.7203 dB, respectively. Furthermore, Table I tive test images from each dataset are selected for conducting
also presents the quantitative comparisons on each color chan- visual comparison, since they are more challenging to perform
nel individually for all methods in terms of the PSNR, from demosaicing. Specifically, Figs. 2-5 show the demosaicked
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
4958 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 29, 2020
TABLE II
AVERAGE SSIM R ESULTS O BTAINED F ROM THE K ODAK AND THE IMAX DATASETS . T HE F IRST-R ANKED , THE S ECOND -R ANKED , AND T HIRD -R ANKED
P ERFORMANCE IN E ACH C OLUMN I S H IGHLIGHTED IN B OLD W ITH R ED , B LUE , AND B LACK C OLORS , R ESPECTIVELY
TABLE III
AVERAGE MSSIM R ESULTS ON THE K ODAK AND THE IMAX DATASETS . T HE F IRST-R ANKED , THE S ECOND -R ANKED , AND T HIRD -R ANKED P ERFOR -
MANCE IN E ACH C OLUMN I S H IGHLIGHTED IN R ED , B LUE , AND B LACK B OLDS , R ESPECTIVELY
results of four cropped-up sub-images (as indicated by the PCR) have produced much improved demosaicked images.
green-colored frame in each test image) for close-up visual Fig. 3 demonstrates a close-up of the demosaicked image
comparisons. from the IMAX 12. Through comparison, one can see that our
In Fig. 2, one can see that the zoom-in region contains rattan proposed PCR delivers the best visual quality on sharp edges
basket (with light brown and dark black colors) and the fruit and color details. On the contrary, the other methods under
in red. Due to the fact that the spectral correlations among comparison produce many visible zipper artifacts and false
the three color channels of the color images from the IMAX colors along the edges of drawings, as shown in Fig. 3(c)-(g).
dataset are much less correlated, the GBTF [5], FR [17], Fig. 4 displays a zoom-in portion of a jalousie window,
ARI [21] and IRI [2] tend to yield zipper effect. Furthermore, which is often used to evaluate the demosaicked results of
some distinct color artifacts can be observed around the edges highly-textured regions. One can see that our proposed PCR
of the black rattan in Fig. 2 (d)-(g). Although the sparse-based has achieved distinctly superior demosaicked result, which is
LSSC [14] method can yield a fairly similar demosaicked almost identical to the original image. On the contrast, one
image to that of our proposed PCR method, one can easily can observe obvious color aliasing and pattern shift resulted
observe that many details of the PCR-demosaicked image are by LSSC [14], GBTF [5], FR [17], ARI [21], and IRI [2].
still superior to and more natural than that of the LSSC [14] Although the LSSC [14] and GBTF [5] have yielded much
with less zipper effect (e.g., the brown rattan area). In fact, better visual results than those in Fig. 4 (e)-(g), however they
our PCR result is nearly identical to the ground truth on this still produced noticeable false color artifacts, as illustrated
zoom-in image. in Fig. 4 (c)-(d).
Compared with the interpolation-based methods (i.e., To further evaluate the demosaicked results of those textured
GBTF [5], ARI [21], and IRI [2]), the learning-based demo- areas using our proposed PCR, Figs. 5 demonstrates a
saicing methods (i.e., LSSC [14], FR [17], and our proposed zoom-in area that is also highly textured as it has dense edges
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
NI et al.: COLOR IMAGE DEMOSAICING USING PROGRESSIVE COLLABORATIVE REPRESENTATION 4959
Fig. 2. Visual comparisons for a close-up region on the image “IMAX 9” from the IMAX dataset.
Fig. 3. Visual comparisons for a close-up region on the image “IMAX 12” from the IMAX dataset.
clustered in a small region. It is quite clear to see that, all with a closer look, one can see that our proposed PCR has
algorithms except our proposed PCR introduce false color yielded the least amount of artifacts; e.g., refer to the top-right
artifacts and/or edge distortion. Especially, the LSSC [14], portion of the sub-image in Figs. 5, where the ARI [21]
GBTF [5], FR [17] and IRI [2] methods produce the most has produced visible color leakage in blue. This study on
distinct color artifacts in this case, while the ARI [21] method visual quality has further demonstrated the superiority of our
has delivered much better demosaiced image quality. However, proposed PCR.
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
4960 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 29, 2020
Fig. 4. Visual comparisons for a close-up region on the image “Kodak 1” from the Kodak dataset.
Fig. 5. Visual comparisons for a close-up region on the image “Kodak 8” from the Kodak dataset.
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
NI et al.: COLOR IMAGE DEMOSAICING USING PROGRESSIVE COLLABORATIVE REPRESENTATION 4961
TABLE IV TABLE V
AVERAGE RUNNING T IME ( IN S ECONDS ) P ER I MAGE AVERAGE CPSNR R ESULTS ( IN dB) AND SSIM OF PCR ON
ON THE IMAX AND K ODAK D ATASETS THE K ODAK AND THE IMAX D ATASETS W ITH
D IFFERENT N UMBERS OF I TERATIONS
TABLE VI
AVERAGE CPSNR P ERFORMANCE C OMPARISON ON THE IMAX
AND THE K ODAK D ATASETS U NDER D IFFERENT
L EVELS OF G AUSSIAN N OISE P OWER
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
4962 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 29, 2020
Fig. 7. A study of the resulted performances in CPSNR of the proposed PCR algorithm with respect to the values set for: (a) the atom number, (b) the
nearest neighbor number, and (c) the number of training samples. These simulations are conducted on the datasets IMAX (blue curve) and Kodak (red curve).
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
NI et al.: COLOR IMAGE DEMOSAICING USING PROGRESSIVE COLLABORATIVE REPRESENTATION 4963
GBTF [5] and GBTF-PCR, one can see that there is about [4] L. Zhang and X. Wu, “Color demosaicking via directional linear min-
3.73 dB of the CPSNR gain yielded when the GBTF [5] imum mean square-error estimation,” IEEE Trans. Image Process.,
vol. 14, no. 12, pp. 2167–2178, Dec. 2005.
is incorporated into our PCR framework (i.e., GBTF-PCR). [5] I. Pekkucuksen and Y. Altunbasak, “Gradient based threshold free color
On the other hand, with more advanced demosaicing methods, filter array interpolation,” in Proc. IEEE Int. Conf. Image Process.,
it is expected that the benefit gained from the proposed PCR Sep. 2010, pp. 137–140.
[6] X. Wu, “Color demosaicking by local directional interpolation and
will be getting less. For example, the CDMNet [25], which nonlocal adaptive thresholding,” J. Electron. Imag., vol. 20, no. 2,
is a state-of-the-art deep-learning approach, can only yield Apr. 2011, Art. no. 023016.
additional 0.01 dB of the CPSNR gain in our PCR framework [7] I. Pekkucuksen and Y. Altunbasak, “Edge strength filter based color
as shown in Table VIII. filter array interpolation,” IEEE Trans. Image Process., vol. 21, no. 1,
pp. 393–397, Jan. 2012.
[8] I. Pekkucuksen and Y. Altunbasak, “Multiscale gradients-based color
IV. C ONCLUSION filter array interpolation,” IEEE Trans. Image Process., vol. 22, no. 1,
pp. 157–165, Jan. 2013.
In this paper, a generic color image post-refinement [9] X. Li, “Demosaicing by successive approximation,” IEEE Trans. Image
framework is proposed, called the progressive collaborative Process., vol. 14, no. 3, pp. 370–379, Mar. 2005.
[10] B. K. Gunturk, Y. Altunbasak, and R. M. Mersereau, “Color plane
representation (PCR), which is exploited for progressively interpolation using alternating projections,” IEEE Trans. Image Process.,
improving the demosaicked image through multiple stages. vol. 11, no. 9, pp. 997–1013, Sep. 2002.
The proposed PCR has two phases as follows. In the offline [11] L. Chen, K.-H. Yap, and Y. He, “Subband synthesis for color filter
training phase, the more refined demosaicked images in any array demosaicking,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans,
vol. 38, no. 2, pp. 485–492, Mar. 2008.
given stage are the output from the previous stage. The [12] B. Leung, G. Jeon, and E. Dubois, “Least-squares luma–chroma demul-
high-frequency features (such as edges) extracted from the dif- tiplexing algorithm for Bayer demosaicking,” IEEE Trans. Image
ference between these images and their corresponding original Process., vol. 20, no. 7, pp. 1885–1894, Jul. 2011.
[13] L. Fang, O. C. Au, Y. Chen, A. K. Katsaggelos, H. Wang, and X. Wen,
full-RGB images are the basis for re-training. The generated “Joint demosaicing and subpixel-based down-sampling for Bayer
projection matrix will be exploited for further refining the images: A fast frequency-domain analysis approach,” IEEE Trans.
demosaicked images of the current stage. These procedures Multimedia, vol. 14, no. 4, pp. 1359–1369, Aug. 2012.
[14] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, “Non-local
will be repeated in the next stage. The above-mentioned sparse models for image restoration,” in Proc. IEEE 12th Int. Conf.
methodology has an effect that any errors or artifacts that have Comput. Vis., vol. 29, Sep. 2009, pp. 54–62.
not been sufficiently corrected from the previous stage will [15] J. Li, C. Bai, Z. Lin, and J. Yu, “Optimized color filter arrays for
sparse representation-based demosaicking,” IEEE Trans. Image Process.,
have a chance to be further corrected in the current stage. vol. 26, no. 5, pp. 2381–2393, May 2017.
At the end of offline training, all the generated projection [16] J. Duran and A. Buades, “Self-similarity and spectral correlation adap-
matrices will be used in the corresponding stages in the online tive algorithm for color demosaicking,” IEEE Trans. Image Process.,
refinement phase to conduct post-refinement process. vol. 23, no. 9, pp. 4031–4040, Sep. 2014.
[17] J. Wu, R. Timofte, and L. Van Gool, “Demosaicing based on directional
Extensive experiments conducted on two widely used image difference regression and efficient regression priors,” IEEE Trans. Image
datasets (i.e., the IMAX and the Kodak) for evaluating the Process., vol. 25, no. 8, pp. 3862–3874, Aug. 2016.
performance of color image demosaicing have clearly shown [18] D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Residual interpo-
lation for color image demosaicking,” in Proc. IEEE Int. Conf. Image
that our proposed PCR framework is able to effectively Process., Sep. 2013, pp. 2304–2308.
address several key adverse factors through a unfied treatment [19] D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Minimized-
via the generated projection matrices. The above-mentioned Laplacian residual interpolation for color image demosaicking,” Proc.
SPIE, vol. 9023, Mar. 2014, Art. no. 90230L.
key factors spectral correlation existing among three color
[20] W. Ye and K.-K. Ma, “Image demosaicing by using iterative resid-
channels, granular noise incurred in low-lighting condi- ual interpolation,” in Proc. IEEE Int. Conf. Image Process. (ICIP),
tions (Section III-D), and demosaicing algorithm’s deficiency Oct. 2014, pp. 1862–1866.
(Section III-E). All these have indicated that the proposed [21] Y. Monno, D. Kiku, M. Tanaka, and M. Okutomi, “Adaptive residual
interpolation for color and multispectral image demosaicking,” Sensors,
PCR post-refinement framework is able to deliver consistent vol. 17, no. 12, p. 2787, Dec. 2017.
improvement and makes the developed algorithm’s perfor- [22] X. Li, B. Gunturk, and L. Zhang, “Image demosaicing: A systematic
mance more robust against adverse factors. survey,” Proc. SPIE, vol. 6822, Jan. 2008, Art. no. 68221J.
[23] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 35, no. 6, pp. 1397–1409, Jun. 2013.
ACKNOWLEDGMENT [24] R. Tan, K. Zhang, W. Zuo, and L. Zhang, “Color image demosaick-
ing via deep residual learning,” in Proc. IEEE Int. Conf. Multimedia
The authors would like to thank Associate Editor Professor Expo (ICME), Jul. 2017, pp. 793–798.
Hitoshi Kiya and the three anonymous reviewers for their [25] K. Cui, Z. Jin, and E. Steinbach, “Color image demosaicking using a
insightful comments and suggestions on our initial manuscript 3-Stage convolutional neural network structure,” in Proc. 25th IEEE Int.
Conf. Image Process. (ICIP), Oct. 2018, pp. 2177–2181.
of this paper. [26] D. S. Tan, W.-Y. Chen, and K.-L. Hua, “DeepDemosaicking: Adaptive
image demosaicking via multiple deep fully convolutional networks,”
IEEE Trans. Image Process., vol. 27, no. 5, pp. 2408–2419, May 2018.
R EFERENCES
[27] M. Gharbi, G. Chaurasia, S. Paris, and F. Durand, “Deep joint demo-
[1] B. E. Bayer, “Color imaging array,” U.S. Patent 3 971 065, Jul. 20, 1976. saicking and denoising,” ACM Trans. Graph., vol. 35, no. 6, pp. 1–12,
[2] W. Ye and K.-K. Ma, “Color image demosaicing using iterative resid- Nov. 2016.
ual interpolation,” IEEE Trans. Image Process., vol. 24, no. 12, [28] F. Kokkinos and S. Lefkimmiatis, “Iterative joint image demosaicking
pp. 5879–5891, Dec. 2015. and denoising using a residual denoising network,” IEEE Trans. Image
[3] J. F. Hamilton, Jr., and J. E. Adams, Jr., “Adaptive color plan interpo- Process., vol. 28, no. 8, pp. 4177–4188, Aug. 2019.
lation in single sensor color electronic camera,” U.S. Patent 5 629 734, [29] B. Zhong, K.-K. Ma, and Z. Lu, “Predictor-corrector image interpola-
May 13, 1997. tion,” J. Vis. Commun. Image Represent., vol. 61, pp. 50–60, May 2019.
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.
4964 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 29, 2020
[30] R. Zeyde, M. Elad, and M. Protter, “On single image scale-up Huanqiang Zeng (Senior Member, IEEE) received
using sparse-representations,” in Proc. Int. Conf. Curves Surf., 2010, the B.S. and M.S. degrees in electrical engineering
pp. 711–730. from Huaqiao University, Xiamen, China, and the
[31] L. Zhang, M. Yang, and X. Feng, “Sparse representation or collaborative Ph.D. degree in electrical engineering from Nanyang
representation: Which helps face recognition?” in Proc. Int. Conf. Technological University, Singapore.
Comput. Vis., Nov. 2011, pp. 471–478. He is currently a Full Professor with the School
[32] R. Timofte, V. De Smet, and L. Van Gool, “A+: Adjusted anchored of Information Science and Engineering, Huaqiao
neighborhood regression for fast super-resolution,” in Proc. IEEE Asian University, Xiamen, China. He was a Postdoctoral
Conf. Comput. Vis., Nov. 2014, pp. 111–126. Fellow with The Chinese University of Hong Kong,
[33] F. Zhang, X. Wu, X. Yang, W. Zhang, and L. Zhang, “Robust color Hong Kong. He has published more than 100 articles
demosaicking with adaptation to varying spectral correlations,” IEEE in well-known journals and conferences, including
Trans. Image Process., vol. 18, no. 12, pp. 2706–2717, Dec. 2009. IEEE TIP, TCSVT, TITS, and three best poster/article awards (in International
[34] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image Forum of Digital TV and Multimedia Communication 2018 and in Chinese
quality assessment: From error visibility to structural similarity,” IEEE Conference on Signal Processing from 2017 to 2019. His research interests
Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004. include image processing, video coding, machine learning, and computer
vision. He has also been actively serving as the General Co-Chair for the IEEE
International Symposium on Intelligent Signal Processing and Communication
Zhangkai Ni (Student Member, IEEE) received the Systems 2017 (ISPACS2017), the Co-Organizers of ICME2020 Workshop
M.E. degree in communication engineering from on 3D Point Cloud Processing, Analysis, Compression, and Communi-
the School of Information Science and Engineer- cation, the Technical Program Co-Chair for the Asia-Pacific Signal and
ing, Huaqiao University, Xiamen, China, in 2017.
Information Processing Association Annual Summit and Conference 2017
He is currently pursuing the Ph.D. degree with the (APSIPA-ASC2017) and ISPACS2019, the, the Area Chair for the IEEE
Department of Computer Science, City University International Conference on Visual Communications and Image Processing
of Hong Kong, Hong Kong. He was a Research (VCIP2015), the Technical Program Committee Member for multiple flag-
Engineer with the School of Electrical and Elec- ship international conferences, and the Reviewer for numerous international
tronic Engineering, Nanyang Technological Univer- journals and conferences. He has been actively serving as the Associate
sity, Singapore, from 2017 to 2018. His current Editor for the IEEE T RANSACTIONS ON C IRCUITS AND S YSTEMS FOR
research interests include perceptual image process-
V IDEO T ECHNOLOGY, IEEE A CCESS , and IET Electronics Letters, the Guest
ing, deep learning, and computer vision. Editor for the Journal of Visual Communication and Image Representation,
Multimedia Tools and Applications, and the Journal of Ambient Intelligence
Kai-Kuang Ma (Fellow, IEEE) received the Ph.D. and Humanized Computing.
degree in electrical and computer engineering from
North Carolina State University, Raleigh, NC, USA.
He was a Member of Technical Staff with the Insti-
tute of Microelectronics (IME), from 1992 to 1995,
working on digital video coding and the MPEG
standards. From 1984 to 1992, he was with IBM
Corporation, Kingston, NY, USA, and then Research
Triangle Park, NC, USA, engaging on various DSP
and VLSI advanced product development. He is
currently a Professor with the School of Electrical
and Electronic Engineering, Nanyang Technological University, Singapore.
He has published extensively in well-known international journals, confer-
ences, and MPEG standardization meetings. He holds one USA patent on
fast motion estimation algorithms. His research interests are in the areas of
fundamental image/video processing and applied computer vision. He was
serving as the Singapore MPEG Chairman and the Head of Delegation
from 1997 to 2001. On the MPEG contributions, two fast motion estimation
algorithms (Diamond Search and MVFAST) produced from his research group
have been adopted by the MPEG-4 standard, as the reference core technology
for fast motion estimation.
He is an Elected Member of three IEEE Technical Committees including
the Image and Multidimensional Signal Processing (IMDSP) Committee,
the Multimedia Communications Committee, and Digital Signal Processing.
He was the Chairman of IEEE Signal Processing, Singapore Chapter from
2000 to 2002. He is the General Chair of organizing a series of international
standard meetings (MPEG and JPEG), JPEG2000, and MPEG-7 workshops
held in Singapore, March 2001. He has served various roles in profes-
sional societies including the General Co-Chair of APSIPA-2017, ISPACS-
2017, ACCV-2016 (Workshop), VCIP-2013, Technical Program Co-Chair of
ICASSP-2022, ICIP-2004, ISPACS-2007, IIH-MSP-2009, and PSIVT-2010; Baojiang Zhong (Senior Member, IEEE) received
and Area Chair of ACCV-2009 and ACCV-2010. He also had extensive the B.S. degree in mathematics from Nanjing Nor-
editorship contributions in several international journals including a Senior mal University, China, in 1995, the M.S. degree in
Area Editor since 2015 and an Associate Editor from 2007 to 2010 of mathematics from the Nanjing University of Aero-
the IEEE T RANSACTIONS ON I MAGE P ROCESSING, an Associate Editor nautics and Astronautics (NUAA), China, in 1998,
since 2015 of the IEEE T RANSACTIONS ON C IRCUITS AND S YSTEMS FOR and the Ph.D. degree in mechanical and electrical
V IDEO T ECHNOLOGY, an Associate Editor from 2014 to 2016 of the IEEE engineering from NUAA, China, in 2006. From
S IGNAL P ROCESSING L ETTERS , an Editor from 1997 to 2012 of the IEEE 1998 to 2009, he was on the Faculty of the Depart-
T RANSACTIONS ON C OMMUNICATIONS, an Associate Editor from 2002 to ment of Mathematics, NUAA, where he was an
2009 of the IEEE T RANSACTIONS ON M ULTIMEDIA, and an Editorial Board Associate Professor. In 2009, he joined the School
Member from 2005 to 2014 of the Journal of Visual Communication and of Computer Science and Technology, Soochow Uni-
Image Representation. He was elected as a Distinguished Lecturer of the versity, China, where he is currently a Full Professor. His research interests
IEEE Circuits and Systems Society from 2008 to 2009. include computer vision, image processing, and numerical linear algebra.
Authorized licensed use limited to: CITY UNIV OF HONG KONG. Downloaded on March 14,2020 at 09:27:29 UTC from IEEE Xplore. Restrictions apply.