Π19 PDF
Π19 PDF
Π19 PDF
Educational Institute of Athens, Ag. Spyridonos Street, Aigaleo, 122 10, Athens, Greece
a r t i c l e i n f o a b s t r a c t
Article history: In the present study, an adaptation of the Markov Random Field (MRF) segmentation model,
Received 5 May 2010 by means of the stationary wavelet transform (SWT), applied to complementary DNA (cDNA)
Received in revised form microarray images is proposed (WMRF). A 3-level decomposition scheme of the initial
17 February 2011 microarray image was performed, followed by a soft thresholding ltering technique. With
Accepted 11 March 2011 the inverse process, a Denoised image was created. In addition, by using the Amplitudes
of the ltered wavelet Horizontal and Vertical images at each level, three different Mag-
Keywords: nitudes were formed. These images were combined with the Denoised one to create the
cDNA microarray proposed SMRF segmentation model. For numerical evaluation of the segmentation accu-
Markov Random Field (MRF) racy, the segmentation matching factor (SMF), the Coefcient of Determination (r2 ), and the
Image segmentation concordance correlation (pc ) were calculated on the simulated images. In addition, the SMRF
Wavelet performance was contrasted to the Fuzzy C Means (FCM), Gaussian Mixture Models (GMM),
Fuzzy GMM (FGMM), and the conventional MRF techniques. Indirect accuracy performances
were also tested on the experimental images by means of the Mean Absolute Error (MAE) and
the Coefcient of Variation (CV). In the latter case, SPOT and SCANALYZE software results
were also tested. In the former case, SMRF attained the best SMF, r2 , and pc (92.66%, 0.923,
and 0.88, respectively) scores, whereas, in the latter case scored MAE and CV, 497 and 0.88,
respectively. The results and support the performance superiority of the SMRF algorithm in
segmenting cDNA images.
2011 Elsevier Ireland Ltd. All rights reserved.
Corresponding author. Tel.: +30 2105385375; fax: +30 2105385303.
E-mail address: mathan@upatras.gr (E. Athanasiadis).
0169-2607/$ see front matter 2011 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.cmpb.2011.03.007
308 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 0 4 ( 2 0 1 1 ) 307315
(Red and Green) are produced. For each spot that corresponds spot (foreground) was lled with intensity values, randomly
to a specic gene, the uorescence ratio of the red versus green chosen by an exponential distribution with predened mean
channel is indicative of the expression level of the correspond- value ranging from 0 to 216 1. The background was lled with
ing gene in the two samples (control vs. reference). Hence, the intensity values randomly chosen by a single exponential dis-
more precise the localization of a spot on the arrayer, the more tribution with mean value equal to 4000. Afterward, the image
accurate the intensity measurement is and, consequently, a was corrupted with additive Gaussian noise [9] of ve different
more precise expression level measurement of the gene may Signal to Noise Ratio (SNR) levels, 1, 3, 5, 7, and 9 db. These ve
be obtained. Therefore, the exact location, as well as, an accu- images would be used for directly evaluating the segmentation
rate identication of the boundary of each spot is crucial for performance of the algorithms at different noise levels.
more precise gene expression measurements.
According to the literature, there are three major steps
employed for the calculation of the expression level of each
2.2. Proposed Wavelet MRF (WMRF) Model
gene in the cDNA image [1,2]. The rst step is the gridding step,
The WMRF models make use of both textural and contex-
where the precise localization of each spot with each back-
tual information of the image [12,14,18], and [19]. In order
ground area is dened. The second step is the segmentation
to use additional useful information residing on the microar-
[38], where the classication of the pixel to either the fore-
ray image, the stationary wavelet transform [18] was applied
ground or the background of the image-spot is performed. The
onto the image up to scale three followed by a soft threshold
nal step is the intensity extraction step, where the intensity
ltering technique, as described in (1)
of each spot is calculated.
In the present study, the gridding step is assumed to have
been done by means of an automatic gridding process, pre-
Win + Th (G 1) if Win > Th
viously developed by our group [10], thus, we will focus on Wout = W Th (G 1) if Win < Th (1)
the segmentation process. More precisely, a new segmentation G inW
in otherwise
algorithm is proposed based on the MRF model in combination
with the additional information given by the wavelet domain
[12] of the microarray image. The Stationary Wavelet Trans- where Wout and Win are considered to be the output and the
form is used to decompose the image up to 3 scales, followed input wavelet domain values of the image, and Th and G are
by a soft thresholding lter [11]. SWT was selected due to the the threshold and gain values, respectively. Filtering is applied
fact that decomposed images retain their initial dimensions to all wavelet domain detail images; Horizontal (Ho ), Vertical
and no loss of information occurs. Magnitude images from (Ve ) and Diagonal (Di ). Following a trial and error procedure,
each one of the three scales, combined with the wavelet l- the biorthogonal mother wavelet [19] was chosen. The whole
tered image are used to enhance the segmentation accuracy process is illustrated in Fig. 1.
of the proposed wavelet-based MRF scheme. The denoised image (D) as well as the three magnitude
images (M1 , M2 , and M3 ) [15] in relation (2) for each scale were
formed
2. Methods
M= Ho 2 + Ve 2 (2)
2.1. Material
An indirect performance estimation of the segmentation algo- where Ho and Ve are the Horizontal and the Vertical ltered
rithms was also carried out, with the use of ve actual wavelet detail images at the 1st, 2nd and 3rd scale, respec-
microarray images concerning Saccharomyces cerevisiae [16]. tively.
More precisely, 10 images (5 Red and 5 Green) from the same Assuming that a feature vector F has been extracted from
experiment, produced at ve different time intervals [16], a random image X and that the segmentation result (0 label
were employed. Common reference channel (Green) inten- for the Background and 1 for the spot) is a binary vector Y,
sities were considered constant in time [17]. This property then according to Bayesian theory, the a-posteriori probability
was used to examine the reproducibility of each segmentation P(Y|F) of Y given the F can be derived from (3), known as Bayes
algorithm [17] and the algorithm with the more stable results Rule.
would be the most effective.
In addition, the use of simulated microarray images is
essential for the task of directly calculating the segmentation P(Y|F)
= p(F|Y)P(Y) (3)
accuracy of the algorithms, due to the fact that the numeri-
cal evaluation of the segmentation accuracy is not a straight
where p(F|Y) is the conditional probability of F given the Y,
forward process since the actual boundaries of the spots
and P(Y) is the a priori probability of Y that is used to describe
cannot be accurately dened. Thus, a simulated microarray
the label distribution. The a-posteriori probability can also be
image with 1600 spots was produced, according to [12]. More
expressed using Gibbs distribution [12] as shown in (4).
specically, an actual microarray image with 1600 spots was
converted into binary image by means of simple thresholding
ltering. This binary image was used as a template in order to 1 SEC /T
P(Y|F) = e (4)
create the simulated image with realistic characteristics. Each Z
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 0 4 ( 2 0 1 1 ) 307315 309
where Z is normalization constant calculated by (5), SEC is an experiment is achieved, thus, microarray data can be used
the total energy for each class c, calculated by (6) and T is a for further analysis
constant value. In the present study, the local weighted linear regression
(lowess) [17] method was adopted. According to this method, IR
2
C /T
plots (I = log10(RG) and R = log2(R/G)) were formed, where R and
Z= eSE (5) G are the red and green channels mean spot intensity values.
c=1 Afterwards, normalization was accomplished with the use of
a best-t weighted function on the IR plot. More detail about
SEc = SEL + aSEcF (6)
lowess normalization method can be found in [17]
where SEL and SEF are the energy of the labels (7) and features
2.4. Evaluation
(8), respectively, and a is a normalization value.
In order to make a direct assessment of the segmentation pro-
L
SEL (y) = w1 (yt , ycenter )D + w2 (yt , ycenter )M (7) cess, the following metrics were calculated from the simulated
i
t Neighbor i=1 t Neighbor images:
were calculated.
where Isegmented and Iactual are the mean intensity values of the
The SEL and two SEF C (one for each class C) energies were
calculated and simulated spots, respectively, i refers to indi-
calculated by using (7) and (8), respectively.
vidual cell images (i = 1. . .1600), and Iactual is the overall mean
Total Energies (4) for each class C were calculated.
of the spot intensity values of the simulated image. The algo-
A-posteriori probabilities (2) of each pixel belonging to
rithm that scores r2 value closest to the unity has the best
either foreground, or background were calculated. Classi-
performance.
cation was performed to the highest probability.
Label values of vector Y were redened according to the
classication decision. The concordance correlation pc [13,20] measures the agree-
The whole process was repeated until no signicant change ment between simulated and calculated data and is used to
of the total energies SEC occurred. evaluate the reproducibility of the proposed segmentation
algorithms.
2.3. Intensity extraction and normalization
2SA SB r
pc (A, B) = 2
(11)
Regarding simulated images, for each segmented spot, the S2A + S2B + (A B)
mean intensity value of the foreground (IFG ) and the back-
ground (IBG ) was calculated. The spot intensity value I was where A and B are two samples, A and B are the mean val-
calculated by subtracting IFG from IBG (background correction) ues, SA and SB are the standard deviation of the samples. The
[17]. Regarding the actual microarray images, background cor- higher the pc value, the better the performance of the algo-
rection was applied to both Red and Green Channels. In the rithm.
latter case, in order to adjust individual intensities among Indirect segmentation performance estimation was made
the ve replicates, normalisation was an essential step. As with the use of the real microarray images, where the repro-
discussed in the literature [1], with normalization process, ducibility of the segmentation techniques was quantied by
elimination of the biases within each microarray involved in means of:
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 0 4 ( 2 0 1 1 ) 307315 311
Table 1 Comparative results for 6 different cells obtained from the G channel of the simulated microarray. The 1st
column indicates the simulated spot with the surrounding area, the 2nd column indicates the actual boundaries of the
spot and the 3rd, 4th, 5th, and 6th columns present the segmentation results of the GMM, FGMM, MRF, and WMRF
algorithms as well as the corresponding matching factors.
SEGMENTATION RESULTS FOR 6 DIFFERENT CELLS
Original Actual GMM FGMM MRF WMRF
Spots Boundaries Result Result Result Result
Table 2 Comparative SMF results for the 5 simulated images with different SNR levels. The 2nd, 3rd, 4th, 5th, and 6th
columns indicate the SMF by using FCM, GMM, FGMM, MRF and WMRF segmentation techniques, respectively.
SMF results on simulated microarray images
Table 3 Comparative r2 results for the 5 simulated images with different SNR levels. The 2nd, 3rd, and 4th columns
indicate r2 by using FCM, MRF and the proposed WMRF segmentation techniques, respectively.
r2 results on simulated microarray images
Table 4 Comparative SMF results for the 5 simulated images with different SNR levels. The 2nd, 3rd, 4th, 5th, and 6th
columns indicate the SMF by using FCM, GMM, FGMM, MRF and WMRF segmentation techniques, respectively.
pc results on simulated microarray images
Fig. 2 SMF calculated by using FCM, GMM, FGMM, MRF, and WMRF algorithms in respect to additive white Gaussian noise
with different SNR.
Fig. 3 r2 calculated by using FCM, GMM, FGMM, MRF, and WMRF algorithms in respect to additive white Gaussian noise
with different SNR.
314 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 0 4 ( 2 0 1 1 ) 307315
Fig. 4 pc calculated by using FCM, GMM, FGMM, MRF, and WMRF algorithms in respect to additive white Gaussian noise
with different SNR.
Table 5 Results for the four segmentation techniques by means of MAE and CV, applied on ve real microarray images.
MAE and CV results on real microarray images
Fig. 5 Normalized box-plots that illustrate the MAE using GMM, FGMM, MRF, WMRF, SPOT and SCANALYZE applied on ve
real microarray images.
segmentation scheme was contrasted with the conventional [10] E. Athanasiadis, D. Cavouras, P. Spyridonos, I. Kalatzis, G.
MRF, as well as, two publicly available packages, the SPOT and Nikiforidis, An automatic microarray image gridding
the SCANALYZE. The main reason why the proposed method technique based on continuous wavelet transform, Lecture
Notes in Computer Science 4673 (2007) 854870.
bestowed better segmentation results was due to the combi-
[11] J.E. Fowler, The redundant discrete wavelet transform
nation of image texture with information given by the images and additive noise, IEEE Signal Processing Letters 12 (9)
wavelet domain. (2005).
[12] O. Demirkaya, M.H. Asyali, M.M. Shoukri, Segmentation of
cDNA microarray spots using Markov radom eld modeling,
Acknowledgement
Bioinformatics 21 (13) (2005) 29943000.
[13] A. Lehmussola, et al., Evaluating the performance of
We would like to thank the Greek State Scholarships Founda- microarray segmentation algorithms, Bioinformatics 22
tion (IKY) for funding the above work. (2006) 29102917.
[14] S. Geman, D. Geman, Stochastic relaxation Gibbs
references distributions, and the Bayesian restoration of images, IEEE
Transactions on Pattern Analysis and Machine Intelligence 6
(1984) 721741.
[15] P. Sakellaropoulos, L. Costaridou, G. Panayiotakis, A
[1] Y.H. Yang, M.J. Buckley, S. Duboit, T.P. Speed, Comparison of wavelet-based spatially adaptive method for
methods for image analysis on cDNA microarray data, mammographic contrast enhancement, Physics in Medicine
Journal of Computational and Graphical Statistics 11 (2002) and Biology 48 (2003) 787803.
108136. [16] J.L. DeRisi, V.R. Iyer, P.O. Brown, Exploring the metabolic and
[2] M. Schena, D. Shalon, R.W. Davis, P.O. Brown, Quantitative genetic control of gene expression on a genomic scale,
monitoring of gene expression patterns with a Science 278 (1997) 680.
complementary DNA microarrray, Science 270 (1995) [17] Y.H. Yang, S. Dudoit, P. Luu, D.M. Lin, V. Peng, J. Ngai, T.P.
467470. Speed, Normalization for cDNA microarray data: a robust
[3] M.B. Eisen, ScanAlyze, 1999. Available at: composite method addressing single and multiple slide
http://rana.lbl.gov/EisenSoftware.htm. systematic variation, Nucleic Acid Research 30 (4) (2002),
[4] Axon Instruments, Inc., GenPix 4000A Users Guide, 1999. e15.
[5] GeneSifter Data Center. Available at: [18] C.S. Burrus, R.A. Gopinath, H. Guo, Introduction to Wavelets
http://www.genesifter.net/web/dataCenter.html. and Wavelet Transfroms, Prentice Hall, Englewood Cliffs, NJ,
[6] M.J. Buckley, The Spot Users Guide, CSIRO Mathematical 1998.
and Information Science, 2000. Available at: [19] E.I. Athanasiadis, D.A. Cavouras, D.Th. Glotsos, P.V.
http://www.cmis.csiro.au/IAP/Spot/spotmanual.htm. Georgiadis, I.K. Kalatzis, G.C. Nikiforidis, Segmentation of
[7] ImaGene, ImaGene 6.1 User Manual. complementary DNA microarray images by wavelet-based
http://www.biodiscovery.com/index/papps-webles-action. Markov random eld model, in: IEEE Transaction on
[8] R. Adams, L. Bischof, Seeded region growing, IEEE Information Technology in Biomedicine, vol. 13, issue 6,
Transactions on Pattern Analysis and Machine Intelligence November 2009.
16 (1994) 641647. [20] E.I. Athanasiadis, D.A. Cavouras, P.P. Spyridonos, D.Th.
[9] K. Blekas, N.P. Galatsanos, I. Georgiou, An unsupervised Glotsos, I.K. Kalatzis, G.C. Nikiforidis, Complementary DNA
artifact correction approach for the analysis of DNA microarray image processing based on the Fuzzy Gaussian
microarray images, in: Proc. IEEE International Conf. on mixture model, in: IEEE Transaction on Information
Image Processing (ICIP), vol. 2, 2003, pp. 165168. Technology in Biomedicine, vol. 13, issue 4, July 2009.