Biogrid 2003 Agents Image Processing MLP

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/377570683
IMAGE PROCESSING AND PATTERN RECOGNITION USING THE MULTI-LAYER

PERCEPTRON: A PROPOSAL FOR BIOMEDICAL IMAGE PROCESSING
Conference Paper · May 2003
CITATIONS READS
0 24
1 author:
Sofia Korsavva
National and Kapodistrian University of Athens
31 PUBLICATIONS 53 CITATIONS
SEE PROFILE
All content following this page was uploaded by Sofia Korsavva on 21 January 2024.
The user has requested enhancement of the downloaded file.

IMAGE PROCESSING AND PATTERN RECOGNITION USING THE MULTI-LAYER
PERCEPTRON: A PROPOSAL FOR BIOMEDICAL IMAGE PROCESSING
Sophia Corsava email: sophiac6@yahoo.com PhD student, University of Westminster, London UK, Harrow School of Computer
Science
(This work was carried out at the University of Edinburgh, UK)
ABSTRACT interconnections in a complex pattern. Of them, the

Bayesian classification technique is a powerful tool [3].
Traditional image processing and pattern recognition Problem Reduction Representations seem to work in
methods are memory consuming and need elaborate and certain cases only and genetic algorithms and neural nets
expensive hardware. There is however a lot of interest in have succeeded repeatedly in pattern recognition tasks [3].
these methods not only because humans need a definite Classifiers rely on distance metrics and probability
improvement of pictorial information for their theory. The distance metrics can be found by calculating
interpretation of image density data, but also because the Hamming distance, the Eucledian distance, the city-
medical, biomedical and robotic applications need to block distance or the square distance [12]. It is essential to
understand images reliably. In this paper we present our train classifiers. This can be done in two ways; either by
work on the Neural Net based facial feature detector, supervised learning or unsupervised. In the first case the
using the multi-layer perceptron. Our project successfully classifier is presented with the output it has to produce. A
identified the location of small complex features i.e. eyes, problem here may be the big demand for training patterns
mouths and noses and recognised them with 100% which grows exponentially with the number of
success. We believe that our approach, which relies dimensions in the pattern space. In the case of
heavily on careful image pre-processing, artificial neural unsupervised learning, different clustering algorithms are
nets and sound verification techniques can help in the area used that seek to find clusters and/or natural groupings in
of biomedical image processing. the data. An example is the hierarchical clustering method
which is complicated [3]. Work in the pattern recognition
1. INTRODUCTION and image processing areas has been done by Fukushima,
(Cognitron, NeoCognitron) [8], Kaya and Kalabashi [3],
Pattern recognition is closely linked with perception and Harmon [20] and Stonham (WISARD [18]).
cognition [13],[17]. Connectionist methods have been
applied in the past in the processing of complex 2-D 3. BUILDING METHODOLOGY
patterns [7],[9]. Their success comes with a major
disadvantage and that is that neural nets easily forget 3.1 Overview
already known data and take time to learn new ones [12].
This kind of problem is attributed to the usually complex In our work we used a combination of Unix shell scripts
structure of the neural net that is used for the recognition to automate pre-processing and post-processing, C and
task and its large number of input units [12]. In our work, C++ programs for pre/post processing, the ISIS [3] image
we attempted and succeeded to reduce to the minimum tool to capture images, MATLAB [14] for algorithms and
these problems by decreasing significantly the number of the “rbp” program that had the Multi-Layer perceptron
inputs to the net. This was achieved by pre-processing the encoded [21]. Our work was divided into 2 phases.
image to be used as input to the net. Pre-processing, made During Phase I images are processed by being normalised
some distinct areas (i.e. small complex features in our and sub-images containing individual features are
case) “visible” to the net and decreased the extracted. Data files are prepared according to 3 pre-
dimensionality of data. processing techniques:1) Eigenvectors and alpha values,
In this paper we shall briefly discuss neural net based 2) Fourrier Transform Analysis with phase angles in
pattern recognition combined with 3 image pre-processing radians of the resulting complex matrix, with angles
techniques. Our main goal was to combine connectionist between π and - π or else ft-I and 3) Fourrier Analysis
methods with computer vision and digital image with reduced frequencies or else ft-II.
processing approaches. This paper is organized as When the three data files are prepared the net training
follows. In section 2 we discuss related work. In section 3 takes place. Various tests are conducted so as to find the
we present our building and processing methodology net configuration that will learn the training patterns fast
while section 4 presents some of the results of our project. and with a small error per unit. The outcome of Phase I is
three trained neural nets.
2. RELATED WORK During Phase II, we do post-processing and verification.
We select two faces unknown to the net. These faces are
There are statistical pattern recognition techniques that scanned and sub-images are extracted from the first pixel
cannot handle very well the structural information about until the last one. The scanning program extracts sub-
1
images in windows 1) 12x30 or 16x32 window starting would be later used for the entire extraction procedure.
pixel number one. The window sub-image dimensions We also examined the gray level histograms of the
depend on the pre-processing technique, and 2) Extract a produced images in order to make sure they were correct.
12x30 or a 16x32 window starting from pixel number two A crucial point was to decide where the centre of each
and so on until the last pixel is reached. When the image individual feature would be. For the left and the right eyes
scanning is completed, extracted widows are processed the centre was in the pupil, in order to facilitate the net
according to each of the three pre-processing methods. training. The window containing an eye does not include
The outcome of each of these methods is a data file any extra pixels from around the eye. The centre for the
containing input patterns to be fed to the corresponding nose was decided to be the left nostril, first because it was
training classifier. The outcome of Phase II is three files more prominent in most of the images and secondly
containing the probabilities assigned by each of the nets because the size of the nose did not allow for a window to
and the projection of these on the actual image. So for contain it. The centre for the mouth, was the geometrical
each image there are three sets of results that show centre of a 12x30 rectangle and was located in between
graphically the probabilities assigned by the nets for each the lips. The file containing grouped facial features
pixel. consisted of 223 samples. Each group has a threefold set
of each facial feature, i.e. the left eye in three orientations,
3.2 Pattern Classification upright, tilted to the left and tilted to the right. These
different orientations were considered critical to the
A way to deal with classification problems is to use an success of this project since the neural net had to be able
automatic scene analysis technique such as the gray level to cope with the various eye orientations and sizes. The
representation [1],[2],[3],[11]. A black and white image same applies for the rest of the features as well. A
can be represented by a real-valued function of two program that reads in the face image produces an output
variables i.e. f(x,y). The image intensity is the gray level file with the gray level value of each pixel in numerical
or brightness. An image function must be digitised both format. The extraction program given the coordinates of
spatially and in amplitude so it can be suitable for image the centre of a 12x30 window or a 16x32 (only in the case
processing. By digitising the spatial co-ordinates (x,y) we of the fourier transform filtered data), extracts the facial
can obtain samples of the digitised image. The amplitude features from the normalised HIPS image and gives the
digitisation is the gray-level quantisation [3],[15]. The centred sub-images which are used during the pre-
image is represented by an array of integers where each processing. The extracted features before they are
member of the array or pixel specifies its approximate compiled together to a single file, are kept in individual
gray level for a corresponding cell. In order to obtain files, that are named according to a 'sorting code'. The left
better results with this method we normalised the image eye files begin with 'le.', then follows the number of the
first. We did this, because there might be too many dark picture they were extracted from and in the end, is the
or very bright regions and subsequently the partitioning of 'norm' extension, which indicates their status (whether
the image into sets of distinct gray levels might lead the normalised or not). A similar method had been followed
net to wrong conclusions. The key decision with gray for the rest of the features, only that the first part of the
level representation is to select which gray level name varies according to the feature the file holds. This
transitions are significant. We decided that the gray levels classification code was proved very efficient and effective
for this project should 256 because we tried to maintain an since it saved a lot of time, throughout the project.
acceptable quality in terms of resolution. Automation of all tasks via Unix scripts was essential to
the success of the project. In the end the only time-
3.3 Input Data consuming operation was the training of the neural net
which was took 12 to 20 hours, depending on the net
We took pictures of faces under the same light conditions configuration.
(fluorescent tubes mounted on either side of the camera).
The subject was sitting on a stool placed at one metre
distance from the camera. For this purpose a CCTV
camera with a digitiser and SUN 3 workstation were used. Figure 1. High level view of the neural net function. Input data are
The background lighting was minimised. [3]. Each picture processed according to 3 methods, ft-I, ft-II and eigenvector alpha
values. They are fed to three neural nets, they train them and the
was initially 512x512 pixels, each pixel 8 bits, and gray output is the learnt patterns and error probabilities.
levels totalled 256. Later on for practical reasons and to
save memory, image resolution was reduced in half,
which means that in the end the entire experimentation 3.4 Image pre-processing
and the net result verification was done on 256x256
images. We used the ISIS processing tool and Unix The pre-processing stage was as follows:
scripts to generate and store captured images. A 1) Normalise the 256x256 facial images 2) Extract
combination of Unix shell scripts, C and C++ programs sequential subimages/windows starting from the first pixel
were used to find the centre of each image by recording until the last one. The coordinates of this pixel denoted
the x,y coordinates, that were kept in a text file which the top left coordinates of the extracted window whose
2
size was either 12x30 for the alpha values and the phase The matrix containing the phase angles in radians was
angles techniques or 16x32 for the ft reduced frequencies. then prepared for the net following the pre-processing
Each extracted window was represented as a vector either standards in terms of naming conventions and numbers.
360 long for the first two cases or 512 for the last. 3) In This had to be done this way because the multilayer
the alpha values case, find the eigenvectors of the 12x30 perceptron does not accept as input complex numbers;
extracted windows, normalise them, apply a modified therefore a meaningful representation had to be found for
version of principal components analysis, select the ones the net [3],[21]. The ft method was used again as a third
with the bigger eigenvalues and then prepare the file for pre-processing technique (ft-II). This time a reduction in
the net training. the data dimensionality was attempted through the
utilisation of a smaller number of frequencies for each
3.4.1 Image normalization facial feature. While the maximum number of frequencies
was 512, in this case only 57 of them were used. The
The normalisation algorithm (which was coded in a C algorithm was filtering the number of frequencies above a
program) takes as input the HIPS image and treats it as a certain limit in an effort to determine the optimum
normally distributed matrix of integers, with variance σ 2 number to be used for representing eventually the image
and mean M. In order to produce the normalised image, it to the neural net. After a lot of experimentation it was
subtracts from each pixel the mean, then it divides this concluded that 57 was the optimum number of
result by σ .This normalises the image. However, in frequencies to be used in order for the net to receive a
order to resolve the 0-255 distribution it adds 3 and finally meaningful representation. To conclude to this number of
it multiplies the entire product with 42.6. This had to done frequencies we used the following algorithm: 1) Read in
as such because it produced a better normalised image. So the pattern 2) Find the inverse fourier transform for it 3)
for an image of N pixels Pi, i =1...N the mean M is given Threshold the frequencies that are bigger than 500 4)
by: M= 1 N (3.1) Project the outcome.
∑ N i=1
Pi
As soon as the number of frequencies was decided, the
The variance σ 2
by: resulting 'mask', was kept for further use. That mask was a
σ 2
= 1 N (3.2) 360 vector long that had zeroes and ones. At the positions
N − 1
∑ i=1
( Pi − M ) 2
where the value was 0, the frequency of the input pattern
the normalised intensity is given by: was determined as being below the value used for
P i= P i − M (3.3) filtering, and therefore, was rejected. In the cases where
σ there was 1, the frequencies were accepted as valid and
And the Final Image is: Fi 42.6(Pi + 3) [3]. they were kept in to a special file that would eventually be
fed to the net. The resulting ft matrix was a complex one;
3.4.2 Fourrier Transform ft-I & ft-II as it was mentioned above the multilayer perceptron does
not accept complex values as inputs, this is why that
The two dimensional Fourier Transformation (ft) is a 2D matrix had to be presented in another way to the net. The
transformation that analyses spatial frequencies in an real part was separated from the imaginary part and each
image. A Fourier transform is defined as: one was kept in separate matrices. This, however meant
that the number of inputs to the net had to double, since
there were now two matrices that had 57x223 dimensions,
(where 223 the number of input patterns, and 57 the
number of ft output complex matrix). So in this case the
net had 114 inputs i.e. 57x2. Again, in this case
reconstruction of the initial images was attempted so as to
ensure the success of this last pre-processing technique.
By projecting the inverse ft of the filtered data, we made
sure that this technique had indeed been successful. The
thresholding value for the mask creation was 500 as
When the 2-D fourier transform was done, the data went
frequencies smaller than that were not important
through the final processing; the calculation of the phase
(determined experimentally). This value was carefully
angles in radians. This was accomplished using the ϕ(u) selected in order to convey all the meaningful information
the phase angles. The frequency factor is represented by for the net without distorting the facial features beyond
u. The 2-D transform was done on the input file. It was recognition.. Please note that due to space restrictions we
done individually on each 360 long pattern and the cannot fully explain the pre-processing methods, neither
resulting complex matrix was kept for further processing. can we include all graphs and plots. If additional
Verifying that the ft results were the correct ones, was information is required please contact the authors directly.
done by projecting the inverse ft of the image and
comparing the results with the actual input patterns. The
inverse ft is defined as: 3.4.3 Eigenvectors and alpha values
3
The eigenvector analysis with reduced data Artificial neural nets learn from example and do not
dimensionality was proven to be the second best pre- require vast amounts of memory. The innovation in our
processing method. This method actually reduces data approach is that the net does not have anywhere an
dimensionality so that a suitable linear transformation of explicit model of the feature we want it to recognise. It
the co-ordinate system is found, such that the data retains instead the memory of its reaction to the feature
variance is small and subsequently these dimensions can recognition [3],[6] and we reinforce the good behaviour
be ignored. The dimensional reduction is achieved by (the behaviour we want) while the undesirable one is
ignoring the eigenvectors that correspond to small reprimanded.
eigenvalues[20]. An important issue in this method was The MLP was chosen because it learns fast, does not
the speed of the entire procedure, not only throughout the forget easily, recognizes noisy patterns, can cope very
pre-processing stage, where the net had to be trained, but well in linearly inseparable problems, generalises well
also afterwards when the net had to classify the presented what it has learnt and as all neural nets, its distributed
patterns. The eigenvector analysis was accomplished nature makes it fault-tolerant. The MLP has three layers,
much faster using the Turk and Pentland [20] approach, in an input layer, a second layer called hidden and a third
comparison to the traditional principal components one called output layer [21].The units in the second and
analysis [3]. According to them, the number of best the third layer are perceptron units, crude models of
eigenvectors λ used (λ represents the eigenvalues whose neurons that function as thresholding units with 1 to n
number subsequently decides how many eigenvectors will inputs each either 1 or 0 denoted by x1….xn. Each output
be used) is determined by: is modified by the weight factor wi so as to produce the
desired output. When the actual output of the net is
similar or the same as the desired output then the learning
is complete. MLPs are used today in a number of
commercial applications such as NetTalk.
Figure 2. Example structure of the Multilayer Perceptron
Where Ψ is the matrix containing average features. The learning rule for the MLP we used, was the
The first 60 eigenvectors were considered the optimum generalised delta rule or else back propagation [3],[10]
number for use, since they were enough to specify a facial [16],[19]. According to this rule, the weights of the net are
feature. When the eigenvector matrix is ready, the adjusted until the net reaches a satisfactory level of
algorithm continues by creating another matrix, which is learning the input pattern. The algorithm continuously
actually the inner product of the eigenmatrix and the compares the actual output with the desired one and
matrix that has the result from the subtracted average adjusts the weights accordingly. This is done with the
features. The outcome of this operation is a set of scalar help of an error function (please refer to [3],[19] for the
values, which are used as input to the net. The inner error function). The calculated error is back propagated
product was used as such because it conveys unique from one layer to the previous one. It is through this error
information about each vector and subsequently, reduces propagation that the net finds out about the erroneously
the probability that the net will misclassify an input learned pattern and thus modifies the weights until
pattern that is generated with this technique. In order to learning is achieved. If the error is big, the weight has to
verify that the number of selected eigenvectors was the be altered significantly, while if the error is small, the
optimum a reconstruction of the processed images is weight adjustment has to be small as well. This method of
necessary. So by finding the inner products of the learning is called supervised learning, because the net has
eigenmatrix and the matrix that contains the scalar values, been instructed in advance to associate a given input to a
which we shall call alpha, and adding each time the given output [3],[16].
corresponding entry in the matrix with the average feature The structure of the three nets we used were as follows:
calculations, the resulting reconstruction when projected 1) For the ft-1 pre-processing method, the input consisted
shows that indeed the number of eigenvectors selected as for 360 long vector. All the frequencies were used (all
well as the rest of the processing was correct. 360 phase angles in radians). So the net had 360 inputs at
the first layer, 20 at the middle and 4 at the output. If
3.5 Multi-layer Perceptron (MLP) and net classification errors were smaller than 0.050, the net was
structure. commenting the learnt pattern as ok. This was the
optimum configuration as it assigned the biggest
probabilities overall and made the fewest errors in the
4
classification of unlearned patterns. This is attributed to were misclassifying left for right eyes and vice versa.
the fact that all frequencies were used as input to the net. These classification errors were not too many. This was
2) The second best configuration was the alpha net. It also attributed to the careful estimation of the components of
gave very good results and clusters with big probabilities. the net configuration, the pre-processing techniques as
In this case, erroneous classification was bigger than in well as the careful image capturing. These procedures
case (1), and the net wrongly classified left eyes for right were all automated using Unix shell scripts, in order to
eyes and vice versa. The input to this net was consisted of exclude the possibility of human errors in the training data
60 long vectors formed by sixty alpha values that and the verification procedure. All three net
corresponded to the 60 best eigenvectors. When less that configurations learnt input patters almost 100% correctly,
60 were used, the error between the reconstructed image with very small errors per unit. To end up with the best
and the actual image increased dramatically. The structure net parameters we experimented with various net
of this net was 60 inputs at the first layer, 15 at the middle configurations. All changes were done methodically, i.e.
layer and 4 at the output layer. one at a time and combinations of these values were
3) The third net configuration for ft-II, gave the worst attempted based on the net training results. We tested
results. It had as input 114 long vectors, produced by the many net configurations throughout the net training
combination of the threshold real and imaginary parts of repeatedly, with different combinations of seed,
the fast fourrier transform matrix of the actual normalised sharpness, number of units in the second level, fully or
input pattern. Although the error per unit was not much partly connected, along with different learning rules
bigger compared to the others, the net assigned lower (quick back propagation and the delta-bar-delta). In some
probabilities overall that the previously described cases where the net was showing steady undesirable
configurations. It also took 4 times longer to train than the behaviour, random changes were attempted. In some
other two. The structure of this net was 114 inputs at the cases this bore success, because of the decreased training
first layer, 87 at the middle and 4 at the output layer. time and patterns learning. For the net training we used
the “rbp” program.
4. RESULTS When the net was fully trained and the weights file
complete, a verification procedure took place during
Some of the problems we had during this process was that which, a MATLAB program scanned a face the net had
servers and workstations would crash mid-way and we not been trained with and extracted windows. That
would have to start again. Looking for spare machines to program saved all the extracted windows with dimensions
use would be a time-consuming and frustrating process, as 12x30 or 16x32 accordingly in a big file, as well as the
we had to do it manually. If we had a grid installed at the coordinates of the top left corner of each window in a
time we would be able to find resources automatically and separate file. That second file was later used for drawing
allocate machines for our processing work. In some of our up the best windows and projecting them on the
subsequent work in the image processing and pattern normalised facial picture. The file that was holding the
recognition areas, we developed our own hardware and extracted windows was then processed by each of the
software infrastructure that provided us with 99.9999% three previously mentioned The entire identification
availability. Our approach included a combination of procedure for, each face was taking a little less than 5
high-availability clusters, grids and intelligent agents. For days to complete. The trained net was then fed with these
more information refer to [4],[5] or contact us directly. patterns and it was expected to identify each pattern as
Experimentally, we discovered that what affected either left eye, right eye, nose, mouth, or nothing. The
considerably the speed of the training was the number of result of this identification procedure was held in a file as
units in the second level and the tolerance. If the number well, since it was necessary for feedback. The coordinates
of units in the second layer was too small or too big, the file that was created during the scanning of the normalised
net had big difficulties learning even 40 from the 223 image along with the probabilities given by the net, was
training samples. When the tolerance was increased the giving eventually information about each window and the
speed of learning was also increased, but the error per unit probabilities assigned by the net. In the case somebody
was quite big for some tolerance values. For this reason, wanted to identify what probabilities had been assigned
we decided to maintain the tolerance at 0.1 level for all by the net to each feature, s/he just had to look at this file.
tests. The purpose of our nets was not only to recognise a From the coordinates file and the above mentioned file,
facial feature, but also to distinguish it amongst other another MATLAB program derived the top left corner
facial features with the smallest error rate. This tactic was window coordinates along with the probabilities assigned
rewarded after a lot of experimentations and considerable for each window. With this information, the program was
training all nets gave very big probabilities overall and projecting the exact position of the identified window on
made very few mistakes in the classification. As it was the actual image. The probabilities assigned by the second
estimated only 2-4% of the classifications of patterns the net configuration, the one corresponding to the
net had not been trained on produced erroneous results. eigenvectors' technique, were quite big. For this reason a
This estimation was based on the threshold best windows filtering was done on the initially produced probabilities;
where the classification errors occurred for left and right probabilities over 0.9, were selected and were kept along
eyes. This was observed for all three nets, whereby they with the coordinates of the windows they corresponded to
5
in the file called best window. It was this file that was allow for a constructive comparison of the three net
eventually projected on the actual image, in order to show results. From these results, it is quite clear that the nets
the successes and/or failures of the net. For practical had a very big success percentage, and that the applied
reasons, instead of the top left corner, the centre of the pre-processing techniques were well selected. The ft-I
chosen window was projected; in this way the outside method, produced a net that learnt 100% all 223 patterns
viewer can have a better understanding about the success after 1,783 training iterations with probability error per
or failure of the net. When the best windows are unit at 0.00126. The eigenvalues alpha net method
identified, another MATLAB program derived the cluster produced a net that learnt 100% all patterns after 1900
of the best of them and designed a 10X10 cross training iterations with probability error per unit at
highlighting thus the best of the best and proving that 0.00121 while the ft-II method net learnt all 100%
although the net had misclassified left eyes for right eyes patterns after 14,111 training iterations with 0.00117
and visa versa, it can definitely identify the clustered probability error per unit.
features with a 95-99 % success percentage. Note that the
entire verification procedure was done on faces that the
net had not encountered before. A way to have a more
general idea about how the net behaved throughout the
whole procedure was to project all the probabilities on the
actual image. In the regions where the net had identified
as nothing the probability distribution was quite smooth.
Very small probabilities and zeroes were assigned to those
windows that had been identified as nothing, while the
windows situated around and on the feature had been
assigned quite big probabilities. In the case of the Figure 3. Best Probabilities, Best of the Best (with plotted crosses),
eigenvectors, the net seemed to be absolutely certain of its Raw Probabilities and Normalised Image by the ft-I method (top left
decisions since the majority of the assigned probabilities to right, clockwise).
was 0.9 and above. In the case of the phase angle data,
although the net had a smaller percentage of erroneous
classification, it seemed to be less sure. This means that
the probabilities it assigned were not as absolute as in the
eigenvectors' case. While in the previously mentioned
case the majority of the probabilities was over 0.9, here it
was ranging from 0.7 to 0.8. In the last case, that of the ft
reduced frequencies, the net still had successes, but the
erroneous classification rate was the biggest. The
probabilities in this case were ranging from 0.6 to 1.000.
When the best windows for each picture in each case were
Figure 4. Best Probabilities, Best of the Best, Raw Probabilities and
identified, another MATLAB program selected the best of Normalised Image by the ft-II method.
these clusters. When they were identified a 10x10 cross
was plotted at the centre of the elite window these best
clusters predefined. In this way, it is proven that although
the net has misclassified left for right eyes and visa versa,
it is nevertheless, capable of correctly identifying the
majority of the pre-processed input patterns More
specifically the MATLAB program that was plotting the
cross was doing the following : 1) Read in the normalised
image 2) Load the file containing the probabilities and the
top left corner coordinates of the best windows 3) Find the
minimum and the maximum value coordinate for x and y
4) For each facial feature find the biggest probabilities
Figure 5. Best Probabilities, Best of the Best, Raw Probabilities and
and locate the best cluster 5) Locate the centre of this Normalised Image by the eigenvalues alpha method (top left to
cluster and plot a 10x10 cross centred there and 6) Show right).
the face with the plotted crosses. For each case there are In the above figures 3,4 and 5, we can see two
four image sets. The first set shows the best of the best (previously unknown to the nets) samples, a man and a
probabilities with the plotted cross upon the normalised woman, processed and recognised with each of the three
face, the second set shows the normalised face with all the methods/nets. We can clearly see that all nets recognised
crosses plotted combined, the second the thresholded and identified successfully all small complex features.
probabilities, the third the raw probabilities un- Please contact the author directly if you need additional
thresholded, and finally the forth shows the actual results and clarifications. Due to space restriction in this
normalised face. This is shown for all three cases, so as to paper we cannot include more results.
6
5. CONCLUSIONS AND FUTHER WORK
REFERENCES
All three net configurations were successful in
recognising the features and identifying their relevant 1. Bennett and I. Craw, “Finding Image Features Using
positions on the facial image. Erroneous classifications Deformable Templates and Detailed Prior Statistical
Knowledge”; Proc. British Machine Vision Conference,
occurred only in the case of the left and the right eyes; all Glasgow, 1991.
nets were getting slightly confused as to which eye was 2. Craw I and Cameron P, “Face Recognition by Computer”.
which, a fact that was expected, since left and right eyes Proc. British Ma- chine Vision Conference, 1992.
are quite similar. In the nose and the mouth cases nets did 3. Corsava Sophia, “A Neural Net Based facial feature
detector”, MSc Thesis University of Edinburgh, 1993.
not misclassify anything at all. This can be explained 4. Corsava Sophia, Getov Vladimir, “Self-Healing Intelligent
since these two features are very different from each other Infrastructure for Computational Clusters”, ACM
and the chosen centre of extracted windows enhanced this Proceedings, SHAMAN Workshop, New York, June 2002.
difference. The pre-processing and post-processing 5. Corsava Sophia, Getov Vladimir, “Intelligent Fault-Tolerant
architecture for cluster computing”, to appear at IASTED,
techniques were essential to this success since, there was PDCN03, Innsbruck, Austria, Feb 2003.
an effort to maintain all the meaningful information to the 6. Duda and Hart, “Pattern Classification and Scene Analysis”
net, while keeping the number of the input units as small New York Chichester Wiley, 1973
as possible. This was attempted because a relatively 7. Fallside F., and Chan LW., “Connectionist Models and
Geometric Reasonin”, in Woodwark (ed), Geometric
simple net configuration can be handled better than a Reasoning”, Clarendon Press, Oxford, 1989, 65- 79.
complex one. The results are more straightforward, the net 8. Fukushima K.,, “NeoCognitron: A Self-Organizing Neural
configuration can be easily modified without disturbing Network Model for a Mechanism of Pattern Recognition
the entire recognition mechanism, and the net training Unaffected by a Shift in Position”, Biological Cybernetics 36,
193-202, 1980.
although still time consuming has a better success 9. Gluck Mark A. G., and. Rummelhurt D.E, “Neuroscience and
percentage. An important thing was that our pre- Connectionist Theory” Erlbaum, Hillsdale, NJ, 1989.
processing eliminated irrelevancies, like intensity 10. Govindaraju V., Srihari S., and Sher D.,, “A Computational
gradients and shadows. All three pre-processing Model For Face Location” Proc. 3rd International
Conference on Computer Vision, Japan, 1990
techniques had something in common; they all tried to 11. Harmondsworth, D.W.J. Corconan, “Pattern Recoginition”
maintain identical important information for each feature, Penguin, 1971
i.e. shape, greyscale, size. These important components 12. Hilger Adam, "Neural Computing, an introduction”, R.
were not isolated, because a detailed representation has Beale and T. Jackson (Eds), 1990.
13. Kelso J.A. Scott, “Concepts and Issues in Human Motor
usually more chances to succeed in the long run. Given Behaviour : Coming to Grips with the Jargon” eds Haskin
more time, further testing should be done so we could Laboratories and the University of Connecticat,, 1987
obtain more information about the net behaviour. It would 14. “MATLAB” eds The MathWorks, Inc,Matlab Manual 1992
also be interesting to find out more efficient search 15. Norwood, N.J “Image Understanding” Ablex Publishing
Corporation Whitman Richards 1989.
strategies that could reduce the post-processing time. 16. Ramsay, C, Sutherland K., Renshaw., and Denyer P., “A
Pattern classification could speed up by parallelism, Comparison of Vector Quantization Codebook Generation
concentrating on likely positions. Although post- Algorithms Applied to Automatic Face Recognition” Proc.
processing was slow the net learning speed was not a British Machine Vision Conference, 1992.
17. Rolls E., “The Processing of Face Information in the Primate
problem. Ideally this project could continue by Temporal Lobe”, in V. Bruce and M. Burton (eds),
recognising individual faces, not as features, but Processing Images of Faces, Ablex Publishing. 1992, pp41-
classifying the scanned face as 'known' i.e. as a person 68.
with a name the net can distinguish among other faces. 18. Stonham J., “Practical face recognition and verification in
WISARD”, in Ellis, Jeeves, Sumby and Young (Eds) Aspects
Artificial Neural Nets have been proven quite successful of face processing:, Dordrect: Martinus Nijhoff, 1986.
where traditional vision techniques have been definitely 19. Rumelhart J D. E., Hinton G. E. and Williams R. J.,
unsuccessful. They have too many advantages to be “Learning Representations by Back-Propagating Errors”
ignored even by those still dedicated to old ways. As a Nature 323, pp533-536, 1986.
20. Turk M. and Pentland A., “Eigenfaces for Recognition”.
concept, they are ideal, since what better than to imitate a Journal of Cognitive Neuroscience, Volume 3, Number 1,
tested way to recognise and identify objects; that of the 1991
human being. Our memory capacity seems unlimited, we 21. Vincent J, Myers D. and Hutchinson R., “Image Feature
are able to accomplish things under the most difficult Location in Multi-resolution Images Using a Hierarchy of
Multilayer Perceptron” in R. Linggard, D. Myers and
conditions and that is all due to the human nervous C.Nightingale (Eds) , Neural Networks for Vision, Speech
system. If a more accurate representation is achieved, and Natural Language, Chapman and Hall pp13-29, 1992.
neural nets will be an immense success. They represent an
intelligent way to discover things as opposed to other
methods. In view of this success, we propose that our
approach or parts of it, are used for biomedical image
processing. The net outputs will have be different
depending on each case, while the net inputs can be of a
similar size.
View publication stats

Biogrid 2003 Agents Image Processing MLP

Uploaded by

Copyright:

Available Formats

Biogrid 2003 Agents Image Processing MLP

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Biogrid 2003 Agents Image Processing MLP

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

IMAGE PROCESSING AND PATTERN RECOGNITION USING THE MULTI-LAYER

Conference Paper · May 2003

The user has requested enhancement of the downloaded file.

ABSTRACT interconnections in a complex pattern. Of them, the

Figure 2. Example structure of the Multilayer Perceptron

View publication stats

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.