Icicct 2018 8473291
Icicct 2018 8473291
Abstract—In this paper we present an innovative method 1)An Android application: This is the frontend of our system.
for offline handwritten character detection using deep neural The android application helps the user to click a picture
networks. In today world it has become easier to train deep of text which is to be recognized , using their smartphone
neural networks because of availability of huge amount of data
and various Algorithmic innovations which are taking place. camera. This picture is passed on to a python script running
Now-a-days the amount of computational power needed to train on a server which further processes this image to extract the
a neural network has increased due to the availability of GPU’s relevant information
and other cloud based services like Google Cloud platform and 2)A server: This is the backend of our system. This server
Amazon Web Services which provide resources to train a Neural is a computer which is capable of executing a python script.
network on the cloud. We have designed a image segmentation
based Handwritten character recognition system. In our system It is needed because an android smart phone does not have
we have made use of OpenCV for performing Image processing the computation power required for running neural networks
and have used Tensorflow for training a the neural Network. We and performing image processing operations. Also the use of
have developed this system using python programming language. server for performing computationally intensive tasks enables
users of older smart phones to make use of our system.
Keywords—Neural Networks, Deep Learning, Tensorflow, Python,
We used the Convolutional Neural Network Model in our
OpenCV, Android, JAVA. system. We used the publicly available NIST Dataset which
contains samples of handwritten characters from thousands
of writers. The neural network model which we have used
I. I NTRODUCTION
is Convolutional Neural Network. CNN’s are State-of-Art
As we All know , in today’s world AI(Artificial Intelligence) neural networks which have huge applications is field of
is the new Electricity. Advancements are taking place in the Computer Vision. The neural network model was trained
field of artificial intelligence and deep-learning every day. using Tensorflow which is an open source library used for
There are many are many fields in which deep-learning is Machine learning applications. OpenCV was used to perform
being used. Handwriting Recognition is one of the active various image processing operations like segmentation ,
areas of research where deep neural networks are being thresholding and Morphological Operations. OpenCV is an
utilized. Recognizing handwriting is an easy task for humans open source library which is used for Image processing.
but a daunting task for computers. Handwriting recognition
systems are of two types: Online and Offline. In an online
handwriting recognition system the handwriting of the user II. R ELATED W ORKS
is recognized as the user is writing. The information like Immense research is going on in the field of handwritten
the order in which the user has made the strokes is also character recognition. Many people have developed systems
available. But in offline handwriting recognition system, the for handwritten character recognition. We have studied some
handwriting of user is available as an image. Handwriting of the systems:
recognition is a challenging task because of many reasons. A character recognition system has been designed using fuzzy
The primary reason is that different people have different logic[1]. The system developed by them can be created on a
styles of writing. The secondary reason is there are lot of VLSI structure. Their character recognition system is immune
characters like Capital letters , Small letters , Digits and to distortion and variations in shift. They have made use of
Special symbols. Thus a large dataset is required to train hamming neural network in their system.
a near-accurate neural network model. To develop a good An innovative method for recognition of handwritten tamil
system an accuracy of atleast 98% is required. However even characters using Neural Networks has been developed[2]. They
the most modern and commercially available systems have have made use of Kohonen Self Organizing Map(SOM) which
not been able to achieve such a high accuracy. is an unsupervised neural network. The system developed by
Our system comprises of two parts: them can be used for recognition of tamil characters as well
as for the recognition of other indic languages. Their system An Android application was developed using which the
produces near accurate results but sometimes produces errors user can click a photo of handwritten text using their camera.
if the handwritten characters are not properly segmented. The clicked picture is sent to our server for processing at
One of the Authors has presented a unique method for the backend. A neural network runs on this server which can
authenticating a person based on their handwritting[3]. The recognize the handwritten text from image. The recognized
author has used the Multi layer feed forward neural network text is sent back as a response to the android application
in their system. The author has proposed in this paper that which is displayed to the user as shown in Fig.1(b).
the height and width of a handwritten alphabet is unique for
each and every person. The author has presented a method Now let us discuss how the backend of our system works.
for recognition and identification of a person from their The backend of our system performs two important things.
handwriting. The first thing is hosting the pre-trained neural network
A novel method for handwritten character recognition has been model to serve predictions. The second thing is performing
designed which does not use feature extraction[4]. They have image processing operations on the image of handwritten text
implemented their system in Matlab. Their system uses a feed which is to be recognized. At the backend we have Neural
forward neural network with backpropogation. network model trained using Tensorflow and a python script
One of the authors have proposed an unique method which is equipped with OpenCV library. We have used the
for handwriting recognition[5]. Their system uses Self Convolutional Neural network model.
Organizing Map[6] for feature extraction. They have used a
Recurrent neural network[7] for learning. They conducted their
experiment on recognition of Japanese characters.
In this section we discuss how our system has been Fig. 2: A Convolutional Neural Network[15]
implemented. Let us first discuss the Android application that
we developed for our system. Using this application the user Convolutional neural network(CNN) is the current state-of-
clicks a photo of the handwritten document to be digitalized art neural network which has wide applications in feilds like
using the camera of the android phone. Shown next are the Image and Video Recognition , Natural Language Processing ,
screen-shots of the developed application. Recommender systems. CNN’s are biologically inspired neural
networks. CNN’s are very good at image recognition. In case
of CNN the input is a multi-channeled image(Often an image
having Red,Green and Blue channels). A CNN comprises of a
stack of Convolutional layer and a Max-pooling layer followed
by a fully connected layer. The convolutional layer is the
most important layer of network. It performs the convolution
operation. The pooling layer comes after the convolutional
layer. This layer is needed because in case of larger images
, the number of trainable parameters can be very large. This
increases the time taken to train a neural network and is not
practical. The pooling layer is used to reduce the size of
image. We used the NIST database which contains thousands
of images of handwritten characters. Some of them are shown
below. However these images were originally of size 128x128
pixels. The images in the training set were cropped to a size
of 28x28. Reducing the size of images decreases the overall
time taken to train the neural network model.
After the training the Neural network model, an accuracy
of upto 94% was obtained.
(a) (b) Now let us discuss the various image processing operations
which are performed on the image to be recognized. Following
Fig. 1: Homescreen of The application and Results displayed steps are involved in processing of images:
to the user
(a)
Fig. 3: Some of the images used for Training Neural Network
1) Pre-processing:
This is the first step performed in image processing. In this
step the noise from the image is removed by using median (a)
filtering. Median filtering is one of the most widely used
noise reduction technique. This is because in median filtering Fig. 4: Horizontal Histogram of Image
the edges in image are preserved while the noise is still
removed. The points marked in red are the points corrosponding to
the rows where sum of pixels are zero. After identifying all
2) Conversion to Gray-Scale: such rows we can easily segment handwritten text into lines
After the pre-processing step, the image is converted into at these points.
grayscale. Conversion into grayscale is necessary because Now once the image is segmented into lines , each line must
different writers use pens of different colours with varying be further segmented into individual words. Segmentation of a
intensities. Also working on grayscale images reduces the line into words can be performed using the Vertical projection
overall complexity of the system. method. For segmenting line into words , we can make use
of the fact that the spacing between two words is larger than
3) Thresholding: the spacing between two characters. To segment a single line
When an image is converted into grayscale , the handwritten into individual words , the image is scanned from left to right
text is darker as compared to its background. With the help and sum of pixels in each column is calculated. A vertical
of thresholding we can seperate the darker regions of the histogram is plotted in which the X-axis represents the X-
image from the lighter regions. Thus because of thresholding coordinates of image and Y-axis represents the sum of pixels
we can seperate the handwritten text from its background. in each column. The vertical histogram is as shown below:
As we can see the points which are marked as red in Fig.5(a)
4) Image Segmentation: are the points corrosponding to the columns where sum of
A user can write text in the form of lines. Thus the pixels is zero. The region where the sum of pixels is zero is
thresholded image is first segmented into individual lines. wider when it is a region seperating two words as compared
Then each individual line is segmented into individual to the region which is seperating two characters.
words. Finally each word is segmented into individual After segmenting a line into words , each word can be
R EFERENCES
[1] Wei Lu, Zhijian Li,Bingxue Shi . ”Handwritten Digits Recognition with
Neural Networks and Fuzzy Logic” in IEEE International Conference on
Neural Networks, 1995. Proceedings.,
[5] Shun Nishide, Hiroshi G. Okuno, Tetsuya Ogata, Jun Tani. ”Handwriting
Prediction Based Character Recognition using Recurrent Neural Network”
(a) in 2011 IEEE International Conference on Systems, Man, and Cybernetics
Fig. 5: Vertical Histogram of Image [6] T. Kohonen, Self-Organization and Associative Memory, 2nd Ed., New
York, Springer, 1988.
seperated into individual character using similar technique as [7] Y. Yamashita and J. Tani, Emergence of Functional Hierarchy in a
Multiple Timescales Recurrent Neural Network Model: A Humanoid
explained earlier. Now these individual characters are given Robot Experiment, PLoS Computational Biology, Vol. 4, e1000220, 2008.
to the pre-trained neural network model and predictions are
obtained. Using this the final predicted text is sent back as a [8] Lei Wang, Lei Zhang , Yanqing Ma . ”Gstreamer Accomplish Video
rersponse to the user. capture and coding with PyGI in Python language” in 2017 24th Asia-
Pacific Software Engineering Conference (APSEC)
V. C ONCLUSION AND F UTURE S COPE
[9] Rahul R. Palekar , Sushant U. Parab , Dhrumil P. Parikh , Prof. Vijaya
There are many developments possible in this system in N. Kamble. ”Real Time License Plate Detection Using OpenCV and
the future. As of now the system cant recognize cursive Tesseract” in International Conference on Communication and Signal
handwritten text. But in future we can add support for Processing
recognition of cursive text. Currently our system can only
[10] ”OpenCV” https://en.wikipedia.org,[Online] Available:
recognize text in English languages. We can add support https://en.wikipedia.org/wiki/OpenCV/. [Accessed 05 March 2018]
for more languages in the future. Presently the system can
only recognize letters and digits. We can add support for [11] Fatih Ertam, Galip AydOn. ”Data Classification with Deep Learning
recognition of Special symbols in the future. There are many using Tensorflow” in 2017 International Conference on Computer Science
and Engineering (UBMK)
applications of this system possible. Some of the applications
are Processing of cheques in Banks , Helping hand in Desktop [12] ”An open-source machine learning framework for
publishing , Recognition of text from buisness cards , Helping everyone” https://www.tensorflow.org/,[Online] Available:
the blind in recognizing handwritten text on letters. https://www.tensorflow.org/. [Accessed 05 March 2018]