Ijet 10892
Ijet 10892
net/publication/325116934
CITATIONS READS
33 35,458
4 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Venu Gopala Rao Matcha on 20 August 2018.
Research Paper
Abstract
The image classification is a classical problem of image processing, computer vision and machine learning fields. In this paper we study
the image classification using deep learning. We use AlexNet architecture with convolutional neural networks for this purpose. Four test
images are selected from the ImageNet database for the classification purpose. We cropped the images for various portion areas and
conducted experiments. The results show the effectiveness of deep learning based image classification using AlexNet.
Keywords: AlexNet; Convolutional Neural Networks; Deep Learning; Image Classification; ImageNet; Machine Learning.
Copyright © 2018 Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
International Journal of Engineering & Technology 615
In the first layer, there are 96 11x11 filters are used at stride 4. The
output volume size is 55x55x96. The AlexNet is trained on the GPU
named GTX580 which is having a small amount of 3GB of
memory. So, the CONV1 output will be halved and sent to two
GPU’s i.e. 55x55x48 is sent to each GPU. The 2nd, 4th, and 5th
convolutional layers bits are related just to the part maps in the pre-
Fig. 2: Architecture of CNN vious layer which dwell on the same GPU said in the figure. The
kernels of the 3rd convolutional layer are associated with all kernel
maps in the 2nd layer. The neurons in the fully connected layers are
3. Alexnet associated with all neurons in the past layer.
The ConvNet is categorized into two types named LeNet and The 3rd, 4th, and 5th convolutional layers are associated with each
AlexNet. The LeNet is expressed as the Shallow Convolutional other with no interceding pooling or standardization layers. The 3rd
Neural Networks which is designed to classify the hand-written dig- convolutional layer has 384 parts of size 3 × 3 × 256 associated with
its. The LeNet comprises of 2 convolutional layers, 2 subsampling the (standardized, pooled) yields of the 2nd convolutional layer.
layers, 2 hidden layers and 1 output layer [5]. The AlexNet is ex- The fourth convolutional layer has 384 kernels of size 3 × 3 × 192
pressed as the deep convolutional neural networks which are used and the fifth convolutional layer has 256 kernels of size 3 × 3 × 192.
for classifying the input image to one of the thousand classes. The first two fully connected layers have 4096 neurons each.
AlexNet is used to solve many problems like indoor sense classifi- We use the local response normalization in the normalization layer.
cation which is highly seen in artificial neural intelligence. It is a There are two normalization layers present in the AlexNet architec-
powerful method of knowing the features of the image with more ture. The Deep Neural Network with ReLU Nonlinearity can train
differential vision in the computer field for the recognition of pat- very fast than with the identical of the function tanh units. The
terns. This paper discuss about the classification of a particular size ReLU considers quicker and more compelling training by mapping
of image of required choice. It can very effectively classify the the negative esteems to zero and keeping up positive esteems. Sig-
training sample of images present in the AlexNet for better vision. nifying by the movement of a neuron figured by applying kernel i
at position (x, y) and after that applying the ReLU nonlinearity, the
The AlexNet comprises of 5 convolutional layers, 3 sub sampling response-normalized movement is expressed as
layers and 3 fully connected layers. The main difference between β
2
d
the LeNet and AlexNet are the type of Feature Extractor. We use min(N-1,i+n/2)
i =di
c(x,y) (x,y) / k + α
i
the non-linearity in the Feature Extractor module in AlexNet ( x, y) (1)
whereas Log sinusoid is used in LeNet. AlexNet uses dropout j=max(0,i-n/2)
which is not observed in any other data sets of networking.
This kind of response standardization actualizes a type of parallel
4. Implementation, Results and Discussions hindrance roused by the sort found in genuine neurons, making ri-
valry for huge exercises among neuron yields registered utilizing
different kernels. The test images are cropped to various portion ar-
We selected four images Sea Anemone, Barometer, Stethoscope
eas and applied for classification. The results are shown in Fig. 5,
and Radio Interferometer from the ImageNet database for experi-
Fig. 6, Fig. 7 and Fig. 8. From the results, it is observed that in all
mentation purpose (See Fig. 3) [6]. The block diagram of the archi-
cases of the cropped data, the classification is successful.
tecture shown in Fig.4 and the corresponding implementation is il-
lustrated below [7].
5. Conclusion
Four test images sea anemone, barometer, stethoscope and radio in-
terferometer are chosen from the AlexNet database for testing and
validation of image classification using deep learning. The convo-
lutional neural network is used in AlexNet architecture for classifi-
cation purpose. From the experiments, it is observed that the im-
ages are classified correctly even for the portion of the test images
and shows the effectiveness of deep learning algorithm.
International Journal of Engineering & Technology 617
Acknowledgement
The authors would like to express their deep gratitude towards the
Department of ECE and the management of K L E F for their sup-
port and encouragement during this work.
References
[1] https://in.mathworks.com/matlabcentral/fileexchange/59133-
neural-network-toolbox-tm--model-for-alexnet-network
[2] H. Lee, R. Grosse, R. Ranganath, and A.Y. Ng. Convolutional
deep belief networks for scalable unsupervised learning of hierar-
chical representations. In Proceedings of the 26th Annual Interna-
tional Conference on Machine Learning, pages 609–616. ACM,
2009
[3] Deep Learning with MATLAB – matlab expo2018
[4] Introducing Deep Learning with the MATLAB – Deep Learning
E-Book provided by the mathworks.
[5] https://www.completegate.com/2017022864/blog/deep-machine-
learning-images-lenet-alexnet-cnn/all-pages
[6] Berg, J. Deng, and L. Fei-Fei. Large scale visual recognition chal-
lenge 2010. www.imagenet.org/challenges. 2010.
[7] Fei-Fei Li, Justin Johnson and Serena Yueng, “Lecture 9: CNN
Architectures” May 2017.
[8] L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual
models from few training examples: An incremental bayesian ap-
proach tested on 101 object categories. Computer Vision and Im-
age Understanding, 106(1):59–70, 2007.
[9] J. Sánchez and F. Perronnin. High-dimensional signature com-
pression for large-scale image classification. In Computer Vision
and Pattern Recognition (CVPR), 2011 IEEE Conference on,
pages 1665–1672. IEEE, 2011.
[10] https://in.mathworks.com/help/vision/examples/image-category-
classification-using-deep-learning.html
[11] Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton,
“ImageNet Classification with Deep Convolutional Neural Net-
works” May 2015.
[12] A. Krizhevsky. Learning multiple layers of features from tiny im-
ages. Master’s thesis, Department of Computer Science, Univer-
sity of Toronto, 2009.
[13] https://in.mathworks.com/help/nnet/deep-learning-imageclassifi-
cation.html
[14] KISHORE, P.V.V., KISHORE, S.R.C. and PRASAD, M.V.D.,
2013. Conglomeration of hand shapes and texture information for
recognizing gestures of indian sign language using feed forward
neural networks. International Journal of Engineering and Tech-
nology, 5(5), pp. 3742-3756.
[15] RAMKIRAN, D.S., MADHAV, B.T.P., PRASANTH, A.M.,
HARSHA, N.S., VARDHAN, V., AVINASH, K., CHAITANYA,
M.N. and NAGASAI, U.S., 2015. Novel compact asymmetrical
fractal aperture Notch band antenna. Leonardo Electronic Journal
of Practices and Technologies, 14(27), pp. 1-12.
[16] KARTHIK, G.V.S., FATHIMA, S.Y., RAHMAN, M.Z.U.,
AHAMED, S.R. and LAY-EKUAKILLE, A., 2013. Efficient sig-
nal conditioning techniques for brain activity in remote health
monitoring network. IEEE Sensors Journal, 13(9), pp. 3273-3283.
[17] KISHORE, P.V.V., PRASAD, M.V.D., PRASAD, C.R. and RA-
HUL, R., 2015. 4-Camera model for sign language recognition
using elliptical fourier descriptors and ANN, International Con-
ference on Signal Processing and Communication Engineering
Systems - Proceedings of SPACES 2015, in Association with
IEEE 2015, pp. 34-38.