Captcha Springer

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

A Simple and Efficient Text-Based CAPTCHA

Verification Scheme Using Virtual Keyboard

Kajol Patel1 and Ankit Thakkar2(B)


1
Department of Computer Engineering, Institute of Technology,
Nirma University, Ahmedabad 382 481, Gujarat, India
15MCEI21@nirmauni.ac.in
2
Department of Information Technology, Institute of Technology,
Nirma University, Ahmedabad 382 481, Gujarat, India
ankit.thakkar@nirmauni.ac.in

Abstract. Digital media becomes an effective way of communication


which is available round the clock to everyone including humans and
machines. This put the requirement for machines to differentiate between
human and machine as far as access of the website or its relevant ser-
vices is concerned. CAPTCHA (Completely Automated Public Turing
test to tell Computer and Human Apart) is a test that helps machines
(or programs) to differentiate between human and machine. CAPTCHA
should be easy for users to solve and difficult for bots to attack. In this
paper, a simple and efficient text-based CAPTCHA verification scheme
is proposed which is easy for human and hard for bots. The proposed
scheme uses virtual keyboard, eliminates input-box, and does verification
on the basis of the positions of the characters.

Keywords: CAPTCHA · Virtual keyboard · Position based verification

1 Introduction
CAPTCHA is used in websites to prevent automated interactions by bots. For
example, Gmail improves its service by blocking access to automated spammers,
eBay blocks automated programs that flood the websites, and Facebook protects
its site by limiting the creation of fraudulent profiles [1]. In November 1999,
slashdot.com released a poll for voting to select the best college of CS in the
US. In this poll, automated programs were created by students of the Carnegie
Mellon University and the MIT that repeatedly voted for their colleges. This
incident put the requirement of using CAPTCHA for online polls to ensure that
only humans are allowed to participate in polls [2]. CAPTCHA are used in many
web applications (or web services) like search engines, password systems, online
polls, account registrations, prevention of spam, blogs, messaging and phishing
attack detection etc. [3].
CAPTCHA can broadly be classified as text-based CAPTCHA, image-based
CAPTCHA, audio-based CAPTCHA and video-based Captcha. This paper

c Springer International Publishing AG 2018
S.C. Satapathy and A. Joshi (eds.), Information and Communication
Technology for Intelligent Systems (ICTIS 2017) - Volume 2, Smart Innovation,
Systems and Technologies 84, DOI 10.1007/978-3-319-63645-0 13
122 K. Patel and A. Thakkar

focuses on text-based CAPTCHA only. Text-based CAPTCHAs are widely used


as it is simple and user-friendly. Few examples of text-based captcha are Gimpy,
EZ-Gimpy, MSN-CAPTCHA, and Baffle-Text etc. In Gimpy CAPTCHA, ten
random words are selected from a dictionary and displayed to the user. These
words are displayed to the user using distorted images. Noise can be added to
the images so that it would be difficult for a machine to identify the CAPTCHA.
To access web service, the user must correctly enter the characters of the given
images. In EZ-Gimpy CAPTCHA, only one word is selected from a dictionary
and displayed to the user after applying misshape/distortion. Mori et al. [4] used
shape context matching method to identify the words in images of EZ-Gimpy
and Gimpy CAPTCHAs. These CAPTCHAS were cracked by object recognition
algorithms with success rate of 92% and 33% respectively.
The characters of CAPTCHA images are distorted so that it would be diffi-
cult for bots to crack CAPTCHAs. This also creates ambiguity to users to iden-
tify characters of the CAPTCHA. This may result in multiple attempts by the
user which may lead to security issues [6]. In [12], various tests were conducted
and examined that how different CAPTCHAs and their complexity affects the
user experience.
Microsoft Captcha was vulnerable to low-cost segmentation attack. A number
of text-based CAPTCHAs were cracked with an overall success rate of 60% and
success rate of 90% is achieved with segmentation [7]. Yan et al. [5] carried out an
attack on visual CAPTCHAs provided by Captchaservice.org and was successful
in cracking them. They exploited design errors and using simple naive pattern
recognition algorithms. In [8], authors have targeted the bank of China website
that uses text-based CAPTCHA. They have explored vulnerabilities like fixed-
character length of a text, use of only lower-case characters, etc. Pengpeng Lu
et al. [10] proposed a new segmentation method for connected characters using
BP (Back Propagation) neural network and drop-falling algorithms. This method
can solve CAPTCHAs having connected characters but fails if its characters is
seriously distorted or overlapped.
Starostenko et al. [9] discussed the novel approach to break text-based
CAPTCHA with variable text and orientation. SVM classifier is used in recog-
nizing straightened characters. Segmentation success rate of 82% for reCaptcha
2011 is achieved by this method. Financial institutions also use CAPTCHAs
for protecting their services. Wang et al. [11] proposed an algorithm to defeat
rotated text-CAPTCHA by transformation and segmentation using an adaptive
system.
Most of the text-Captchas are cracked by OCR attacks and to prevent from
OCR-attacks, CAPTCHA can be made more complex with noise and distortion
that affects the usability of users. Simplicity makes text-based CAPTCHA as
preferred choice of implementation, at the same time there is a need to pro-
tect the CAPTCHAs from boats. Hence, this paper proposes a new method for
CAPTCHA verification that makes easy for users to read and input the charac-
ters of the CAPTCHA, and at the same time, it makes difficult for bots to input
the CAPTCHA characters. This approach uses a virtual keyboard to take input
A Simple and Efficient Text-Based CAPTCHA Verification Scheme 123

from the user, eliminate the use of input box and compares the CAPTCHA
character based on the position of the characters instead of contents of the
CAPTCHA. The rest of the paper is organized as follows: Proposed approach

Start

Generate text-
based CAPTCHA
without distortion

Initialize virtual
keyboard with key
randomization

Save Key positions of


virtual keyboard and
CAPTCHA characters

Press a key of
Virtual Keyboard

Assign sequence
number to the key
pressed by the user

Save position and


sequence num-
ber of the key

Is all
characters
of no
CAPTCHA
input by
the user?

yes

Press submit button

Are
yes positions no
Successful of the
keys
matched?

Fig. 1. Basic working flow of virtual keyboard


124 K. Patel and A. Thakkar

is discussed in Sect. 2, Simulation setup and result discussion is given in Sect. 3,


and concluding remarks and future scope is given in Sect. 4.

2 Proposed Method
To reduce bot attacks, more complex CAPTCHAs are generated with distortions
and noise that affects the usability of users. Users get frustrated because of refresh-
ing the CAPTCHA many times as they face difficulty in reading characters of the
CAPTCHAs due to noise. Hence, instead of making CAPTCHA more complex,
security can be increased by developing a new CAPTCHA verification method
which is difficult for the bots but easy for the humans to pass the verification
process. This paper has proposed a new approach using virtual keyboard.
In the proposed approach, text-based CAPTCHA is created without noise
that makes easy for the user to read and pass the test in a single attempt in most
cases. The user uses the virtual keyboard to input CAPTCHA word. However,
this word is stored in the form of the position of the characters. The keys pressed
by the users are also highlighted and the sequence number is assigned to each
character pressed by the user. The sequence number of a particular character
can be viewed by the user by placing the mouse pointer over the specific key.
This method avoids the use of textbox to take input from the user which makes
it difficult for bots to input the CAPTCHA characters.
In addition to that, the proposed approach adds complexity by randomization
of keys of virtual keyboard. It should be noted that the proposed approach
compares the CAPTCHA text using key positions of the key pressed by the user
rather actual value of keys. The flowchart of proposed method is shown in Fig. 1.

3 Simulation Setup and Result Discussions


The proposed approach is verified using JAVA language. A CAPTCHA image of
random characters generated by the server is displayed to the user. It should be
noted that the proposed approach does not generate fixed length CAPTCHA. In
addition to that noise is removed to increase readability of the CAPTCHA. Both
of this help to overcome the weaknesses of text-based CAPTCHA generation
schemes discussed in Sect. 1.
When a page is loaded, the positions of the characters are saved. When the
user clicks the submit button, the position and sequencing of the CAPTCHA-
text is compared with the position and sequencing of the virtual keyboard keys
pressed by the user. If the positions are matched, the user gets the access of the
required services provided by the server otherwise, the page will be refreshed
and a CAPTCHA test begins with a new text-CAPTCHA and keyboard.
An example of CAPTCHA test is shown in the Fig. 2. The characters of
the keyboard get highlighted as the user clicks the character of the keyboard.
This helps the user to identify the characters which have been input by the user.
This can be evident through Fig. 3. Each character is assigned a unique sequence
number as soon as the user clicks on it. This sequence number helps the user
A Simple and Efficient Text-Based CAPTCHA Verification Scheme 125

to order the characters which have been input by him. The sequence number
of character ‘q’ is shown to the user when a mouse hovers on the character ‘q’.
This can be evident through Fig. 4. A user is required to click on submit button
when all characters are input by the user.

Fig. 2. Snapshot of CAPTCHA and keyboard

Fig. 3. Snapshot of keyboard with keys highlighted

Fig. 4. Snapshot of sequence displayed on mouseover

4 Conclusion and Future Scope

This paper presents a simple and efficient CAPTCHA verification scheme that dif-
ferentiate between human and machine. The proposed approach generates a sim-
ple text-based CAPTCHA which is easy to read by humans and hence, humans
126 K. Patel and A. Thakkar

can pass the test in a single attempt as far as possible. At the same time, use of vir-
tual keyboard along with randomized key positions makes it difficult for machines
to pass the CAPTCHA test. The proposed approach uses virtual keyboard to take
input for CAPTCHA verification, eliminates the inputbox that makes difficult for
boats to decide where to input CAPTCHA text, and uses of position-based veri-
fication in place of comparing contents of the CAPTCHA text.
In future, the proposed approach can be extended by randomizing positions
of the CAPTCHA and virtual keyboard, and both can take any position on the
screen. Use of handwritten characters to initialize the virtual keyboard can also
be considered as future scope. In addition to that, response time analysis can
make the proposed approach much stronger and can be considered as a future
scope.

References
1. Bursztein, E., Martin, M., Mitchell, J.: Text-based CAPTCHA strengths and weak-
nesses. In: Proceedings of the 18th ACM Conference on Computer and Communi-
cations Security, pp. 125–138 (2011)
2. Choudhary, S., Saroha, R., Dahiya, Y., Choudhary, S.: understanding CAPTCHA:
text and audio based CAPTCHA with its applications. Int. J. Adv. Res. Comput.
Sci. Softw. Eng. 3(6) (2013)
3. Banday, M.T., Shah, N.A.: A study of CAPTCHAS for securing web services.
arXiv preprint arXiv:1112.5605 (2011)
4. Mori, G., Malik, J.: Recognizing objects in adversarial clutter: breaking a visual
CAPTCHA. In: 2003 IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, Proceedings, pp. 1–134. IEEE (2003)
5. Yan, J., Ahmad, E., Salah, A.: Breaking visual captchas with naive pattern recog-
nition algorithms. In: Twenty-Third Annual Computer Security Applications Con-
ference, ACSAC 2007, pp. 279–291. IEEE (2007)
6. Yan, J., Ahmad, E., Salah, A.: Usability of CAPTCHAS or usability issues in
CAPTCHA design. In: Proceedings of the 4th Symposium on Usable Privacy and
Security, pp. 44–52. ACM (2008)
7. Yan, J., Ahmad, E., Salah, A.: A low-cost attack on a Microsoft CAPTCHA.
In: Proceedings of the 15th ACM Conference on Computer and Communications
Security, pp. 543–554. ACM (2008)
8. Ling-Zi, X., Yi-Chun, Z.: A case study of text-based CAPTCHA attacks. In: 2012
International Conference on Cyber-Enabled Distributed Computing and Knowl-
edge Discovery (CyberC). IEEE (2012)
9. Starostenko, O., Cruz-Perez, C., Uceda-Ponga, F., Alarcon-Aquino, V.: Break-
ing text-based CAPTCHAS with variable word and character orientation. Pattern
Recogn. 48, 1101–1112 (2015). Elsevier
10. Lu, P., Shan, L., Li, J., Liu, X.: A new segmentation method for connected char-
acters in CAPTCHA. In: 2015 International Conference on Control, Automation
and Information Sciences (ICCAIS). IEEE (2015)
11. Wang, Y., Lu, M.: A self-adaptive algorithm to defeat text-based CAPTCHA. In:
2016 IEEE International Conference on Industrial Technology (ICIT), pp. 720–725.
IEEE (2016)
12. Gafni, R., Nagar, I.: CAPTCHA-security affecting user experience. In: Issues in
Informing Science and Information Technology (2016)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy