Batch6DocumentationFinal_1
Batch6DocumentationFinal_1
MACHINE LEARNING
MAJOR PROJECT REPORT
Submitted by
MAY 2019
SRM UNIVERSITY
(Under Section 3 of UGC Act, 1956)
BONAFIDE CERTIFICATE
Certified that this project report titled “PHISHING SCAM DETECTION USING
MACHINE LEARNING” is the bonafide work of “BADRI NARAYAN MOHAN
[Reg No: RA151100302078], VIJAY KISHORE.V [Reg No: RA1511003020073],
KEERTHIVASAN.R [Reg No: RA1511003020088], NARESH SOLANKI [Reg
No: RA1511003020124]”, who carried out the project work under my supervision.
Certified further, that to the best of my knowledge the work reported here in does not
form any other project report or dissertation.
SIGNATURE SIGNATURE
Dept. of Computer Science and Engg. Dept. of Computer Science and Engg.
SRM Institute of Science and Technology SRM Institute of Science and Technology
Ramapuram Ramapuram
CHENNAI CHENNAI
EXAMINER I EXAMINER II
ABSTRACT
Abstract iii
List of Figures iv
Chapter 3 SPECIFICATIONS 10
3.1 Introduction 10
3.1.1 Purpose of Content 10
3.2.2.1 Python 13
3.2.3.2 Liclipse 14
3.1 Introduction 19
3.2 System Architecture 20
4.2.1 Description 20
3.3 System Requirement 21
3.4 Summary 21
REFERENCES 42
CHAPTER 1
INTRODUCTION
Phishing Scams are very ardent and widespread in the whole world. The internet has
become a crucial and indispensable infrastructure for the human society which has
helped both individuals and corporations over the years and has given a platform for
worldwide connectivity. However, it also has its fair share of drawbacks especially
when it comes to security. One of those security threats comes in the form of
Phishing. Phishing is a technique which employs technical tactics and social
engineering to lure gullible people into leaking personal and valuable data and
information. Phishers have multiple methods in their disposal to steal sensitive
information. One such form of phishing is achieved by creating replicas of real
websites which are designed in such a way that users are led to fraudulent websites
where unsuspecting users release credible values such as atm card values, pins and
many important data. Phishers also create spoofed e-mails disguising to be from
legitimate corporations which tricks recipients into believing such e-mails are from
those legitimate corporations and buy into the contents of such e-mails which slyly
demand users for information such as username, user id, and passwords for accounts
commonly held in social media and e-commerce websites among others. Such e-mails
also lure people into phoney schemes. The main reason why consumers of the internet
buy into such phishing methods is because of how phishers abuse everything right
from logos and slogans to trademarks among many such corporate identifiers which
makes the fake websites and e-mails dangerously similar and bear resemblance to
their original and legitimate counterparts. In the United States solidly it cost 71 billion
dollars in harm due to these scams and thefts that happen over the internet. Hence,
phishing continues to be one of the briskly growing identity theft scams on the
internet.
1.1 OVERVIEW
Blacklisting is a process commonly used by many web browsers and is used to warn
users about potentially dangerous web pages that are included in their blacklist
listings. However such listings don’t include previously unseen URLs since it is non-
trivial to decide if an unseen URL is malicious. Hence, phishing detection faces
challenges such as real time detection which is not possible with blacklisting as it is
1
impossible to have an exhaustive list of phishing websites. Speculation ability is
another test looked by phishing as assailants are constantly patching up their
techniques to set up flourishing framework so as to help continuous phishing
movement. One of the important blocks of such an infrastructure is botnet, which is
used to generate automated phishing emails and also anchor phishing sites. A recent
study by the APWG supports the fact that there could be more sophisticated schemes
and infrastructures used by attackers to exploit the ever expanding volume of popular
brands. To summarize, we are in an urgent need of a reliable phishing detection
system which can potentially assure almost-perfect accuracy in an internet
environment where the amount of attackers and phishing activity continues to expand
and grow.
1.3 OBJECTIVE
The main objective is to develop a suitable machine learning algorithm which helps in
blocking and analyzing phishing equipment’s and software’s. By doing so, one can be
aware of his/her environment and can prove to be fighting against immoral activities
happening over the internet. Here we use a decision tree algorithm using machine
learning and train the computer with data sets to distinguish if a website is a phishing
website or not. This algorithm constructs a decision tree based on two formulas called
the info gain and purity which helps in diversifying. Mainly to look out for the tree
constructed and how big is the tree which indicates the scam from real.
2
processing. There are seven chapters that deal with the various design and
implementation details.
Chapter 1
The chapter deals with the overview of the project, objective of the project and the
problem occurs during the completion of project.
Chapter 2
The chapter includes all the features of the existing system and the proposed system.
The issues in the existing system are discussed.
Chapter 3
The chapter includes purpose of the project and the description of the software used.
The features of the platform.
Chapter 4
The chapter deals with the proposed system architecture and the flow of process.
Design of the entire project is done.
Chapter 5
The chapter includes the various modules involved in the project and the architecture
of the entire system. The working of various modules is explained with description.
Chapter 6
The chapter entirely deals about the system implementation on the details about the
platform used and the implementation source code and screen shots of the output
produced.
Chapter 7
The chapter deals with the conclusion and the future enhancement of the project.
3
CHAPTER 2
LITERATURE SURVEY
2.1 INTRODUCTION
The possibility of phishing has been here for over 2 decades and can be followed back
to the 1990s by AOL (America Online). A group of programmers gathered together
and formed a group by the name ‘warez network’ who can be considered as the first
set of “phishers”. During the initial stages of phishing, a generator was made to
generate random charge card numbers which would be later used to create counterfeit
accounts on AOL. When they had the capacity to coordinate a certified card, they
made records and spammed others in AOL's society where individuals were there to
take the bait. Around 1995, AOL had the ability to stop such irregular charge card
generators, but the warez community moved on to other techniques and started to
disguise as AOL representatives and requesting users through AOL messenger for
private data.
4
malicious attackers are increasingly becoming difficult to trace and have developed
many approaches to easily pickup trust from individuals. Email phishing is a numbers
amusement. An aggressor passing on an extensive number of beguiling messages can
net significant information and sums of money, paying little respect to whether only
somewhat dimension of recipients falls for the trap. As saw above, there are a couple
of techniques aggressors use to fabricate their success rates.
For one, they will put everything on the line in planning phishing messages to imitate
real messages from a caricature association. Utilizing a similar stating, typefaces,
logos, and marks influences the messages to seem real. Moreover, aggressors will
ordinarily attempt to push clients enthusiastically by making a feeling of direness. For
instance, as recently appeared, an email could compromise account termination and
spot the beneficiary on a clock. Applying such weight makes the client be not so
much determined but rather more inclined to mistake. In conclusion, connects inside
messages look like their real partners, yet regularly have an incorrectly spelled area
name or additional subdomains. In the above urls, the avitahr.in/careers URL was
changed to avitahr.inrenewal.com. Similarities between the two tends to offer the
impression of a protected connection, making the beneficiary less mindful that an
assault is occurring.
5
2.2.3 Cybersquatting
Cybersquatting, is selecting, managing in, or using a space name with deceitfulness
objective to profit by the unselfishness of a trademark having a spot with someone
else. The cybersquatter may offer pitching the domain to an individual or organization
who claims a trademark contained inside the name at an expanded cost or may utilize
it for false purposes, for example, phishing. For instance, the name of your
organization is "Avita HR solutions" and you register as avitahrsolutions.com. At that
point phishers can enroll avitahrsolutions.net, abcompany.org, abcompany.biz and
they can utilize it for fake reason.
2.2.4 Typosquatting
Typosquatting, likewise called URL seizing/hijacking, is a type of cybersquatting
which depends on oversights, for example, typographical mistakes made by Internet
clients while contributing a site address into an internet browser or dependent on
typographical blunders that are difficult to see while a quick glance. URLs which are
made with Typosquatting resembles to be a confided URL. A client may
coincidentally enter a wrong site address or snap a connection which resembles a
confided in space, and along these lines, they may visit an elective site claimed by a
phisher.
6
2.2.6 An Intelligent Anti-phishing Strategy Model for Phishing Website
Detection
7
NFV architecture further offers agile steering and joint optimization of network
functions and resources. This architecture benefits a wide range of applications (e.g.,
service chaining) and is becoming the dominant form of NFV. In this survey, we
present a thorough investigation of the development of NFV under the software-
defined NFV architecture, with an emphasis on service chaining as its application. We
first introduce the software-defined NFV architecture as the state of the art of NFV
and present relationships between NFV and SDN. Then, we provide a historic view of
the involvement from middle box to NFV. Finally, we introduce significant
challenges and relevant solutions of NFV, and discuss its future research directions by
different application domains.
Fog Computing is a paradigm that extends Cloud computing and services to the edge
of the network. Similar to Cloud, Fog provides data, compute, storage, and
application services to end-users. In this article, we elaborate the motivation and
advantages of Fog computing, and analyse its applications in a series of real
scenarios, such as Smart Grid, smart traffic lights in vehicular networks and software
defined networks. We discuss the state-of-the-art of Fog computing and similar work
under the same umbrella. Security and privacy issues are further disclosed according
to current Fog computing paradigm. As an example, we study a typical attack, man-
in-the-middle attack, for the discussion of security in Fog computing. We investigate
the stealthy features of this attack by examining its CPU and memory consumption on
Fog device.
8
2.4 Summary of Literature Review
Various methods have been detected on the basis of type, attacking notion, motive,
region, vulnerability and so on. All these kinds of attacks are highly diverse and takes
time to provide datasets. These could be done by proper classifications and
understanding as to how a particular platform actually operates. As another type of
malignant programming, phishing sites show up often as of late, which cause
incredible mischief to online budgetary administrations and information security. In
this paper, we structure and actualize a keen model for identifying phishing sites. In
this model, we separate 10 unique sorts of highlights, for example, title, catchphrase
and connection content data to speak to the site. Heterogeneous classifiers are then
fabricated dependent on these distinctive highlights. We propose a principled group
characterization calculation to consolidate the anticipated outcomes from various
phishing location classifiers. Various leveled grouping procedure has been utilized for
programmed phishing order. Contextual investigations on expansive and genuine day
by day phishing sites gathered from Kingsoft Internet Security Lab exhibit that our
proposed model beats other ordinarily utilized enemy of phishing techniques and
devices in phishing site location.
9
We utilize uniform asset locator highlights and web traffic highlights to identify
phishing sites dependent on a planned neuro-fluffy system. In view of the new
methodology, mist figuring as empowered by Cisco, we structure an enemy of
phishing model to straightforwardly screen and shield mist clients from phishing
assaults.
10
CHAPTER 3
SPECIFICATIONS
3.1 INTRODUCTION
The internet has become a crucial and indispensable infrastructure for the human
society which has helped both individuals and corporations over the years and has
given a platform for worldwide connectivity. However, it also has its fair share of
drawbacks especially when it comes to security. One of those security threats comes
in the form of Phishing. Phishing is a technique which employs technical tactics and
social engineering to lure gullible people into leaking personal and valuable data and
information. Phishers have multiple methods in their disposal to steal sensitive
information. One such form of phishing is achieved by creating replicas of real
websites which are designed in such a way that users are led to fraudulent websites
where unsuspecting users release credible values such as atm card values, pins and
many important data. Phishers also create spoofed e-mails disguising to be from
legitimate corporations which tricks recipients into believing such e-mails are from
those legitimate corporations and buy into the contents of such e-mails which slyly
demand users for information such as username, user id, and passwords for accounts
commonly held in social media and e-commerce websites among others. Such e-mails
also lure people into phoney schemes. The main reason why consumers of the internet
buy into such phishing methods is because of how phishers abuse everything right
11
from logos and slogans to trademarks among many such corporate identifiers which
makes the fake websites and e-mails dangerously similar and bear resemblance to
their original and legitimate counterparts. In the United States solidly it cost 71 billion
dollars in harm due to these scams and thefts that happen over the internet. Hence,
phishing continues to be one of the briskly growing identity theft scams on the
internet.
One such Algorithm which can be implemented is called “Decision Tree Algorithm”.
This can be considered as an enriched nested-if-else structure. This algorithm checks
each feature one by one to determine if a certain URL is legitimate or not. URLs are
passed through a tree which contains nodes and leaves. The nodes are elliptical and
represent the features while leaves are rectangular and represent the classes. Samples
are fed into this tree to determine if they match the features’ classification to figure
out the class in which it belongs (phishing, legitimate). Once it goes through this
journey, it will be clear if the sample is phish or legitimate.
Thus the important thing is to decide the features intelligently. In order to achieve
this, the algorithm uses two equations which are used to provide values for two
parameters: Gain score, Entropy. The method used is called as Information Gain
method.
12
Fig 3.1: Gain Score Formula
The gain score value is directly proportional to the distinguishing ability of the
feature. Thus the objective is to rank the features in the tree in the order of reducing
gain scores. The Entropy value used in the equation has a separate equation. It is the
statistical measure of the purity of the samples used.
There are two Entropy values: Original Entropy and Relative Entropy in which
Original Entropy is constant while Relative Entropy keeps changing. The purity of the
samples is inversely proportional to the Relative Entropy value. To obtain the
hierarchy of features, we will use multiple samples to run through the tree to
determine the gain score values of each feature using the two equations and rank the
features in decreasing order of gain score. To grow the tree, leaves will be converted
into nodes as extra features are being added. During the process of the tree’s growth,
the leaves will have high purity thus indicating the tree is big enough and the training
process can be finished.
The efficiency of this algorithm depends on the variety of the samples used and the
sources from which the samples are extracted. Thus to attain generalization of success
of detection, we need to ensure the samples are extracted from multiple data sources
otherwise it will only work well with the existing dataset and not with real world data.
13
1) Effortless
2) Idiot-Proof Coding
3) Facile to Read
To start with, how about we find out about expressiveness. Assume we have two
dialects An and B, and all projects that can be made in A can be made in B utilizing
nearby changes. Nonetheless, there are a few projects that can be made in B, however
not in An, utilizing neighborhood changes. At that point, B is said to be more
expressive than A. Python gives us a horde of develops that assistance us center
around the arrangement as opposed to on the linguistic structure. This is one of the
exceptional python includes that discloses to you why you ought to learn Python.
https://www.python.org/downloads/
For understanding on the best way to download and introduce Python, allude
to our instructional exercise on Python Installation. Besides, it is open-source. This
implies its source code is accessible to people in general. You can download it,
change it, use it, and appropriate it. This is called FLOSS(Free/Libre and Open Source
Software). As the Python people group, we're altogether made a beeline for one
objective a regularly bettering Python
5) High-Level
Programmers need not worry about the language failing to impress them. It’s a very
code friendly program and no necessity to worry about memory management.
14
3.2.2.1 Python:
3.2.3.2 Liclipse:
15
Themed scrollbars
Improved text search capabilities (with Lucene index-based searching, support
for external folders, open editors and additional filtering on results page)
HTML preview for the RST, Markdown and HTML editors
Native installers
Improved theming support based on Eclipse 4 improvements
Release Highlights for LiClipse 5.1.3
Updated PyDev to 7.0.3.
Debugger performance improvements (on Python 3.6 onwards).
Mypy can be used for doing code analysis.
Black can be used as the code formatting engine.
It's now possible to use pipenv for managing virtual environments.
It's possible to manage virtual environments from the editor (Ctrl+2,
pip/conda/pipenv).
Updated EGit.
The uses of SQL include modifying database table and index structures;
adding, updating and deleting rows of data; and retrieving subsets of information from
within a database for transaction processing and analytics applications. Queries and
other SQL operations take the form of commands written as statements -- commonly
used SQL statements include select, add, insert, update, delete, create, alter and
truncate.
16
customer name or address -- while each row contains a data value for the intersecting
column.
SQL commands are divided into several different types, among them data
manipulation language (DML) and data definition language (DDL) statements,
transaction controls and security measures. The DML vocabulary is used to retrieve
and manipulate data, while DDL statements are for defining and modifying database
structures. The transaction controls help manage transaction processing, ensuring that
transactions are either completed or rolled back if errors or problems occur. The
security statements are used to control database access as well as to create user roles
and permissions.
17
Machine Learning is highly effective and much better than traditional phishing
detection methods such as blacklisting and filtering, but it comes with its own set of
challenges.
18
CHAPTER 4
SYSTEM DESIGN
4.1 INTRODUCTION
Phishing is a technique which employs technical tactics and social engineering to lure
gullible people into leaking personal and valuable data and information. Phishers have
multiple methods in their disposal to steal sensitive information. One such form of
phishing is achieved by creating replicas of real websites which are designed in such a
way that users are led to fraudulent websites where unsuspecting users release
credible values such as atm card values, pins and many important data. Phishers also
create spoofed e-mails disguising to be from legitimate corporations which tricks
recipients into believing such e-mails are from those legitimate corporations and buy
into the contents of such e-mails which slyly demand users for information such as
username, user id, and passwords for accounts commonly held in social media and e-
commerce websites among others. Such e-mails also lure people into phony schemes.
19
Fig 4.1: System Architecture
4.2.1 Description
Traditional phishing detection techniques such as blacklisting and content filtering has
multiple drawbacks which is not efficient enough for the phishing techniques of
attackers in this day and age. Filtering techniques such as Bayesian Content Filtering
can be easily bypassed through “Bayesian Poisoning” which circumvents the process
of filtering. Blacklisting results in multiple false positives due to the limited amount
of data sets available in such exhaustive methods. Thus it is crucial to bring in
Machine Learning into the picture to optimize phishing detection and the Decision
Tree Algorithm” is chosen as it is expected to give more optimal results.
Thus the important thing is to decide the features intelligently. In order to achieve
this, the algorithm uses two equations which are used to provide values for two
parameters: Gain score, Entropy. The method used is called as Information Gain
method.
20
4.3SYSTEM REQUIREMENTS
Hardware Requirements
Processor: - Intel Core i3 or above
Speed: - 1.50GHZ
Memory: - 2GB RAM
Hard Disk Drive: - 200GB
Software Requirements
Development Platform: - Windows 10
Coding Language:-Python,SQL
Tools: - LiClpise
Back End: - MySQL
4.4 SUMMARY
Traditional phishing detection techniques such as blacklisting and content filtering has
multiple drawbacks which is not efficient enough for the phishing techniques of
attackers in this day and age. Filtering techniques such as Bayesian Content Filtering
can be easily bypassed through “Bayesian Poisoning” which circumvents the process
of filtering. Blacklisting results in multiple false positives due to the limited amount
of data sets available in such exhaustive methods. Thus it is crucial to bring in
Machine Learning into the picture to optimize phishing detection and the Decision
Tree Algorithm” is chosen as it is expected to give more optimal results.
Thus the important thing is to decide the features intelligently. In order to achieve
this, the algorithm uses two equations which are used to provide values for two
parameters: Gain score, Entropy. The method used is called as Information Gain
method.
21
Fig 4.2 : Gain Score Formula
The gain score value is directly proportional to the distinguishing ability of the
feature. Thus the objective is to rank the features in the tree in the order of reducing
gain scores. The Entropy value used in the equation has a separate equation. It is the
statistical measure of the purity of the samples used.
There are two Entropy values: Original Entropy and Relative Entropy in which
Original Entropy is constant while Relative Entropy keeps changing. The purity of the
samples is inversely proportional to the Relative Entropy value. To obtain the
hierarchy of features, we will use multiple samples to run through the tree to
determine the gain score values of each feature using the two equations and rank the
features in decreasing order of gain score. To grow the tree, leaves will be converted
into nodes as extra features are being added. During the process of the tree’s growth,
the leaves will have high purity thus indicating the tree is big enough and the training
process can be finished.
The efficiency of this algorithm depends on the variety of the samples used and the
sources from which the samples are extracted. Thus to attain generalization of success
of detection, we need to ensure the samples are extracted from multiple data sources
otherwise it will only work well with the existing dataset and not with real world data.
22
CHAPTER 5
MODULE DESCRIPTION
5.1 INTRODUCTION
Our complete project deals with the different modules based on the working. These
modules are listed below :
List of modules
The list of modules to performed are given below
Primary Domain
Sub Domain
Path Domain
Page Rank
Alexa Reputation
Google Index
23
5.2 PRIMARY DOMAIN
Phishers can't utilize the first Primary Domain since it is now enlisted by the first
organization. Subsequently, phishers register incorrect spellings or comparable
Primary Domain of phishing sites to trick clients.
Phishers frequently prepend the space of phishing sites to their site. For instance,
phishers prepend the Sub Domain "paypal.com" to some other area (e.g., ".io",
".business") that may trick clients into the phishing URLs.
This is a sub-organizer of the URL. Phishers can likewise utilize the Path Domain to
trick clients. For instance, phishers may explore clients to the URL
www.attack.com/paypal, where a phishing site interface is like the first one.
Heedlessly, the clients will believe that this URL is from the "paypal.com" site.
Particularly, utilizing cell phones with little realistic interfaces, it might be too hard to
even consider recognizing such phishing URLs.
24
Google search engine uses a link analysis algorithm to build PageRank values most
phishing web-pages have low PageRank, because these sites exist only for a short
time.
Google file records every genuine site that are visited by specialists of Google.
Google every now and again refreshes this file list for its web index. The estimations
of Google Index for phishing sites are a lot littler than those of real locales.
5.8 SUMMARY
Thus these are the various modules present in our project implementation.
25
CHAPTER 6
SYSTEM IMPLEMENTATION
6.1 INTRODUCTION
In this chapter implementation of the system is described in detail.
26
6.5 SUMMARY
In system implementation, all the details regarding the simulation and implementation
of the project have been mentioned along with the sample coding for the Phishing
scam detection and screenshot of every module is also given in the above section.
Thus, the proposed system has been executed successfully.
27
CHAPTER 7
28
REFERENCES
[1] L. Wenyin, G. Huang, L. Xiaoyue, X. Deng, and Z. Min, “Phishing web page
detection,” in Document Analysis and Recognition, 2005. Proceedings. Eighth
International Conference on. IEEE, pp. 560–564.
[9] V. M. Gandhimathi K1, “Identifying similar web pages using scoring methods for
web community mining,” in Proc. of Int. Conf. on Advances in Computer Science,
AETACS, 2013.
29
[10] B. B. Gupta, A. Tewari, A. K. Jain, and D. P. Agrawal, “Fighting against
phishing attacks: state of the art and future challenges,” in Neural Computing and
Applications, 2016.
[12] Fog computing and the internet of things: Extend the cloud to where the things
are. [Online]. Available: http://www.cisco.com/c/dam/en
us/solutions/trends/iot/docs/computing-overview.pdf
[17] N. Zhang and Y. Yuan, “Phishing detection using neural network, cs229 lecture
notes.” [Online].Available:http://cs229.stanford.edu/proj2012/ZhangYuan
PhishingDetectionUsingNeuralNetwork.pdf,2012
30
Available: http://blog.trendmicro.com/trendlabs-security-intelligence/vulnerability-in-
spotify-android-app-may-lead-to-phishing/
[26] R. Verma and K. Dyer, “On the character of phishing urls: Accurate and robust
statistical learning classifiers,” in Proceedings of the 5 th ACM Conference on Data
and Application Security and Privacy, ser.CODASPY ’15. New York, NY, USA:
ACM, 2015, pp. 111–122.
31
[27] L. Wenyin, G. Liu, B. Qiu, and X. Quan, “Antiphishing through phishing target
discovery,” IEEE Internet Computing, vol. 16, no. 2, pp. 52–61, March 2012.
[28] F. Bonomi, R. Milito, P. Natarajan, and J. Zhu, Fog Computing: A Platform for
Internet of Things and Analytics. Springer International Publishing, 2014, pp. 169–
186.
[29] IoT, from cloud to fog computing. Accessed Sept 2015. [Online]. Available:
http://blogs.cisco.com/perspectives/iot-from-cloud-to-fog-computing
[30] J. Shropshire, “Extending the cloud with fog: Security challenges &
opportunities,” in Information Systems Security, Assurance, and Privacy Track
Conference, 2014.
[31] I. Stojmenovic and S. Wen, “The fog computing paradigm: Scenarios and
security issues,” in Computer Science and Information Systems (FedCSIS), 2014
Federated Conference on. IEEE, pp. 1–8.
[32] Network appliance issues advisory for customers facing phishing attacks.
Accessed Sept 2016. [Online]. Available:
http://www.netapp.com/us/company/news/press-releases/
[33] Cloud NFV White Paper. Accessed date Sept 2016. [Online]. Available:
http://cloudnfv.com/WhitePaper.pdf
32
[37]Url blacklist. Accessed Sept 2016. [Online]. Available: http://uribl.com
[40] R. Dhamija and J. D. Tygar, “The battle against phishing: Dynamic security
skins,” in Proceedings of the 2005 symposium on Usable privacy and security. ACM,
2005, pp. 77–88.
[41] J. H. Huh and H. Kim, “Phishing detection with popular search engines:Simple
and effective,” in Foundations and Practice of Security.Springer, 2012, pp. 194–207.
[48] P. Liu and H. Li, “Fuzzy neural networks for storing and classifying,”in Fuzzy
Neural Network Theory And Application. World Scientific,2004, pp. 25–67.
33
[49] Query suggestion service of google. Accessed Sept 2016.[Online].
Available:”http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa
doc set/xml reference/query suggestion.html”
[50] H. Zhang, G. Liu, T. W. Chow, and W. Liu, “Textual and visual content-based
anti-phishing: a bayesian approach,” Neural Networks, IEEE Transactions on, vol. 22,
no. 10, pp. 1532–1546, 2011.
[51] W. Zhuang, Q. Jiang, and T. Xiong, “An intelligent anti-phishing strategy model
for phishing website detection,” in Distributed Computing Systems Workshops
(ICDCSW), 2012 32nd International Conference on. IEEE, 2012, pp. 51–56.
[52] J.-S. Jang and C.-T. Sun, “Neuro-fuzzy modeling and control,” Proceedings of
the IEEE, vol. 83, no. 3, pp. 378–406, 1995.
34
35