The Use of Honeypot in Machine Learning Based On Malware Detection: A Review

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/346816068
The Use of Honeypot in Machine Learning Based on Malware Detection: A

Review
Conference Paper · December 2020

DOI: 10.1109/CITSM50537.2020.9268794
CITATIONS READS
11 1,590
2 authors:
Iik Muhamad Malik Matin Budi Rahardjo

Bandung Institute of Technology Bandung Institute of Technology
15 PUBLICATIONS 50 CITATIONS 57 PUBLICATIONS 256 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Elliptic Curve Cryptography View project
Honeypot Machine Learning Based Malware Detection View project
All content following this page was uploaded by Iik Muhamad Malik Matin on 09 December 2020.
The user has requested enhancement of the downloaded file.

The Use of Honeypot in Machine Learning Based
on Malware Detection: A Review
1st Iik Muhamad Malik Matin 2nd Budi Rahardjo

School of Electrical Engineering and Informatics School of Electrical Engineering and Informatics
Bandung Institute of Technology Bandung Institute of Technology
Bandung, Indonesia Bandung, Indonesia
iikmuhamadmalikmatin@students.itb.ac.id br@paume.itb.ac.id
Abstract—A very significant increase in the spread of proposed by many researchers to detect malware, including
malware has resulted in malware analysis using signature using machine learning.
matching approaches and heuristic methods that are no longer
suitable for malware analysis. Recently the approach to using Machine learning has the ability to be quite effective and
machine learning has been proposed by many researchers. efficient for malware detection [4], and machine learning-
Machine learning is considered a more effective and efficient based antimalware software is an effective method to use [5].
approach to detect malware compared to conventional Besides, compared to the signature analysis-based approach,
approaches. At the same time, researchers proposed a honeypot machine learning has better effectiveness [6]. Machine
as a device capable of gathering malware information. Honeypot learning will identify the presence of malware based on the
is designed as a malware trap and is stored on the system classification of a class that has been defined in a dataset.
provided. Then record events that detect and gather Classification of a class in a dataset is often called a label that
information about the attacker's activities and identity. This is divided into two classes, namely benign and malware [7].
paper aims to investigate the use of honeypot in machine
learning to detect malware. The Systematic Literature Review The challenge of machine learning for detecting malware
(SLR) method was used to identify 684 papers in the IEEE is the dependence of machine learning on data collection.
Xplore database and ACM Digital Library based on automatic When machine learning receives very little training data, it
searches and predefined strings. Then 11 papers were selected will make it difficult for machine learning to detect new
to be investigated based on inclusion and exclusion criteria. malware [8].
From the results of the literature, it can be concluded that the
trend of honeypot use in malware detection-based learning has At the same time, the researchers proposed a honeypot as
increased from 2017 to 2019. The techniques used by most a device capable of gathering information on malware activity
researchers are utilizing available honeypot datasets. [9][10][11]. Honeypot is a system that is deliberately left as
Meanwhile, based on the type of malware analyzed, honeypot in bait to lure potential attackers so that they stay away from
machine learning is mostly used to collect IoT-based malware. critical systems. Honeypot is designed to divert attention from
critical systems, gather attacker activity, and allow attackers
Keywords—SLR, honeypot, machine learning, malware, to stay on the system [12].
survey
Honeypot can help machine learning update training data
I. INTRODUCTION so that it can provide better accuracy. However, in its
application, the honeypot does not have a clear scheme, so it
Malware is one of the most increased security threats in
becomes difficult and confusing when it is implemented,
recent years. Malware is specifically designed to attack
especially in adjusting to one's needs in developing a machine
systems to steal sensitive and confidential information. Also,
learning model. For this reason, relevant literature is needed
malware can be designed to disrupt running systems to create
to be investigated further to identify trends and provide
digital chaos [1][2].
direction in the future.
Kaspersky Lab reported that throughout 2016 it detected
This study conducted a literature review using the
more than 1.966.324 malware warnings that attempted to steal
systematic literature review (SLR) method to identify
money by utilizing online access to bank accounts. In addition,
honeypot trends. By using SLR, researchers can find solutions
753.684 different computer users were infected with the
by reviewing relevant previous research [13]. This study aims
ransomware program, of which 179.209 computers were
to understand trends, techniques, and types of malware
encrypted with a ransomware program. In addition, Kaspersky
handled by honeypots. This study analyzes articles with
antivirus solution also detects 121.262.075 different malicious
literature database sources on the IEEE Xplore and ACM from
objects such as scripts, exploits, executable less, and others.
2010 to 2020.
This means, at least one web attack occurred on 34.2% of
computer users [3]. A survey regarding the honeypot trend has been conducted
by Campbell and Keshnee [14], which identifies honeypot
High levels of computer malware (viruses, worms, trojan
research trends and identifies research opportunities in the
horses, rootkits, botnets, backdoors, and other malicious
honeypot area. The survey was also conducted by Matthew L.
software) and increased malware spread from day to day have
Bringer [15], which presents a survey of the latest advances in
resulted in conventional signature-based approaches failing to
honeypot research. Currently, machine learning is a hot issue,
detect new polymorphic that did not look dangerous before.
while honeypot and machine learning studies have not been
Also, manual heuristic checking and static malware analysis
found. For that, we are interested in conducting a survey on
are no longer effective and efficient compared to high
honeypot in its use in machine learning.
malware spread [4]. Several other approaches have been
XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE
Authorized licensed use limited to: Institut Teknologi Bandung. Downloaded on December 09,2020 at 19:22:42 UTC from IEEE Xplore. Restrictions apply.
The 8th International Conference on Cyber and IT Service Management (CITSM 2020)
On Virtual, October 23-24, 2020
This paper is divided into four sections as follows: part I, systematic literature review. The scope is determined so that
we explain the background and purpose of this paper; part II the study remains focused on the topic discussed.
we discuss the research methodology used; then part III we Determination of the scope is based on Population,
discuss the results of the literature study; part IV we discuss Comparation Interventions, Outcomes, and Context (PICOC)
the threat of validity and finally section V we summarize the
criteria [21], which is shown in Table I.
results of the discussion.
TABLE I. SLR SCOPE
II. RESEARCH METHODOLOGY
In this section, we describe the methods and designs of the Criteria RQ1 RQ2 RQ3
Population Trend Techniques field
Systematic Literature Review that we carried out. Systematic Intervention Identification, internal / external validation, data
Literature Review is a way of evaluating and interpreting all extraction and synthesizing data
available research that is relevant to a particular research Comparison A comparison study in honeypot in machine learning
question, topic area, or interesting phenomenon. A systematic for malware detection
review aims to present a fair evaluation of a research topic Outcomes Usability honeypot in machine learning techniques for
using a methodology that is trustworthy, though, and malware detection
auditable. There are at least three main reasons for conducting Context Study in academic research
SLRs. First, to investigate and synthesize existing research
topics. The two identify gaps in previous research. The third Based on the PICOC formulation, the research questions
is to provide a strong foundation for investigating new are determined as follows:
research topics [17]. RQ1 What is the research trend on the use of honeypot in
The systematic Literature review process consists of three detecting machine learning-based malware?
stages: a literature planning review, a literature review RQ2 What is the honeypot technique that researchers use in
process, and a documented literature [16]. The steps of the detecting machine learning-based malware?
systematic literature review method can be summarized in
Fig. 1. RQ3 What types of malware are handled by honeypot in
detecting machine learning-based malware?
3) Developing and validating review protocol: To reduce
researcher bias, we define a systematic literature review
protocol. The protocol covers defining the scope, search
strategy, and determining PICOC as well as defining
inclusions and exclusions. The validation process is carried
out in consultation with experts to be criticized and evaluated.
We improved the protocol based on expert feedback.
B. Conducting Review
In the second stage, the steps for conducting consist of: (i)
selecting studies; (ii) extracting required data; (iii)
synthesizing data.
Fig. 1. Systematic Literature Review Steps [18] [19] 1) Selecting Studies: To search for papers relevant to the
topics we cover, we define search strings by identifying
Based on the outline, we follow a three-step review
keywords based on the topic area and related fields. We
process, namely planning review, conducting a review, and
documenting review. determined the keywords "Honeypot," "Malware," "Machine
learning," and "Detection" as the chosen topic area. In
A. Planning Review addition, we determine mutants from the topic area. We also
In the first stage, the planning steps consist of: (i) use strings AND, OR, an asterisk * and quotation marks ".
identifying the needs; (ii) specifying the research question; Fig. 2 displays the string construction that will be used for
(iii) developing and validating protocol review. searching.
1) Identifying the needs: The need for a systematic To get a comprehensive result of the papers investigated,
we conducted a search on the digital library database sources
literature review has been explained in the introduction. In
that were relevant to the topic that we had determined.
addition, a systematic literature review is needed to identify,
classify, and compare research on the needs and use of For this reason, we determine the digital library list as
honeypots in machine learning in detecting malware. This follows:
paper aims to identify related research papers. To show that 1) IEEE Xplore (ieeexplore.ieee.org)
similar reviews have not been reported, we searched for a 2) ACM Digital Library (dl.acm.org)
digital database, namely IEEE Xplore, ACM Digital Library.
2) Specifying the research question: Determining IEEE Explore is a library that provides scientific research
research questions helps researchers in defining and results in the form of indexes, abstracts, and full texts for
articles and papers in computer science and electrical
evaluating literature review protocols [20]. The research engineering, while the ACM Digital Library is a library
questions were determined based on the scope of the
consisting of journals, magazines, and proceedings that are digital library and 667 papers in ACM so that the total papers
often used by researchers and experts in the field computer obtained were based on a number of 684 keywords. Then we
and information technology. apply the inclusion and exclusion criteria. We only choose
papers that meet the criteria that can be used as SR's main
reference. The results of applying the inclusion and exclusion
criteria obtained 14 papers in the IEEE Xplore digital library
and 268 papers in the ACM Digital Library. Finally, we select
papers according to the topic by scanning abstracts and full-
text scanning of papers that have been selected by inclusion
and exclusion criteria. From the selection process, there were
seven papers selected from the IEEE Xplore digital library
and four papers in the ACM Digital library so that the number
of papers investigated was 11. Table II displays the number
of papers selected based on the selected selection process.
TABLE II. NUMBER OF PAPER BASED ON THE SELECTION PROCESS
Selection step
Select by year
Conference
Automatic
Journal &
Scanning
Abstract
Full-Text
Scanning
Search
Digital Library
IEEE Xplore 17 17 14 11 7
ACM Digital Library 667 326 268 12 4
Total 684 343 282 23 11
Fig. 2. String Construction
The selected articles are arranged and synthesized based

2) Extracting Required Data: We execute strings in the
on four research question domains, namely research trends,
IEEE Xplore digital library and ACM Digital Library search techniques, and types of malware. The results of the analysis
fields. We carry out a search procedure with the following are then presented using a visualization diagram presented in
stages: section III.
1. The string is inputted in the auto search field of the
IEEE Xplore and ACM Digital Library C. Documenting Review
2. Search manually on journal papers, seminars, and In the third stage, the steps of the Documenting Review
proceedings consist of (i) Document Observation; (ii) Threat Analysis; (iii)
3. The term of the paper is from February 2010- Result Description.
February 2020 1) Document Observation: At this stage, we make
We defined inclusion and exclusion criteria to ensure only observations from selected articles. To answer the research
relevant papers were accepted by the SR [22]. The inclusion question, we classify selected articles based on the specified
and exclusion criteria were described as follows:
research question topic. The inquiry question consists of four
1. Inclusion Criteria: topics, namely, trends, techniques, and field.
• Journal, conference, and proceeding 2) Threat Analysis: In general, the results of the
• The term of the paper is between 2010-2020 systematic literature review are reliable [23]. But every
• Papers taken are only in the form of journals, systematic literature review has potential limitations [24]. For
seminars, and proceedings this reason, we discuss threat analysis, which will be
• In case of duplication, only the most complete explained in Part IV.
or the most recent are included 3) Result Description: In the final stage, we describe
• The digital database used is only from the IEEE reporting SLR findings. The results are explained in section
Xplore and ACM Digital Library
III.
2. Exclusion Criteria:
• Books, theses, book titles III. RESULT
• Duplicate papers
• Papers are not written in English A. Q1: Trend Research
• Incomplete papers Based on search results, research related to the use of
Next, we filter the specified criteria. Filter stages are honeypot in malware detection based machine learning
based on the title, abstract, official access, and full-text contained one paper each in 2012, 2014, and 2018 while the
review. number of 2 papers was also published in 2020. The number
of published studies has increased significantly and was
3) Synthesizing Data: Based on the results of the occurred in 2019 as many as six papers. Fig. 3 is a chart of
keyword execution, we got 17 papers in the IEEE Xplore publication trends based on the year of publication.
shellcodes to worms, and its variants. Benign but suspicious

number of publication flow flows on HTTP GET, POST, and OPSI, FTP, and TFTP
8
requests, download requests, oracle, and MySQL database
connection requests.
6
4
2
0
2012 2013 2014 2015 2016 2017 2018 2019 2020
number of publication
Fig. 3. Paper distribution by year
B. Q2: Techniques for Using Honeypot

The purpose of a honeypot is to collect attack information
and store it in a log for investigative purposes. Information
gathering on honeypot is very needed to support the
availability of data needed by machine learning in conducting
training models.
TABLE III. PAPER-BASED ON TECHNIQUES Fig. 4. Overview of IoTPOT design [40]

Techniques References
Other studies conducted by [30], [31], [32], [33], [34], and
[25], [26],
Virtual Honeypot
[27], [28]
[35] uses a dataset to detect malware on the Internet of Thing
[29], [30], platform. IoTPOT Dataset is an IoT-based Honeypot project
[31], [32], from Yokohama National University. IoTPOT collects
Open sources honeypot dataset
[33], [34], botnets from artificial telnet services from various IoT device
[35] sources to analyze attacks that occur in depth. IoTPOT
functions to perform active scanning automatically of the IP
In the process, several researchers conducted a collection addresses that are attacked to get their banner profile profiles.
of malware directly using a virtual honeypot. In addition to the IoTPOT was designed with a combination of low-front-end
direct way, other researchers can also use the honeypot dataset respondents and high-end virtual interactions known as
in available sources. In this way, researchers can use data IoTBOX, which act as malware sandboxes. IoTBOX operates
directly without having to build a honeypot infrastructure in a variety of virtual environments commonly used by
the table. III presents the techniques for using honeypot with embedded systems for different CPU architectures. In general,
appropriate literature. fig. 4 is an overview of the design of IoTPOT [40].
1) Virtual Honeypot: The honeypot virtualization C. Q3: Type of Malware

technique is used by research [25], [26], [27] and [29]. Virtual Malware can be divided based on shape. This affects who
honeypot is a trap system built using virtualization is the target of malware. From a machine learning perspective,
technology so that honeypot allows it to run using only one different types of malware can affect the features used. These
features largely determine the model's performance. Based on
machine. Virtual honeypot has the ability to respond to
the results of the study, there are three types of malware,
network traffic by simulating it in one system. Honeypot can namely portable windows executable malware, internet of
also be said to be a simulation engine with model behavior things malware, and malware streams. Table IV shows a list
[36]. Virtual honeypot is more commonly used because it is of types of malware analyzed in the period 2010 to 2020.
easily implemented into a system using VMWare, User Mode
Linux, and Microsoft Virtual PC [37]. The advantage of using TABLE IV. PAPER-BASED ON TYPE OF MALWARE
virtualization honeypot is that it is easy to isolate and repair. Type of malware References
Besides, a virtual honeypot can mimic several systems in one Portable Executable (PE) [25]
machine [38]. Malware streams [29], [27]
2) Open sources Honeypot Dataset: Research conducted [26], [28],
by Abbasi [29] using the dataset provided by Tillmann Internet of Thing (IoT) [30], [34],
[31]
Werner. The Nebula dataset was manually labeled and
analyzed by Tillmann Werner [39]. This dataset consists of
6631 suspicious benign and unique, dangerous flows. This 1) Portable Executable: file is a file format that can be
dataset was collected from two honeytrap honeypot instances executed on Windows operating systems such as .exe, etc.,
in the last quarter of 2007. This data consists of 28 main label COFF, and others in either 32-bit or 64-bit. Some features
categories and a total of 55 labels, including all subcategories. used in machine learning consist of DOS Header, PE Header,
Dangerous flows range from buffer-overflows to exploits, and Optional Headers. DOS header provides information
about the compatibility of Windows NT. The PE file header inclusion criteria. From the results of the review, it is known
contains the PE file header containing many important fields that research trends on the use of honeypot to detect malware
which tell about which machine should run this PE file, the fluctuate from 2012 to 2020. Significant research trend
number of parts in this PE file, the optional header size that improvement occurred in 2019. Based on the technique of use,
the honeypot is more widely used using virtualization
starts immediately after the PE file header, the number of
techniques. Namely, honeypot installed to obtain information,
symbols and the compiler date timestamp. Finally, the then modeling based information obtained. In contrast, other
optional header contains many important fields such as entry techniques use a honeypot-based dataset that is already
address, OS version, Image Base, Database, Image version, available. By using a dataset, researchers do not need to install
subsystem version, etc. The optional header defines the a honeypot so that modeling can be made directly based on the
logical structure of the PE file, and it's probably the most available dataset. The use of honeypot in machine learning to
important header in a PE file[41]. detect malware is more studied in Internet-based Thing (IoT)
2) Datastream: is a data format based on the network. In malware. In contrast, others are based on data streams and
the honeypot data stream, they record network traffic and Windows malware.
store information such as IP address, the protocol used, Honeypot is a system that can collect malware information
service used, number of packet data, packet destination, and that is useful as machine learning training data, giving
others. researchers many choices in developing machine learning
3) Internet of Things: based malware is a sample of models. Sometimes researchers need consideration in
malware that attacks IoT architectures such as ARM, M68K, choosing a honeypot as their dataset. For this reason, further,
SPARC, MIPS, PPC, SH4, and many more. Based on its development is needed to prove the ability of the honeypot
family, the malware consists of ZORRO, GAYFGT, nttpd, dataset to improve the performance of models with different
techniques.
KOS, and *.sh[40].
REFERENCES
IV. THREATS TO VALIDITY
[1] C. Vatamanu, D. Cosovan, D. Gavrilu, and H. Luchian, "A
One of the most important elements in implementing this Comparative Study of Malware Detection Techniques Using Machine
SLR is designing and reducing threats to validity [42]. We Learning Methods," Int. J. Comput. Electr. Autom. Control Inf. Eng.,
have considered threats to validation to ensure the accuracy vol. 9, no. 5, pp. 1115–1122, 2015.
and acceptance of the results of this systematic review. We try [2] T. Mithal, K. Shah, and D. K. Singh, "Case Studies on Intelligent
to reduce and provide solutions to threats of validity based on Approaches for Static Malware Analysis," Emerg. Res. Comput.
Information, Commun. Appl., pp. 555–567, 2016, DOI: 10.1007/978-
four threat factors. 981-10-0287-8.
1) Construction Validity: To obtain relevant studies, we [3] Kaspersky Lab, "Kaspersky Lab′s Cyber Security report," 2016.
developed the right construction. We determine the selection [4] I. Firdausi, C. Lim, A. Erwin, and A. S. Nugroho, "Analysis of machine
learning techniques used in behavior-based malware detection," Proc.
of studies that are appropriate to avoid bias. The repository - 2010 2nd Int. Conf. Adv. Comput. Control Telecommun. Technol.
that we use comes from databases that are related to the ACT 2010, pp. 201–203, 2010, DOI: 10.1109/ACT.2010.33.
domain of computer science. We use string experiments [5] Z. Markel and M. Bilzor, "Building a machine learning classifier for
several times with several different keywords and then malware detection," WATeR 2014 - Proc. 2014 2nd Work. Anti-
Malware Test. Res., 2015, DOI: 10.1109/WATeR.2014.7015757.
formulate them into full strings to reach related papers. In [6] M. G. Schultz, E. Eskin, E. Zadok, and S. J. Stolfo, "Data mining
addition to using an automatic search, we do a manual search methods for detection of new malicious executables," Proc. IEEE
by searching for citations related to ensuring no studies are Comput. Soc. Symp. Res. Secur. Priv., pp. 38–49, 2001, DOI:
10.1109/secpri.2001.924286.
missed.
[7] U. Pehlivan, N. Baltaci, C. Acarturk, and N. Baykal, "The analysis of
2) The validity of Conclusions: To achieve the feature selection methods and classification algorithms in permission-
appropriate literature goals, we repeatedly experimented with based Android malware detection," IEEE SSCI 2014 2014 IEEE Symp.
procedures to ensure that other researchers could replicate the Ser. Comput. Intell. - CICS 2014, 2014. IEEE Symp. Comput. Intell.
Cyber Secur. Proc., pp. 1–8, 2014, DOI:
same literary goals. We try to reduce the possibility of some 10.1109/CICYBS.2014.7013371.
relevant studies being missed. [8] Y. Roh, G. Heo, and S. E. Whang, "A Survey on Data Collection for
3) Internal Validity: We use a tool that is Microsoft Excel Machine Learning: A Big Data - AI Integration Perspective," IEEE
to extract information from selected papers. This is done to Trans. Knowl. Data Eng., vol. 4347, no. c, pp. 1–1, 2019, DOI:
10.1109/tkde.2019.2946162.
avoid inaccuracies during the data analysis process.
[9] K. Saikawa and V. Klyuev, "Detection and Classification of Malicious
4) External Validation: To ensure the results of the Access using a Dionaea Honeypot," Proc. 2019 10th IEEE Int. Conf.
findings can be generalized, we divide the findings into three Intell. Data Acquis. Adv. Comput. Syst. Technol. Appl. IDAACS 2019,
classifications so that they do not go beyond the topics vol. 2, pp. 844–848, 2019, DOI: 10.1109/IDAACS.2019.8924340.
discussed. [10] P. D. Ali and T. Gireesh Kumar, "Malware capturing and detection in
dionaea honeypot," 2017 Innov. Power Adv. Comput. Technol. i-PACT
2017, vol. 2017-Janua, pp. 1–5, 2017, DOI:
V. CONCLUSION AND FUTURE WORK 10.1109/IPACT.2017.8245158.
In this paper, we conducted a literature study on the use of [11] V. Sethia and A. Jeyasekar, "Malware capturing and analysis using
honeypot in machine learning. We analyzed 11 productions, dionaea honeypot," Proc. - Int. Carnahan Conf. Secur. Technol., vol.
conferences, and journal papers published in the 2010-2020 2019-October, pp. 0–3, 2019, DOI: 10.1109/CCST.2019.8888409.
period from the IEEE Xplore database and ACM Digital [12] W. Stallings, Computer Security Third Edition, 3rd Editio. United
States of America: Pearson Education, 2015.
Library. We have done the sorting by applying exclusion and
[13] Miswar, Suhardi, and N. B. Kurniawan, "A Systematic Literature GLOBECOM 2019 - Proc., pp. 1–6, 2019, DOI:
Review on Survey Data Collection System," 2018 Int. Conf. Inf. 10.1109/GLOBECOM38437.2019.9014300.
Technol. Syst. Innov. ICITSI 2018 - Proc., pp. 177–181, 2018, DOI: [29] F. H. Abbasi, R. J. Harris, G. Moretti, A. Haider, and N. Anwar,
10.1109/ICITSI.2018.8696036. "Classification of malicious network streams using honeynets,"
[14] R. M. Campbell, K. Padayachee, and T. Masombuka, "A survey of GLOBECOM - IEEE Glob. Telecommun. Conf., pp. 891–897, 2012,
honeypot research: Trends and opportunities," 2015 10th Int. Conf. DOI: 10.1109/GLOCOM.2012.6503226.
Internet Technol. Secur. Trans. ICITST 2015, pp. 208–212, 2016, DOI: [30] H. T. Nguyen, Q. D. Ngo, and V. H. Le, "IoT Botnet Detection
10.1109/ICITST.2015.7412090. Approach Based on PSI graph and DGCNN classifier," 2018 IEEE Int.
[15] M. L. Bringer, C. A. Chelmecki, and H. Fujinoki, "A Survey: Recent Conf. Inf. Commun. Signal Process. ICICSP 2018, no. Icsp, pp. 118–
Advances and Future Trends in Honeypot Research," Int. J. Comput. 122, 2018, DOI: 10.1109/ICICSP.2018.8549713.
Netw. Inf. Secur., vol. 4, no. 10, pp. 63–75, 2012, DOI: [31] C. Tien, S. Chen, and I. Industry, "Machine Learning Framework to
10.5815/ijcnis.2012.10.07. Analyze IoT Malware Using ELF and Opcode Features," 2020, vol. 1,
[16] B. Kitchenham and P. Brereton, "A systematic review of systematic no. 1, pp. 1–19.
review process research in software engineering," Inf. Softw. Technol., [32] T. N. Phu, L. H. Hoang, N. N. Toan, N. D. Tho, and N. N. Binh,
vol. 55, no. 12, pp. 2049–2075, 2013, DOI: "CFDVex: A novel feature extraction method for detecting cross-
10.1016/j.infsof.2013.07.010. architecture IoT malware," in ACM International Conference
[17] B. Kitchenham and S. Charters, "Guidelines for performing Systematic Proceeding Series, 2019, pp. 248–254, DOI:
Literature Reviews in Software Engineering," 2007. 10.1145/3368926.3369702.
[18] C. Jatoth, G. R. Gangadharan, and R. Buyya, "Computational [33] H. T. Nguyen, D. H. Nguyen, Q. D. Ngo, V. H. Tran, and V. H. Le,
Intelligence Based QoS-Aware Web Service Composition: A "Towards a rooted subgraph classifier for IoT botnet detection," ACM
Systematic Literature Review," IEEE Trans. Serv. Comput., vol. 10, Int. Conf. Proceeding Ser., pp. 247–251, 2019, DOI:
no. 3, pp. 475–492, 2017, DOI: 10.1109/TSC.2015.2473840. 10.1145/3348445.3348474.
[19] B. Kitchenham, "Procedures for Performing Systematic Literature [34] S. Alhaidari and M. Zohdy, "Hybrid learning approach of combining
Reviews," 2004. cluster-based partitioning and hidden Markov model for IoT intrusion
[20] P. Jamshidi, A. Ahmad, and C. Pahl, "Cloud Migration Research : A detection," ACM Int. Conf. Proceeding Ser., pp. 27–31, 2019, DOI:
Systematic Review," vol. 1, no. 2, pp. 142–157, 2013. 10.1145/3325917.3325939.
[21] M. Petticrew and H. Roberts, Systematic Reviews in the Social [35] M. Shobana and S. Poonkuzhali, "A novel approach to detect IoT
Sciences, a Practical Guide, First Edit., vol. 64, no. 4. Victoria: malware by system calls using Deep learning techniques," 2020 Int.
Blackwell Publisher, 2006. Conf. Innov. Trends Inf. Technol. ICITIIT 2020, pp. 1–5, 2020, DOI:
10.1109/ICITIIT49094.2020.9071531.
[22] M. K. Najafabadi and M. N. Mahrin, "A systematic literature review
on the state of research and practice of collaborative filtering technique [36] N. Provos, "A virtual honeypot framework," Proc. 13th USENIX Secur.
and implicit feedback," Artif. Intell. Rev., vol. 45, no. 2, pp. 167–201, Symp., 2004.
2016, DOI: 10.1007/s10462-015-9443-9. [37] A. H. S. D. of C. S. P. Defibaugh-Chavez*, R. Veeraghattam, M.
[23] H. Zhang and M. Ali, "Systematic reviews in software engineering : An Kannappa, S. Mukkamala*, "Network Based Detection of virtual
empirical investigation," Inf. Softw. Technol., vol. 55, no. 7, pp. 1341– environments and low interaction honeypots," in Proceedings of the
1354, 2013, DOI: 10.1016/j.infsof.2012.09.008. 2007 IEEE Workshop on Information Assurance, IAW, 2007, pp. 92–
98, DOI: 10.1109/IAW.2007.381919.
[24] P. Brereton, B. A. Kitchenham, D. Budgen, M. Turner, and M. Khalil,
"Lessons from applying the systematic literature review process within [38] The Honeynet Project, "Know your enemy: Honeynets. What a
the software engineering domain," J. Syst. Softw., vol. 80, no. 4, pp. honeynet is, its value, how it works, and risk/issues involved.," 2002.
571–583, 2007, DOI: 10.1016/j.jss.2006.07.009. [39] T. Werner, C. Fuchs, E. Gerhards-Padilla, and P. Martini, "Nebula -
[25] I. M. M. Matin and B. Rahardjo, "Malware Detection Using Honeypot Generating syntactical network intrusion signatures," 2009 4th Int.
and Machine Learning," in The 7th International Conference on Cyber Conf. Malicious Unwanted Software, MALWARE, 2009, pp. 31–38,
and IT Service Management (CITSM 2019), 2019, pp. 1–4, DOI: 2009, DOI: 10.1109/MALWARE.2009.5403022.
10.1109/citsm47753.2019.8965419. [40] Y. M. P. Pa, S. Suzuki, K. Yoshioka, T. Matsumoto, T. Kasama, and
[26] R. Vishwakarma and A. K. Jain, "A honeypot with machine learning- C. Rossow, "IoTPOT: A novel honeypot for revealing current IoT
based detection framework for defending IoT based botnet DDoS threats," J. Inf. Process., vol. 24, no. 3, pp. 522–533, 2016, DOI:
attacks," in Proceedings of the International Conference on Trends in 10.2197/ipsjjip.24.522.
Electronics and Informatics, ICOEI 2019, 2019, no. Icoei, pp. 1019– [41] M. S. Yousaf, M. H. Durad, and M. Ismail, "Implementation of
1024, DOI: 10.1109/ICOEI.2019.8862720. Portable Executable File Analysis Framework (PEFAF)," Proc. 2019
[27] F. Haltas, E. Uzun, N. Siseci, A. Posul, and B. Emre, "An automated 16th Int. Bhurban Conf. Appl. Sci. Technol. IBCAST 2019, pp. 671–
bot detection system through honeypots for large-scale," Int. Conf. 675, 2019, DOI: 10.1109/IBCAST.2019.8667202.
Cyber Conflict, CYCON, vol. 2014, pp. 255–270, 2014, DOI: [42] X. Zhou, Y. Jin, H. Zhang, S. Li, and X. Huang, "A map of threats to
10.1109/CYCON.2014.6916407. validity of systematic literature reviews in software engineering," Proc.
[28] O. P. Dwyer, A. K. Marnerides, V. Giotsas, and T. Mursch, "Profiling - Asia-Pacific Softw. Eng. Conf. APSEC, vol. 0, pp. 153–160, 2016,
IoT-based botnet traffic using DNS," 2019 IEEE Glob. Commun. Conf. DOI: 10.1109/APSEC.2016.031.
View publication stats

The Use of Honeypot in Machine Learning Based On Malware Detection: A Review

Uploaded by

Copyright:

Available Formats

The Use of Honeypot in Machine Learning Based On Malware Detection: A Review

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Use of Honeypot in Machine Learning Based On Malware Detection: A Review

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

The Use of Honeypot in Machine Learning Based on Malware Detection: A

Conference Paper · December 2020

Iik Muhamad Malik Matin Budi Rahardjo

SEE PROFILE SEE PROFILE

Elliptic Curve Cryptography View project

Honeypot Machine Learning Based Malware Detection View project

The user has requested enhancement of the downloaded file.

1st Iik Muhamad Malik Matin 2nd Budi Rahardjo

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE

TABLE II. NUMBER OF PAPER BASED ON THE SELECTION PROCESS

The selected articles are arranged and synthesized based

shellcodes to worms, and its variants. Benign but suspicious

Fig. 3. Paper distribution by year

B. Q2: Techniques for Using Honeypot

TABLE III. PAPER-BASED ON TECHNIQUES Fig. 4. Overview of IoTPOT design [40]

1) Virtual Honeypot: The honeypot virtualization C. Q3: Type of Malware

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.