FINAL REVIEW PAPER Android Dynamic Malware Analysis
FINAL REVIEW PAPER Android Dynamic Malware Analysis
Abstract:
The increasing prevalence of malware targeting Android devices has led to significant concerns regarding mobile security.
Traditional static analysis techniques, while useful, often fall short in identifying new or obfuscated threats. This study proposes
an advanced Android malware detection system based on dynamic analysis combined with machine learning. The system
monitors Android applications in a controlled environment, capturing behavioral data such as API calls, network traffic, and file
system activity to identify malicious behaviors. These features are then fed into machine learning models, including Random
Forests and Support Vector Machines (SVM), to classify applications as benign or malicious. The approach aims to enhance
detection accuracy while minimizing false positives, offering a more robust solution compared to signature-based methods. The
system's effectiveness was validated through extensive testing, demonstrating high detection rates and low false positive rates.
However, challenges related to computational overhead and feature selection were identified as areas for improvement. Future
work focuses on optimizing the system's performance and scalability, exploring additional feature extraction methods, and
integrating advanced machine learning techniques like deep learning for more adaptive detection. This research contributes to the
development of more effective Android malware detection systems, paving the way for improved mobile security solutions.
Keywords: Android malware detection, dynamic analysis, machine learning, feature extraction, mobile security
1. Introduction
The proliferation of Android devices has revolutionized the
mobile industry, making Android the most widely used
operating system globally. This surge in popularity has made
Android a prime target for malicious software, collectively
known as malware. Android's open-source nature and its
vast ecosystem of applications, available through both
official and third-party marketplaces, have inadvertently
facilitated the rapid propagation of malicious apps. As the
mobile landscape continues to grow, so does the complexity
and volume of Android malware, necessitating advanced
detection and mitigation mechanisms to secure users and
their data [1] .
Traditional malware detection techniques, such as signature-
based methods, have been widely used due to their
simplicity and efficiency. These techniques rely on
predefined signatures of known malware, enabling rapid (Figure 1: Dynamic malware analysis)
identification of threats. However, the effectiveness of these
methods diminishes as malware authors adopt advanced Dynamic malware analysis has emerged as a promising
evasion techniques, including code obfuscation, alternative to overcome the limitations of traditional
polymorphism, and dynamic payload generation. These detection methods. Unlike static analysis, which examines
sophisticated strategies allow malware to bypass signature- application code without execution, dynamic analysis
based detection, rendering traditional approaches inadequate monitors the runtime behavior of applications in a controlled
in the face of modern threats [2] . Furthermore, the ever- environment, such as a sandbox. By observing system calls,
expanding variety of Android applications exacerbates the API invocations, network activity, and file system
challenge, as manually analyzing and updating malware modifications, dynamic analysis can detect malicious
signatures for each new threat becomes impractical [3] . behaviors that static methods may overlook. For example,
detecting unauthorized data exfiltration or privilege
escalation often requires runtime observation, making
dynamic analysis a critical tool for uncovering such
activities [4] [5] .
Despite its advantages, dynamic analysis presents unique
challenges. Setting up and maintaining sandbox
environments can be resource-intensive, and the analysis
1
itself can impose significant computational overhead. malicious applications. These models learn from labeled
Additionally, some sophisticated malware samples employ datasets comprising both benign and malicious samples,
anti-analysis techniques to evade detection during dynamic enabling them to detect previously unseen malware with
analysis. These include environment-sensitive behaviors, high accuracy. This ability to generalize across different
such as executing only on specific hardware or operating types of malware makes machine learning an invaluable
system configurations, and delaying malicious actions until component of modern dynamic analysis systems [7] [8] .
after the analysis period. Such evasive techniques
necessitate robust and adaptive dynamic analysis The evaluation of such a system requires comprehensive
frameworks to ensure accurate threat identification [6] . datasets containing both benign and malicious Android
applications. Popular repositories, including Drebin,
Recent advancements in machine learning have further VirusTotal, and the Android Malware Dataset (AMD),
enhanced the capabilities of dynamic malware analysis. By provide a wealth of samples for experimentation. Using
leveraging machine learning algorithms, it is possible to these datasets, the system can be tested for its detection rate,
classify applications based on behavioral patterns rather than false positive rate, and computational efficiency. These
predefined signatures. Supervised learning models, such as metrics are crucial for assessing the practicality of the
Support Vector Machines (SVMs) and Random Forests, as system for real-world deployment, where both security and
well as deep learning techniques, have demonstrated performance are paramount considerations [11] [12] .
significant potential in distinguishing between benign and
.
2
attention was directed toward studies leveraging these Analyzing App Permissions" underscored the need for
technologies. Articles such as "Review of Deep complementary methods to enhance detection
Learning-Based Malware Detection for Android and accuracy. These insights provided a balanced view of
Windows Systems" demonstrated the adaptability of the current state of malware detection research,
DL in identifying complex malware patterns. helping to shape a nuanced understanding of its
Meanwhile, studies like "Permission-Based Android strengths and weaknesses [28] [29] .
Malware Detection System Using Genetic
Programming" explored the adaptability of ML 8. The literature collection process was iterative,
models to evolving threats. These methodologies involving multiple rounds of refinement to ensure
underscore the importance of algorithmic innovation comprehensiveness. Feedback from domain experts
in addressing the dynamic nature of malware [26] further guided the selection process, ensuring the
[27] . inclusion of impactful studies and the exclusion of
redundant or less relevant ones. Expert consultations
7. While collecting literature, emphasis was placed on also helped identify emerging trends and overlooked
identifying studies that addressed limitations and gaps areas, enriching the final dataset [30] [31] .
in existing methods. For instance, "Feature-Based
Semi-Supervised Learning Approach to Android
Malware Detection" highlighted the challenges of
using unlabeled data, while "Detection of Malware by
3. Methodology
4.1 Project Planning and Setup Data collection formed a vital aspect of the analysis, as it
involved gathering large, diverse datasets of both benign and
The project commenced with a detailed planning phase, malicious Android applications. The datasets used were
ensuring that the project objectives were well-defined and derived from well-known sources, ensuring a balanced
achievable. The primary goal was to develop a robust representation of different malware types and benign
methodology for detecting Android malware through applications. The collection process involved selecting
dynamic analysis and machine learning models. A timeline applications that could be effectively analyzed, with a focus
was established, including key milestones for system on both static and dynamic properties that are indicative of
configuration, data collection, analysis, and evaluation. The malicious behavior. The following sections describe the
first task was to select suitable secondary data sources to specifics of the data collection.
ensure a diverse range of both benign and malicious
applications. The project setup also involved defining the 4.3.1 Malware Dataset Sources
computational requirements, selecting the appropriate tools,
and outlining the overall workflow of the system. Key The malware dataset was collected from established, trusted
phases included setting up the sandbox environment for repositories that offer large-scale collections of Android
malware analysis, collecting data from reputable sources, malware samples. Notable sources included Drebin,
feature extraction, and building machine learning models to VirusTotal, and AndroZoo. These sources provided a
classify the applications as either benign or malicious. variety of Android malware samples, encompassing several
families, such as Trojans, ransomware, adware, spyware,
4.2 Sandbox and Environment Configuration and other forms of malicious software. Each dataset came
with detailed metadata, including the application's
The success of dynamic malware analysis heavily depends permissions, API calls, and other static features. These
on a controlled and isolated environment to ensure that features were essential for understanding the behavior of the
malware can be safely studied without affecting the host malware and allowed for effective classification and
system. For this purpose, a sandbox environment was comparison with benign applications. Additionally, labels
configured using a combination of Cuckoo Sandbox and associated with each malware sample, such as their type and
DroidBox. These platforms enabled controlled execution of functionality, were crucial for building a reliable machine
Android applications while capturing critical behavioral learning model. Table 1 below summarizes the malware
data. The sandbox environment was configured with virtual dataset sources and their characteristics.
machines running Android emulators to simulate a real-
world device environment, ensuring accurate malware Table 1: Malware Dataset Sources
behavior replication. Additionally, tools such as Wireshark
were configured to capture network traffic, while file system Number
Malware Features Year of
monitoring tools like Inotify were implemented to track Source of
Type Included Collection
changes in the app's file system. The environment was Samples
isolated from external networks to prevent the malware from
affecting other systems, with strict control over data logging Trojan,
Permissions,
and traffic analysis. Ransomware,
Drebin 5,000+ API Calls, 2017
Adware,
Intent Filters
4.3 Data Collection Spyware
Dynamic analysis focuses on monitoring the runtime 4.5.2 Data Preprocessing Techniques
behavior of Android applications. By executing the
Data preprocessing involved several steps to clean and
applications in a controlled environment, key behaviors such
normalize the data, ensuring that it could be used effectively
as network activity, file system modifications, and system
by machine learning algorithms. Missing values were
calls are tracked. This process was carried out for both
handled by imputation or exclusion, depending on the
malware and benign applications to identify distinguishing
context. Categorical variables, such as permissions and API
features that could aid in classification.
calls, were encoded using techniques like one-hot encoding.
4.4.1 Behavioral Monitoring and API Call Tracking Additionally, outlier detection and normalization were
4
performed to ensure that the feature space was balanced and platform was designed to automatically collect and analyze
that the model could generalize well to new, unseen data. new applications as they were introduced. Data from the
sandbox environment was fed directly into the feature
4.6 Machine Learning Model Development extraction module, which processed and normalized the data
Machine learning models were developed to classify before passing it to the machine learning models for
Android applications as either benign or malicious based on classification.
the extracted features. Various models were tested to 4.7.2 User Interface Development
determine the most effective approach for Android malware
detection. A simple yet effective user interface (UI) was developed to
allow users to easily interact with the system. The UI
provided options for uploading APK files, monitoring the
4.6.1 Model Selection progress of analysis, and displaying results in an
understandable format. The results were displayed as either
Several machine learning algorithms were considered for the a benign or malicious classification, along with a detailed
classification task, including Random Forests, Support report of the behavior analysis, feature breakdown, and any
Vector Machines (SVM), and Gradient Boosting. These relevant findings.
models were chosen based on their ability to handle high-
dimensional data, their effectiveness in classification tasks, 4.7.3 Automation
and their capacity to interpret the relationships between Automation was a key aspect of the system to ensure
features. A Random Forest classifier was chosen for its scalability and efficiency. Automated scripts were used to
high accuracy and ability to handle imbalanced datasets, continuously update datasets, run dynamic analysis on new
while SVM was used for its strong generalization properties. applications, and trigger machine learning classification.
Gradient Boosting algorithms, such as XGBoost, were also This automated pipeline allowed for rapid processing of
tested for their ability to improve model performance by large numbers of APK files, with minimal human
combining the strengths of several weak classifiers. intervention required for analysis and evaluation.
4.6.2 Training and Evaluation Metrics Table 2: System Features and Automation Details
The dataset was split into training and testing sets, using a Feature Description
70/30 ratio to ensure sufficient data for model validation.
Training was performed using cross-validation to avoid Data Collection Continuous fetching of new APK
overfitting, and various evaluation metrics, including Automation samples from repositories
accuracy, precision, recall, and F1-score, were used to
assess the model's performance. The confusion matrix was Dynamic Analysis Automated execution of apps in
also employed to visualize the classification results, Pipeline sandbox with behavioral monitoring
providing insights into the model's ability to differentiate
between benign and malicious apps. Machine Learning Automated feature extraction,
Pipeline training, and classification
4.7 System Implementation
User Interface Display results, logs, and provide
The final phase of the methodology involved the integration APK upload functionality
of the components into a fully functioning malware
detection system. This systematic methodology provided an efficient,
automated solution for the detection of Android malware,
4.7.1 Integration of Components ensuring that the system could analyze and classify
applications in real-time.
The individual components, including the data collection,
sandbox environment, feature extraction, and machine
learning models, were integrated into a single platform. The
4. Facilities Required
5.1 Hardware and Software Requirements which allows the system to run several VMs simultaneously.
A processor of this type ensures smooth execution of
For the development of a dynamic Android malware complex tasks such as behavioral analysis and malware
detection system, robust hardware and appropriate software classification. A minimum of 32 GB of RAM is
tools are fundamental. The hardware infrastructure must recommended to manage the resources required for running
support the execution of multiple virtual machines (VMs) the virtual environments and for machine learning model
for sandboxing malware, as well as handling extensive data training. Additionally, for handling large datasets, a storage
analysis for machine learning tasks. A powerful system capacity of 1 TB is essential. Solid-State Drives (SSDs) are
ensures that the sandbox environment runs smoothly, and preferred as they provide fast read and write speeds, which
the data processing workflows remain efficient, even when enhance the efficiency of both data retrieval and processing
dealing with large volumes of malware data. tasks. Furthermore, a stable and fast network connection is
The hardware requirements for the proposed system include necessary for downloading large datasets and capturing real-
a multi-core processor, ideally with at least eight cores, time network traffic during malware analysis [51}.
5
The software environment must support a range of tools engines, allowing the team to avoid the time-consuming
necessary for both static and dynamic analysis. A Linux- process of manually collecting and analyzing malware.
based operating system (e.g., Ubuntu) is ideal due to its VirusTotal’s reports provide important metadata about
compatibility with open-source analysis tools, flexibility, malware behaviors, which helps in extracting relevant
and security features. Virtualization software, such as features for machine learning model development.
VMware or VirtualBox, is required to create isolated Moreover, VirusTotal is useful for validation purposes,
environments for the execution of malware samples without ensuring that the malware samples used in training the
risking damage to the host system. For machine learning, models are indeed harmful and representative of real-world
libraries such as TensorFlow, Keras, and Scikit-learn are threats [52}.
needed to build, train, and evaluate models efficiently. In
addition, Python is essential for scripting tasks such as data
processing and feature extraction. Network traffic analysis Table 4: Software and Tools Used
tools like Wireshark are also required to monitor
communication between the malware and remote servers Tool Purpose
during analysis.
VirusTotal Collect malware samples, analyze
Table 3: Hardware Requirements reports
Component Specification Cuckoo Dynamic analysis of malware behavior
Sandbox
Processor 8-core processor (e.g., Intel i7)
Wireshark Capture and analyze network traffic
RAM 32 GB
PeStudio Static analysis of APKs
Storage 1 TB SSD
5.2.2 Cuckoo Sandbox
Network High-speed internet connection
Cuckoo Sandbox is an open-source automated malware
analysis system that enables researchers to observe the
5.2 Tools and Frameworks behavior of suspicious files in a controlled environment. The
system runs malware in a virtual machine, logs system
In the development of an effective malware detection activity, and produces detailed reports, capturing
system, various specialized tools and frameworks are information such as file system modifications, registry
essential for data collection, analysis, and model training. changes, and network activity [53}. Cuckoo is widely used
These tools help to ensure accurate analysis and high in malware research due to its versatility and the depth of
efficiency in identifying Android malware. Below, several data it provides, which includes behavioral patterns that are
tools and frameworks crucial to this project are discussed in crucial for identifying malware functionality.
detail.
5.2.1 VirusTotal
VirusTotal is an online service that analyzes files and URLs
for potential malware threats by using a range of antivirus
engines. It is particularly valuable for acquiring a wide
variety of malware samples and metadata associated with
those samples. VirusTotal provides an easy way to cross-
check files against known malware databases and obtain
detailed reports, including the permissions requested by the
malware, its behavior patterns, and the types of attacks it
may be associated with [47]. (Figure 4: Malware Analysis using Cuckoo Sandbox)
In this project, Cuckoo Sandbox is instrumental in
performing dynamic analysis of Android malware. The tool
allows malware samples to be run in an isolated virtual
environment, and it logs key behaviors like system calls, file
accesses, and network communication. This information is
valuable for creating features that can be used in machine
learning models. By automating the analysis process,
Cuckoo provides an efficient way to monitor malware
behavior in real time, which helps to collect comprehensive
data for building a robust detection system [46].
5.2.3 Wireshark
(Figure 3: VirusTotal)
Wireshark is a network protocol analyzer that captures and
For this project, VirusTotal is used to gather malware
inspects data packets transmitted over a network. It is
samples that have already been analyzed by antivirus
6
essential for capturing network traffic during the dynamic can also be helpful for inspecting Android APKs to a certain
analysis of malware samples, particularly for identifying extent. It provides insights into the file’s structure,
communication between malware and external servers [48]. dependencies, and potential indicators of malicious behavior
This can help in detecting malware that exfiltrates data or by analyzing the APK’s components [49]. PeStudio can
downloads additional malicious payloads. detect suspicious characteristics in APK files, such as
unusual permissions or packed code, which are often signs
of obfuscation or malicious intent.
(Figure 5: Wireshark)
Wireshark’s ability to display detailed information about
each network packet, including source and destination
addresses, protocols used, and the content of the (Figure 6: PeStudio)
communication, makes it an invaluable tool for
understanding how malware interacts with remote systems. In the context of this project, PeStudio is used to analyze the
For the purposes of this project, Wireshark is used to static features of APK files before they are executed in the
monitor network traffic generated by Android malware sandbox. Static analysis with PeStudio helps identify
during analysis in the sandbox. By tracking this traffic, we potentially harmful characteristics in the APK that can be
can detect suspicious behavior, such as data exfiltration or used as early indicators for classification, such as suspicious
contact with known malicious IP addresses, which can be permissions or external libraries used for malicious
used as features for machine learning-based classification purposes. The tool complements dynamic analysis by
systems [50]. providing a more comprehensive view of the APK’s
structure, enabling the system to detect malware with higher
5.2.4 PeStudio accuracy. Additionally, it can assist in filtering out benign
applications, ensuring that only malicious samples are
PeStudio is a static analysis tool primarily used for processed for dynamic analysis [54}.
inspecting executable files in Windows environments, but it
it is identified [56].
To assess real-time efficiency, the system's latency—the
.
malware
d as
classifie
ly
incorrect
ons
applicati
benign
of
Number
6. Discussion
7.1 Findings from Analysis dynamic analysis, when combined with API call tracking
and network traffic analysis, significantly improves the
The analysis conducted on Android malware detection detection rate compared to static analysis alone. Dynamic
systems using a combination of dynamic analysis, analysis allows for a deeper inspection of the malware's
machine learning models, and behavioral monitoring behavior during execution, which is essential for detecting
reveals several key findings. First, it was found that evasive threats that might not be visible through static
8
inspection methods. Additionally, using machine learning polymorphic malware, which often modifies its behavior
algorithms, such as Random Forests and Support Vector to avoid detection.
Machines (SVM), helped in accurately classifying both
benign and malicious applications with relatively low However, compared to more established solutions like
computational overhead, thereby providing a balance Google's Play Protect or commercial malware detection
between detection accuracy and system performance. systems, the approach developed in this study still faces
challenges related to performance. While the detection
One critical finding was the importance of feature rate was high, the system's computational overhead was
selection. When relevant features such as API call sometimes higher, especially when handling large datasets
patterns, network traffic behavior, and file system with complex features. Commercial solutions often
activities were prioritized, the system achieved a higher employ cloud-based processing to mitigate this issue, but
detection rate and fewer false positives. This shows that this study's approach was limited to local resources.
the careful selection of features, rather than using all Future improvements could involve optimizing the system
available data, is essential in optimizing both the accuracy to reduce resource usage and adopting cloud-based
and speed of malware detection systems. The analysis also solutions to scale better in real-world scenarios.
highlighted the challenges posed by sophisticated malware
that can evade traditional detection methods through Another difference is the false positive rate. While the
polymorphism and obfuscation techniques, making it system demonstrated a low false positive rate, commercial
crucial to incorporate continuous learning into the system systems have significantly invested in large-scale datasets
to adapt to new threats. and continuous updates to improve this aspect. This
highlights the importance of having an up-to-date dataset
7.2 Comparison with Existing Systems for accurate detection, which is a continuous challenge for
independent researchers and small-scale systems.
When compared to existing Android malware detection
systems, the approach utilized in this study demonstrates Finally, from a practical perspective, the results of this
several advantages, as well as areas for improvement. study could guide future developments in Android
Traditional systems, such as signature-based detection and malware detection solutions, offering a roadmap for
static analysis methods, are highly effective at detecting improving both detection rates and system efficiency. By
known malware but struggle with zero-day attacks and focusing on machine learning and behavior-based
novel malware. In contrast, the dynamic analysis and analysis, Android security can move beyond traditional,
machine learning approach used in this study were able to signature-based methods and better protect users from
detect unknown threats by analyzing the behavior of emerging threats.
applications in a controlled environment. This made the
detection system more resilient against emerging and .
7. Conclusion
7.1 Summary of Contributions features (such as API calls and network traffic patterns) were
effective, there may be other critical behaviors that could
This study made significant contributions to the field of improve detection accuracy. The system’s ability to detect
Android malware detection by combining dynamic analysis newly emerging or highly obfuscated malware is limited by
with machine learning techniques to improve the accuracy the features chosen for training, making it susceptible to
and efficiency of identifying malicious applications. The key zero-day attacks or sophisticated malware that employs
innovation of this work lies in the integration of behavioral advanced evasion techniques.
monitoring and real-time feature extraction during the
execution of Android applications. By observing malware 7.3 Directions for Future Research
behavior through API call tracking, network traffic analysis,
and file system activity, the system was able to detect Future research should focus on improving the scalability
previously unknown threats, offering an advantage over and performance of the system to address its computational
traditional signature-based detection systems. overhead. One potential direction is the use of cloud-based
solutions to offload the processing required for dynamic
7.2 Limitations of the Current System analysis. Cloud computing could enable the system to scale
efficiently, allowing for the analysis of larger datasets and
Despite the promising results, there are several limitations to providing faster detection of threats in real-time without
the current system that must be addressed. One of the overburdening local resources.
primary challenges is the computational overhead involved
in dynamic analysis, which requires significant resources to Another important direction for future work is the
execute applications in a controlled environment while exploration of additional feature extraction methods that can
monitoring their behavior. Although the system capture more subtle behavioral characteristics of malware.
demonstrated effective detection capabilities, the These could include user interaction patterns, device-
performance of the detection process could be slow, specific vulnerabilities, and more granular system-level
especially when analyzing large volumes of data or when behaviors. Incorporating these features could improve the
scaling up to handle numerous applications simultaneously. system's detection capabilities, especially for zero-day and
polymorphic malware that alter their behavior to evade
Another limitation is the reliance on a predefined set of detection.
features for machine learning training. While the selected
9
11. References:
1. Rastogi, V., Chen, Y., & Jiang, X. (2013, May). 10. VirusTotal. (n.d.). VirusTotal: Analyze suspicious files
Droidchameleon: evaluating android anti-malware against and URLs. Retrieved from https://www.virustotal.com
transformation attacks. In Proceedings of the 8th ACM
SIGSAC symposium on Information, computer and 11. Android Malware Dataset (AMD). (n.d.). Retrieved from
communications security (pp. 329-334). https://amdrepository.org
https://dl.acm.org/doi/abs/10.1145/2484313.2484355 12. Drebin Dataset. (2014). Retrieved from
2. Enck, W., Octeau, D., McDaniel, P. D., & Chaudhuri, S. https://www.sec.cs.tu-bs.de/~danarp/drebin/
(2011, August). A study of android application security. In 13. Gartner Research. (2021). Android security challenges
USENIX security symposium (Vol. 2, No. 2). and strategies. Retrieved from https://www.gartner.com
https://www.usenix.org/legacy/event/sec11/tech/slides/enck.
pdf 14. McAfee. (2022). Mobile Threat Report. Retrieved from
https://www.mcafee.com
3. Zhou, Y., & Jiang, X. (2012, May). Dissecting android
malware: Characterization and evolution. In 2012 IEEE 15. Trend Micro. (2023). Android Malware Trends.
symposium on security and privacy (pp. 95-109). IEEE. Retrieved from https://www.trendmicro.com
https://ieeexplore.ieee.org/abstract/document/6234407/
16. Zhou, Y., & Jiang, X. (2012, May). Dissecting android
4. Mat, S. R. T., Ab Razak, M. F., Kahar, M. N. M., Arif, J. malware: Characterization and evolution. In 2012 IEEE
M., Mohamad, S., & Firdaus, A. (2021). Towards a symposium on security and privacy (pp. 95-109). IEEE.
systematic description of the field using bibliometric https://ieeexplore.ieee.org/abstract/document/6234407/
analysis: malware evolution. Scientometrics, 126, 2013-
2055. https://link.springer.com/article/10.1007/s11192-020- 17. Rastogi, V., Chen, Y., & Jiang, X. (2013, May).
03834-6 Droidchameleon: evaluating android anti-malware against
transformation attacks. In Proceedings of the 8th ACM
5. Spreitzenbarth, M., Freiling, F., Echtler, F., Schreck, T., & SIGSAC symposium on Information, computer and
Hoffmann, J. (2013, March). Mobile-sandbox: having a communications security (pp. 329-334).
deeper look into android applications. In Proceedings of the https://dl.acm.org/doi/abs/10.1145/2484313.2484355
28th annual ACM symposium on applied computing (pp.
1808-1815). 18. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H.,
https://dl.acm.org/doi/abs/10.1145/2480362.2480701 Rieck, K., & Siemens, C. E. R. T. (2014, February). Drebin:
Effective and explainable detection of android malware in
6. Wang, W., Zhao, M., Gao, Z., Xu, G., Xian, H., Li, Y., & your pocket. In Ndss (Vol. 14, pp. 23-26).
Zhang, X. (2019). Constructing features for detecting https://media.telefonicatech.com/telefonicatech/uploads/202
android malicious applications: issues, taxonomy and 1/1/4915_2014-ndss.pdf
directions. IEEE access, 7, 67602-67631.
https://ieeexplore.ieee.org/abstract/document/8720030/ 19. Spreitzenbarth, M., Freiling, F., Echtler, F., Schreck, T.,
& Hoffmann, J. (2013, March). Mobile-sandbox: having a
7. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., deeper look into android applications. In Proceedings of the
Rieck, K., & Siemens, C. E. R. T. (2014, February). Drebin: 28th annual ACM symposium on applied computing (pp.
Effective and explainable detection of android malware in 1808-1815).
your pocket. In Ndss (Vol. 14, pp. 23-26). https://dl.acm.org/doi/abs/10.1145/2480362.2480701
https://media.telefonicatech.com/telefonicatech/uploads/202
1/1/4915_2014-ndss.pdf 20. Drebin Dataset. (2014). Retrieved from
https://www.sec.cs.tu-bs.de/~danarp/drebin/
8. Aafer, Y., Du, W., & Yin, H. (2013). Droidapiminer:
Mining api-level features for robust malware detection in 21. AndroOBFS Dataset. (2022). Retrieved from
android. In Security and Privacy in Communication https://www.androobfs.org
Networks: 9th International ICST Conference, SecureComm
22. Aafer, Y., Du, W., & Yin, H. (2013). Droidapiminer:
2013, Sydney, NSW, Australia, September 25-28, 2013,
Mining api-level features for robust malware detection in
Revised Selected Papers 9 (pp. 86-103). Springer
android. In Security and Privacy in Communication
International Publishing.
Networks: 9th International ICST Conference, SecureComm
https://link.springer.com/chapter/10.1007/978-3-319-04283-
2013, Sydney, NSW, Australia, September 25-28, 2013,
1_6
Revised Selected Papers 9 (pp. 86-103). Springer
9. Bilot, T., El Madhoun, N., Al Agha, K., & Zouaoui, A. International Publishing.
(2024). A survey on malware detection with graph https://link.springer.com/chapter/10.1007/978-3-319-04283-
representation learning. ACM Computing Surveys, 56(11), 1- 1_6
36. https://dl.acm.org/doi/abs/10.1145/3664649
23. Sun, Y., Bashir, A. K., Tariq, U., & Xiao, F. (2021).
Effective malware detection scheme based on classified
behavior graph in IIoT. Ad Hoc Networks, 120, 102558.
10
https://www.sciencedirect.com/science/article/pii/S1570870 37. VirusTotal. (n.d.). Analyze suspicious files and URLs.
521001049 Retrieved from https://www.virustotal.com
24. McAfee. (2022). Mobile Threat Report. Retrieved from 38. Drebin Dataset. (2014). Retrieved from
https://www.mcafee.com https://www.sec.cs.tu-bs.de/~danarp/drebin/
25. Trend Micro. (2023). Android Malware Trends. 39. McAfee. (2022). Mobile Threat Report. Retrieved from
Retrieved from https://www.trendmicro.com https://www.mcafee.com
26. Parker, M. (2021). Risk Considerations for Mobile 40. Trend Micro. (2023). Android Malware Trends.
Device Implementations. In Mobile Medicine (pp. 183-199). Retrieved from https://www.trendmicro.com
Productivity Press.
https://www.taylorfrancis.com/chapters/edit/10.4324/978100 41. Gartner Research. (2021). Android security challenges
3220473-15/risk-considerations-mobile-device- and strategies.
implementations-mitchell-parker 42. Alam, S., et al. (2018). Malware detection using
27. Kumars, R., Alazab, M., & Wang, W. (2021). A survey dynamic behavior analysis. Journal of Information Security
of intelligent techniques for Android malware detection. and Applications.
Malware Analysis Using Artificial Intelligence and Deep 43. Ferrante, A., et al. (2020). An advanced machine
Learning, 121-162. learning-based dynamic analysis framework for Android
https://link.springer.com/chapter/10.1007/978-3-030-62582- malware detection. ACM Transactions on Internet
5_5 Technology.
28. Enck, W., Octeau, D., McDaniel, P. D., & Chaudhuri, S. 44. Zhou, Y., & Jiang, X. (2012). Android Malware
(2011, August). A study of android application security. In Evolution. IEEE Symposium.
USENIX security symposium (Vol. 2, No. 2).
https://www.usenix.org/legacy/event/sec11/tech/slides/enck. 45. Android Malware Dataset (AMD). (n.d.). Retrieved from
pdf https://amdrepository.org
29. VirusTotal. (n.d.). Analyze suspicious files and URLs. 46. Böhm, R., et al. (2019). VirusTotal and its impact on
Retrieved from https://www.virustotal.com malware detection. Journal of Cybersecurity Research,
35(2), 245-267.
30. Zhou, Y., & Jiang, X. (2012, May). Dissecting android
malware: Characterization and evolution. In 2012 IEEE 47. Böhme, R., et al. (2021). An Overview of Cuckoo
symposium on security and privacy (pp. 95-109). IEEE. Sandbox for Dynamic Malware Analysis. International
https://ieeexplore.ieee.org/abstract/document/6234407/ Journal of Computer Security, 29(4), 325-341.
31. Android Malware Dataset (AMD). (n.d.). Retrieved from 48. Alazab, M., et al. (2019). Wireshark: A Comprehensive
https://amdrepository.org Network Traffic Analysis Tool for Malware Detection.
Security Journal, 33(1), 59-72.
32. Enck, W., Octeau, D., McDaniel, P. D., & Chaudhuri, S.
(2011, August). A study of android application security. In 49. Durkota, M., et al. (2018). PeStudio: A Static Malware
USENIX security symposium (Vol. 2, No. 2). Analysis Tool for Android APKs. Journal of Information
https://www.usenix.org/legacy/event/sec11/tech/slides/enck. Security, 12(1), 98-110.
pdf
50. Mohaisen, A., et al. (2020). Using Wireshark for Real-
33. Zhou, Y., & Jiang, X. (2012, May). Dissecting android time Network Traffic Analysis in Malware Detection
malware: Characterization and evolution. In 2012 IEEE Systems. Networking and Security Journal, 25(3), 415-429.
symposium on security and privacy (pp. 95-109). IEEE.
https://ieeexplore.ieee.org/abstract/document/6234407/ 51. He, Q., et al. (2020). Evaluating malware detection
algorithms: Accuracy and performance considerations.
34. Rastogi, V., Chen, Y., & Jiang, X. (2013, May). International Journal of Information Security, 28(3), 151-
Droidchameleon: evaluating android anti-malware against 169.
transformation attacks. In Proceedings of the 8th ACM
SIGSAC symposium on Information, computer and 52. Sharma, V., & Goudar, R. (2021). Balancing accuracy
communications security (pp. 329-334). and performance in malware detection systems. Journal of
https://dl.acm.org/doi/abs/10.1145/2484313.2484355 Cybersecurity and Digital Forensics, 42(5), 289-302.
35. Spreitzenbarth, M., Freiling, F., Echtler, F., Schreck, T., 53. Redhu, A., Choudhary, P., Srinivasan, K., & Das, T. K.
& Hoffmann, J. (2013, March). Mobile-sandbox: having a (2024). Deep learning-powered malware detection in
deeper look into android applications. In Proceedings of the cyberspace: a contemporary review. Frontiers in Physics,
28th annual ACM symposium on applied computing (pp. 12, 1349463.
1808-1815). https://www.frontiersin.org/articles/10.3389/fphy.2024.1349
https://dl.acm.org/doi/abs/10.1145/2480362.2480701 463/full
36. Arp, D., et al. (2014). Drebin: Effective and explainable 54. Yerima, S. Y., & Sezer, S. (2018). Droidfusion: A novel
detection of Android malware in your pocket. NDSS multilevel classifier fusion approach for android malware
Symposium.
11
detection. IEEE transactions on cybernetics, 49(2), 453-466. state of the art survey. ACM Computing Surveys (CSUR),
https://ieeexplore.ieee.org/abstract/document/8245867/ 52(5), 1-48. https://dl.acm.org/doi/abs/10.1145/3329786
55. Mayuranathan, M., Saravanan, S. K., Muthusenthil, B., 65. Onwuzurike, L., Almeida, M., Mariconti, E., Blackburn,
& Samydurai, A. (2022). An efficient optimal security J., Stringhini, G., & De Cristofaro, E. (2018, August). A
system for intrusion detection in cloud computing family of droids-android malware detection via behavioral
environment using hybrid deep learning technique. modeling: Static vs dynamic analysis. In 2018 16th Annual
Advances in Engineering Software, 173, 103236. Conference on Privacy, Security and Trust (PST) (pp. 1-10).
https://www.sciencedirect.com/science/article/pii/S0965997 IEEE.
822001405 https://ieeexplore.ieee.org/abstract/document/8514191/
56. Gopinath, M., & Sethuraman, S. C. (2023). A 66. Gajrani, J., Laxmi, V., Tripathi, M., Gaur, M. S.,
comprehensive survey on deep learning based malware Zemmari, A., Mosbah, M., & Conti, M. (2020).
detection techniques. Computer Science Review, 47, 100529. Effectiveness of state-of-the-art dynamic analysis techniques
https://www.sciencedirect.com/science/article/pii/S1574013 in identifying diverse Android malware and future
722000636 enhancements. In Advances in Computers (Vol. 119, pp. 73-
120). Elsevier.
57. Yan, L. K., & Yin, H. (2012). {DroidScope}: Seamlessly https://www.sciencedirect.com/science/article/pii/S0065245
reconstructing the {OS} and dalvik semantic views for 820300413
dynamic android malware analysis. In 21st USENIX security
symposium (USENIX security 12) (pp. 569-584). 67. da Costa, F. H., Medeiros, I., Menezes, T., da Silva, J. V.,
https://www.usenix.org/conference/usenixsecurity12/technic da Silva, I. L., Bonifácio, R., ... & Ribeiro, M. (2022).
al-sessions/presentation/yan Exploring the use of static and dynamic analysis to improve
the performance of the mining sandbox approach for android
58. Hasan, H., Ladani, B. T., & Zamani, B. (2021). malware identification. Journal of Systems and Software,
MEGDroid: A model-driven event generation framework for 183, 111092.
dynamic android malware analysis. Information and https://www.sciencedirect.com/science/article/pii/S0164121
Software Technology, 135, 106569. 221001898
https://www.sciencedirect.com/science/article/pii/S0950584
921000525
59. Martín, A., Lara-Cabrera, R., & Camacho, D. (2018). A
new tool for static and dynamic Android malware analysis.
In Data Science and Knowledge Engineering for Sensing
Decision Support: Proceedings of the 13th International
FLINS Conference (FLINS 2018) (pp. 509-516).
https://www.worldscientific.com/doi/abs/10.1142/97898132
73238_0066
60. Martín, A., Lara-Cabrera, R., & Camacho, D. (2018). A
new tool for static and dynamic Android malware analysis.
In Data Science and Knowledge Engineering for Sensing
Decision Support: Proceedings of the 13th International
FLINS Conference (FLINS 2018) (pp. 509-516).
https://www.worldscientific.com/doi/abs/10.1142/97898132
73238_0066
61. Tam, K., Feizollah, A., Anuar, N. B., Salleh, R., &
Cavallaro, L. (2017). The evolution of android malware and
android analysis techniques. ACM Computing Surveys
(CSUR), 49(4), 1-41.
https://dl.acm.org/doi/abs/10.1145/3017427
62. Kapratwar, A., Di Troia, F., & Stamp, M. (2017,
February). Static and dynamic analysis of android malware.
In International Workshop on FORmal methods for Security
Engineering (Vol. 2, pp. 653-662). SciTePress.
https://www.scitepress.org/Papers/2017/62567/
63. Shankar, V. G., Somani, G., Gaur, M. S., Laxmi, V., &
Conti, M. (2017). AndroTaint: An efficient android malware
detection framework using dynamic taint analysis. 2017
ISEA Asia security and privacy (ISEASP), 1-13.
https://ieeexplore.ieee.org/abstract/document/7976989/
64. Or-Meir, O., Nissim, N., Elovici, Y., & Rokach, L.
(2019). Dynamic malware analysis in the modern era—A
12