Assignment
Assignment
Assignment
APT2F2006CYB-FRC
7.1.2 Assumptions................................................................................................................ 13
References ................................................................................................................................ 25
2
1.0 Abstract
I have been tasked to develop a design for a digital display board using registers,
services, flow control statements and procedures/macros using assembly language using
TASM and TLINK. Then, using IDA Pro tool to reverse engineer the assembly program. This
concludes that learning about Assembly language opens more opportunity on learning how
programming works.
2.0 Introduction
For the task that was given for me, in this paper, I will discuss about the fundamentals
of machine language. The machine language that I will be discussing is Assembly Language.
Assembly Language has the same instructions as a machine language but instead of being just
numbers on machine language, the instructions and variables have names (Vangie Beal, 2010).
While it is almost impossible for humans to understand machine language, assembly language
is the closest to writing machine language. The assembly language will translate the code that
is written into machine language with the program called an assembler. While other
programming language follows the flow of translating into different levels of language, it all
goes down into machine language.
The difference between assembly and other programming language is that assembly
language lacks variable and functions and works directly with the CPU. Assembly language
also is coded differently for every type of processer which being 32-bit CPU, or 64-bit CPU
has a different set of language to perform the same task. While high-level language is very
human friendly which makes it being able to understand easily and written more comfortably
(IT Release, 2018).
3
(Fig.1 Image of the levels of programming language)
(Fig.2 left image is C and right image is Assembly on writing Hello world)
4
3.2 Python Programming Language
For an even more simpler programming language, there is python. Python is very
beginner friendly as it only requires one line to print out an output where assembly requires a
few knowledges on how to code in assembly. Due to the popularity of python, there are a lot
of learning tutorials online and a very helpful community to talk about python. Although
Python runs slower than assembly, but it takes less time to develop a Python program. An
example of a “Hello World” print from python on the image below.
3.3 JavaScript
Although some might think that JavaScript is an extension of HTML (Since you need
at least html to produce the output), JavaScript is in fact a high-level language. Although there
is not much direct comparison against each other, there are a few things to note when looking
up with JavaScript. For example, JavaScript is dynamically typed, very flexible, and it is
delivered in human-readable code. While assembly is faster, strongly typed, and difficult to
read the code. But JavaScript is mainly focused on web application or websites to create
functions, animations, etc. Here is an example of the code for JavaScript
(Fig.4 Image of JavaScript in HTML file on writing Hello world on an alert function)
5
4.0 Evaluation of low-level programming language
The advantages of low-level language are that it can make use of a certain hardware or
a special machine-dependant instruction. Because low-level programming works close to the
hardware, it can precisely control the hardware more efficiently without jumping into hoops of
translating into another language then to the hardware itself. Another advantage of low-level
language is that the translated program requires less memory because of the nature of how the
program is written (Computer Science GCSE GURU, 2020). Thirdly, because the way low
level-language code is written, the program can be executed faster. The reason to this is because
that low-level language efficiently does what it is written in the code. The next advantage is
that users that code with assembly has total control over the code. It means the user has control
over the specific instructions on what the user code into these low-level languages. For
example, the after the user write the specific instructions to the assembler, the code will only
do that. Lastly, low-level programming can work directly on memory locations. What this
means is that the low-level programming can manipulate on memory locations to make the
program execute more efficiently.
These are the few reasons on why low-level programming is required for. If for example
a user wanted to make an invention of a new device, with the right architecture, it might be
wise to have a low-level language to run the device with. The reason to do that is for complete
control over the device and the immense performance that it can preform will be much more
efficient and better.
6
5.0 Contribution of assembly in cyber security and forensic
From the article “A Similarity Based Technique for Detecting Malicious Executable
files for Computer Forensics” by Jun-Hyung Park, Minsoo Kim, Bong-Nam Noh, and James
B D Joshi, they compare a set of assembly instructions from well-known malicious programs
to extract anything malicious from executable files. The method that they do is by comparing
assembly instructions scattered in a certain disk that is deemed malicious and calculate the
similarity values. From the article, the assembly instruction called cmp is one of the comparison
conditions that relates to the transfer of the execution points. The assembly instructions
between two cmp statements is compared because of it is executed in sequence. Table 1 is an
example of a cmp block compared with a high-level programming statements.
JNZ L1
ELSE CX := 20 JNZ L1
MOV CX, 10
JMP L2
L1 :
MOV CX, 20
L2 :
7
Another example of maximizing the use of assembly in security concept is by reverse
engineering an application. The article “An Analysis and Implementation of Assembly
Language Programming by using TASM Incorporating with Security Concepts” from
Heynthan Kumar A/L R.Radhakrishnan, Faeem Aizat Bin Fadzlan Shah, and S.Vannoshan A/L
Sundareson. In this article, it is stated that reverse engineering is the way toward breaking down
the target to understand it better. By using assembly, ASCII and UNICODE characters can be
broken down and hunted from binary records (Kumar, Fadzlan and Sundareson, 2016).
The next contribution is from the article “A Representation of Business Oriented Cyber
Threat Intelligence and the Objects Assembly” by Yuanchen Xu, Yingjie Yang, and Ying he.
The authors that made a case study that uses assembly to extract business objects vulnerabilities
so that it can be used to connect anything related to cyber threat intelligence (CTI). Cyber threat
intelligence is a form of approach to security for business to identify a threat and to provide
countermeasures (Xu, Yang and He, 2020).
Lastly, a contribution from the article “Autoencoder-based Feature Learning for Cyber
Security Applications” from Mahmood Yousefi-Azar, Vijay Varadharajan, Len Hamey and
Uday Tupakula. The authors of this article shows how well auto-encoder is capable of
automatically learning a reasonable similarity of inputs and using assembly to cross reference
anything malicious (Yousefi-Azar et al., 2017).
8
6.0 IDA Reverse Engineering
Starting off on the reverse engineering, the file that will be used is picked off from the
internet with the file being a .bin extension called “ida_tutorial.bin”. The binary file content
is unknown and the tool that will be used is called IDA on a Windows XP Virtual Machine on
VMware. The reason VMware is used is because to run the IDA application on Windows XP
as IDA tool cannot run on Windows 10 which is the current operating system that is being
used.
To start off the reverse engineering, the first thing to do is to run the application. The
user will be greeted to run a new disassemble or a previous file that the user was previously
working on. After clicking a new disassemble as the user is working on a new file, the user will
be greeted with a menu to drag and drop the binary file into the application. Figure below shows
the screenshot of the application.
(Figure 5.1)
After dragging the binary file into the application, a window will pop out and prompt out the
details of the file and what to do to load the file as shown on the figure below.
9
(Figure 5.2)
After pressing OK, we can see the structure of the bin file as shown by the IDA tool in
assembly. The file that is being used has details about how the flow of the binary file including
the functions of the binary file. At the start of the binary file, we can see that there is the start
of the file as shown on the figure below.
(Figure 5.3)
10
Here we can see after the start, it calls out a printf function to print out the text on the top right
side of the second box that is shown on figure 5.3 followed by the next step that it goes to.
While clicking on the functions tab, we can see what functions does the file use. As shown
below.
(Figure 5.4)
After looking through the functions, we can double click each of the functions from the list to
see where each function belongs to. After looking at the functions, we can conclude that this is
a database system for ACME company as the screenshot below shows.
11
(Figure 5.5)
The ACME database asks for the password to enter the database where the password can be
found in the “String” tab of the IDA tool. After sniffing through the IDA tool, we can run the
debugger to see if the prediction of this program being a database of ACME and it requires a
password. By pressing F9 key, we can run the debugger and see if the prediction is right or not
in the figure below.
(Figure 5.6)
After running the debugger, we can conclude that the assumptions that was made is correct and
the reverse engineering tool analysis is a success. We can conclude that the reverse engineering
tool has given insight on the flow and structure of the binary file.
12
7.0 User Manual
7.1.1 Introduction
The task that was given is to make a signboard for DISCO Sdn Bhd display to attract
customers with the colourful name and shapes. By programming the digital display board with
Assembly, the requirements for this display board are to have at least minimum of five lines of
text, and 3 graphical characters to be displayed. The text that I have developed will be “Disco
Tech” in graphical text in green colour. I developed the signboard in assembly to visualise what
will be expected at the signboard as this is only just the prototype.
7.1.2 Assumptions
The digital display board will have a different code rather than the one that is the current
prototype, and it varies on the manufacturer to have a different code as such the program that
was written can be assumed incomplete as it is only a prototype to visualise what will be
expected when implementing the real digital display board. And because of the style of code,
it will be easy to replace the text to something different and add more characters.
The software that was used to code this is TASM version 1.4 and coded through Visual
Studio Code for ease of editing and running the code. The application TASM is an emulator
that runs MSDOS and can edit and execute .asm files that is created for the program.
Traditionally to run the code with only TASM, first we must edit a .asm file with the
command “edit <filename>.asm” where <filename> is any name the user wants. After that,
the user can start writing the code in the editor. When the user is satisfied and wants to test if
the code is running properly, the user can exit out of the editor by clicking File > Save > Exit
and run a few lines into the prompt to compile and run which are “tasm <filename>.asm” to
compile the code using turbo assembler then “tlink <filename>” to finalize the compilation.
After that, the user can run the code by typing the filename into the prompt and check the
output. And if the user were to edit parts of the code, the user can jump back from the beginning
to edit the file and compiling them again.
13
For using Visual Studio Code and TASM incorporated together, we can eliminate the
old TASM editor and compiling codes are much faster and simpler. By just making a new file
with the file extension .asm, the user can directly edit the file and running the file by right
clicking the editor and pressing the option to run, and visual studio code will automatically
compile and run the file and output the result in the command prompt below the application.
The downside of this is that it requires to set up a few stuff rather than just downloading TASM
version 1.4 and running the MSDOS emulator.
To enable Visual Studio Code to run .asm files, firstly the user will need to download
TASM from techapple.net. After installing, the user will need to edit the environment variable
in their local computer assuming if the user is using windows as their operating system. By
typing in “Environment Variable” in the search bar of the start menu from the bottom left of
the user’s desktop. Image below shows the output that should be received.
(Figure 6.1)
14
After getting into the System properties, the user must click the Environment Variable button
at the bottom of the window as shown in the figure below.
(Figure 6.2)
After clicking the Environment Variable, the user will have another window pop up with two
lists of boxes for “user variable” and “system variable”. If this computer is not the user’s
computer, the user will have to edit the variables in the user side of the environment variable.
If not, the user is free to edit the system’s variable. On either of the boxes, scroll down to find
the variable “PATH” and click once on the list and click edit on the box that relates to the user
as shown in the figure below.
15
(Figure 6.3)
After clicking the edit, the user will have to add a new entry to the list to direct it to the
installation of where TASM is installed and click “OK” as shown in the figure below.
(Figure 6.4)
16
After done doing this step, the user can install Visual Studio Code and run the program. After
running the program, the user can go into the left-hand side of the program and select the
extension icon and search the keyword TASM and click the extension named MASM/TASM
by clcxsrolau as shown in the figure below.
(Figure 7.1)
After installing the user might be prompted to reload the program and the user must follow.
But for the current version of this extension is currently not working properly to run with as of
February 2021, the user must downgrade to a lower version of 0.7.0 by clicking the cog icon
of the extension and clicking “Install Another Version” and will have to reload as prompted as
shown in the figure below.
17
(Figure 7.2)
(Figure 7.3)
After that, the user will need to click on the cog icon again to change a few settings to
accommodate the best user experience for the extension by clicking “Extension Settings” and
navigate the settings to the “Masmtasm.asm: Emulator” drop down menu to “msdos player”
and the “Masmtasm.ASM: MASMor TASM” drop down menu to “MASM” as shown in the
image below.
18
(Figure 7.4)
After that, the user can go to the left-hand side of the program and start editing a file by opening
a new folder anywhere they want to edit as shown in the figure below.
(Figure 7.5)
After satisfied by coding the file, the user will have to right click anywhere in the editor and
click “Run ASM code” and the output will be shown in the window below as the output as
shown in the figure below.
19
(Figure 7.6)
And that concludes the installation process on how to set up TASM and Visual Studio Code.
The sources of the application will be in the reference list at the last page of this document [5],
[7], [9].
By following the steps on how to set up the TASM and Visual Studio Code, the program
can be run properly. By opening the prototype.asm file and running the code, the output will
be shown properly as visualised by the figure below.
20
(Figure 8)
As the output shown, the digital display will have the same visualisation of the logo as
shown in the output on the window below the program. Now we can visualise and finalize the
code to incorporate the one that is visualised into the digital display that is meant to be coded
to. To learn about the code itself, the next section of this document is a pasted raw text of the
source code that is being outputted on figure 8 and figure 9 is the output on TASM with an
another code.
21
8.0 Source Code
8.1 Visual Studio Code version
.model small
.stack 100h
.data
msg db" ____ ____ ___ ___ _____ ____ ____ ___ _ _ ",0ah
db"( _ \(_ _)/ __) / __)( _ ) (_ _)( ___)/ __)( )_( )",0ah
db" )(_) )_)(_ \__ \( (__ )(_)( )( )__)( (__ ) _ ( ",0ah
db"(____/(____)(___/ \___)(_____) (__) (____)\___)(_) (_)",0ah
.code
main proc
;Initialization
mov ax,@data
mov es, ax ;changed to es because BIOS use extended segment
22
8.2 Tasm version
.model small
.stack 100h
.data
msg db " ____ ____ ___ ___ _____ ____ ____ ___ _ _
"
db "( _ \(_ _)/ __) / __)( _ ) (_ _)( ___)/ __)( )_( )
"
db " )(_) )_)(_ \__ \( (__ )(_)( )( )__)( (__ ) _ (
"
db "(____/(____)(___/ \___)(_____) (__) (____)\___)(_) (_)
"
db "
"
.code
main proc
;Initialization
mov ax,@data
mov es, ax ;es because BIOS use extended segment
23
9.0 Conclusion
In conclusion, after reading some articles and analysis between the programming
languages, reverse engineering a file, and developing a prototype for the digital display on
assembly, it has taught me how to analyse the lower-level codes and files to help on the details
and behind the scenes of the codes.
10.0 Self-reflection
The next part of having to evaluate the contributions made on assembly to cyber
security and forensics shows on how helpful it is to learn assembly to reverse engineer and find
out anything that can be malicious or patterns to see if the files are malicious. With the IDA
tool, exploring the tool made me realize that it can be used for looking through the flow of the
code and check if there is anything that can be malicious to the user.
The limitations of the code are that I do not have access to a real display to test out the
code into the real display. Other than that, using visual studio code has simplify my testing for
the code as it is very easy to work with.
24
References
1. Anju, S.S., Harmya, P., Jagadeesh, N. and Darsana, R. (2010). Malware detection
using assembly code and control flow graph optimization. Proceedings of the 1st
Amrita ACM-W Celebration on Women in Computing in India - A2CWiC ’10.
2. Computer Science GCSE GURU. (2020). High and Low Level Languages - Computer
Science GCSE GURU. [online] Available at:
https://www.computerscience.gcse.guru/theory/high-low-level-
languages#:~:text=Low%20level%20languages%20are%20used,Assembly%20Langu
age [Accessed 1 Feb. 2021].
3. IT Release (2018). Difference between assembly language and high level language.
[online] IT Release. Available at: https://www.itrelease.com/2018/07/difference-
between-assembly-language-and-high-level-language/ [Accessed 30 Jan. 2021].
6. Park, J., Kim, M., Noh, B. and D Joshi, J. (2006). A Similarity based Technique for
Detecting Malicious Executable files for Computer Forensics. 2006 IEEE
International Conference on Information Reuse & Integration.
7. TechApple. (2013). Tasm for Windows 7 / 8.1 & Windows 10 [32-bit / 64bit version
Single Installer-.TechApple Communicating Technology In an Easy Way. [online]
Available at: https://techapple.net/2013/01/tasm-windows-7-windows-8-full-screen-
64bit-version-single-installer/ [Accessed 7 Feb. 2021].
25
8. Vangie Beal (2010). What is Machine Language? | Machine Language Definition.
[online] Webopedia. Available at: https://www.webopedia.com/definitions/machine-
language/ [Accessed 30 Jan. 2021].
10. Xu, Y., Yang, Y. and He, Y. (2020). A Representation of Business Oriented Cyber
Threat Intelligence and the Objects Assembly. 2020 10th International Conference on
Information Science and Technology (ICIST).
26