How Is Systems Programming Different From Other Types of Programming ?
How Is Systems Programming Different From Other Types of Programming ?
10/12/2020 1
The basic need of system software is to achieve the
following goals :-
10/12/2020 2
Domain: It refers to the scope or sphere of any activity.
. Application Domain: The scope of an application is its application domain.
E.g., the application domain of an inventory program is
Execution Domain: (also called as the solution domain). The execution domain is the
work of programmers, e.g., program code, documentation, test results, files,
computers, etc.
Software CPU
Designer
10/12/2020 4
Consequences of Semantic Gap
10/12/2020 5
The semantic gap is reduced by programming languages (PL).
The use of a PL introduces a new domain called the programming
language domain (or PL domain).
Application PL Execution
Domain Domain Domain
10/12/2020 6
The PL domain bridges the gap between the
application domain and the execution domain.
• Specification gap: It is the semantic gap between
the application domain and the PL domain. It can
also be defined as the semantic gap between the
two specifications of the same task. The
specification gap is bridged by the software
development team.
• Execution gap: It is the gap between the semantics
of programs written in different programming
languages. The execution gap is bridged by
the translator or interpreter.
10/12/2020 7
Language Processor: It is a software which bridges
the specification or execution gap.
10/12/2020 8
Introduction
• Why Language Processor?
10/12/2020 9
Language Processors
• A Language Processor is a software which
bridges a specification or execution gap.
• Program to input to a PL is referred as a
Source Program and output as Target
Program.
• Languages in which they are written are called
as source language and target languages
respectively.
10/12/2020 10
A Spectrum of Language Processor
• A language translator bridges an execution gap to
the machine language of a computer system.
• A detranslator bridges the same execution gap as
the language translator but in reverse direction.
• A Preprocessor is a language processor which
bridges an execution gap but is not a language
translator.
• A language migrator bridges the specification gap
between two PL’s.
10/12/2020 11
Errors
C++ C++ C
program
program preprocessor
Errors
10/12/2020 12
Interpreters
• An interpreter is a language processor which
bridges an execution gap without generating a
machine language program.
• Here execution gap vanishes totally.
Interpreter
Domain
Application Execution
PL
Domain Domain
Domain
10/12/2020 13
ADVANTAGES:
1. Good at locating errors in programs
2. Debugging is easier since the interpreter stops when it encounters an error.
3. If an error is deducted there is no need to retranslate the whole program
DISADVANTAGES:
1. Rather slow
2. No object code is produced, so a translation has to be done every time the
program is running.
3. For the program to run, the Interpreter must be present
10/12/2020 14
Problem Oriented Language
Application PL Execution
Domain Domain Domain
10/12/2020 15
Procedure Oriented Language
PL Execution
Application
Domain Domain
Domain
10/12/2020 16
Language Processing Activities
• Divided into those that bridge the
specification gap and those that bridge the
execution gap.
1. Program generation activities.
2. Program execution activities.
10/12/2020 17
Program Generation
Errors
10/12/2020 18
Specification Gap
10/12/2020 19
Example
• A screen handling Program
• Specification is given as below
Address
Married Yes
Age Gender
10/12/2020 21
Program Execution
• Two models of Program Execution
– Program translation
– Program interpretation
10/12/2020 22
Program Translation
• Program translation model bridges the
execution gap by translating a program
written in PL i.e. Source Program into machine
language i.e. Target Program.
Errors Data
Source Target
Translator m/c language program Program
Program
10/12/2020 24
Instruction Execution Cycle
1. Fetch the instruction
2. Decode the instruction and determine the
operation to be performed.
3. Execute the instruction.
10/12/2020 25
Interpretation Cycle consists of—
1. Fetch the statement.
2. Analyze the statement and determine its
meaning.
3. Execute the meaning of the statement.
10/12/2020 26
Interpreter Memory CPU Memory
PC PC
Machine
Source
language
program
program
Errors +
+
Data
Data
10/12/2020 27
Characteristics of interpretation
1. Source program is retained in the source
form itself i.e. no target program.
2. A statement is analyzed during its
interpretation.
10/12/2020 28
Fundamentals of Language Processing
• Lang Processing = Analysis of SP+ Synthesis of TP
10/12/2020 29
Synthesis of target program is construction of target
language statements
Language Processor
Source Target
Analysis Phase Synthesis Phase
Program Program
Errors Errors
10/12/2020 30
Example:-
Percent_profit := (profit * 100) / cost_price;
Lexical analysis identifies-------
:=, * and / as operators
100 as constant
Remaining strings as identifiers.
Syntax analysis identifies the statement as the assignment
statement.
Semantic analysis determines the meaning of the statement as
profit x 100
to percent_profit
cost_price
10/12/2020 31
Referred as memory allocation and code generation,
respectively.
10/12/2020 32
Language Processor Pass
• Pass I : performs analysis of SP and notes
relevant information.
• Pass II: performs synthesis of target program.
10/12/2020 33
Intermediate Representation (IR)
Intermediate
Representation (IR)
10/12/2020 34
Properties
1. Ease of use.
2. Processing efficiency.
3. Memory efficiency.
10/12/2020 35
A Hierarchy of Languages
10/12/2020 36
Assembly and Machine Language
• Machine language
– Native to a processor: executed directly by hardware
– Instructions consist of binary code: 1s and 0s
• Assembly language
– A programming language that uses symbolic names to represent
operations, registers and memory locations.
– Slightly higher-level language
– Readability of instructions is better than machine language
– One-to-one correspondence with machine language instructions
• Assemblers translate assembly to machine code
• Compilers translate high-level programs to machine code
– Either directly, or
– Indirectly via an assembler
10/12/2020 37
Compiler and Assembler
10/12/2020 38
Assembler
• Software tools are needed for editing, assembling,
linking, and debugging assembly language programs
• An assembler is a program that converts source-code
programs written in assembly language into object files
in machine language
• Popular assemblers have emerged over the years for
the Intel family of processors. These include …
– TASM (Turbo Assembler from Borland)
– NASM (Netwide Assembler for both Windows and Linux),
and
– GNU assembler distributed by the free software
foundation
10/12/2020 39
Linker and Link Libraries
• You need a linker program to produce executable files
• It combines your program's object file created by the
assembler with other object files and link libraries, and
produces a single executable program
• LINK32.EXE is the linker program provided with the
MASM distribution for linking 32-bit programs
• We will also use a link library for input and output
• Called Irvine32.lib developed by Kip Irvine
– Works in Win32 console mode under MS-Windows
10/12/2020 40
Assemble and Link Process
Source Object
File Assembler File
Link
Source Object
Assembler Libraries
File File
10/12/2020 45
Statement format
Eg.10/12/2020
AREA, AREA+5, AREA(4), AREA+5(4) 46
Mnemonic Operation Codes
• Each statement has two operands, first operand is always a register
and second operand refers to a memory word using a symbolic
name and optional displacement.
10/12/2020 47
Instruction Assembly Remarks
Opcode Mnemonic
00 STOP Stop Execution
01 ADD Op1 Op1+ Op2
02 SUB Op1 Op1 – Op2
03 MULT Op1 Op1* Op2
04 MOVER CPU Reg Memory operand
05 MOVEM Memory CPU Reg
06 COMP Sets Condition Code
07 BC Branch on Condition
08 DIV Op1 Op1/ Op2
09 READ Operand 2 input Value
10 PRINT Output Operand2
10/12/2020 49
Assembly Language Statements
• Declaration Statements: These statements declares the
storage area or declares the constant in program.
syntax is as follows:
[Label] DS <constant>
[Label] DC '<value>'
◼ The DS (declare storage) statement reserves memory and associates names with them.
◼ Ex:
A DS 1 ; reserves a memory area of 1 word, associating the name A to it
G DS 200 ; reserves a block of 200 words and the name G is associated with the
first word of the block (G+6 etc. to access the other words)
◼ The DC (declare constant) statement constructs memory words containing constants.
◼ Ex:
ONE DC '1’ ; associates name one with a memory word containing value 1
10/12/2020 50
Assembly Language Statements
Use of Constants
• The DC statement does not really implement constants
• it just initializes memory words to given values.
• The values are not protected by the assembler and can be
changed by moving a new value into the memory word.
• In the above example, the value of ONE can be changed by
executing an instruction
MOVEM BREG,ONE
10/12/2020 51
Assembly Language Statements
Use of Constants
• An Assembly Program can use constants just like HLL, in two
ways – as immediate operands, and as literals.
10/12/2020 52
Assembly Language Statements
Use of Constants
• A literal is an operand with the syntax = '<value>'.
10/12/2020 53
Assembly Language Statements
Assembler Directive
– These are the statements used to indicate certain thing
regarding how assembly of input program is to be
performed.
10/12/2020 55
Analysis Phase – Implementing memory
allocation
• LC(location counter) :
– is always made to contain the address of the next memory word in
the target program.
– It is initialized to the constant specified at the START statement.
• When a LABEL is encountered
– it enters the LABEL and the contents of LC in a new entry of the
symbol table.
LABEL – e.g. N, AGAIN, SUM etc
– It then finds the number of memory words required by the
assembly statement and updates the LC contents
• To update the contents of the LC, analysis phase needs to know
lengths of the different instructions
– This information is available in the Mnemonics table and is extended
with a field called length
• We refer the processing involved in maintaining the LC as LC
Processing
10/12/2020 56
Operation Codes
◼ MOVE instructions move a value between a memory word and a
register
◼ MOVER – First operand is target and second operand is source
◼ MOVEM – first operand is source, second is target
10/12/2020 58
START 101
READ X
READ Y
MOVER AREG, X
ADD AREG, Y
MOVEM AREG, RESULT
PRINT RESULT
STOP
X DS 1
Y DS 1
RESULT DS 1
END
10/12/2020
Fig: Sample program to find X+Y 59
Assembly Lang to M/C lang Program
1. Find address of variables and labels.
2. Replace Symbolic addr by numeric addr.
3. Replace Symbolic opcodes by machine
opcode.
4. Reserve storage for data.
10/12/2020 60
Opcode Register Memory
START 101 LC operand
READ X 101 + 09 0 108
READ Y 102 + 09 0 109
MOVER AREG, X 103 + 04 1 108
ADD AREG, Y 104 + 01 1 109
MOVEM AREG, RESULT 105 + 05 0 110
PRINT RESULT 106 + 10 0 110
STOP 107 + 00 0 000
X DS 1 108
Y DS 1 109
RESULT DS 1 110
END
10/12/2020 61
Variable Address
X 108
Y 109
RESULT 110
10/12/2020 62
Required M/C Code
LC Opcode Register Address
101 09 0 108
102 09 0 109
103 04 1 108
104 01 1 109
105 05 0 110
106 10 0 110
107 00 0 000
108
109
110
111
10/12/2020 63
Example: ALP and its equivalent Machine
Language Program
10/12/2020 64
Analysis Phase
• Primary function of the Analysis phase is to build the
symbol table.
– It must determine the addresses with which the symbolic
names used in a program are associated
– It is possible to determine some addresses directly like the
address of first instruction in the program (ie.,start)
– Other addresses must be inferred
– To determine the addresses of the symbolic names we
need to fix the addresses of all program elements
preceding it through Memory Allocation.
• To implement memory allocation a data structure
called location counter is introduced.
10/12/2020 65
Synthesis Phase: Example
Consider the following statement:
MOVER BREG, ONE
The following info is needed to synthesize machine instruction
for this stmt:
1. Address of the memory word with which name ONE is
associated [depends on the source program, hence made
available by the Analysis phase].
Symbol Address
N 103
10/12/2020 68
• Since there the instructions take different
amount of memory, it is also stored in the
mnemonic table in the “length” field
10/12/2020 69
Data structures of an assembler
During analysis and
Synthesis phases Mnemonic Opcode length
ADD 01 1
SUB 02 1
Mnemonic Table
Symbol Address
→ Data Access
N 104
-- > Control Access
AGAIN 113
Symbol Table
10/12/2020 70
A simple Assembly Scheme
Design Specification of an assembler
There are four steps involved to design the specification
of an assembler:
• Identify information necessary to perform a task.
• Design a suitable data structure to record info.
• Determine processing necessary to obtain and maintain
the info.
• Determine processing necessary to perform the task
10/12/2020 71
Tasks Performed : Analysis Phase
• Isolate the labels, mnemonic, opcode and operand fields
of a statement.
10/12/2020 72
Tasks Performed : Synthesis Phase
• Obtain machine opcode corresponding to the mnemonic from
the mnemonic table.
10/12/2020 73
Forward Reference
• It is a reference to the entity which precedes
its definition in the program.
percent_profit := (profit * 100) / cost_price;
……..
……..
long profit;
10/12/2020 74
Difficulties: Forward Reference
• Forward reference: reference to a label that is
defined later in the program.
10/12/2020 75
Single Pass Translation
• The problem of forward reference can be
handled using a technique called as back
patching.
10/12/2020 76
START 100
MOVER AREG, X
L1 ADD BREG, ONE
ADD CREG, TEN
STOP
X DC ‘5’
ONE DC ‘1’
TEN DC ‘10’
END
10/12/2020 77
START 100
MOVER AREG, X 100 04 1 ___
L1 ADD BREG, ONE 101 01 2 ___
ADD CREG, TEN 102 06 3 ___
STOP 103 00 0 000
X DC ‘5’ 104
ONE DC ‘1’ 105
TEN DC ‘10’ 106
END
Figure : TII
10/12/2020 78
Machine Instruction After Backpatching
04 1 104
01 2 105
06 2 106
00 0 000
10/12/2020 79
Backpatching
• The problem of forward references is handled
using a process called backpatching.
– Initially, the operand field of an instruction
containing a forward reference is left blank
– Ex: MOVER BREG, ONE can be only partially
synthesized since ONE is a forward reference
– The instruction opcode and address of BREG will be
assembled to reside in location 101
– To insert the second operand’s address later, an entry
is added as Table of Incomplete Instructions (TII)
– The entry TII is a pair (<instruction address>,
<symbol>) which is (101, ONE) here
10/12/2020 80
Backpatching
– When END statement is processed, the symbol table
would contain the addresses of all symbols defined in
the source program
– So TII would contain information of all forward
references
– Now each entry in TII is processed to complete the
instruction
– Ex: the entry (101, ONE) would be processed by
obtaining the address of ONE from symbol table and
inserting it in the operand field of the instruction with
assembled address 101.
– Alternatively, when definition of some symbol L is
encountered, all forward references to L can be
processed
10/12/2020 81
Two Pass Translation
• Handles forward references easily.
• Requires 2 scans of the source program.
• LC processing is performed in the 1st pass and
symbols are stored in the symbol table.
• Second pass synthesis Target Program.
10/12/2020 82
Two Pass Assembler
• Read from input line
– LABEL, OPCODE, OPERAND
Source
program
Intermediate Object
Pass 1 Pass 2
file codes
10/12/2020 83
General Design Procedure of Two Pass
Assembler
1. Specify the problem
2. Specify data structures
3. Define format of data structures
4. Specify algorithm
5. Look for modularity [capability of one
program to be subdivided into independent
programming units.]
6. Repeat 1 through 5 on modules.
10/12/2020 84
Pass structure of assembler
Data Structures
Source Target
Program Pass I Pass II Program
Intermediate Code
10/12/2020 85
Design of a Two Pass Assembler
• Pass I:-
1. Separate the symbol, mnemonic, opcode and
operand.
2. Build Symbol Table.
3. Perform LC Processing.
4. Construct Intermediate Representation.
• Pass II:-
1.Process IR to synthesize the target program.
10/12/2020 86
Pass I
• Pass I uses the following data structures
1. Machine Opcode table (OPTAB)
2. Symbol Table (ST)
3. Literal Table (LT)
4. Pool Table (PT)
10/12/2020 87
1. OPTAB contains opcode, class and opcode
length.
2. SYMTAB contains symbol and address.
3. LITTAB contains literal and address.
4. POOLTAB contains starting literal number of
each pool.
10/12/2020 88
START 200 LC
MOVER AREG, =‘5’ 200
MOVEM AREG, X 201
L1 MOVER BREG, =‘2’ 202
ORIGIN L1+3
LTORG 205
206
NEXT ADD AREG,=‘1’ 207
SUB BREG,=‘2’ 208
BC LT, BACK 209
LTORG 210
211
BACK EQU L1 212
ORIGIN NEXT+5
MULT CREG,=‘4’ 212
STOP 213
X DS 1 214
END
10/12/2020 89
START 200
=‘5’ --- 0
10/12/2020 90
MOVEM AREG,X 201
10/12/2020 91
ORIGIN L1+3 203
LTORG 205
206
10/12/2020 92
NEXT ADD AREG, =‘1’ 207
LTORG 210
211
=‘2’ 211
10/12/2020 94
BACK EQU L1 212
=‘4’ ----
STOP 213
Symbol Address Literal Address Pool Table
X ---- =‘5’ 205 0
L1 202 =‘2’ 206 2
NEXT 207 =‘1’ 210 4
BACK 202 =‘2’ 211
10/12/2020 =‘4’ ---- 96
X DS 1 214
Symbol Address Literal Address
Pool Table
X 214 =‘5’ 205
0
L1 202 =‘2’ 206
2
NEXT 207 =‘1’ 210
4
BACK 202 =‘2’ 211
=‘4’ ----
END
Literal Address
Symbol Address Pool Table
=‘5’ 205
X 214 0
=‘2’ 206
L1 202 2
=‘1’ 210
NEXT 207 4
=‘2’ 211
BACK 202 5
10/12/2020 =‘4’ 215 97
Specify the problem
Pass1: Define symbols & literals.
1) Determine length of m/c instruction [MOTGET1]
2) Keep track of Location Counter [LC]
3) Remember values of symbols [STSTO]
4) Process some pseudo ops[EQU,DS etc] [POTGET1]
5) Remember Literals [LITSTO]
10/12/2020 98
Pass2: Generate object program
1) Look up value of symbols [STGET]
2) Generate instruction [MOTGET2]
3) Generate data (for DS, DC & literals)
4) Process pseudo ops[POTGET2]
10/12/2020 99
Step 2. Data structure:-
Pass1: Databases
• Input source program
• “LC” location counter used to keep track of each
instructions addr.
• M/c operation table (MOT) [Symbolic mnemonic &
length]
• Pseudo operation table [POT], [Symbolic mnemonic
& action]
• Symbol Table (ST) to store each lable & it’s value.
• Literal Table (LT), to store each literal (variable) & it’s
location.
• Copy of input to used later by PASS-2.
10/12/2020 100
Step 2. Data structure:-
• Pass2: Databases
• Copy of source program input to Pass1.
• Location Counter (LC)
• MOT [Mnemonic, length, binary m/c op code, etc.]
• POT [Mnemonic & action to be taken in Pass2
• ST [prepared by Pass1, label & value]
• Base Table [or register table] indicates which registers
are currently specified using ‘USING’ pseudo op & what
are contents.
• Literal table prepared by Pass1. [Lit name & value].
10/12/2020 101
10/12/2020 102
10/12/2020 103
Machine Dependent and Machine
Independent features of Assembler
• M/C Dependent Features
– A] Instruction format & addr. mode:-
– B] Program Relocation
• Machine Independent Assembler Features
– 1) Literals
– 2) Symbol defining statements
– 3) Expressions
10/12/2020 104
Assembler’s functions
• Convert mnemonic operation codes to their
machine language equivalents
• Convert symbolic operands to their equivalent
machine addresses
• Build the machine instructions in the proper
format
• Convert the data constants to internal
machine representations
• Write the object program and the assembly
listing
10/12/2020 105