CC Assignment

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Question no.

1
Describe the process of lexical analysis in compiler construction. Write a program that takes a source
code input and performs tokenization, identifying keywords, operators, and identifiers.

Answer.

Lexical Analysis
Lexical Analysis is the first phase of a compiler that takes the input as a source code written in a high-
level language. Lexical analysis is the process of breaking down the source code of the program into
smaller parts, called tokens, such that a computer can easily understand. These tokens can be individual
words or symbols in a sentence, such as keywords, variable names, numbers, and punctuation. It is also
known as a scanner. Lexical Analysis can be implemented with the Deterministic Finite Automata. The
output generated from Lexical Analysis are a sequence of tokens sent to the parser for syntax analysis.

Token:
A lexical token is a sequence of characters that can be treated as a unit in the grammar of the
programming languages.

Lexeme
The sequence of characters matched by a pattern to form the corresponding token or a sequence of input
characters that comprises a single token is called a lexeme. Eg- “float”, “abs_zero_Kelvin”, “=”, “-”,
“273”, “;”

Working of lexical analysis


1. Input Preprocessing: Clean the input by removing comments, extra whitespace, and irrelevant
characters.
2. Tokenization: Split the input into tokens by matching patterns using regular expressions.
3. Token Classification: Identify the type of each token (e.g., keyword, identifier, operator,
delimiter).
4. Token Validation: Ensure tokens follow the rules of the programming language (e.g., valid
variable names, correct operator syntax).
5. Output Generation: Produce a list of valid tokens for the next phase of compilation.

Program for performing tokenization


Input source code
“int a = 5; if (a > 10) return a + 1;”

Output
Token: int, Type: Keyword

Token: a, Type: Identifier


Token: =, Type: Operator

Token: 5, Type: Literal

Token: ;, Type: Delimiter

Token: if, Type: Keyword

Token: >, Type: Operator

Token: 10, Type: Literal

Token: ), Type: Delimiter

Token: return, Type: Keyword

Token: +, Type: Operator

Question no. 2
Explain the phases of a compiler and provide examples of tasks performed in each phase. Illustrate how a
source code is transformed step-by-step through these phases until it becomes executable machine code.

Answer.

Phases of Compiler
Compiler operates in various phases each phase transforms the source program from one representation to
another. Every phase takes inputs from its previous stage and feeds its output to the next phase of the
compiler.

There are 6 phases in a compiler. Each of this phase help in converting the high-level langue the machine
code. The phases of a compiler are:

 Lexical analysis
 Syntax analysis
 Semantic analysis
 Intermediate code generator
 Code optimizer
 Code generator
Phase 1: Lexical Analysis
Lexical Analysis is the first phase when compiler scans the source code. This process can be left to right,
character by character, and group these characters into tokens.

Here, the character stream from the source program is grouped in meaningful sequences by identifying
the tokens. It makes the entry of the corresponding tickets into the symbol table and passes that token to
next phase.

The primary functions of this phase are:

 Identify the lexical units in a source code


 Classify lexical units into classes like constants, reserved words, and enter them in different
tables. It will Ignore comments in the source program
 Identify token which is not a part of the language

Example:
X = y + 10

Tokens
X : identifier

= : Assignment operator

Y : identifier

+ : Addition operator

10 : Number

Phase 2: Syntax Analysis


Syntax analysis is all about discovering structure in code. It determines whether or not a text follows the
expected format. The main aim of this phase is to make sure that the source code was written by the
programmer is correct or not.

Syntax analysis is based on the rules based on the specific programing language by constructing the parse
tree with the help of tokens. It also determines the structure of source language and grammar or syntax of
the language.

Here, is a list of tasks performed in this phase:

 Obtain tokens from the lexical analyzer


 Checks if the expression is syntactically correct or not
 Report all syntax errors
 Construct a hierarchical structure which is known as a parse tree

Example
Any identifier/number is an expression

If x is an identifier and y+10 is an expression, then x= y+10 is a statement.

Consider parse tree for the following example

In Parse Tree
 Interior node: record with an operator filed and two files for children
 Leaf: records with 2/more fields; one for token and other information about the token
 Ensure that the components of the program fit together meaningfully
 Gathers type information and checks for type compatibility
 Checks operands are permitted by the source language

Phase 3: Semantic Analysis


Semantic analysis checks the semantic consistency of the code. It uses the syntax tree of the previous
phase along with the symbol table to verify that the given source code is semantically consistent. It also
checks whether the code is conveying an appropriate meaning.

Semantic Analyzer will check for Type mismatches, incompatible operands, a function called with
improper arguments, an undeclared variable, etc.

Functions of Semantic analyses phase are:

 Helps you to store type information gathered and save it in symbol table or syntax tree
 Allows you to perform type checking
 In the case of type mismatch, where there are no exact type correction rules which satisfy the
desired operation a semantic error is shown
 Collects type information and checks for type compatibility
 Checks if the source language permits the operands or not

Example
Float x = 20.2;

Float y = x*30;

In the above code, the semantic analyzer will typecast the integer 30 to float 30.0 before multiplication.

Phase 4: Intermediate Code Generation


Once the semantic analysis phase is over the compiler, generates intermediate code for the target machine.
It represents a program for some abstract machine.

Intermediate code is between the high-level and machine level language. This intermediate code needs to
be generated in such a manner that makes it easy to translate it into the target machine code.

Functions on Intermediate Code generation:

 It should be generated from the semantic representation of the source program


 Holds the values computed during the process of translation
 Helps you to translate the intermediate code into target language
 Allows you to maintain precedence ordering of the source language
 It holds the correct number of operands of the instruction

Example
Total = count + rate * 5

Intermediate code with the help of address code method is:

T1 := int_to_float(5)

T2 := rate * t1

T3 := count + t2

Total := t3

Phase 5: Code Optimization


The next phase of is code optimization or Intermediate code. This phase removes unnecessary code line
and arranges the sequence of statements to speed up the execution of the program without wasting
resources. The main goal of this phase is to improve on the intermediate code to generate a code that runs
faster and occupies less space.

The primary functions of this phase are:

 It helps you to establish a trade-off between execution and compilation speed


 Improves the running time of the target program
 Generates streamlined code still in intermediate representation
 Removing unreachable code and getting rid of unused variables
 Removing statements which are not altered from the loop

Example:
Consider the following code

A = intofloat(10)

B=c*a

D=e+b

F=d

Can become

B =c * 10.0

F = e+b

Phase 6: Code Generation


Code generation is the last and final phase of a compiler. It gets inputs from code optimization phases and
produces the page code or object code as a result. The objective of this phase is to allocate storage and
generate relocatable machine code.

It also allocatesq memory locations for the variable. The instructions in the intermediate code are
converted into machine instructions. This phase coverts the optimize or intermediate code into the target
language.

The target language is the machine code. Therefore, all the memory locations and registers are also
selected and allotted during this phase. The code generated by this phase is executed to take inputs and
generate expected outputs.

Example
A = b + 60.0

Would be possibly translated to registers.

MOVF a, R1

MULF #60.0, R2

ADDF R1, R2q

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy