CD Lab 1
CD Lab 1
Course Objectives:
To introduce LEX and YACC tools
To learn to develop algorithms to generate code for a target machine
To implement LL and LR parsers
Course Outcomes:
After completion of the course, students will be able to
Design, develop, and implement a compiler for any language
Use LEX and YACC tools for developing a scanner and a parser
Design and implement LL and LR parsers
Design algorithms to perform code optimization in order to improve the
performance of a program in terms of space and time complexity
List of Experiments:
1. Design and implement a lexical analyzer for given language using C and the lexical
analyzer should ignore redundant spaces, tabs and new lines.
2. Implementation of Lexical Analyzer using Lex Tool
3. Generate YACC specification for a few syntactic categories.
a. Program to recognize a valid arithmetic expression that uses operator +, – , * and /.
b. Program to recognize a valid variable which starts with a letter followed by any number
of letters or digits.
c. Implementation of Calculator using LEX and YACC
d. Convert the BNF rules into YACC form and write code to generate abstract syntax tree
4. Write program to find ε – closure of all states of any given NFA with ε transition.
5. Write program to convert NFA with ε transition to NFA without ε transition.
6. Write program to convert NFA to DFA
7. Write program to minimize any given DFA.
8. Develop an operator precedence parser for a given language.
9. Write program to find Simulate First and Follow of any given grammar.
10. Construct a recursive descent parser for an expression.
11. Construct a Shift Reduce Parser for a given language.
12. Write a program to perform loop unrolling.
13. Write a program to perform constant propagation.
14. Implement Intermediate code generation for simple expressions.
References:
1. Compilers: Principles, Techniques and Tools, Second Edition, Alfred V. Aho, Monica S.
Lam, Ravi Sethi, Jeffry D. Ullman, Pearson.
2. Compiler Construction-Principles and Practice, Kenneth C Louden, Cengage Learning.
3. Modern compiler implementation in C, Andrew W Appel, Revised edition, Cambridge
University Press.
4. The Theory and Practice of Compiler writing, J. P. Tremblay and P. G. Sorenson, TMH
5. Writing compilers and interpreters, R. Mak, 3rd edition, Wiley student edition.
Online Learning Resources/Virtual Labs:
http://cse.iitkgp.ac.in/~bivasm/notes/LexAndYaccTutorial.pdf
IMPORTANCE OF COMPILER DESIGN LAB
Compiler is software which takes as input a program written in a High-Level language
and translates it into its equivalent program in Low Level program. Compilers teach us how
real-world applications are working and how to design them.
Learning Compilers gives us with both theoretical and practical knowledge that is crucial
in order to implement a programming language. It gives you a new level of understanding of a
language in order to make better use of the language (optimization is just one example).
Sometimes just using a compiler is not enough. You need to optimize the compiler itself for
your application.
Compilers have a general structure that can be applied in many other applications, from
debuggers to simulators to 3D applications to a browser and even a cmd / shell.
understanding compilers and how they work makes it super simple to understand all the rest. a
bit like a deep understanding of math will help you to understand geometry or physics. We
cannot do physics without the math. not on the same level.
Just using something (read: tool, device, software, programming language) is usually
enough when everything goes as expected. But if something goes wrong, only a true
understanding of the inner workings and details will help to fix it.
Even more specifically, Compilers are super elaborated / sophisticated systems
(architecturally speaking). If you will say that can or have written a compiler by yourself - there
will be no doubt as to your capabilities as a programmer. There is nothing you cannot do in the
Software realm.
So, better be a pilot who have the knowledge and mechanics of an airplane than the
one who just know how to fly. Every computer scientist can do much better if have knowledge
of compilers apart from the domain and technical knowledge.
Compiler design lab provides deep understanding of how programming language Syntax,
Semantics are used in translation into machine equivalents apart from the knowledge of
various compiler generation tools like LEX,YACC etc.
Lex and Yacc can generate program fragments that solve the first task.
The task of discovering the source structure again is decomposed into subtasks:
1. Split the source file into tokens (Lex).
2. Find the hierarchical structure of the program (Yacc).
Lex - A Lexical Analyzer Generator
Lex is a program generator designed for lexical processing of character input streams. It accepts
a high-level, problem oriented specification for character string matching, and produces a
program in a general purpose language which recognizes regular expressions. The regular
expressions are specified by the user in the source specifications given to Lex. The Lex written
code recognizes these expressions in an input stream and partitions the input stream into
strings matching the expressions. At the boundaries between strings program sections provided
by the user are executed. The Lex source file associates the regular expressions and the
program fragments. As each expression appears in the input to the program written by Lex, the
corresponding fragment is executed.
Lex helps write programs whose control flow is directed by instances of regular
expressions in the input stream. It is well suited for editor-script type transformations and for
segmenting input in preparation for a parsing routine.
Lex source is a table of regular expressions and corresponding program fragments. The
table is translated to a program which reads an input stream, copying it to an output stream
and partitioning the input into strings which match the given expressions. As each such string is
recognized the corresponding program fragment is executed. The recognition of the
expressions is performed by a deterministic finite automaton generated by Lex. The program
fragments written by the user are executed in the order in which the corresponding regular
expressions occur in the input stream.
The lexical analysis programs written with Lex accept ambiguous specifications and
choose the longest match possible at each input point. If necessary, substantial lookahead is
performed on the input, but the input stream will be backed up to the end of the current
partition, so that the user has general freedom to manipulate it.
Lex can generate analyzers in either C or Ratfor, a language which can be translated
automatically to portable Fortran. It is available on the PDP-11 UNIX, Honeywell GCOS, and IBM
OS systems. This manual, however, will only discuss generating analyzers in C on the UNIX
system, which is the only supported form of Lex under UNIX Version 7. Lex is designed to
simplify interfacing with Yacc, for those with access to this compiler-compiler system.
Lex generates programs to be used in simple lexical analysis of text. The input files
(standard input default) contain regular expressions to be searched for and actions written in C
to be executed when expressions are found.
A C source program, lex.yy.c is generated. This program, when run, copies unrecognized
portions of the input to the output, and executes the associated C action for each regular
expression that is recognized.
1. Problem Statement: Design a Lexical analyzer. The lexical analyzer should ignore redundant
blanks, tabs and new lines. It should also ignore comments. Although the syntax specification s
those identifiers can be arbitrarily long, you may restrict the length to some reasonable Value.
AIM: Write a C program to implement the design of a Lexical analyzer to recognize
the tokens defined by the given grammar.
ALGORITHM / PROCEDURE:
We make use of the following two functions in the process.
look up() – it takes string as argument and checks its presence in the symbol table. If the string is
found then returns the address else it returns NULL.
insert() – it takes string as its argument and the same is inserted into the symbol table and the
corresponding address is returned.
1. Start
2. Declare an array of characters, an input file to store the input;
3. Read the character from the input file and put it into character type of variable, say ‘c’.
4. If ‘c’ is blank then do nothing.
5. If ‘c’ is new line character line=line+1.
6. If ‘c’ is digit, set token Val, the value assigned for a digit and return the ‘NUMBER’.
7. If ‘c’ is proper token then assign the token value.
8. Print the complete table with Token entered by the user, Associated token value.
9. Stop
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
if (len == 0)
return (false);
for (i = 0; i < len; i++) {
if (str[i] != '0' && str[i] != '1' && str[i] != '2'
&& str[i] != '3' && str[i] != '4' && str[i] != '5'
&& str[i] != '6' && str[i] != '7' && str[i] != '8'
&& str[i] != '9' || (str[i] == '-' && i > 0))
return (false);
}
return (true);
}
if (len == 0)
return (false);
for (i = 0; i < len; i++) {
if (str[i] != '0' && str[i] != '1' && str[i] != '2'
&& str[i] != '3' && str[i] != '4' && str[i] != '5'
&& str[i] != '6' && str[i] != '7' && str[i] != '8'
&& str[i] != '9' && str[i] != '.' ||
(str[i] == '-' && i > 0))
return (false);
if (str[i] == '.')
hasDecimal = true;
}
return (hasDecimal);
}
right++;
left = right;
} else if (isDelimiter(str[right]) == true && left != right
|| (right == len && left != right)) {
char* subStr = subString(str, left, right - 1);
if (isKeyword(subStr) == true)
printf("'%s' IS A KEYWORD\n", subStr);
// DRIVER FUNCTION
int main()
{
// maximum length of string is 100 here
char str[100] = "int a = b + 1c; ";
return (0);
}
Output :
'int' IS A KEYWORD
'a' IS A VALID IDENTIFIER
'=' IS AN OPERATOR
'b' IS A VALID IDENTIFIER
'+' IS AN OPERATOR
'1c' IS NOT A VALID IDENTIFIER
[ Viva Questions ]
1. What is lexical analyzer?
2. Which compiler is used for lexical analysis?
3. What is the output of Lexical analyzer?
5. Which Finite state machines are used in lexical analyzer design?
6. What is the role of regular expressions, grammars in Lexical Analizer?