0% found this document useful (0 votes)
104 views

CD Lab 1

This document describes a compiler design lab course that teaches students about compiler construction tools like LEX and YACC. The course objectives are to introduce LEX and YACC, teach algorithms for generating target machine code, and implement LL and LR parsers. After the course, students will be able to design and implement compilers, use LEX and YACC, design parsers, and perform code optimizations. The document lists experiments involving lexical analysis, parser implementation, automata conversions, and code optimizations. It provides references for further reading.

Uploaded by

shaik fareed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views

CD Lab 1

This document describes a compiler design lab course that teaches students about compiler construction tools like LEX and YACC. The course objectives are to introduce LEX and YACC, teach algorithms for generating target machine code, and implement LL and LR parsers. After the course, students will be able to design and implement compilers, use LEX and YACC, design parsers, and perform code optimizations. The document lists experiments involving lexical analysis, parser implementation, automata conversions, and code optimizations. It provides references for further reading.

Uploaded by

shaik fareed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

(20A05601P) COMPILER DESIGN LAB

Course Objectives:
 To introduce LEX and YACC tools
 To learn to develop algorithms to generate code for a target machine
 To implement LL and LR parsers
Course Outcomes:
After completion of the course, students will be able to
 Design, develop, and implement a compiler for any language
 Use LEX and YACC tools for developing a scanner and a parser
 Design and implement LL and LR parsers
 Design algorithms to perform code optimization in order to improve the
performance of a program in terms of space and time complexity
List of Experiments:
1. Design and implement a lexical analyzer for given language using C and the lexical
analyzer should ignore redundant spaces, tabs and new lines.
2. Implementation of Lexical Analyzer using Lex Tool
3. Generate YACC specification for a few syntactic categories.
a. Program to recognize a valid arithmetic expression that uses operator +, – , * and /.
b. Program to recognize a valid variable which starts with a letter followed by any number
of letters or digits.
c. Implementation of Calculator using LEX and YACC
d. Convert the BNF rules into YACC form and write code to generate abstract syntax tree
4. Write program to find ε – closure of all states of any given NFA with ε transition.
5. Write program to convert NFA with ε transition to NFA without ε transition.
6. Write program to convert NFA to DFA
7. Write program to minimize any given DFA.
8. Develop an operator precedence parser for a given language.
9. Write program to find Simulate First and Follow of any given grammar.
10. Construct a recursive descent parser for an expression.
11. Construct a Shift Reduce Parser for a given language.
12. Write a program to perform loop unrolling.
13. Write a program to perform constant propagation.
14. Implement Intermediate code generation for simple expressions.
References:
1. Compilers: Principles, Techniques and Tools, Second Edition, Alfred V. Aho, Monica S.
Lam, Ravi Sethi, Jeffry D. Ullman, Pearson.
2. Compiler Construction-Principles and Practice, Kenneth C Louden, Cengage Learning.
3. Modern compiler implementation in C, Andrew W Appel, Revised edition, Cambridge
University Press.
4. The Theory and Practice of Compiler writing, J. P. Tremblay and P. G. Sorenson, TMH
5. Writing compilers and interpreters, R. Mak, 3rd edition, Wiley student edition.
Online Learning Resources/Virtual Labs:
http://cse.iitkgp.ac.in/~bivasm/notes/LexAndYaccTutorial.pdf
IMPORTANCE OF COMPILER DESIGN LAB
Compiler is software which takes as input a program written in a High-Level language
and translates it into its equivalent program in Low Level program. Compilers teach us how
real-world applications are working and how to design them.
Learning Compilers gives us with both theoretical and practical knowledge that is crucial
in order to implement a programming language. It gives you a new level of understanding of a
language in order to make better use of the language (optimization is just one example).
Sometimes just using a compiler is not enough. You need to optimize the compiler itself for
your application.
Compilers have a general structure that can be applied in many other applications, from
debuggers to simulators to 3D applications to a browser and even a cmd / shell.
understanding compilers and how they work makes it super simple to understand all the rest. a
bit like a deep understanding of math will help you to understand geometry or physics. We
cannot do physics without the math. not on the same level.
Just using something (read: tool, device, software, programming language) is usually
enough when everything goes as expected. But if something goes wrong, only a true
understanding of the inner workings and details will help to fix it.
Even more specifically, Compilers are super elaborated / sophisticated systems
(architecturally speaking). If you will say that can or have written a compiler by yourself - there
will be no doubt as to your capabilities as a programmer. There is nothing you cannot do in the
Software realm.
So, better be a pilot who have the knowledge and mechanics of an airplane than the
one who just know how to fly. Every computer scientist can do much better if have knowledge
of compilers apart from the domain and technical knowledge.
Compiler design lab provides deep understanding of how programming language Syntax,
Semantics are used in translation into machine equivalents apart from the knowledge of
various compiler generation tools like LEX,YACC etc.

PRACTICE OF LEX/YACC OF COMPILER WRITING


A compiler or interpreter for a programming language is often decomposed into two parts:
1. Read the source program and discover its structure.
2. Process this structure, e.g. to generate the target program.

Lex and Yacc can generate program fragments that solve the first task.
The task of discovering the source structure again is decomposed into subtasks:
1. Split the source file into tokens (Lex).
2. Find the hierarchical structure of the program (Yacc).
Lex - A Lexical Analyzer Generator

Lex is a program generator designed for lexical processing of character input streams. It accepts
a high-level, problem oriented specification for character string matching, and produces a
program in a general purpose language which recognizes regular expressions. The regular
expressions are specified by the user in the source specifications given to Lex. The Lex written
code recognizes these expressions in an input stream and partitions the input stream into
strings matching the expressions. At the boundaries between strings program sections provided
by the user are executed. The Lex source file associates the regular expressions and the
program fragments. As each expression appears in the input to the program written by Lex, the
corresponding fragment is executed.

Lex helps write programs whose control flow is directed by instances of regular
expressions in the input stream. It is well suited for editor-script type transformations and for
segmenting input in preparation for a parsing routine.

Lex source is a table of regular expressions and corresponding program fragments. The
table is translated to a program which reads an input stream, copying it to an output stream
and partitioning the input into strings which match the given expressions. As each such string is
recognized the corresponding program fragment is executed. The recognition of the
expressions is performed by a deterministic finite automaton generated by Lex. The program
fragments written by the user are executed in the order in which the corresponding regular
expressions occur in the input stream.

The lexical analysis programs written with Lex accept ambiguous specifications and
choose the longest match possible at each input point. If necessary, substantial lookahead is
performed on the input, but the input stream will be backed up to the end of the current
partition, so that the user has general freedom to manipulate it.
Lex can generate analyzers in either C or Ratfor, a language which can be translated
automatically to portable Fortran. It is available on the PDP-11 UNIX, Honeywell GCOS, and IBM
OS systems. This manual, however, will only discuss generating analyzers in C on the UNIX
system, which is the only supported form of Lex under UNIX Version 7. Lex is designed to
simplify interfacing with Yacc, for those with access to this compiler-compiler system.

Lex generates programs to be used in simple lexical analysis of text. The input files
(standard input default) contain regular expressions to be searched for and actions written in C
to be executed when expressions are found.
A C source program, lex.yy.c is generated. This program, when run, copies unrecognized
portions of the input to the output, and executes the associated C action for each regular
expression that is recognized.

The options have the following meanings.


–t Place the result on the standard output instead of in file lex.yy.c.
–v Print a one–line summary of statistics of the generated analyzer.
–n Opposite of –v;
–n is default.
–9 Adds code to be able to compile through the native C compilers.
EXAMPLE
This program converts upper case to lower, removes blanks at the end of lines, and replaces
multiple blanks by single blanks
%%
[A–Z] putchar(yytext[0]+'a'–'A');
[ ]+$
[ ]+ putchar(' ')

1. Problem Statement: Design a Lexical analyzer. The lexical analyzer should ignore redundant
blanks, tabs and new lines. It should also ignore comments. Although the syntax specification s
those identifiers can be arbitrarily long, you may restrict the length to some reasonable Value.
AIM: Write a C program to implement the design of a Lexical analyzer to recognize
the tokens defined by the given grammar.
ALGORITHM / PROCEDURE:
We make use of the following two functions in the process.
look up() – it takes string as argument and checks its presence in the symbol table. If the string is
found then returns the address else it returns NULL.
insert() – it takes string as its argument and the same is inserted into the symbol table and the
corresponding address is returned.
1. Start
2. Declare an array of characters, an input file to store the input;
3. Read the character from the input file and put it into character type of variable, say ‘c’.
4. If ‘c’ is blank then do nothing.
5. If ‘c’ is new line character line=line+1.
6. If ‘c’ is digit, set token Val, the value assigned for a digit and return the ‘NUMBER’.
7. If ‘c’ is proper token then assign the token value.
8. Print the complete table with Token entered by the user, Associated token value.
9. Stop

#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

// Returns 'true' if the character is a DELIMITER.


bool isDelimiter(char ch)
{
if (ch == ' ' || ch == '+' || ch == '-' || ch == '*' ||
ch == '/' || ch == ',' || ch == ';' || ch == '>' ||
ch == '<' || ch == '=' || ch == '(' || ch == ')' ||
ch == '[' || ch == ']' || ch == '{' || ch == '}')
return (true);
return (false);
}

// Returns 'true' if the character is an OPERATOR.


bool isOperator(char ch)
{
if (ch == '+' || ch == '-' || ch == '*' ||
ch == '/' || ch == '>' || ch == '<' ||
ch == '=')
return (true);
return (false);
}

// Returns 'true' if the string is a VALID IDENTIFIER.


bool validIdentifier(char* str)
{
if (str[0] == '0' || str[0] == '1' || str[0] == '2' ||
str[0] == '3' || str[0] == '4' || str[0] == '5' ||
str[0] == '6' || str[0] == '7' || str[0] == '8' ||
str[0] == '9' || isDelimiter(str[0]) == true)
return (false);
return (true);
}

// Returns 'true' if the string is a KEYWORD.


bool isKeyword(char* str)
{
if (!strcmp(str, "if") || !strcmp(str, "else") ||
!strcmp(str, "while") || !strcmp(str, "do") ||
!strcmp(str, "break") ||
!strcmp(str, "continue") || !strcmp(str, "int")
|| !strcmp(str, "double") || !strcmp(str, "float")
|| !strcmp(str, "return") || !strcmp(str, "char")
|| !strcmp(str, "case") || !strcmp(str, "char")
|| !strcmp(str, "sizeof") || !strcmp(str, "long")
|| !strcmp(str, "short") || !strcmp(str, "typedef")
|| !strcmp(str, "switch") || !strcmp(str, "unsigned")
|| !strcmp(str, "void") || !strcmp(str, "static")
|| !strcmp(str, "struct") || !strcmp(str, "goto"))
return (true);
return (false);
}

// Returns 'true' if the string is an INTEGER.


bool isInteger(char* str)
{
int i, len = strlen(str);

if (len == 0)
return (false);
for (i = 0; i < len; i++) {
if (str[i] != '0' && str[i] != '1' && str[i] != '2'
&& str[i] != '3' && str[i] != '4' && str[i] != '5'
&& str[i] != '6' && str[i] != '7' && str[i] != '8'
&& str[i] != '9' || (str[i] == '-' && i > 0))
return (false);
}
return (true);
}

// Returns 'true' if the string is a REAL NUMBER.


bool isRealNumber(char* str)
{
int i, len = strlen(str);
bool hasDecimal = false;

if (len == 0)
return (false);
for (i = 0; i < len; i++) {
if (str[i] != '0' && str[i] != '1' && str[i] != '2'
&& str[i] != '3' && str[i] != '4' && str[i] != '5'
&& str[i] != '6' && str[i] != '7' && str[i] != '8'
&& str[i] != '9' && str[i] != '.' ||
(str[i] == '-' && i > 0))
return (false);
if (str[i] == '.')
hasDecimal = true;
}
return (hasDecimal);
}

// Extracts the SUBSTRING.


char* subString(char* str, int left, int right)
{
int i;
char* subStr = (char*)malloc(
sizeof(char) * (right - left + 2));

for (i = left; i <= right; i++)


subStr[i - left] = str[i];
subStr[right - left + 1] = '\0';
return (subStr);
}

// Parsing the input STRING.


void parse(char* str)
{
int left = 0, right = 0;
int len = strlen(str);
while (right <= len && left <= right) {
if (isDelimiter(str[right]) == false)
right++;

if (isDelimiter(str[right]) == true && left == right) {


if (isOperator(str[right]) == true)
printf("'%c' IS AN OPERATOR\n", str[right]);

right++;
left = right;
} else if (isDelimiter(str[right]) == true && left != right
|| (right == len && left != right)) {
char* subStr = subString(str, left, right - 1);

if (isKeyword(subStr) == true)
printf("'%s' IS A KEYWORD\n", subStr);

else if (isInteger(subStr) == true)


printf("'%s' IS AN INTEGER\n", subStr);

else if (isRealNumber(subStr) == true)


printf("'%s' IS A REAL NUMBER\n", subStr);

else if (validIdentifier(subStr) == true


&& isDelimiter(str[right - 1]) == false)
printf("'%s' IS A VALID IDENTIFIER\n", subStr);

else if (validIdentifier(subStr) == false


&& isDelimiter(str[right - 1]) == false)
printf("'%s' IS NOT A VALID IDENTIFIER\n", subStr);
left = right;
}
}
return;
}

// DRIVER FUNCTION
int main()
{
// maximum length of string is 100 here
char str[100] = "int a = b + 1c; ";

parse(str); // calling the parse function

return (0);
}
Output :
'int' IS A KEYWORD
'a' IS A VALID IDENTIFIER
'=' IS AN OPERATOR
'b' IS A VALID IDENTIFIER
'+' IS AN OPERATOR
'1c' IS NOT A VALID IDENTIFIER

[ Viva Questions ]
1. What is lexical analyzer?
2. Which compiler is used for lexical analysis?
3. What is the output of Lexical analyzer?
5. Which Finite state machines are used in lexical analyzer design?
6. What is the role of regular expressions, grammars in Lexical Analizer?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy