Lex and Yacc Examples Lab Task

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Compiler Construction lab manual Spring 2018 By: Rabbia Mahum

Lab 8: Topics covered

 Lex and Yacc


 Examples
 Lab Task

Lex and Yacc.


If you want to use Lex with Yacc, note that what Lex writes is a program named yylex(), the name
required by Yacc for its analyzer. Normally, the default main program on the Lex library calls this routine,
but if Yacc is loaded, and its main program is used, Yacc will call yylex(). In this case each Lex rule should
end with

return(token);
where the appropriate token value is returned. An easy way to get access to Yacc's names for tokens is
to compile the Lex output file as part of the Yacc output file by placing the line # include "lex.yy.c" in the
last section of Yacc input. Supposing the grammar to be named ``good'' and the lexical rules to be
named ``better'' the UNIX command sequence can just be:

yacc good
lex better
cc y.tab.c -ly -ll
The Yacc library (-ly) should be loaded before the Lex library, to obtain a main program which invokes
the Yacc parser. The generations of Lex and Yacc programs can be done in either order.

9. Examples.
As a trivial problem, consider copying an input file while adding 3 to every positive number divisible by
7. Here is a suitable Lex source program

%%
int k;
[0-9]+ {
k = atoi(yytext);
if (k%7 == 0)
printf("%d", k+3);
else
printf("%d",k);
}
to do just that. The rule [0-9]+ recognizes strings of digits; atoi converts the digits to binary and stores
the result in k. The operator % (remainder) is used to check whether k is divisible by 7; if it is, it is
incremented by 3 as it is written out. It may be objected that this program will alter such input items as
49.63 or X7. Furthermore, it increments the absolute value of all negative numbers divisible by 7. To
avoid this, just add a few more rules after the active one, as here:

%%
int k;
-?[0-9]+ {
Compiler Construction lab manual Spring 2018 By: Rabbia Mahum

k = atoi(yytext);
printf("%d",
k%7 == 0 ? k+3 : k);
}
-?[0-9.]+ ECHO;
[A-Za-z][A-Za-z0-9]+ ECHO;
Numerical strings containing a ``.'' or preceded by a letter will be picked up by one of the last two rules,
and not changed. The if-else has been replaced by a C conditional expression to save space; the form
a?b:c means ``if a then b else c''.

For an example of statistics gathering, here is a program which histograms the lengths of words, where a
word is defined as a string of letters.

int lengs[100];
%%
[a-z]+ lengs[yyleng]++;
. |
\n ;
%%
yywrap()
{
int i;
printf("Length No. words\n");
for(i=0; i 0)
printf("%5d%10d\n",i,lengs[i]);
return(1);
}
This program accumulates the histogram, while producing no output. At the end of the input it prints
the table. The final statement return(1); indicates that Lex is to perform wrapup. If yywrap returns zero
(false) it implies that further input is available and the program is to continue reading and processing. To
provide a yywrap that never returns true causes an infinite loop.

As a larger example, here are some parts of a program written by N. L. Schryer to convert double
precision Fortran to single precision Fortran. Because Fortran does not distinguish upper and lower case
letters, this routine begins by defining a set of classes including both cases of each letter:

a [aA]
b [bB]
c [cC]
...
z [zZ]
An additional class recognizes white space:

W [ \t]*
The first rule changes ``double precision'' to ``real'', or ``DOUBLE PRECISION'' to ``REAL''.

{d}{o}{u}{b}{l}{e}{W}{p}{r}{e}{c}{i}{s}{i}{o}{n} {
printf(yytext[0]=='d'? "real" : "REAL");
}
Compiler Construction lab manual Spring 2018 By: Rabbia Mahum

Care is taken throughout this program to preserve the case (upper or lower) of the original program. The
conditional operator is used to select the proper form of the keyword. The next rule copies continuation
card indications to avoid confusing them with constants:

^" "[^ 0] ECHO;


In the regular expression, the quotes surround the blanks. It is interpreted as ``beginning of line, then
five blanks, then anything but blank or zero.'' Note the two different meanings of ^. There follow some
rules to change double precision constants to ordinary floating constants.

[0-9]+{W}{d}{W}[+-]?{W}[0-9]+ |
[0-9]+{W}"."{W}{d}{W}[+-]?{W}[0-9]+ |
"."{W}[0-9]+{W}{d}{W}[+-]?{W}[0-9]+ {
/* convert constants */
for(p=yytext; *p != 0; p++)
{
if (*p == 'd' || *p == 'D')
*p=+ 'e'- 'd';
ECHO;
}
After the floating point constant is recognized, it is scanned by the for loop to find the letter d or D. The
program than adds 'e'-'d', which converts it to the next letter of the alphabet. The modified constant,
now single-precision, is written out again. There follow a series of names which must be respelled to
remove their initial d. By using the array yytext the same action suffices for all the names (only a sample
of a rather long list is given here).

{d}{s}{i}{n} |
{d}{c}{o}{s} |
{d}{s}{q}{r}{t} |
{d}{a}{t}{a}{n} |
...
{d}{f}{l}{o}{a}{t} printf("%s",yytext+1);
Another list of names must have initial d changed to initial a:

{d}{l}{o}{g} |
{d}{l}{o}{g}10 |
{d}{m}{i}{n}1 |
{d}{m}{a}{x}1 {
yytext[0] =+ 'a' - 'd';
ECHO;
}
And one routine must have initial d changed to initial r:

{d}1{m}{a}{c}{h} {yytext[0] =+ 'r' - 'd';


To avoid such names as dsinx being detected as instances of dsin, some final rules pick up longer words
as identifiers and copy some surviving characters:

[A-Za-z][A-Za-z0-9]* |
[0-9]+ |
\n |
. ECHO;
Compiler Construction lab manual Spring 2018 By: Rabbia Mahum

Note that this program is not complete; it does not deal with the spacing problems in Fortran or with
the use of keywords as identifiers.

Lab Task: identify tokens through lex.

Solution:

 Go to tools and compile.


Compiler Construction lab manual Spring 2018 By: Rabbia Mahum

 Go to tools and build lex or you can do it by opening cmd and change directory as
 Cd path where you have saved your file

 Then follow step as in below figure and get output


Compiler Construction lab manual Spring 2018 By: Rabbia Mahum

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy