Compiler File
Compiler File
1. Lexical Analysis
• This is the first step in converting the HLL (high level language) to machine code.
• The lexical analyser scans the HLL to generate the tokens hence also known as
scanner or tokenizer.
• At this phase it uses a tool known as flex (faster lexical analyser generator) or lex
(lexical analyser generator) which generates a file with .out extension or .exe
extension which is nothing but lexical analyser which further is used for
tokenization.
• All the phases use the error handler for generating any error if found any.
• While scanning the input it eliminates all the spaces, blank spaces, new lines,
and indentations.
2. LEX
• Lex is a program that generates lexical analyser. It is used with YACC (yet
another another compiler) parser generator.
• The lexical analyser is a program that transforms an input stream into a
sequence of tokens.
• It reads the input stream and produces the source code as output through
implementing the lexical analyser in the C program.
c. { rules } %%
d. { user subroutines }
e. Definitions include declarations of constant, variable and regular definitions.
Rules define the statement of form p1 {action1} p2 {action2}....pn {action}.
f. Where pi describes the regular expression and action1 describes the actions what
action the lexical analyser should take when pattern pi matches a lexeme.
g. User subroutines are auxiliary procedures needed by the actions. The subroutine
can be loaded with the lexical analyser and compiled separately.
Problem Statement 1:
Design a LEX Code to count the number of lines, space, tab-meta character and rest of
characters in a given input pattern.
Source code:
%{
#include<stdio.h>
int total_lines=0,white_spaces=0,tab_meta_character=0,total_characters=0;
%}
%%
[\n] {total_lines++;}
[" "] {white_spaces++;}
[\t] {tab_meta_character++;}
[^\t\n" "] {total_characters++;}
%%
void main(){
printf("Enter the sentence: ");
yylex(); // takes the input and pass on to the rule section.
printf("number of lines : %d\n",total_lines);
printf("number of spaces : %d\n",white_spaces);
printf("total characters : %d\n",total_characters);
printf("number of tabs : %d\n",tab_meta_character);
}
int yywrap(){
return 1;
}
Output:
Problem Statement 2:
Design a LEX Code to identify and print valid identifier of C/C++ in given input pattern.
Source code:
%{
#include <stdio.h>
%}
%%
^[a-zA-Z_][a-zA-Z0-9_]* printf("Valid Identifier\n");
^[^a-zA-Z_] printf("Invalid Identifier\n");
. ; /* Ignore other characters */
%%
void main() {
printf("Enter any identifier you want to check: \n");
yylex();
}
int yywrap(){
return 1;
}
Output:
Problem Statement 3:
Design a LEX Code to identify and print integer and float value in a given Input pattern.
Source code:
%{
#include<stdio.h>
%}
%%
[0-9]+ {printf("Number entered is an integer\n");}
[0-9]*.[0-9]+ {printf("Number entered is an floating point\n");}
.* {printf("wrong number entered");}
%%
void main(){
printf("Enter a number for validation: ");
yylex();
}
int yywrap(){
return 1;
}
Output:
Problem Statement 4:
Design a LEX Code for tokenizing {Identify and print OPERATORS, SEPARATORS, KEYWORDS,
IDENTIFIERS}.
Source code:
%{
#include<stdio.h>
%}
%%
[-+*/=<>!&|]+ {printf("operator detected\n");}
auto|break|case|char|const|continue|default|do|double|else|enum|extern|float|for|goto|if|int|lo
ng|register|return|short|signed|sizeof|static|struct|switch|typedef|union|unsigned|void|volatile|
while {printf("Keyword detected\n");}
void main(){
printf("Enter something: ");
yylex();
}
int yywrap(){
return 1;
}
Output:
Problem Statement 5:
Design a LEX Code to count and print the number of total characters, words, white spaces in
given ‘Input.txt’ file.
Source code:
%{
#include<stdio.h>
int total_chars=0,total_words=0,total_whitespaces=0;
%}
%%
[" "] {total_words++,total_whitespaces++;}
[a-zA-Z] {total_chars++;}
[\n\t] {total_words++;}
%%
void main(){
yyin=fopen("question5.txt","r");
yylex();
printf("Number of characters: %d\n",total_chars);
printf("Number of words: %d\n",total_words);
printf("Number of white spaces: %d\n",total_whitespaces);
}
int yywrap(){
return 1;
}
Input:
Output:
Problem Statement 6:
Design a LEX Code to replace white spaces of ‘Input.txt’ file by a single blank character into
‘Output.txt’ file.
Source code:
%{
#include<stdio.h>
%}
%%
[\t" "]+ fprintf(yyout," ");
.|\n fprintf(yyout,"%s",yytext);
%%
int yywrap(){
return 1;
}
void main(){
yyin=fopen("question6_input.txt","r");
yyout=fopen("question6_output.txt","w");
yylex();
}
Input:
Output:
Problem Statement 7:
Design a LEX Code to remove the comments from any C-Program given at run-time and store
into ‘out.c’ file.
Source code:
%{
#include <stdio.h>
%}
%%
"/*"([^*]|"*"[^/])*"*/" {}
"//"(.*) {}
. {fprintf(yyout,"%s", yytext);}
%%
int yywrap(){
return(1);
}
Output:
Problem Statement 8:
Design a LEX Code to extract all html tags in the given HTML file at run time and store into
Text file given at run time.
Source code:
%{
#include <stdio.h>
%}
%%
"<"[^>]*">" {fprintf(yyout, "%s\n", yytext);}
.|\n ;
%%
int yywrap(){
return 1;
}
Output: