CC Lab Manual CS 20
CC Lab Manual CS 20
CC Lab Manual CS 20
Lab ManuaL
TabLe of conTenT
Lab # 1
Basics of Compiler Construction:
A compiler translates the code written in one language to some other language without
changing the meaning of the program. It is also expected that a compiler should make the
target code efficient and optimized in terms of time and space.
Compiler design principles provide an in-depth view of the translation and optimization
process. Compiler design covers basic translation mechanisms and error detection & and
recovery. It includes lexical, syntax, and semantic analysis as the front end, and code
generation and optimization as back end.
Compiler Design:
Computers are a balanced mix of software and hardware. Hardware is just a piece of
mechanical device, and its functions are controlled by compatible software. Hardware
understands instructions in the form of electronic charge, which is the counterpart of binary
language in software programming. Binary language has only two alphabets, 0 and 1. To
instruct, the hardware codes must be written in binary format, which is simply a series of
1s and 0s.
Language Processing System:
We have learnt that any computer system is made of hardware and software. The
hardware understands a language, which humans cannot understand. So we write programs
in high-level language, which is easier for us to understand and remember. These programs
are then fed into a series of tools and OS components to get the desired code that can be
used by the machine. This is known as Language Processing System.
Task of the Compiler:
The design of a particular compiler determines which (if any) intermediate language
programs actually appear as concrete text or data structures during compilation. Any
compilation can be broken down into two major tasks:
Compiler Structure:
The decomposition of any problem identifies both tasks and data structures. For example,
in Section 1.2 we discussed the analysis and synthesis tasks. We mentioned that the
analyzer converted the source program into an abstract representation and that the
synthesizer obtained information from this abstract representation to guide its construction
of the target algorithm. Modules fall into a spectrum with single procedures at one end and
simple data objects at the other. Four points on this spectrum are important for our
purposes:
Procedure: An abstraction of a single "memoryless" action (i.e. an action with no
internal state). It may be invoked with parameters and its e
etc. depends only upon the parameter values. (Example {A procedure to calculate
the square root of a real value.)
Package: An abstraction of a collection of actions related by a common internal
state. The declaration of a package is also its instantiation, and hence only one
instance is possible. (Example {The analysis or structure tree module of a compiler.)
Abstract data type: An abstraction of a data object on which a number of actions
can be performed. A declaration is separate from instantiation, and hence many
instances may exist. (Example {A stack abstraction providing the operations push,
pop, top, etc.)
Variable: An abstraction of a data object on which exactly two operations, fetch
and store, can be performed. (Example {An integer variable in most programming
languages.)
Introduction of C++:
C++ is a high-level, general-purpose programming language created by Danish computer
scientist Bjarne Structure. First released in 1985 as an extension of the C programming language,
it has since expanded significantly over time; as of 1997, C++ has object-oriented, generic,
and functional features, in addition to facilities for low-level memory manipulation.
C++ was designed with systems programming and embedded, resource-constrained software
and large systems in mind, with performance, efficiency, and flexibility of use as its design
highlights. C++ has also been found useful in many other contexts, with key strengths being
software infrastructure and resource-constrained applications. including desktop
applications, video games, servers (e.g. e-commerce, web search, or databases), and performance-
critical applications (e.g. telephone switches or space probes).
Language:
The C++ language has two main components: a direct mapping of hardware features provided
primarily by the C subset, and zero-overhead abstractions based on those mappings. Structure
describes C++ as "a light-weight abstraction programming language [designed] for building and
using efficient and elegant abstractions";[14] and "offering both hardware access and abstraction is
the basis of C++. Doing it efficiently is what distinguishes it from other languages.
C++ inherits most of C's syntax. The following is Bjarne Structure’s version of the Hello World
program that uses the C++ Standard Library stream facility to write a message to standard output.
Standard library:
The C++ standard consists of two parts: the core language and the standard library. C++
programmers expect the latter on every major implementation of C++; it includes aggregate types
(vectors, lists, maps, sets, queues, stacks, arrays, tuples), algorithms (find, for each, binary search,
random shuffle, etc.), input/output facilities (iostream, for reading from and writing to the console
and files), filesystem library, localization support, smart pointers for automatic memory
management, regular expression support, multi-threading library, atomics support (allowing a
variable to be read or written to buy at most one thread at a time without any external
synchronization), time utilities (measurement, getting current time, etc.), a system for converting
error reporting that does not use C++ exceptions into C++ exceptions, a random number
generator and a slightly modified version of the C standard library (to make it comply with the
C++ type system).
Benefits of C++ Programming:
There are many reasons to use the C++ programming language to develop software and
applications. Outside of its popularity and wide range of use cases. We highlight just a few of the
many advantages and benefits of using C++ below.
Given that C++ is a derivative or extension of C and, in fact, uses practically all of the C syntax,
it is no surprise that C++ is highly compatible with C. If there is a valid C application, it is
compatible with C++ by its very nature.
As one of the oldest, most popular programming languages, C++ enjoys a vast community of
developers. A large portion of that community helps add to the continued functionality of C++ by
creating libraries that extend the language’s natural abilities.
C++ is platform-independent and portable, meaning it can run on any operating system or
machine. Because of this, programs developers create in C++ will not be limited to a single OS
environment or require any further coding to work on other operating systems. This increases the
programmer’s audience reach and limits the amount of iterations of an application a coder will
have to make.
Memory Management:
With C++, developers have complete control over memory management, which technically
counts as both an advantage and a disadvantage of programming in C++. It is a bonus to developers
because they have more control over memory allocation. It is a negative in that the coder must be
responsible for the management of memory, instead of giving that task to a garbage collector, as
other programming languages do.
Embedded Systems:
Scalability:
Scalability – the ability to scale an application to serve varying degrees of users or purposes – is
an important element to any modern piece of software. C++ programs are highly scalable and
capable of serving small amounts of data or large amounts of information.
C++ has the Standard Temple Library (STL) that provides libraries – or functions – of pre-
written code that developers can use to save time instead of writing common functionality in their
software. These libraries make coders more efficient, type fewer lines of code, write quicker
programs, and avoid errors.
C++ is a fast programming language in terms of both execution and compilation. The speed for
both is much quicker than with other general-purpose development languages. Further, C++ excels
at concurrency, which is important for critical mass servers, web servers, databases, and other
types of servers.
Department of Computer science
University of Engineering and Technology, Lahore (Narowal Campus)
8
Lab # 2
Lexical Analyzer:
A lexical analyzer, often referred to as a Lexar or scanner, is a crucial component of a compiler
or interpreter. Its primary function is to read the source code of a program and break it down into
smaller units called tokens. These tokens are the basic building blocks of a programming language
and represent the smallest meaningful units in the code, such as keywords, identifiers, literals, and
operators.
It converts the High-level input program into a sequence of Tokens.
Lexical Analysis can be implemented with the Deterministic finite Automata.
The output is a sequence of tokens that is sent to the parser for syntax analysis
What is a token? A lexical token is a sequence of characters that can be treated as a unit in the
grammar of the programming languages. Example of tokens:
Type token (id, number, real, . . . )
Punctuation tokens (IF, void, return, . . . )
Alphabetic tokens (keywords)
Keywords; Examples-for, while, if etc.
Identifier; Examples-Variable name, function name, etc.
Operators; Examples '+', '++', '-' etc.
Separators; Examples ',' ';' etc
Example of Non-Tokens:
Comments, preprocessor directive, macros, blanks, tabs, newline, etc.
Lexeme: The sequence of characters matched by a pattern to form the corresponding token or a
sequence of input characters that comprises a single token is called a lexeme. eg- “float”,
“abs_zero_Kelvin”, “=”, “-”, “273”, “;” .
1. Input preprocessing: This stage involves cleaning up the input text and preparing it for
lexical analysis. This may include removing comments, whitespace, and other non-essential
characters from the input text.
2. Tokenization: This is the process of breaking the input text into a sequence of tokens. This
is usually done by matching the characters in the input text against a set of patterns or regular
expressions that define the different types of tokens.
3. Token classification: In this stage, the lexer determines the type of each token. For
example, in a programming language, the lexer might classify keywords, identifiers,
operators, and punctuation symbols as separate token types.
4. Token validation: In this stage, the lexer checks that each token is valid according to the
rules of the programming language. For example, it might check that a variable name is a
valid identifier, or that an operator has the correct syntax.
5. Output generation: In this final stage, the lexer generates the output of the lexical analysis
process, which is typically a list of tokens. This list of tokens can then be passed to the next
stage of compilation or interpretation.
Advantages:
Efficiency
Flexibility
Error Detection
Recognition of operators/variables:
In a lexical analyzer or lexer, the recognition of operators and variables involves identifying and
categorizing sequences of characters in the source code into tokens that represent operators and
variables. Here's how this recognition typically works:
Operators Recognition:
Operators are symbols in the source code that represent various operations, such as addition,
subtraction, multiplication, and comparison. To recognize operators, the lexer typically follows
these steps:
It scans the source code character by character.
When it encounters a character that could be the start of an operator (e.g., "+", "-",
"*", "="), it begins forming a token.
The lexer continues scanning characters until it determines that the token is
complete or until it encounters a character that cannot be part of the operator.
Once the lexer identifies the operator token, it categorizes it as an operator token
and records the specific operator it represents (e.g., "+" for addition, "=" for
assignment).
Variables Recognition:
Variables are names given to values or objects in the program. To recognize variables, the lexer
generally follows these steps:
It scans the source code character by character.
When it encounters a character that could be the start of a variable name (e.g., a letter or
an underscore), it begins forming a token.
The lexer continues scanning characters until it determines that the token is complete or
until it encounters a character that cannot be part of a variable name (e.g., a space or
punctuation).
Once the lexer identifies the variable token, it categorizes it as a variable token and records
the variable's name.
Here's an example in C++:
Recognition of keywords/constants:
Recognition of keywords and constants in a lexical analyzer follows a similar pattern to the
recognition of operators and variables. Let's break down how the lexer typically identifies
keywords and constants in the source code:
Keywords Recognition:
Keywords are reserved words in the programming language that have a special meaning.
Examples of keywords include "if," "while," "int," "for," "return," etc. To recognize keywords, the
lexer generally follows these steps:
1. It scans the source code character by character, just like in the case of operators and
variables.
2. When it encounters a character that could be the start of a keyword (e.g., a letter), it begins
forming a token.
3. The lexer continues scanning characters until it determines that the token is complete or
until it encounters a character that cannot be part of a keyword (e.g., a space or
punctuation).
4. Once the lexer identifies the keyword token, it categorizes it as a "Keyword" token and
records the specific keyword it represents (e.g., "if" for conditional statements, "int" for
declaring integer types).
Here's an example in C++:
Constants Recognition:
Constants in a programming language can be numeric literals (integers or floating-point
numbers), string literals, character literals, or other special values. To recognize constants, the
lexer typically follows these steps:
1. It scans the source code character by character, similar to the previous cases.
2. When it encounters a character that could be the start of a constant (e.g., a digit for numeric
literals or a quote for string literals), it begins forming a token.
3. The lexer continues scanning characters until it determines that the token is complete or
until it encounters a character that cannot be part of the constant (e.g., a space or
punctuation).
4. Once the lexer identifies the constant token, it categorizes it as a "Constant" token and
records the specific value it represents (e.g., "42" for an integer literal, "Hello, World!" for
a string literal).
Here's an example in Python:
CODE
C++ code to generate the lexical analyzer with the Recognition of operators/variables.
C++ code to generate the lexical analyzer with the Recognition of keywords/constants.
Activity
Generate the Calculator which also show the previous value.
Lab # 3
Input Buffering in Compiler Design:
Lexical Analysis has to access secondary memory each time to identify tokens. It is time-
consuming and costly. So, the input strings are stored into a buffer and then scanned by
Lexical Analysis.
Lexical Analysis scans input string from left to right one character at a time to identify
tokens. It uses two pointers to scan tokens −
Begin Pointer (bptr) − It points to the beginning of the string to be read.
Look Ahead Pointer (lptr) − It moves ahead to search for the end of the token.
Example − For statement int a, b;
Both pointers start at the beginning of the string, which is stored in the buffer.
The character ("blank space") beyond the token ("int") have to be examined before
the token ("int") will be determined.
After processing token ("int") both pointers will set to the next token ('a'), & this
process will be repeated for the whole program.
A buffer can be divided into two halves. If the look Ahead pointer moves towards
halfway in First Half, the second half is filled with new characters to be read. If the look
Ahead pointer moves towards the right end of the buffer of the second half, the first half
will be filled with new characters, and it goes on.
Sentinels − Sentinels are used to making a check, each time when the forward pointer is
converted, a check is completed to provide that one half of the buffer has not converted
off. If it is completed, then the other half should be reloaded.
Buffer Pairs − A specialized buffering technique can decrease the amount of overhead,
which is needed to process an input character in transferring characters. It includes two
buffers, each includes N-character size which is reloaded alternatively.
There are two pointers such as the lexeme Begin and forward are supported. Lexeme
Begin points to the starting of the current lexeme which is discovered. Forward scans ahead
before a match for a pattern are discovered. Before a lexeme is initiated, lexeme begin is
set to the character directly after the lexeme which is only constructed, and forward is set
to the character at its right end.
Code
Code to build a compiler using the buffer.
Lab # 4
Symbol Table
Symbol table is an important data structure created and maintained by compilers in order
to store information about the occurrence of various entities such as variable names,
function names, objects, classes, interfaces, etc. Symbol table is used by both the analysis
and the synthesis parts of a compiler.
A symbol table may serve the following purposes depending upon the language in hand:
A symbol table is simply a table that can be either linear or a hash table. It maintains an
entry for each name in the following format:
For example, if a symbol table has to store information about the following variable
declaration:
Implementation:
If a compiler is to handle a small amount of data, then the symbol table can be implemented
as an unordered list, which is easy to code, but it is only suitable for small tables. A symbol
table can be implemented in one of the following ways:
Among all, symbol tables are mostly implemented as hash tables, where the source code
symbol itself is treated as a key for the hash function and the return value is the information
about the symbol.
Operations:
A symbol table, either linear or hash, should provide the following operations.
insert()
This operation is more frequently used by analysis phase, i.e., the first half of the compiler
where tokens are identified and names are stored in the table. This operation is used to add
information in the symbol table about unique names occurring in the source code. The
format or structure in which the names are stored depends upon the compiler in hand.
An attribute for a symbol in the source code is the information associated with that symbol.
This information contains the value, state, scope, and type about the symbol. The insert()
function takes the symbol and its attributes as arguments and stores the information in the
symbol table.
For example:
int a;
insert(a, int);
lookup()
The format of lookup() function varies according to the programming language. The basic
format should match the following:
lookup(symbol)
This method returns 0 (zero) if the symbol does not exist in the symbol table. If the symbol
exists in the symbol table, it returns its attributes stored in the table.
Scope Management:
A compiler maintains two types of symbol tables: a global symbol table which can be
accessed by all the procedures and scope symbol tables that are created for each scope in
the program.
To determine the scope of a name, symbol tables are arranged in hierarchical structure as
shown in the example below:
...
int value=10;
void pro_one()
{
int one_1;
int one_2;
{ \
int one_3; |_ inner scope 1
int one_4; |
} /
int one_5;
{ \
int one_6; |_ inner scope 2
int one_7; |
} /
}
void pro_two()
{
int two_1;
int two_2;
{ \
int two_3; |_ inner scope 3
int two_4; |
} /
int two_5;
}
...
The global symbol table contains names for one global variable (int value) and two
procedure names, which should be available to all the child nodes shown above. The names
mentioned in the pro_one symbol table (and all its child tables) are not available for
pro_two symbols and its child tables.
Code
Code to Create the Symbol Table.
#include <iostream>
#include <string>
#include <map>
Global
};
private:
std::map<std::string, SymbolInfo> table;
};
int main() {
SymbolTable symbolTable;
return 0;
}
Lab # 5
Parsing:
Parsing is the process of converting information from one type to another. The parser is
a component of the translator in the organization of linear text structure according to a set
of defined rules known as grammar.
Bottom-up parser
Top-down parser
Top-Down Parser:
A top-down parser in compiler design can be considered to construct a parse tree for an
input string in preorder, starting from the root. It can also be considered to create a leftmost
derivation for an input string. The leftmost derivation is built by a top-down parser. A top-
down parser builds the leftmost derivation from the grammar’s start symbol. Then it
chooses a suitable production rule to move the input string from left to right in sentential
form.
Consider the lexical analyzer’s input string ‘acb’ for the following grammar by using
left leftmost deviation.
S->aAb
A->cd|c
Code
Code for top-down parsing.
#include <iostream>
#include <cctype>
#include <stdexcept>
#include <memory>
class TreeNode {
public:
};
public:
if (op == '+') {
if (rightValue == 0) {
} else {
}
Department of Computer science
University of Engineering and Technology, Lahore (Narowal Campus)
38
std::cout << std::string(depth * 2, ' ') << "Binary Operator: " << op << std::endl;
left->display(depth + 1);
right->display(depth + 1);
private:
char op;
std::shared_ptr<TreeNode> left;
std::shared_ptr<TreeNode> right;
};
public:
return value;
std::cout << std::string(depth * 2, ' ') << "Number: " << value << std::endl;
private:
double value;
};
class Parser {
public:
std::shared_ptr<TreeNode> parse() {
return result;
private:
std::shared_ptr<TreeNode> expression() {
char op = input[position++];
return left;
}
Department of Computer science
University of Engineering and Technology, Lahore (Narowal Campus)
40
std::shared_ptr<TreeNode> term() {
char op = input[position++];
return left;
std::shared_ptr<TreeNode> factor() {
position++;
position++;
return result;
} else {
} else if (isdigit(input[position])) {
size_t end;
position += end;
return std::make_shared<NumberNode>(number);
} else {
size_t position;
};
int main() {
std::string input;
try {
Parser parser(input);
parseTree->display(0);
return 0;
Lab # 6
Parsing:
Parsing is the process of converting information from one type to another. The parser is
a component of the translator in the organization of linear text structure according to a set
of defined rules known as grammar.
Bottom-up parser
Top-down parser
Bottom-Up Parser:
Bottom-up Parsers / Shift Reduce Parsers Build the parse tree from leaves to root.
Bottom-up parsing can be defined as an attempt to reduce the input string w to the start
symbol of grammar by tracing out the rightmost derivations of w in reverse.
Eg.
A general shift reduce parsing is LR parsing. The L stands for scanning the input from
left to right and R stands for constructing a rightmost derivation in reverse.
Benefits of LR parsing:
1. Many programming languages using some variations of an LR parser. It should be
noted that C++ and Perl are exceptions to it.
2. LR Parser can be implemented very efficiently.
3. Of all the Parsers that scan their symbols from left to right, LR Parsers detect syntactic
errors, as soon as possible.
CODE
Code for Bottom-up Parser with the help of DFA.
#include <iostream>
#include <string>
#include <vector>
#include <stdexcept>
if (isdigit(inputSymbol)) {
return State::NUMBER;
}
return State::ERROR;
default:
return State::ERROR;
}
}
if (currentState == State::ERROR) {
break;
}
parseStack.push_back(inputSymbol);
}
if (currentState != State::NUMBER) {
parseStack.clear(); // Clear the stack if parsing fails
}
return parseStack;
}
int main() {
std::string inputString;
std::cout << "Enter an arithmetic expression: ";
std::cin >> inputString;
if (parseStack.empty()) {
std::cout << "Error: Invalid expression" << std::endl;
} else {
std::cout << "Parse Stack:" << std::endl;
for (char symbol : parseStack) {
std::cout << symbol << std::endl;
}
std::cout << "Input String Length: " << inputString.length() << std::endl;
}
return 0;
}
Lab # 7
FIRST:
We saw the need of backtrack in the previous article of on Introduction to Syntax
Analysis, which is really a complex process to implement. There can be easier way to
sort out this problem:
If the compiler would have come to know in advance, that what is the “first character
of the string produced when a production rule is applied”, and comparing it to the current
character or token in the input string it sees, it can wisely take decision on which
production rule to apply.
Example:
Let’s take the same grammar from the previous article:
S -> cAd
A -> bc|a
And the input string is “cad”.
Thus, in the example above, if it knew that after reading character ‘c’ in the input string
and applying S->cAd, next character in the input string is ‘a’, then it would have ignored
the production rule A->bc (because ‘b’ is the first character of the string produced by this
production rule, not ‘a’ ), and directly use the production rule A->a (because ‘a’ is the
first character of the string produced by this production rule, and is same as the current
character of the input string which is also ‘a’).
FOLLOW:
The parser faces one more problem. Let us consider below grammar to understand this
problem.
A -> aBb
B -> c | ε
And suppose the input string is “ab” to parse.
As the first character in the input is a, the parser applies the rule A->aBb.
A
/| \
a B b
Now the parser checks for the second character of the input string which is b, and the
Non-Terminal to derive is B, but the parser can’t get any string derivable from B that
contains b as first character.
But the Grammar does contain a production rule B -> ε, if that is applied then B will
Department of Computer science
University of Engineering and Technology, Lahore (Narowal Campus)
50
vanish, and the parser gets the input “ab”, as shown below. But the parser can apply it
only when it knows that the character that follows B in the production rule is same as the
current character in the input.
Example:
In RHS of A -> aBb, b follows Non-Terminal B, i.e. FOLLOW(B) = {b}, and the
current input character read is also b. Hence the parser applies this rule. And it is able to
get the string “ab” from the given grammar.
A A
/ | \ / \
a B b => a b
|
ε
So FOLLOW can make a Non-terminal vanish out if needed to generate the string from
the parse tree.
The conclusions is, we need to find FIRST and FOLLOW sets for a given grammar so
that the parser can properly apply the needed rule at the correct position.
In the next article, we will discuss formal definitions of FIRST and FOLLOW, and
some easy rules to compute these sets.
CODE
Implementation of the first and follow algorithm.
#include <iostream>
#include <set>
#include <map>
#include <vector>
#include <string>
struct Production {
char nonTerminal;
string production;
};
set<char> firstSet;
if (production.nonTerminal == nonTerminal) {
// If the first symbol of the production is a terminal, add it to the First set
firstSet.insert(production.production[0]);
// If the first symbol of the production is a non-terminal, calculate its First set
firstSet.insert(subFirst.begin(), subFirst.end());
first[nonTerminal] = firstSet;
return firstSet;
set<char> followSet;
if (nonTerminal == productions[0].nonTerminal) {
followSet.insert('$');
if (pos != string::npos) {
if (remaining.empty()) {
if (production.nonTerminal != nonTerminal) {
followSet.insert(subFollow.begin(), subFollow.end());
} else {
followSet.insert(subFirst.begin(), subFirst.end());
if (subFirst.find('e') == subFirst.end()) {
nullable = false;
break;
followSet.insert(symbol);
nullable = false;
break;
} else {
followSet.insert(symbol);
nullable = false;
break;
if (nullable) {
followSet.insert(subFollow.begin(), subFollow.end());
follow[nonTerminal] = followSet;
return followSet;
int main() {
vector<Production> productions = {
{'S', "AB"},
{'A', "aA"},
{'A', "e"},
{'B', "bB"},
{'B', "c"}
};
first[production.nonTerminal].insert(production.production[0]);
return 0;
Lab # 8
Semantic Analysis:
Semantic Analysis is the third phase of Compiler. Semantic Analysis makes sure
that declarations and statements of program are semantically correct. It is a collection of
procedures which is called by parser as and when required by grammar. Both syntax tree
of previous phase and symbol table are used to check the consistency of the given
code. Type checking is an important part of semantic analysis where compiler makes sure
that each operator has matching operands.
Semantic Analyzer:
It uses syntax tree and symbol table to check whether the given program is semantically
consistent with language definition. It gathers type information and stores it in either
syntax tree or symbol table. This type of information is subsequently used by compiler
during intermediate-code generation.
Semantic Errors:
Errors recognized by semantic analyzer are as follows:
Type mismatch
Undeclared variables
Reserved identifier misuse
Example:
float x = 10.1;
float y = x*30;
In the above example integer 30 will be type casted to float 30.0 before multiplication,
by semantic analyzer.
CODE
Implementation of semantic analyzer.
#include <iostream>
#include <string>
#include <unordered_set>
#include <unordered_map>
#include <sstream>
// Add the variable to the symbol table and associate it with its type
declaredVariables.insert(name);
variableTypes[name] = type;
}
int main() {
std::string input;
std::cout << "Enter 'declare <type> <name>' to declare a variable, or 'check <name>' to
check if a variable is declared: ";
std::getline(std::cin, input);
std::istringstream iss(input);
std::string action;
iss >> action;
if (action == "declare") {
std::string type, name;
return 0;
}
Output:
Lab # 9
Shift Reduce parser:
Shift Reduce parser attempts for the construction of parse in a similar manner as
done in bottom-up parsing i.e. the parse tree is constructed from leaves(bottom) to the
root(up). A more general form of the shift-reduce parser is the LR parser.
This parser requires some data structures i.e.
An input buffer for storing the input string.
A stack for storing and accessing the production rules.
Basic Operations:
Shift: This involves moving symbols from the input buffer onto the stack.
Reduce: If the handle appears on top of the stack then, its reduction by using the
appropriate production rule is done i.e. RHS of a production rule is popped out of a
stack and LHS of a production rule is pushed onto the stack.
Accept: If only the start symbol is present in the stack and the input buffer is empty
then, the parsing action is called accept. When accepted action is obtained, it means
successful parsing is done.
Error: This is the situation in which the parser can neither perform shift action nor
reduce action and not even accept action.
CODE
Implementation of shift-reduce parser.
// Including Libraries
#include <bits/stdc++.h>
using namespace std;
// Global Variables
int z = 0, i = 0, j = 0, c = 0;
//printing action
printf("\n$%s\t%s$\t", stk, a);
}
}
Department of Computer science
University of Engineering and Technology, Lahore (Narowal Campus)
65
// Driver Function
int main()
{
printf("GRAMMAR is -\nE->2E2 \nE->3E3 \nE->4\n");
// a is input string
strcpy(a,"32423");
// Printing action
printf("\n$%s\t%s$\t", stk, a);