Chapter 3
Chapter 3
Error Messages
Each represents an
The of represent .
Key concepts of Syntax Analysis
refers to the set of rules, principles, and processes that
govern the structure of sentences in a given language,
specifically word order and hierarchical structure.
:- is the process of analyzing a sequence of input tokens
(words or symbols) to determine its grammatical structure.
This structure is often represented as a parse tree or syntax
tree.
:-A formal grammar defines the syntactic rules of a
language. Common types of grammars used in parsing include
context-free grammars (CFGs) and regular grammars.
:-A parse tree is a tree representation that depicts the
syntactic structure of a string according to some formal
grammar. The tree’s nodes represent syntactic categories, and
the edges represent the relationship between these categories.
Syntax Analysis
: Syntax Tree for Assignment Statement
Symbol table
Context-free grammar (CFG)
Context-Free Grammars (CFGs) are a type of formal grammar used
to define the syntax of programming languages and other formal
languages.
are good for the of
.
Can define the languages, a strict of the
, i.e. than
Formally, a
= or a finite set
= or a finite set
= a finite set
= Start SV
:
: and/or
• Rules how to rewrite (beginning with
) into terminals
Context-free grammar (CFG)
four those
V = {S}
T = {a,b,e}
S → |aSbS | bSaS | e P=3
S=S
Notational Conventions
To avoid always having to state that " ,"
"these are the ”, and so on. we shall employ the
following with regard to
1. These are :
letters early in the alphabet such as a, b, c.
symbols such as +, - ,* , / etc.
symbols such as parentheses, comma.
0, 1, …….., 9.
strings such as or .
2. These are :
letters early in the alphabet such as
b) The , which. when it appears, is usually the
,
italic names such as or .
Notational Conventions
letters in the alphabet, chiefly , represent
strings of .
4. , it for , represent
strings or grammar . Thus, a generic could be
written as , indicating that there is a single
on the of the (
) and a symbols to the of
the ( ).
5. If are an with
on the left (we call them ), we may
the. for
.
6. Unless otherwise stated. the of the first is
the start .
Using these , we could write the of
concisely as Expr → Expr A Expr | (Expr) |id
A→ +|-|*|/|%
Derivation
Productions are treated as rules to generate a string this
process is called .
Show that a is in the ( )
• Start with the
replace of the by a side
of a
when the contains only
At step, we choose a to replace.
• This can to .
Left-most derivation
. of a sentential form is one
in which rules transforming the nonterminal are
always applied
Expr → | Expr + Expr V = {Expr}
T = {id,+,*}
Expr → | Expr * Expr P=3
Expr → |id S = Expr
String id + id * id 2+ 3 * 4
( ,
, we can transform a grammar to have this property:
For each A find the common to or
of its .
| ε
Left factoring
,
<stmt> : : = if < > then <stmt>
| if < > then <stmt> else <stmt>
| <otherstmt>
<stmt> : := <otherstmt>
means in the
equal
E→E+E E E + T/T
E
E →id T Id E + T
String id + id + id
E + T Id
T Id
Id
Resolving ambiguity
concerned about the priority of operators
get the
get from
E + T
T T * F
F F Id 14 = 2 + (3 * 4)
Id Id
Resolving ambiguity
concerned about the priority of
operators
E → | E or E E → | E or T/T
E → | E and E T → | T and F/F
E → |not E F → |not F/true/false
String id or id and (not id)
E
T T and F
F F not
:
• Starts at the of tree and fills in
• Picks a and tries to the input
• Some grammars are ( )
:
• Starts at the and fills in
• Up to a valid for
• Uses a to store both and forms
Top-Down Parser
A tries to create a from the
the leafs input from to
It can be also as finding a for an
input
E E E E E E
E -> TE’ lm lm lm lm lm
E’ -> +TE’ | Ɛ T E’ T E’ T E’ T E’ T E’
T -> FT’ F T’ F T’ F T’ F T’ + T E’
T’ -> *FT’ | Ɛ
F -> (E) | id id id Ɛ id Ɛ F T’ Ɛ
id
* F T’
id
Ɛ
Top-down parsing
Bottom-up parser
for an input string at the
(the ) and working the (the top)
T -> T * F | F id F id
F id
F T*F T
F -> id
id id F id
F id T*F
id
id F id
id
Bottom-up parsing
Top-down parser
()
()
is set of that derived from
• If then is also in
• In when we have , if and
are then we can
by looking at the
for any , is set of a that
can in some form
we have for some and then is in
First() Follow()
S Bb/Cd {a, b, c, d } {$}
B aB/Ɛ {a, Ɛ} {b}
C cC/Ɛ {c, Ɛ} {d}
Construction of predictive parsing table
• For each in do the
:
1. For a in add in
is in then for each in
to .
is in and $ is in , add
to
as well
the , there is
in then set to
Example
• For the
First() Follow()
E TE’ {Id,(} {$, )}
E’ + TE’/Ɛ {+, Ɛ} {$, )}
T FT’ {Id, (} {+, $, )}
T' *
FT’/Ɛ {*, Ɛ} {+, $, )}
F Id/(E) {Id, (} {*, +, $, )}
Construction of predictive parsing table
Id + * ( ) $
E E TE’ E TE’
E’ E’ + TE’ E’ Ɛ E’ Ɛ
T T FT’ T FT’
T' T’ Ɛ T’ *FT’ T’ Ɛ T’ Ɛ
F F Id F (E)
Construction of predictive parsing table
E
$E Id + Id * Id$ T E’
$E’T Id + Id * Id$ E TE’
T’ + T
F E’
$E’T’F Id + Id * Id$ T FT’
id T’
$E’T’ Id Id + Id * Id$ F Id Ɛ F
$E’T’ + Id * Id$ id
$E’ + Id * Id$ T’ Ɛ
$E’T + + Id * Id$ E’ + TE’
$E’T Id * Id$
$E’T’F Id * Id$ T FT’
$E’T’ Id Id * Id$ F Id
$E’T’ * Id$
Construction of predictive parsing table
E
$E’T’F * * Id$ T’ *FT’ T E’
$E’T’F Id$ F T’ + T E’
$E’T’ Id Id$ F Id
id T’ Ɛ
$E’T’ $ Ɛ F
$E’ $ T’ Ɛ id * F T’
$ $ E’ Ɛ
id Ɛ
Characteristics of LL(1) Grammar
Left-to-Right Scanning The parser read the input from
left to right.
Leftmost Derivation: The parser constructs the leftmost
derivation of the sentence.
One Lookahead Token: The parser uses one token of
lookahead to make parsing decisions.
No Left Recursion: The grammar should not have left
recursion, as it can lead to infinite recursion in the
parser.
Non-Ambiguous: Each production should be
unambiguously chosen based on the current non-
terminal and lookahead token.
Bottom Up
Bottom Up parsing
parsing
Bottom Up parsing
Bottom up parser is constructs parse tree from the bottom to the
top i.e. leave to root
It is the process of reducing the string to the starting symbols of
the grammar
• It construct the parser tree in the reverse which means it uses
reverse Right most derivative (RMD) in reducing the input string
• The popular bottom up parser is LR parser
• The main objective of bottom up parser is construct a parser tree
starting from input string and proceed upward to generate the
starting symbol of grammar
• Steps
Parsing start with input string
Scan input left to right
Detect the right handle
Apply production rule to reduce the handle
Procedure continue until drives the starting symbol
Shift-reduce parser
• The idea is to some of input to the
until a can be
, a specific the
of a is by the at the of
the
during parsing are about
and about
• A is a of a in a
of a parser is to a in
: that means
Types of Bottom up parser
Shift-reduce parser
• A is a that the of a
and whose represents one
the of a
Shift-reduce parser
is used to
always on of the
:
Shift-reduce parser
$ Id + id*id$ shift
$id +id*id$ reduce by Eid
$E +id*id$ shift
$E+ id*id$ shift
$E + id *id$ reduce by Eid
$E + E *id$ shift
$E + E * id$ shift
$E + E * id $ reduce by Eid
$E + E * E $ reduce by EE*E
E reduce by EE+E
$E + E $
$E $ accept
E E
Fig. Configurations of Shift Reduce
E E Parser on id + id * id
id + id * id
Operator Precedence parsing
E EAE/ Id
A */+
b/c
These two are by ( )
E | E*E From the above example
E |E + E EEAE|id EE+E|E*E|id
E |E-E A+|* A+|*
E |E^E
The first production is not operator grammar
E | Id
but we can change it into operator grammar
Operator Precedence parsing
There are three possibility precedence
relations
a>b terminal a have high
precedence than terminal b
a<b terminal b have least
precedence than b
a=b terminal a and b have the
same precedence
E → | E or E
E → | E and E
E → |not E
E →|Id
String id or id and id
Construct of Operator Precedence Relation table
When we construct
relation table the id
E E+E
Id + have high precedence
E E*E and $ have lower
E E* Id Id -- ∙> ∙> ∙> precedence
+ <∙ ∙> <∙ ∙> It there is + + give
high precedence of the
* <∙ ∙> ∙> ∙> left b/c we applied left
Example
Id+ Id * Id$
$ <∙ <∙ <∙ accepted
associative and similar
for *
E
$ Id + Id * Id
E E
i
Id -- ∙> ∙> ∙>
+ <∙ ∙> <∙ ∙>
* <∙ ∙> ∙> ∙>
$ <∙ <∙ <∙ --
j
Construct of Operator Precedence Function table
E→|E -E
E→|E /E
E →|Id
String id or id and id
E → | E or E
E → | E and E
E →|Id
String id or id and id
Construct of Operator Precedence Function table
From Relation table