3 - Grammars
3 - Grammars
3 - Grammars
Radu Prodan
SYNTACTIC ANALYSIS
CONTEXT-FREE GRAMMARS
20.03.2024 R. Prodan, Compiler Construction, Summer Semester 2024 1
Phases of a Compiler
Front-end Back-end
Source Code
Target Code
Generator
Optimiser
Optimiser
Semantic
Syntactic
Analyser
Analyser
Analyser
Lexical
Code
Literal Symbol Error
Table Table Handler
▪ Context-free grammars
▪ Ambiguous grammars
▪ Extended BNF
▪ Conclusions
▪ Context-free grammar
– Grammar rules defines programming language syntax
– Operates similar to scanner recognising regular expressions
parser
Sequence of tokens Syntax Tree
▪ Error handling
– Scanners consume incorrect characters and generate error token
▪ Error recovery
– Infer possible correct code from incorrect code and continue parsing
▪ Context-free grammars
▪ Ambiguous grammars
▪ Extended BNF
▪ Conclusions
▪ (34 – 3) * 42
– Corresponds to legal string of seven tokens
– ( number – number ) * number
▪ (34 – 3 * 42
– Not legal expression because of a missing right parenthesis
▪ Symbol set: T N
program-heading → . . .
program-block → . . .
. . .
▪ BNF form
– number → digit | digit number
– digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
▪ Repetition specified by
– * in regular expressions
– Recursion in BNF grammar rules
▪ Language generated by G
– L(G) = { w T* | S *w G }
▪ Example 1 exp
(1) exp exp op exp
(2) number op exp 2 exp 3 op 4 exp
(3) number + exp
(4) number + number
number + number
▪ Leftmost derivation
– Leftmost nonterminal is replaced at each derivation step
– Preorder numbering
20.03.2024 R. Prodan, Compiler Construction, Summer Semester 2024 13
Leftmost versus Rightmost
Derivation 1 exp
▪ Rightmost derivation
– Rightmost nonterminal is replaced at 4 exp 3 op 2 exp
each derivation step
– Postorder numbering
number + number
+
▪ AST for 3+4
3 4
▪ E → ( E )
– Nonterminals: E
– Terminals: ( and )
– L(G) =
• Missing non-recursive case (or base case)
▪ -production
– empty →
– empty →
▪ With -productions
statement → if-stmt | other
if-stmt → if ( exp ) statement else-part
else-part → else statement |
exp → 0 | 1
if-stmt
other
20.03.2024 R. Prodan, Compiler Construction, Summer Semester 2024 21
Statement Grammar AST
typedef enum { ExpK, StmtK } NodeKind;
typedef enum { Zero, One } ExpKind;
typedef enum { IfK, OtherK } StmtKind;
stmt ; stmt-sequence
s stmt ; stmt-sequence
s stmt
seq
▪ Variable number of children
s s s
seq
▪ Leftmost-child right-sibling
s s s
▪ Context-free grammars
▪ Ambiguous grammars
▪ Extended BNF
▪ Conclusions
▪ Non-associative operation
– A sequence of more than one operator is not allowed
• 34 – 3 – 42 or 34 – 3 * 42 are illegal
– Only fully parenthesized expressions are legal
• (34 – 3) – 42, 34 – (3 * 42)
20.03.2024 R. Prodan, Compiler Construction, Summer Semester 2024 29
Ambiguity Removal
▪ Replace one recursion with base case
▪ Bracketing keyword
▪ Inessential ambiguity
– Semantic does not depend on disambiguating rule
▪ Context-free grammars
▪ Ambiguous grammars
▪ Extended BNF
▪ Conclusions
▪ Optional constructs
▪ Right recursive: A → A |
– Kleene closure in regular expressions: A → *
– EBNF: A → { }
A
A → { B }
B
A
A → [ B ]
B
factor mulop
▪ EBNF expression grammar
exp → term { addop term }
addop → + | – mulop *
term → factor { mulop factor }
mulop → * ( exp )
factor → ( exp ) | number factor
number
20.03.2024 R. Prodan, Compiler Construction, Summer Semester 2024 40
Syntax Diagrams for Simplified
Grammar of If-Statements
▪ BNF grammar
statement→ if-stmt | other if-stmt
statement
if-stmt → if ( exp ) statement other
| if ( exp ) statement
else statement
exp → 0 | 1 if-stmt if ( exp )
statement
▪ EBNF grammar else statement
statement → if-stmt | other
if-stmt → if ( exp ) statement
[ else statement ] 0
exp
exp → 0 | 1
1
20.03.2024 R. Prodan, Compiler Construction, Summer Semester 2024 41
Agenda
▪ Introduction
▪ Context-free grammars
▪ Ambiguous grammars
▪ Extended BNF
▪ Conclusions
▪ Ambiguous grammars