0% found this document useful (0 votes)

5 views5 pages

Document From Aditya Tripathi

This document provides an overview of compiler design, detailing the role of compilers in translating high-level programming languages into machine code. It covers key concepts such as the phases of compilation, lexical analysis, data structures used in compilers, and tools like LEX and YACC for generating lexical analyzers and parsers. Additionally, it discusses context-free grammars and their capabilities and limitations in representing programming language syntax.

Uploaded by

Aditya Tripathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views5 pages

Document From Aditya Tripathi

Uploaded by

Aditya Tripathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

UNIT-I: Introduction to Compiler Design

1. Introduction to Compiler
A compiler is a software program that translates code written in a high-level programming language (like
C, Java) into machine code (binary) that a computer can execute.
Example:
int main() {
printf("Hello, world!");
}
A compiler converts this into assembly or machine language the CPU understands.
Reference Links:
• GeeksforGeeks - Introduction to Compiler Design

• TutorialsPoint - Compiler Basics

2. Analysis of Source Program

This refers to how the compiler understands and processes the input program. It typically involves:

• Lexical Analysis: Breaking the source code into a stream of tokens.

• Syntax Analysis: Checking the grammatical structure of the token stream against the language’s
syntax rules.
• Semantic Analysis: Verifying the meaning and consistency of the code, such as type checking
and variable declaration.
Example: For code: int a = b + 1;
• Lexical Analysis: tokens → int, a, =, b, +, 1, ;
Reference Links:

• GeeksforGeeks - Phases of a Compiler

• Stanford University - Compiler Analysis Phases (PDF)

3. Phases and Passes of a Compiler

Phases of Compilation (in order):
1. Lexical Analysis
2. Syntax Analysis

3. Semantic Analysis
4. Intermediate Code Generation
5. Code Optimization

6. Code Generation
7. Symbol Table Management (interacts with all phases)
8. Error Handling (interacts with all phases)

1
Passes:
• Single Pass Compiler: Processes the source code once, typically for simpler languages.

• Multi-Pass Compiler: Processes the code in multiple stages, often separating declaration pro-
cessing from code generation, allowing for more complex optimizations and language features.

Reference Links:
• GeeksforGeeks - Phases of a Compiler
• Computer Science Stack Exchange - Difference between phases and passes of a compiler

4. Bootstrapping
Bootstrapping is a technique where a compiler for a language is written in the very language it is
intended to compile. This is common for developing new compilers or porting existing ones.
Example: Writing a C compiler using the C language itself.
Reference Links:

• Wikipedia - Bootstrapping (compilers)

• GeeksforGeeks - Bootstrapping in Compiler Design

5. Lexical Analyzers (Scanners)

The lexical analyzer (lexer or scanner) is the first phase of a compiler. It reads the stream of characters
from the source code and groups them into meaningful units called tokens.
Example: Input: int x = 5; Output Tokens: keyword(int), identifier(x), operator(=),
constant(5), delimiter(;)
Reference Links:

• GeeksforGeeks - Lexical Analysis in Compiler Design

• TutorialsPoint - Compiler Design - Lexical Analysis

6. Data Structures in Compilation

Compilers heavily rely on various data structures to manage and process information about the source
program.
• Symbol Table: Stores information about identifiers (variables, functions, classes), their types,
scope, and other attributes. It’s used across almost all phases.

• Parse Trees (Concrete Syntax Trees): A tree representation of the syntactic structure of the
input string, reflecting the derivation steps from a grammar.
• Abstract Syntax Trees (ASTs): A simplified and abstract representation of the program’s
structure, omitting concrete syntax details like parentheses, and directly representing the logical
structure.

• Hash Tables: Often used for efficient symbol table lookups.

• Directed Acyclic Graphs (DAGs): Used in optimization to represent expressions and identify
common subexpressions.
Reference Links:

• GeeksforGeeks - Symbol Table in Compiler Design

• GeeksforGeeks - Parse Tree vs Abstract Syntax Tree

2
7. LEX: Lexical Analyzer Generator
LEX (or Flex, its GNU version) is a tool that generates lexical analyzers. It takes a set of regular
expressions (patterns for tokens) and corresponding actions as input and produces C code for a lexer.
Example Lex program snippet:
%%
[0-9]+ { printf("Number found: %s\n", yytext); }
[a-zA-Z]+ { printf("Word found: %s\n", yytext); }
%%

Reference Links:
• GeeksforGeeks - Lexical Analyzer using LEX for C and C++
• TutorialsPoint - Lex & Yacc Tutorial

8. Input Buffering
To optimize the reading of source code characters, lexical analyzers often use input buffering techniques.
This avoids frequent disk I/O operations.
• Two-Buffer Scheme: Divides the input buffer into two halves. When one half is processed, the
next characters are read into the other half.
• Sentinels: A special character (sentinel) is placed at the end of each buffer half to eliminate the
need for checking the end of the buffer in every character read, speeding up the process.
Reference Links:

• GeeksforGeeks - Input Buffering in Compiler Design

• TutorialsPoint - Compiler Design - Input Buffering

9. Specification and Recognition of Tokens

• Specification of Tokens: Tokens are typically specified using regular expressions. Regular
expressions are a powerful notation for describing patterns in text.
– Examples:
∗ Keywords: int|float|char
∗ Identifiers: [a-zA-Z ][a-zA-Z0-9 ]*
∗ Integer literals: [0-9]+
• Recognition of Tokens: Regular expressions are implemented using finite automata.
– Non-deterministic Finite Automata (NFA): Can have multiple transitions for the same
input symbol or empty transitions.
– Deterministic Finite Automata (DFA): For each state and input symbol, there’s exactly
one transition. NFAs can be converted to DFAs, and DFAs are used to efficiently recognize
tokens.

Reference Links:
• GeeksforGeeks - Specification of Tokens in Compiler Design
• NPTEL - Finite Automata and Regular Expressions (Lecture on Automata Theory relevant here)

3
10. YACC: Yet Another Compiler Compiler
YACC (Yet Another Compiler Compiler, or Bison, its GNU version) is a parser generator. It takes
a context-free grammar specification as input and generates C code for a parser (syntax analyzer). This
parser builds a parse tree or an AST from the token stream provided by the lexer.
Example YACC program snippet (for a simple expression grammar):
%token NUMBER
%%
expr: expr ’+’ expr
| expr ’-’ expr
| NUMBER
;
%%
Reference Links:
• GeeksforGeeks - YACC Tutorial
• TutorialsPoint - Lex & Yacc Tutorial

11. The Syntactic Specification of Programming Languages: Context-Free

Grammars (CFG)
Context-Free Grammars (CFGs) are a formal system used to describe the syntax (structure) of
programming languages. They consist of:
• Terminals: The basic symbols of the language (e.g., tokens like int, +, if, id).
• Non-terminals: Variables representing syntactic categories (e.g., Statement, Expression, Declaration).
• Start Symbol: A special non-terminal that represents the entire program or the highest-level
syntactic category.
• Production Rules: Rules that define how non-terminals can be replaced by sequences of terminals
and other non-terminals.
Example CFG for arithmetic expressions:
• E →E+T |T
• T →T ∗F |F
• F → (E) | id
Reference Links:
• GeeksforGeeks - Context-Free Grammar (CFG) in Compiler Design
• TutorialsPoint - Compiler Design - Syntax Analysis (CFG)

12. Derivation and Parse Trees

• Derivation: A sequence of applications of production rules to derive a string of terminals from
the start symbol of a grammar. It shows how a sentence in the language is generated.
– Leftmost Derivation: Always expands the leftmost non-terminal.
– Rightmost Derivation: Always expands the rightmost non-terminal.
• Parse Trees: A graphical representation of a derivation. Each internal node is a non-terminal,
each leaf node is a terminal, and the children of a node represent the symbols on the right-hand
side of a production rule applied to the node’s non-terminal.

Example: For the expression a + b * c using the CFG above: A parse tree visually represents the
hierarchical structure:

4
[level distance=1.5cm, level 1/.style=sibling distance=2.5cm, level 2/.style=sibling distance=1.5cm,
level 3/.style=sibling distance=1cm] E child node E child node T child node F child node a child
node + child node T child node T child node F child node b child node * child node F child
node c ;

Note: This is a simplified representation of the tree for ‘b * c‘ for brevity.

Reference Links:
• GeeksforGeeks - Derivation and Parse Tree
• YouTube - Parse Tree in Compiler Design (Search for ”Parse Tree in Compiler Design” to find
relevant videos)

13. Capabilities of CFG

• What CFGs can represent:

– The recursive nature of programming language constructs (e.g., nested expressions, loops,
conditional statements, function calls).
– Hierarchical structures of programs.
– Most of the syntactic structure of typical programming languages.

• What CFGs cannot represent (limitations):

– Context-sensitive aspects: Rules that depend on the surrounding context (e.g., a variable
must be declared before use, an array index must be an integer). These are typically handled
by the semantic analysis phase.
– Agreement in numbers (e.g., number of parameters in a function call must match the defini-
tion).
– Type compatibility rules.

Reference Links:
• Stack Overflow - What are the limitations of Context-Free Grammars?
• TutorialsPoint - Compiler Design - Context-Free Grammar Limitations (often implied in the CFG
section’s examples)

Compiler Design Quantum
No ratings yet
Compiler Design Quantum
251 pages
Compiler Design Notes
No ratings yet
Compiler Design Notes
35 pages
Unit 1
No ratings yet
Unit 1
109 pages
VI-Semester Departmental Elective CY603 (C) - Autometa & Compiler Design
No ratings yet
VI-Semester Departmental Elective CY603 (C) - Autometa & Compiler Design
94 pages
Compiler Design 1
No ratings yet
Compiler Design 1
206 pages
Compiler Design Note1
No ratings yet
Compiler Design Note1
111 pages
Slides 01 - Compiler Construction - UET CS - Introduction
No ratings yet
Slides 01 - Compiler Construction - UET CS - Introduction
37 pages
Compiler Design Unit 1 SRM 21 Regulation
No ratings yet
Compiler Design Unit 1 SRM 21 Regulation
193 pages
Automata and Compiler Design: D.Rahul
No ratings yet
Automata and Compiler Design: D.Rahul
638 pages
Acd 2.1
No ratings yet
Acd 2.1
20 pages
CD Unit 1
No ratings yet
CD Unit 1
112 pages
Introduction To Compiler Design-Unit I
No ratings yet
Introduction To Compiler Design-Unit I
30 pages
Compiler Design
No ratings yet
Compiler Design
16 pages
Module 1
No ratings yet
Module 1
136 pages
Unit-I - CD R2021
No ratings yet
Unit-I - CD R2021
60 pages
Unit 3
No ratings yet
Unit 3
43 pages
Day - 1 Intro To Compilers
No ratings yet
Day - 1 Intro To Compilers
53 pages
Compiler Design
No ratings yet
Compiler Design
42 pages
Introduction To Compiler Design
No ratings yet
Introduction To Compiler Design
12 pages
CDUnit 1
No ratings yet
CDUnit 1
39 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Bedasa
No ratings yet
Bedasa
31 pages
Notes Compiler
No ratings yet
Notes Compiler
28 pages
Chapter 1
No ratings yet
Chapter 1
42 pages
CH 1
No ratings yet
CH 1
23 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
Compiler Designing
No ratings yet
Compiler Designing
12 pages
Unit 2. The Parts of A Compiler
No ratings yet
Unit 2. The Parts of A Compiler
24 pages
Unit 2. The Phases of A Compiler
No ratings yet
Unit 2. The Phases of A Compiler
23 pages
Compiler Design Unit I
No ratings yet
Compiler Design Unit I
3 pages
Introduction of Compiler Design
No ratings yet
Introduction of Compiler Design
118 pages
Chapter 1
No ratings yet
Chapter 1
40 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
Compiler RNP SP Unit 4
No ratings yet
Compiler RNP SP Unit 4
69 pages
01 IntroToCompilers
No ratings yet
01 IntroToCompilers
41 pages
Lec 1 2
No ratings yet
Lec 1 2
24 pages
Compiler Design Slide Chapter 1-6
No ratings yet
Compiler Design Slide Chapter 1-6
250 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
124 pages
Muhammad Hamza BSCS-E3-22-23 Compiler
No ratings yet
Muhammad Hamza BSCS-E3-22-23 Compiler
11 pages
English: Quarter 1 - Module 3
100% (2)
English: Quarter 1 - Module 3
13 pages
Unit 1 Slides
No ratings yet
Unit 1 Slides
49 pages
Notes Compile Complete
No ratings yet
Notes Compile Complete
117 pages
Compiler Notes - Ullman
No ratings yet
Compiler Notes - Ullman
182 pages
CC Viva Questions
0% (1)
CC Viva Questions
5 pages
Compiler Design
No ratings yet
Compiler Design
118 pages
DLP Final
No ratings yet
DLP Final
14 pages
Common Mistakes Proficiency
100% (2)
Common Mistakes Proficiency
63 pages
Compiler CH1
No ratings yet
Compiler CH1
24 pages
CSE353 Slides
No ratings yet
CSE353 Slides
76 pages
Compiler Designassignment
No ratings yet
Compiler Designassignment
15 pages
Chapter 1
No ratings yet
Chapter 1
35 pages
CSC 318 Class Notes
No ratings yet
CSC 318 Class Notes
21 pages
Compiler Assignment
No ratings yet
Compiler Assignment
6 pages
Unit 1,2 PDF
No ratings yet
Unit 1,2 PDF
31 pages
Unit 1,2 PDF
No ratings yet
Unit 1,2 PDF
31 pages
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
No ratings yet
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
35 pages
Lec00 Outline
No ratings yet
Lec00 Outline
27 pages
Compiler Construction Principles and Practice
No ratings yet
Compiler Construction Principles and Practice
15 pages
001-2022-0930 DLAPENG02 Course Book
100% (1)
001-2022-0930 DLAPENG02 Course Book
114 pages
Phrasal-Verbs-With-Get Exercises
100% (1)
Phrasal-Verbs-With-Get Exercises
2 pages
Compiler Design Quantum PDF
100% (1)
Compiler Design Quantum PDF
211 pages
An Analysis of Commissive Speech Act Used in The Vow (Pragmatics Study)
No ratings yet
An Analysis of Commissive Speech Act Used in The Vow (Pragmatics Study)
112 pages
Cs133 Group A: Compiler Construction
No ratings yet
Cs133 Group A: Compiler Construction
24 pages
Top Notch 1 Unit 7-8 Test: Name: Date
No ratings yet
Top Notch 1 Unit 7-8 Test: Name: Date
6 pages
Adjective: The Winners Institute Indore
No ratings yet
Adjective: The Winners Institute Indore
13 pages
Use of Passive and Active Voice - tcm18-117655
No ratings yet
Use of Passive and Active Voice - tcm18-117655
16 pages
Yashal Incorrect Sentences
No ratings yet
Yashal Incorrect Sentences
35 pages
Aptis Material 2
No ratings yet
Aptis Material 2
23 pages
8th Session
100% (1)
8th Session
21 pages
Catenative Verbs
No ratings yet
Catenative Verbs
4 pages
Lesson 5
No ratings yet
Lesson 5
8 pages
The Chomskian Perspective On Language Study
No ratings yet
The Chomskian Perspective On Language Study
29 pages
Graphs and Charts - Introduction. Answer Key
100% (1)
Graphs and Charts - Introduction. Answer Key
6 pages
S2S2 Short Texts Rational Cloze Error Id Tenses
No ratings yet
S2S2 Short Texts Rational Cloze Error Id Tenses
10 pages
Vocabulary Words: Abase Verb
No ratings yet
Vocabulary Words: Abase Verb
19 pages
Porphyria's Lover V Sonnet 29
100% (1)
Porphyria's Lover V Sonnet 29
2 pages
DLL English Q3 Week 1
100% (6)
DLL English Q3 Week 1
4 pages
MODALS
No ratings yet
MODALS
2 pages
Developmental Order & Meanings Communicated With Different Pronouns
No ratings yet
Developmental Order & Meanings Communicated With Different Pronouns
11 pages
Gerunds
No ratings yet
Gerunds
1 page
Uas Bahasa Inggris 201853055
No ratings yet
Uas Bahasa Inggris 201853055
3 pages
English Paper 2 Answer Scheme (Perak's PMR Trial 2012)
100% (1)
English Paper 2 Answer Scheme (Perak's PMR Trial 2012)
4 pages
Vocabulary Key
No ratings yet
Vocabulary Key
2 pages
Review Unit 9
No ratings yet
Review Unit 9
4 pages
Grammar: Health and Sickness
No ratings yet
Grammar: Health and Sickness
8 pages
Deception: Activity Type
No ratings yet
Deception: Activity Type
2 pages
Causative All Tenses
No ratings yet
Causative All Tenses
1 page
LAN4108 Possible Response Format
No ratings yet
LAN4108 Possible Response Format
2 pages
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Dive Into Sea of C
From Everand
Dive Into Sea of C
M Ashok
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Document From Aditya Tripathi

Uploaded by

Document From Aditya Tripathi

Uploaded by

UNIT-I: Introduction to Compiler Design

• TutorialsPoint - Compiler Basics

2. Analysis of Source Program

• Lexical Analysis: Breaking the source code into a stream of tokens.

• GeeksforGeeks - Phases of a Compiler

3. Phases and Passes of a Compiler

• Wikipedia - Bootstrapping (compilers)

5. Lexical Analyzers (Scanners)

• GeeksforGeeks - Lexical Analysis in Compiler Design

6. Data Structures in Compilation

• Hash Tables: Often used for efficient symbol table lookups.

• GeeksforGeeks - Symbol Table in Compiler Design

• GeeksforGeeks - Input Buffering in Compiler Design

9. Specification and Recognition of Tokens

11. The Syntactic Specification of Programming Languages: Context-Free

12. Derivation and Parse Trees

Note: This is a simplified representation of the tree for ‘b * c‘ for brevity.

13. Capabilities of CFG

• What CFGs cannot represent (limitations):

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.