COME4107 Course Outline
COME4107 Course Outline
COURSE SYLLABUS/OUTLINE*
Faculty/School: NAHPI Department: Computer Engineering (COME).
Option/Specialty: B.Eng. Year: 2021/2022
Course Code: COME4107 Course Title: FORMAL LANGUAGE THEORY & COMPILER
CONSTRUCTION
Credit Value: 4 Semester: One L/T/P: 20/10/10
Instructor(s): KONGNYU DEREK NDI
Hall Location: NAHPI HALL Assigned Hours: 40
Email: kongnyun@gmail.com Tel: 670077353
Description:
Objectives:
• Understand how a compiler works, specifically the analysis of a program into atomic pieces and the
subsequent synthesis into an equivalent program.
• Gain experience in the construction of a large programming project that draws upon several
previous courses.
• Gain personal responsibility such as time management and testing for a large project.
• Build a compiler for a non-trivial programming language.
• Describe the phases of compilation.
• Specify regular expressions for matching tokens in a language.
• Show the equivalence between regular expressions, NFAs, and DFAs.
• Specify and disambiguate context-free grammars.
• Specify a type system for a language including type equivalence, and use it to correctly type check
expressions in a language.
• Apply fundamentals of storage allocation strategies toward run-time management of data.
• Generate correct assembly code for simple expressions and statements in a programming language.
Content: Introduction to formal language concepts: regular expressions and context-free grammars.
Compiler organization and construction. Lexical analysis and implementation of scanners. Top-down and
bottom-up parsing and implementation of top-down parsers. An overview of symbol table arrangement,
run-time memory allocation, intermediate forms, optimization, and code generation.
• Introduction: Definition, functions of Compiler, other associated terms e.g. Text formatter, Text
Editors
• Theory of automata and their application to lexical analysis, Lexical- Analyser Generator, Lex
(Flex) – Compiler,
• Formal Grammar and their application to Syntax Analysis, BNF Notation.
• The Syntactic specification of Languages: Context Free Grammar (CFG), Derivation and Parse
Trees, Capabilities of CFG.
• Basic Parsing Techniques: Parsers, Shift Reduce Parsing, Operator precedence parsing, top down
Parsing, Predictive Parsers.
• Automatic Construction of efficient Parsers: LR Parsers, the canonical collection of LR(0) items,
constructing SLR Parsing Tables, Constructing canonical LALR parsing tables , An Automatic
Parser Generator (YACC/Bison).
• Symbol Tables: Data Structure for Symbol Tables, representing scope information. Run-Time
Administration: Implementation of simple Stack allocation scheme, storage allocation in block
structured language.
Outcomes: After completing this course, students will be able to design, codify and program a compiler
Pre-Requisites: Programming experience Mode of Instruction: Lectures/Presentations/Tutorials
Mode of Delivery: Face-to-Face Assessment and Grading Scale: CA: 30%, EXAM: 70%
Text Books, Reading Materials, and References:
1 S. Hollos and J.R. Hollos, “Finite Automata and Regular Expressions: Problems and Solutions”, Abrazol
Publishing, 2013. A collection of clever little problems and solutions relating to automata and state
machines, if you are looking for more problems to work on.
2. Software Engineering, Ian Sommerville, 7th edition, Pearson education.
Teaching Plan (Activities include Lectures, Tutorials, Individual/Group Work, CA, Quizzes)
Time Slot Description of Lesson Activity
(# hours) Topic/Sub-Topic/Activity L T P
1 2hrs Introduction: Overview of language processing system & compiler * *
structure * *
Content: What is a compiler? Why should you study compiler? What’s
the best way to learn about compiler? What language should I use?
Stages within a compiler, Examples and illustrations & exercises.
2 Chapter 1: Formal Languages and Automata Theory * *
2hrs Module 1: General Concepts * *
Unit 1: Alphabets, Strings, and Representations
Unit 2: Formal Grammars
3 Unit 3: Formal Languages * *
2hrs Unit 4: Automata Theory * *
4 Chapter 2: Compiler Toolchain * *
2 hrs Content: Stages within a compiler, example compilation, exercises * *
5 Chapter 3: Scanning * *
2hrs Content: 3.1 Kinds of Tokens, 3.2 Hand-made Scanner, 3.3 Regular * *
Expressions; 3.4 Finite Automata (a) Deterministic Finite Automat (b)
Nondeterministic Finite Automata
6 Content: 3.5 Conversion Algorithms; * *
2hrs 3.5.1 Converting REs to NFAs * *
3.5.2 Converting NFAs to DFAs
3.5.1 Minimizing DFAs
3.6 Limits of Finite State Automata
3.7 Using a Scanner Generator
3.8 Practical Considerations
3.9 Exercises
7 Chapter 4: Parsing * *
2hrs Content: * *
4.1 Overview
4.2 Context Free Grammars
4.2.1 Deriving Sentences
4.2.2 Ambiguous Grammars
8 4.3: LL Grammars * *
2hrs 4.3.1 Eliminating Left Recursion * *
4.3.2 Eliminating Common Left Prefixes
4.3.3 First and Follow Sets
4.3.4 Recursive Descent Parsing
4.3.5 Table Driven Parsing
9 2hrs 4.4: LR Grammars
4.4.1 Shift-Reduce Parsing
4.4.2 The LR(0) Automation
4.4.3 LR(1) Parsing
4.4.4 LALR Parsing
4.5 Grammar Classes Revisited
4.6 The Chomsky Hierarchy
4.7 Exercises
10 CA. * *
2hrs * *
11 Chapter 5: Parsing in Practice * *
2hrs 5.1 The Bison Parser Generator * *
5.2 Expression Validator
5.3 Expression Interpreter
5.4 Expression Trees
5.5 Exercises
12 Chapter 6: Abstract Syntax Tree * *
2hrs 6.1 Overview * *
6.2 Declarations
6.3 Statements
6.4 Expressions
6.5 Types
6.6 Putting it All Together
6.7 Building the AST
6.8 Exercises
13 Chapter 7: Semantic Analysis * *
2hrs 7.1 Overview of Types Systems * *
7.2 Designing a Type System
7.3 The B-Minor Type system
7.4 The Symbol Table
7.5 Name Resolution
7.6 Implementing Type Checking
7.7 Error Messages
7.8 Exercises
14 Chapter 8: Intermediate Representations * *
2hrs 8.1 Introduction * *
8.2 Abstract Syntax Tree
8.3 Directed Acyclic Graph
8.4 Control Flow Graph
8.5 Static Single Assignment Form
8.6 Linear IR
8.7 Stack Machine IR
15 8.8 Examples * *
2hrs 8.8.1 GIMPLE – GNU Simple Reppresentation * *
8.8.2 LLVM – Low Level Virtual Machine
8.8.3 JVM – Java Virtual Machine
8.9 Exercises
16 2hrs Chapter 9: Memory Organization
9.1 Introduction
9.2 Logical Segmentation
9.3 Heap Management
9.4. Stack Management
9.4.1 Stack Calling Convention
9.4.2 Register Calling Convention
9.5 Locating Data
9.6 Program Loading
1 *Note: This course outline/teaching plan is subject to modification by the instructor. Students will be notified of any
modifications accordingly.