NLP-Lesson-plan-final
NLP-Lesson-plan-final
NLP-Lesson-plan-final
Course Plan
Prerequisites:
Fundamental knowledge of Exploratory Data Analysis, and fundamentals of Neural Network
building.
1
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Course Title: Natural Language Processing with Neural Network Models Semester: 5
Course Code: 23ECSE315 Year: 2024-25
2
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Eg: 1.2.3: Represents Program Outcome ‘1’, Competency ‘2’ and Performance Indicators ‘3’.
3
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Course Content
Unit – 1
Unit – 2
3 Transformer Networks, Memory Networks
Transformer Networks and CNNs, Advanced Architectures and Memory Networks. 07 hrs
Text Books:
1. Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing, 2016.
References:
1. Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft).
2. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press.
4
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Evaluation Scheme
ISA Scheme
Assessment Conducted for Weightage in
marks Marks
ISA-1 (Theory) 30
33
ISA-2 (Theory) 30
Lab Experiments 15
17
Certification 15
Total 50
Laboratory Plan
List of Exercises
Expt./ Brief description about the experiment/job No. of
No. Lab.
Slots
1. Implement Bag of Words and TF-IDF vectorization on sample text data, and 2
visualize the feature vectors using t-SNE plots.
2. Train a Word2Vec model on the text corpus and calculate cosine similarity 2
between words.
3. Train RNN models for sentiment analysis and demonstrate the functionality of 2
LSTM and GRU networks. Demonstrate the functionality of grid-based
hyperparameter tuning and early stopping.
6. Build a 1D CNN model for text classification, experimenting with different kernel 2
sizes to analyse how the CNN captures local patterns.
7. Develop a multi-task learning model for sentiment analysis and entity recognition, 2
and analyze how multi-task learning enhances generalization.
5
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Code Modularity and Code is well-structured, Code is partially Code is not modular 2.3.1 1
Reusability (4 M) modular, and can be modular but lacks and has low
reused for different full reusability for reusability. (1-0 M)
datasets or tasks. (4 M) broader
applications. (2-3
M)
Analysis and Results are accurate, Results are mostly Results are 2.3.1 1
interpretation of results thoroughly analyzed, and accurate, but inaccurate or poorly
with justification the performance is analysis lacks analyzed, with no
justified based on depth or justification. (1-0 M)
(3 M) metrics. (3 M) justification. (2 M)
6
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
(8-10 M)
Analysis and All the parameters are Parameters are Parameters are 2.3.1 1
interpretation of results analyzed with correct analyzed with analyzed with
with justification interpretation of results moderate incorrect
and justification (4-5 interpretation of interpretation of
(5 M) M) results and results. (0-1 M)
justification (2-3 M)
Unit II
Chapter No. 3: 8 - 1.5 1.5 1 1
Transformer Networks,
Memory Networks
7
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Note
1. Each Question carries 15 marks and may consists of sub-questions.
2. Mixing of sub-questions from different chapters within a unit (only for Unit I and Unit
II) is allowed in ISA-I, II and ESA
8
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
9
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Course Code and Title: 23ECSE315 / Natural Language Processing with Neural Network Models
Chapter Number and Title: 1. Introduction to NLP and Deep Learning Planned Hours: 7 hrs
Learning Outcomes: -
At the end of the topic the student should be able to:
TLO's CO's BL CA
Code
1. Explain the basics of natural language processing and deep learning. CO1 L2 1.1
2. Describe the applications of natural language processing. CO1 L2 1.1
4. Differentiate between common bag of words model and skip gram model. CO1 L3 1.1
Lesson Schedule
Class No. - Portion covered per hour / per Class
1. Introduction to Natural Language Processing and deep learning.
2. Applications of Natural Language Processing
3. Applications of Natural Language Processing contd.
4. Word2vec introduction.
5. Word2vec introduction contd.
6. Word2vec objective function gradients.
7. Word2vec example
Review Questions
Sl. No. - Questions TLOs BL PI Code
1. Explain how terms bigram and trigram language models denote n-gram TLO1 L2 1.1.3
models with n = 2 and n = 3, respectively.
2. Word2vec is a group of related models that are used to produce word TLO1 L3 1.1.3
embeddings. These models are shallow, two-layer neural networks that are
trained to reconstruct linguistic contexts of words. Word2vec takes as its
input a large corpus of text and produces a vector space, typically of
several hundred dimensions, with each unique word in the corpus being
assigned a corresponding vector in the space.
10
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
3. Differentiate between intrinsic word vector evaluation with extrinsic word TLO3 L3 1.1.3
vector evaluation
Course Code and Title: 23ECSE315 / Natural Language Processing with Neural Network Models
Chapter Number and Title: 2. Recurrent Neural Networks, Machine Planned Hours: 8 hrs
Translation, Seq2Seq and Attention
Learning Outcomes: -
At the end of the topic the student should be able to:
4. Explain how the vanishing gradient problem is a difficulty found in CO2 L3 13.1.1
training artificial neural networks.
5. Discuss the Machine Translation CO3 L1 13.1
Lesson Schedule
Class No. - Portion covered per hour
1. Recurrent Neural Networks and Language Models.
2. Recurrent Neural Networks and Language Models contd.
3. Vanishing Gradients.
4. Fancy RNNs
5. Machine Translation.
6. Seq2Seq and Attention.
7. Seq2Seq and Attention contd.
8. Advanced Attention.
11
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Review Questions
Sr. No. - Questions TLO BL PI Code
1. Recurrent Neural Networks (RNNs) add an interesting twist to basic TLO1 L3 13.1.1
neural networks. A vanilla neural network takes in a fixed size vector as
input which limits its usage in situations that involve a ‘series’ type input
with no predetermined size. Recurrent Neural Network remembers the
past and I t’s decisions are influenced by what it has learnt from the past.
While RNNs learn similarly while training, in addition, they remember
things learnt from prior input(s) while generating output(s). Justify this
with example.
2. Illustrate vanishing gradient problem with respect to what it is, why it TLO4 L3 13.1.1
happens, and why it’s bad for RNNs.
3. Explain how the vanishing gradient problem arises when, during TLO1 L3 13.1.1
backpropogation the error signal used to train the network exponentially
decreases, the whole point of an RNN is to keep track of long-
term dependencies.
4. Differentiate between soft attention and hard attention. TLO5 L2 13.1.1
5. Justify, how sequence prediction was classically handled as a structured TLO6 L2 13.1.1
prediction task?
6. Describe how to produce the coefficients (attention vector) for blending? TLO7 L3 13.1.1
Course Code and Title: 23ECSE315 / Natural Language Processing with Neural Network Models
Chapter Number and Title: 3. Transformer Networks, Memory Networks Planned Hours:7 hrs
Learning Outcomes: -
At the end of the topic the student should be able to:
Lesson Schedule
12
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Review Questions
Sr. No. - Questions TLO BL PI Code
1. Example: “the country of my birth” computes vectors for: • the country, TLO1 L3 13.1.1
country of, of my, my birth, the country of, country of my, of my birth, the
country of my, country of my birth. What is the single layer convolution for the
above example?
2. Illustrate the core idea behind the Transformer model as self-attention TLO3 L3 13.1.1
model. With a neat diagram explain the ability to attend to different positions
of the input sequence to compute a representation of that sequence and how
transformer creates stacks of self-attention layers. Also, explain in detail
scaled dot product attention and multi-head attention in transformers.
Course Code and Title: 23ECSE315 / Natural Language Processing with Neural Network Models
13
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Learning Outcomes: -
At the end of the topic the student should be able to:
Lesson Schedule
Class No. - Portion covered per hour
1. Reinforcement Learning for NLP.
2. Reinforcement Learning for NLP contd.
3. Semi-supervised Learning for NLP.
4. Semi-supervised Learning for NLP contd.
5. Future of NLP Models.
6. Future of NLP Models contd.
7. Multi-task Learning.
8. Multi-task Learning contd.
9. QA Systems.
Review Questions
4. Illustrate how, what is learned for each task can help other tasks be TLO1 L3 13.1.1
learned better in multi task learning?
14
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
3b Explain how terms bigram and trigram language models 8 CO2 L3 13 1.1.3
denote n-gram models with n = 2 and n = 3, respectively.
16