NLP-Lesson-plan-final

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 16

SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Course Plan

Semester: V Year: 2024-25

Program: Bachelor of Engineering

Course Title: Natural Language Processing with Course Code: 23ECSE315


Neural Network Models
Lesson Plan Author: Shankar Biradar Date:

Checked By: Date:

Prerequisites:
Fundamental knowledge of Exploratory Data Analysis, and fundamentals of Neural Network
building.

Course Outcomes (COs):


At the end of the course the student should be able to:

i. Understand natural language processing and deep learning.


ii. Describe recurrent neural networks.
iii. Apply machine translation, seq2seq and attention.
iv. Understand transformer networks, and memory networks.
v. Illustrate reinforcement learning in natural language processing.

1
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Course Articulation Matrix: Mapping of Course Outcomes (COs) with Program


Outcomes (POs)

Course Title: Natural Language Processing with Neural Network Models Semester: 5
Course Code: 23ECSE315 Year: 2024-25

Course Outcomes (COs) / Program 1 2 3 4 5 6 7 8 9 10 11 12 13 14


Outcomes (POs)
i. Understand natural language H
processing and deep learning.

ii. Describe Recurrent Neural M M


Networks (RNN).

iii. Apply machine translation, M M


seq2seq and attention.

iv. Understand transformer networks, M M


and memory networks.

v. Illustrate reinforcement learning in M L


natural language processing.

Degree of compliance L: Low M: Medium H: High

2
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Competency addressed in the Course and corresponding Performance


Indicators
Competency Performance Indicators
1.1: Demonstrate competence in mathematics. 1.1.3- Apply numerical analysis, linear algebra,
probability & queuing theory and
statistics to solve problems
13.1: Demonstrate the knowledge required in the 13.1.1-Identify the source and type of data
domain of data engineering to develop computer required for analysis and knowledge
based solutions discovery.
13.1.2 Apply suitable data engineering techniques
or tools to achieve data consistency.

Eg: 1.2.3: Represents Program Outcome ‘1’, Competency ‘2’ and Performance Indicators ‘3’.

3
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Course Content

Course Title: Natural Language Processing with Course Code: 23ECSE315


Neural Network Models
L-T-P: 2-0-1 Credits: 3 Contact Hrs: 4hrs/week

ISA Marks: 50 ESA Marks: 50 Total Marks: 100

Teaching Hrs: 30 Practical Hrs: 28 Exam Duration: 3hr

Unit – 1

1 Introduction to NLP and Deep Learning

Introduction to Natural Language Processing, Applications of Natural Language


Processing, Word2vec introduction, Word2vec objective function gradients. 07 hrs

2 Recurrent Neural Networks, Machine Translation, Seq2Seq and Attention

Recurrent Neural Networks and Language Models, Vanishing Gradients, Fancy


RNNs, Machine Translation, Seq2Seq and Attention, Advanced Attention. 08 hrs

Unit – 2
3 Transformer Networks, Memory Networks

Transformer Networks and CNNs, Advanced Architectures and Memory Networks. 07 hrs

4 Reinforcement Learning for NLP applications

Reinforcement Learning for NLP, Semi-supervised Learning for NLP, Future of


NLP Models, Multi-task Learning and QA Systems. 08 hrs

Text Books:

1. Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing, 2016.

References:

1. Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft).
2. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press.

4
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Evaluation Scheme

ISA Scheme
Assessment Conducted for Weightage in
marks Marks
ISA-1 (Theory) 30
33
ISA-2 (Theory) 30
Lab Experiments 15
17
Certification 15
Total 50

End-Semester Assessment Scheme

Assessment Conducted for Weightage in Marks


marks
Theory 60 33
Laboratory 20 17
Total 50

Laboratory Plan
List of Exercises
Expt./ Brief description about the experiment/job No. of
No. Lab.
Slots

1. Implement Bag of Words and TF-IDF vectorization on sample text data, and 2
visualize the feature vectors using t-SNE plots.

2. Train a Word2Vec model on the text corpus and calculate cosine similarity 2
between words.

3. Train RNN models for sentiment analysis and demonstrate the functionality of 2
LSTM and GRU networks. Demonstrate the functionality of grid-based
hyperparameter tuning and early stopping.

4. Construct a Seq2Seq model with attention for language translation. 2

5. Demonstrate the functionality of a Transformer-based encoder and decoder 2


network, and visualize the attention heads and maps.

6. Build a 1D CNN model for text classification, experimenting with different kernel 2
sizes to analyse how the CNN captures local patterns.

7. Develop a multi-task learning model for sentiment analysis and entity recognition, 2
and analyze how multi-task learning enhances generalization.

5
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

In Semester Assessment (ISA) Rubrics for Practical (P) component

Parameter Excellent Good Fair PI CO

(evaluated for 10M)

Application of Techniques Appropriately and Techniques are Poor or incorrect 2.3.1 1


(3M) successfully applies applied with some application of
techniques like image minor errors but techniques with little
classification, object demonstrate an to no understanding
detection, and understanding of of core concepts.
segmentation in real- the concepts. (2 M) (1-0M)
world datasets. (3 M)

Code Modularity and Code is well-structured, Code is partially Code is not modular 2.3.1 1
Reusability (4 M) modular, and can be modular but lacks and has low
reused for different full reusability for reusability. (1-0 M)
datasets or tasks. (4 M) broader
applications. (2-3
M)

Analysis and Results are accurate, Results are mostly Results are 2.3.1 1
interpretation of results thoroughly analyzed, and accurate, but inaccurate or poorly
with justification the performance is analysis lacks analyzed, with no
justified based on depth or justification. (1-0 M)
(3 M) metrics. (3 M) justification. (2 M)

6
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

End Semester Assessment (ESA) Rubrics for Practical (P) component

Parameter Excellent Good Fair PI CO

(evaluated for 20)

Write up (5 M) Method is completely Method is correct No appropriate 2.3.1 1


correct with all with minimal errors method is used.
functions as intended (2-3 M) (0-1 M)
(4-5 M)

Implementation of Code is according to Code is moderately Code is not in 2.3.1 1


methods using coding the industrial written using standards (1-4 M)
standard (10 M) standards standards (5-7 M)

(8-10 M)

Analysis and All the parameters are Parameters are Parameters are 2.3.1 1
interpretation of results analyzed with correct analyzed with analyzed with
with justification interpretation of results moderate incorrect
and justification (4-5 interpretation of interpretation of
(5 M) M) results and results. (0-1 M)
justification (2-3 M)

Course Unitization for ISA and ESA


Theory P Component

No. of No. of No. of No. of No. of


Topics / Chapters Teachin Question Question Question Question Questions in
g Hours s in ISA-1 s in ISA-2 s in ESA s in ISA-1 ESA
Unit I
Chapter No. 1: 7 1.5 - 1.5 1 -
Introduction to NLP
and Deep Learning

Chapter No. 2: 8 1.5 - 1.5 1


Recurrent Neural 1
Networks, Machine
Translation, Seq2Seq
and Attention

Unit II
Chapter No. 3: 8 - 1.5 1.5 1 1
Transformer Networks,
Memory Networks

7
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Chapter No. 4: 7 - 1.5 1.5 1 -


Reinforcement
Learning for NLP
applications

Note
1. Each Question carries 15 marks and may consists of sub-questions.
2. Mixing of sub-questions from different chapters within a unit (only for Unit I and Unit
II) is allowed in ISA-I, II and ESA

Date:27/10/24 Head, SoCSE

8
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Course Assessment Plan

Course Title: Natural Language Processing with


Neural Network Models Code: 23ECSE315
Course outcomes (COs) Weightage Assessment Methods
in
ISA-I ISA-II LAB ESA
assessment
1. Explain natural language
20%   
processing and deep learning.
2. Describe Recurrent Neural
15%  ✓ 
Networks.
3. Apply machine translation,
25%   
Seq2Seq and Attention.
4. Explain transformer networks,
20%  ✓ 
and memory networks.
5. Illustrate reinforcement
learning. 20% 

Weightage 15% 15% 20 % 50%

9
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Chapter wise Plan

Course Code and Title: 23ECSE315 / Natural Language Processing with Neural Network Models
Chapter Number and Title: 1. Introduction to NLP and Deep Learning Planned Hours: 7 hrs

Learning Outcomes: -
At the end of the topic the student should be able to:

TLO's CO's BL CA
Code
1. Explain the basics of natural language processing and deep learning. CO1 L2 1.1
2. Describe the applications of natural language processing. CO1 L2 1.1

3. Explain word2vec model in vector space. CO1 L2 1.1

4. Differentiate between common bag of words model and skip gram model. CO1 L3 1.1

5. Illustrate word2vec objective function gradients. CO1 L3 1.1

Lesson Schedule
Class No. - Portion covered per hour / per Class
1. Introduction to Natural Language Processing and deep learning.
2. Applications of Natural Language Processing
3. Applications of Natural Language Processing contd.
4. Word2vec introduction.
5. Word2vec introduction contd.
6. Word2vec objective function gradients.
7. Word2vec example

Review Questions
Sl. No. - Questions TLOs BL PI Code
1. Explain how terms bigram and trigram language models denote n-gram TLO1 L2 1.1.3
models with n = 2 and n = 3, respectively.
2. Word2vec is a group of related models that are used to produce word TLO1 L3 1.1.3
embeddings. These models are shallow, two-layer neural networks that are
trained to reconstruct linguistic contexts of words. Word2vec takes as its
input a large corpus of text and produces a vector space, typically of
several hundred dimensions, with each unique word in the corpus being
assigned a corresponding vector in the space.
10
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

3. Differentiate between intrinsic word vector evaluation with extrinsic word TLO3 L3 1.1.3
vector evaluation

Course Code and Title: 23ECSE315 / Natural Language Processing with Neural Network Models
Chapter Number and Title: 2. Recurrent Neural Networks, Machine Planned Hours: 8 hrs
Translation, Seq2Seq and Attention

Learning Outcomes: -
At the end of the topic the student should be able to:

TLO's CO's BL CA Code


1. Explain Recurrent Neural Networks. CO2 L2 13.1.1

2. Explain the context of language models. CO2 L3 13.1.1

3. Illustrate the Fancy Recurrent Neural Networks. CO2 L2 13.1.1

4. Explain how the vanishing gradient problem is a difficulty found in CO2 L3 13.1.1
training artificial neural networks.
5. Discuss the Machine Translation CO3 L1 13.1

6. Explain seq2seq - The issue with long inputs CO3 L3 13.1

7. Describe the advanced attention. CO3 L2 13.1

Lesson Schedule
Class No. - Portion covered per hour
1. Recurrent Neural Networks and Language Models.
2. Recurrent Neural Networks and Language Models contd.
3. Vanishing Gradients.
4. Fancy RNNs
5. Machine Translation.
6. Seq2Seq and Attention.
7. Seq2Seq and Attention contd.
8. Advanced Attention.

11
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Review Questions
Sr. No. - Questions TLO BL PI Code
1. Recurrent Neural Networks (RNNs) add an interesting twist to basic TLO1 L3 13.1.1
neural networks. A vanilla neural network takes in a fixed size vector as
input which limits its usage in situations that involve a ‘series’ type input
with no predetermined size. Recurrent Neural Network remembers the
past and I t’s decisions are influenced by what it has learnt from the past.
While RNNs learn similarly while training, in addition, they remember
things learnt from prior input(s) while generating output(s). Justify this
with example.
2. Illustrate vanishing gradient problem with respect to what it is, why it TLO4 L3 13.1.1
happens, and why it’s bad for RNNs.
3. Explain how the vanishing gradient problem arises when, during TLO1 L3 13.1.1
backpropogation the error signal used to train the network exponentially
decreases, the whole point of an RNN is to keep track of long-
term dependencies.
4. Differentiate between soft attention and hard attention. TLO5 L2 13.1.1

5. Justify, how sequence prediction was classically handled as a structured TLO6 L2 13.1.1
prediction task?
6. Describe how to produce the coefficients (attention vector) for blending? TLO7 L3 13.1.1

7. Compare dot product attention with content-based attention. TLO7 L3 13.1.1

Course Code and Title: 23ECSE315 / Natural Language Processing with Neural Network Models
Chapter Number and Title: 3. Transformer Networks, Memory Networks Planned Hours:7 hrs

Learning Outcomes: -
At the end of the topic the student should be able to:

TLO's CO's BL CA Code


1 Explain Transformer Networks and CNNs. CO4 L2 13.1
2 Discuss the Tree Recursive Neural Networks. CO4 L2 13.1

3 Discuss the Advanced Architectures and Memory Networks. CO4 L2 13.1

Lesson Schedule
12
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Class No. - Portion covered per hour


1. Transformer Networks and CNNs.
2. Transformer Networks and CNNs contd.
3. Transformer Networks and CNNs contd.
4. Advanced Architectures and Memory Networks
5. Advanced Architectures and Memory Networks contd.
6. Advanced Architectures and Memory Networks contd.
7. Advanced Architectures and Memory Networks contd.

Review Questions
Sr. No. - Questions TLO BL PI Code
1. Example: “the country of my birth” computes vectors for: • the country, TLO1 L3 13.1.1
country of, of my, my birth, the country of, country of my, of my birth, the
country of my, country of my birth. What is the single layer convolution for the
above example?

2. Illustrate the core idea behind the Transformer model as self-attention TLO3 L3 13.1.1
model. With a neat diagram explain the ability to attend to different positions
of the input sequence to compute a representation of that sequence and how
transformer creates stacks of self-attention layers. Also, explain in detail
scaled dot product attention and multi-head attention in transformers.

3. What do you mean by an Advanced Architecture? Illustrate how deep 13.1.1


TLO3 L2
learning algorithms consists of such a diverse set of models in comparison to
a single traditional machine learning algorithm. With an example show the
flexibility that neural network provides when building a full-fledged end-to-end
model.

Course Code and Title: 23ECSE315 / Natural Language Processing with Neural Network Models

13
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Chapter Number and Title: 4. Reinforcement Learning Planned Hours: 8 hrs

Learning Outcomes: -
At the end of the topic the student should be able to:

Topic Learning Outcomes COs BL CA Code


1. Explain Reinforcement Learning for NLP. CO5 L2 13.1
2. Apply Semi-supervised Learning for NLP CO5 L2 13.1

Lesson Schedule
Class No. - Portion covered per hour
1. Reinforcement Learning for NLP.
2. Reinforcement Learning for NLP contd.
3. Semi-supervised Learning for NLP.
4. Semi-supervised Learning for NLP contd.
5. Future of NLP Models.
6. Future of NLP Models contd.
7. Multi-task Learning.
8. Multi-task Learning contd.
9. QA Systems.

Review Questions

Sl. No. - Questions TLOs BL PI Code


1. Illustrate the reinforcement learning for quantitative results? TLO3 L2 13.1.1

2. Explain text summarization in the process of automatically generating TLO2 L2 13.1.1


natural language summaries from an input document while retaining the
important points considers the extractive summarization.

3. What is multitasking learning? How inductive transfer that TLO3 L2 13.1.1


improves generalization by using the domain information contained in
the training signals of related tasks as an inductive bias.

4. Illustrate how, what is learned for each task can help other tasks be TLO1 L3 13.1.1
learned better in multi task learning?

5. Compare information-retrieval or IR-based question answering with TLO2 L3 13.1.2


knowledge-based question answering.

14
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Question Paper Title: Model Question Paper for ISA-1


Total Duration Course: Natural Language Processing Maximum Marks: 30
(H:M): 1:00 with Neural Network Models-
(23ECSE315)
Note: Answer any two full questions
Q.No. Questions Marks CO BL PO PI
Code

1a Explain the different types of Recurrent Neural Network 7 2 L2 1,13 13.1.1


(RNN) models with block diagrams. List their real-world
applications.

1b Explain how the vanishing gradient problem is a difficulty 8 CO2 L2 13 13.1.1


found in training artificial neural networks.
Explain any two applications of natural language
processing.
2a Define natural language processing (NLP)? Explain the 7 CO1 L2 1 1.1.1
real-world applications of NLP.

2b Considering an example, illustrate and explain the 8 CO2 L3 13 13.1.1


capturing of context of a word in a document, semantic
and syntactic similarity, relation with other words with
respect to word2vec.

3a Why is natural language processing hard? Justify with 7 CO1 L2 13 13.1.1


appropriate examples.

3b Explain how terms bigram and trigram language models 8 CO2 L3 13 1.1.3
denote n-gram models with n = 2 and n = 3, respectively.

Question Paper Title: Model Question Paper for ISA-II


Total Duration (H:M): 1:00 Course: Natural Language Maximum Marks: 30
Processing - (23ECSE315)
Note: Answer any two full questions

Q.No. Questions Marks CO BL PO PI


Code
1a Illustrate Self-attention, sometimes called intra-attention, 7 CO3 L3 13 13.1.1
which is an attention mechanism relating different
positions of a single sequence in order to compute a
representation of the sequence with query vector, key
vector and value vector.
1b With an example, In machine translation 8 CO3 L2 13 13.1.1
 Compare and contrast word-based translation
with phrase-based translation.
15
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

 Compare dot product attention with content


based attention.
2a In knowledge-based pronominal coreference illustrate 7 CO4 L3 13 13.1.1
the supervised machine learning pronominal anaphora
resolution.
2c Illustrate how to improve the power of recurrent neural 8 CO4 L3 13 13.1.1
networks, which are implemented with long short-term
memory network. Show an example where LSTM’s
perform better than RNN.
3a Explain mapping from one sequence to another 7 CO4 L2 13 13.1.1
sequence, using Encoder/decoder with
attention model architecture in seq2seq learning.
3b Compare dot product attention with content-based 8 CO4 L3 1 1.1.3
attention.

16

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy