0% found this document useful (0 votes)

3 views

03 LanguageModel

Uploaded by

trần văn quyết

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

03 LanguageModel

Uploaded by

trần văn quyết

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Natural Language Processing

Application
Week 3: Language Model
❏ Introduction to n-gram
❏ Estimating N-gram Probabilities
❏ N-gram model evaluation
❏ Smoothing techniques
XLNNTNUD - Language Model
INTRODUCTION TO N-GRAM
Introduction to n-gram
❏ Probabilistic Language Model: assign a probability to a
sentence
❏ Machine Translation:
❏ P(ngôn ngữ tự nhiên) > P(ngôn ngữ nhiên tự)
❏ Spelling error detection and correction:
❏ P(ngôn ngữ tự nhiên) > P(gôn ngữ tu nhiên)
❏ Text summarization
❏ Question-Answering
❏ ...
Introduction to n-gram (cont)
❏ Probability of a sentence (or a sequence of words):
❏ P(W) = P(w1, w2, w3...wn)

❏ Probability of the next word in a sentence (or a sequence of

words):
❏ P(wn| w1, w2...wn-1)

❏ Model P(W) or P(wn| w1, w2...wn-1) is called a Language Model

❏ Example:
❏ P(ngôn ngữ tự nhiên) = P(ngôn) x P(ngữ|ngôn) x P(tự|ngôn ngữ)
x P(nhiên|ngôn ngữ tự)
Introduction to n-gram (cont)
❏ Estimating the probability:
❏ P(ngôn) = count(ngôn)/N
❏ P(ngữ|ngôn) = count(ngôn ngữ)/count(ngôn)
❏ P(tự|ngôn ngữ) = count(ngôn ngữ tự)/count(ngôn ngữ)
❏ P(nhiên|ngôn ngữ tự) = count(ngôn ngữ tự nhiên)/count(ngôn ngữ tự)
❏ Comment:
❏ There are too many possibilities
❏ Not enough data for estimation
Introduction to n-gram (cont)
❏ Markov Assumption:
❏ P(nhiên|ngôn ngữ tự) ≈ P(nhiên|tự)
Or
❏ P(nhiên|ngôn ngữ tự) ≈ P(nhiên|ngữ tự)
Introduction to n-gram (cont)
❏ Markov Assumption:
Introduction to n-gram (cont)
❏ Unigram model (1-gram):

❏ Automatically generated sentences from a unigram model:

ở trận, fabregas, cesc bàn dusan và cầu kiến emmanuel thứ reyes,
utd trong sau tạo anh ngoại anh thủ pogba man một jose xuất ở này.

là tadic adebayor, thủ harry dennis santi cazorola, nhóm thành

bergkamp, bốn tiên bảy hạng kane. đầu cầu hiện antonio
Introduction to n-gram (cont)
❏ Bigram model (2-gram):

❏ Automatically generated sentences from a bigram model:

anh thành cầu ở ngoại hạng anh tạo bàn trong một trận, sau dennis
bergkamp, và harry kane.
pogba là man utd đầu xuất hiện ở nhóm này.
Introduction to n-gram (cont)
❏ Extension: trigram, 4-gram, 5-gram…
❏ Comment:
❏ The effect of long-distance dependency in language
❏ Ví dụ: “Chiếc máy tính mà tôi vừa đưa vào phòng máy trên tầng năm đã bị
hỏng.”
❏ However, n-gram model should work fine in most cases
XLNNTNUD - Language Model
ESTIMATING N-GRAM PROBABILITIES
Estimating n-gram probabilities
❏ Maximum Likelihood Estimation
Estimating n-gram probabilities (cont)
❏ Example:
Estimating n-gram probabilities (cont)
❏ Example: bigram count (9222 sentences)
Estimating n-gram probabilities (cont)
❏ Normalize by unigram:

❏ Result:
Estimating n-gram probabilities (cont)
❏ Example:
Estimating n-gram probabilities (cont)
❏ Knowledge from the probability:
❏ P(english|want) = .0011
❏ P(chinese|want) = .0065
❏ P(to|want) = .66
❏ P(eat | to) = .28
❏ P(food | to) = 0
❏ P(want | spend) = 0
❏ P (i | <s>) = .25
Estimating n-gram probabilities (cont)
❏ Problem with Multiplication:
❏ Underﬂow
❏ Slow
❏ Transform Multiplication into Addition:
Estimating n-gram probabilities (cont)
❏ Language Modeling Toolkits:
❏ SRILM
❏ IRSTLM
❏ KendLM
❏ ...
XLNNTNUD - Language Model
MODEL EVALUATION
Model Evaluation
❏ Language Model (comparing “good" and “not good" sentence):
❏ Assign higher probability to “real" or “frequently seen" sentences than
“ungrammatical" or “rarely seen" sentences.
❏ Model’s parameters are trained on a training set
❏ The model’s performance are tested on unseen data
❏ A test set is an unseen dataset, separate from the training set
❏ An evaluation metric show how good our model does on the test set
Model Evaluation (cont)
❏ Extrinsic Evaluation: to compare models A and B
❏ Give each model a task:
❏ Spelling correction, Machine Translation…
❏ Run the task and get an accuracy for A and B
❏ How many misspelled words corrected properly
❏ How many words translated correctly
❏ Compare accuracy for A and B
Model Evaluation (cont)
❏ Extrinsic Evaluation:
❏ Time consuming (days or even weeks to complete…)
❏ Therefore, Intrinsic evaluation is sometimes used: perplexity
❏ Bad approximation:
❏ If the test data doesn’t look like the training data
❏ Only useful in pilot experiment
Model Evaluation (cont)
❏ Perplexity:
❏ How well can we predict the next word:

❏ Unigram are not good in this situation ?

❏ A good model is one assigns a higher probability to the word
that actually occurs
Model Evaluation (cont)
❏ Perplexity:
❏ The best language model is one that best predicts an unseen test set
❏ Perplexity is the inverse probability of the test set, normalized by the
number of words
Model Evaluation (cont)
❏ Perplexity:
❏ Number Recognition 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
❏ Perplexity = 10
❏ Recognizing (30K) name:
❏ Perplexity = 30000
❏ A System:
❏ Management (1/4)
❏ Business (1/4)
❏ Assistance (1/4)
❏ 30K name
❏ Perplexity = 53
Model Evaluation (cont)
❏ Perplexity:
❏ Lower Perplexity = Better Model
❏ Training set 3M words, Test set 1.5M words (WSJ)
XLNNTNUD - Language Model
SMOOTHING TECHNIQUES
Smoothing Techniques
❏ Shakespeare corpus :
❏ N=884,647 tokens, V=29,066
❏ 300K bigram (reality) out of V2 = 844M bigram (possibility)
=> 99.96% bigram (possible) were never seen (probability 0)
Smoothing Techniques (cont)
❏ The problem:
Smoothing Techniques (cont)
❏ The problem:
❏ Bigram with zero probability
❏ Assign probability zero to the test set
❏ Therefore we cannot calculate perplexity (Division by zero)
Smoothing Techniques (cont)
❏ Add-one (Laplace) Smoothing:
❏ Sparse statistics:

❏ Generalize better probability

Smoothing Techniques (cont)
❏ Add-one (Laplace) Smoothing:
❏ Pretend we saw each word one more time
❏ Add one to all the counts
Smoothing Techniques (cont)
❏ Add-one (Laplace) smoothing:
Smoothing Techniques (cont)
❏ Add-one (Laplace) smoothing
Smoothing Techniques (cont)
❏ Add-one (Laplace) smoothing
Smoothing Techniques (cont)
❏ Phương pháp Thêm-1 (Laplace)
Smoothing Techniques (cont)
❏ Other techniques:
❏ Recursive Interpolation
❏ Backtracking
❏ Good Turing
❏ Kneser-Ney
❏ ...

Nlp Basic 03-N-gram Language Model: Nguyễn Quốc Thái
No ratings yet
Nlp Basic 03-N-gram Language Model: Nguyễn Quốc Thái
31 pages
5)Lecture-Feb11&13&17&18
No ratings yet
5)Lecture-Feb11&13&17&18
21 pages
6.Chapter6_LanguageModel
No ratings yet
6.Chapter6_LanguageModel
33 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
Lecture 5: Language Modeling (N-Gram, BOW)
No ratings yet
Lecture 5: Language Modeling (N-Gram, BOW)
25 pages
NLP m2
No ratings yet
NLP m2
74 pages
NLP_Module 2(1)
No ratings yet
NLP_Module 2(1)
77 pages
NLP_Unit2 (2)
No ratings yet
NLP_Unit2 (2)
65 pages
CME4408 P5 N-grams Smooting
No ratings yet
CME4408 P5 N-grams Smooting
43 pages
13 Ngramlm
No ratings yet
13 Ngramlm
27 pages
Multimedia Application L6
No ratings yet
Multimedia Application L6
63 pages
N Grams
No ratings yet
N Grams
51 pages
Introduction To Language Modeling Final
No ratings yet
Introduction To Language Modeling Final
69 pages
NLP-UNITS-IV-V
No ratings yet
NLP-UNITS-IV-V
30 pages
Lec-3 Language Modeling N-Grams
No ratings yet
Lec-3 Language Modeling N-Grams
41 pages
04_N-gram Language Models
No ratings yet
04_N-gram Language Models
41 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Ngrams
100% (1)
Ngrams
22 pages
Lecture 6 to 8 N-gram
No ratings yet
Lecture 6 to 8 N-gram
19 pages
Lecture_4_N_grams
No ratings yet
Lecture_4_N_grams
29 pages
KEN2570 4 LanguageModel
No ratings yet
KEN2570 4 LanguageModel
17 pages
Module 2
No ratings yet
Module 2
98 pages
3. n grams
No ratings yet
3. n grams
3 pages
UNIT II_NLP
No ratings yet
UNIT II_NLP
35 pages
Natural Language Processing_Notes_Unit 2.docx
No ratings yet
Natural Language Processing_Notes_Unit 2.docx
19 pages
Language Models: CS6370: Natural Language Processing
No ratings yet
Language Models: CS6370: Natural Language Processing
35 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
No ratings yet
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
32 pages
NLTK - N-Gram LM
No ratings yet
NLTK - N-Gram LM
13 pages
NLP 5th unit
No ratings yet
NLP 5th unit
19 pages
Evaluating Language Models
No ratings yet
Evaluating Language Models
21 pages
A7_NLP_Exp2
No ratings yet
A7_NLP_Exp2
11 pages
Unit 2b
No ratings yet
Unit 2b
22 pages
module-1 ch-2
No ratings yet
module-1 ch-2
31 pages
3_2
No ratings yet
3_2
26 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
1_N-grams_and_Language_Models_Detailed
No ratings yet
1_N-grams_and_Language_Models_Detailed
4 pages
NLp
No ratings yet
NLp
12 pages
14 Ngramlm
No ratings yet
14 Ngramlm
67 pages
Multimedia Application L5
No ratings yet
Multimedia Application L5
35 pages
2. Language Modeling
No ratings yet
2. Language Modeling
50 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
Probabilistic Language Modeling Challenges
No ratings yet
Probabilistic Language Modeling Challenges
12 pages
lecture5-ngrams
No ratings yet
lecture5-ngrams
40 pages
N Gram Model
No ratings yet
N Gram Model
4 pages
Lecture13 LM YirenWang
No ratings yet
Lecture13 LM YirenWang
8 pages
Rizvi College of Engineering: DLO8012: Natural Language Processing
No ratings yet
Rizvi College of Engineering: DLO8012: Natural Language Processing
16 pages
Language Modeling: Introduction To N-Grams
No ratings yet
Language Modeling: Introduction To N-Grams
79 pages
lm24aug
No ratings yet
lm24aug
84 pages
Language Modeling: Introduction To N-Grams
No ratings yet
Language Modeling: Introduction To N-Grams
88 pages
NLP PLM
No ratings yet
NLP PLM
35 pages
N-Gram in NLP
No ratings yet
N-Gram in NLP
15 pages
N-Grams and Corpus Linguistics: Julia Hirschberg
No ratings yet
N-Grams and Corpus Linguistics: Julia Hirschberg
47 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
NLP Module 2
No ratings yet
NLP Module 2
18 pages
2 N-Gram
No ratings yet
2 N-Gram
70 pages
Language Modeling: Prabhleen Juneja Thapar Institute of Engineering & Technology
No ratings yet
Language Modeling: Prabhleen Juneja Thapar Institute of Engineering & Technology
36 pages
Session 2-3 Language Modeling
No ratings yet
Session 2-3 Language Modeling
69 pages
02 NLP LM
No ratings yet
02 NLP LM
99 pages
GMAT Advanced Quant
From Everand
GMAT Advanced Quant
Manhattan Prep
No ratings yet
Q4 - Math 7 - Periodical Exam (S.Y.24-25) with answers
No ratings yet
Q4 - Math 7 - Periodical Exam (S.Y.24-25) with answers
7 pages
Frekuensi Usia Responden
No ratings yet
Frekuensi Usia Responden
4 pages
Freeman, Randy
No ratings yet
Freeman, Randy
43 pages
REV6
No ratings yet
REV6
3 pages
Statistical Inference Book PDF
No ratings yet
Statistical Inference Book PDF
350 pages
Instant download Fundamentals of Bayesian Epistemology 1 : Introducing Credences Michael G. Titelbaum pdf all chapter
No ratings yet
Instant download Fundamentals of Bayesian Epistemology 1 : Introducing Credences Michael G. Titelbaum pdf all chapter
41 pages
ST2CMT02 Statistics Probability Theory
No ratings yet
ST2CMT02 Statistics Probability Theory
3 pages
Probability - Formula Sheet - MathonGo
No ratings yet
Probability - Formula Sheet - MathonGo
5 pages
Biostatistics Unit 5. Measure of Skew
No ratings yet
Biostatistics Unit 5. Measure of Skew
38 pages
ZC-417 Quantitative Methods Exam Notes
No ratings yet
ZC-417 Quantitative Methods Exam Notes
144 pages
Bayes Net
No ratings yet
Bayes Net
36 pages
References
No ratings yet
References
5 pages
Buy ebook Bayesian Analysis of Infectious Diseases COVID 19 and Beyond 1st Edition Lyle D. Broemeling cheap price
100% (11)
Buy ebook Bayesian Analysis of Infectious Diseases COVID 19 and Beyond 1st Edition Lyle D. Broemeling cheap price
75 pages
Unit 1 Review of Probability and Basic Statistics
100% (1)
Unit 1 Review of Probability and Basic Statistics
90 pages
SAP
No ratings yet
SAP
4 pages
George Chrystal - Algebra: An Elementary Text-Book Volume 2
100% (1)
George Chrystal - Algebra: An Elementary Text-Book Volume 2
648 pages
Discrete Probability Distribution ANSWER KEY
No ratings yet
Discrete Probability Distribution ANSWER KEY
7 pages
Juaneza, Randy Jr. - (Ps #2)
No ratings yet
Juaneza, Randy Jr. - (Ps #2)
3 pages
4ma1 2223 Chapter 02 Discrete RV
No ratings yet
4ma1 2223 Chapter 02 Discrete RV
12 pages
Exercises U6 - Estimation Theory
No ratings yet
Exercises U6 - Estimation Theory
4 pages
(Ebook) Fundamentals of applied probability and random processes by Oliver Ibe ISBN 9780120885084, 0120885085 instant download
100% (1)
(Ebook) Fundamentals of applied probability and random processes by Oliver Ibe ISBN 9780120885084, 0120885085 instant download
59 pages
Ch-13-Worksheet-1
No ratings yet
Ch-13-Worksheet-1
3 pages
Exploring The Mathematics of Cocoa Plantation and Dark Chocolate
No ratings yet
Exploring The Mathematics of Cocoa Plantation and Dark Chocolate
47 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Module 8
No ratings yet
Module 8
15 pages
Home Work (Satistics AIUB)
No ratings yet
Home Work (Satistics AIUB)
5 pages
Download ebooks file Monte Carlo Methods Second Edition Malvin H. Kalos all chapters
100% (7)
Download ebooks file Monte Carlo Methods Second Edition Malvin H. Kalos all chapters
49 pages
Probability: Prepared by
No ratings yet
Probability: Prepared by
40 pages
Sample Remediation Plan For Grade 6 Pupils
No ratings yet
Sample Remediation Plan For Grade 6 Pupils
5 pages
Chapter One: 1.1 Axiom of Probability
No ratings yet
Chapter One: 1.1 Axiom of Probability
11 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

03 LanguageModel

Uploaded by

03 LanguageModel

Uploaded by

Natural Language Processing

❏ Probability of the next word in a sentence (or a sequence of

❏ Model P(W) or P(wn| w1, w2...wn-1) is called a Language Model

❏ Automatically generated sentences from a unigram model:

là tadic adebayor, thủ harry dennis santi cazorola, nhóm thành

❏ Automatically generated sentences from a bigram model:

❏ Unigram are not good in this situation ?

❏ Generalize better probability

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.