0% found this document useful (0 votes)

18 views82 pages

Naive Bayes With Sentiment Classification

The document discusses text classification, focusing on the Naïve Bayes classifier, which uses Bayes' theorem for categorizing texts into predefined classes. It covers various applications such as spam detection, sentiment analysis, and authorship identification, as well as the challenges and methodologies involved in training and implementing the classifier. Additionally, it highlights the importance of feature selection and the handling of unknown words and negation in sentiment classification tasks.

Uploaded by

szayon469

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views82 pages

Naive Bayes With Sentiment Classification

Uploaded by

szayon469

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 82

Text

Classification
and Naïve
Bayes

The Task of Text

Classification
Dan Jurafsky

Is this spam?
Dan Jurafsky

Who wrote which Federalist papers?

• 1787-8: anonymous essays try to convince New York

to ratify U.S Constitution: Jay, Madison, Hamilton.
• Authorship of 12 of the letters in dispute
• 1963: solved by Mosteller and Wallace using
Bayesian methods

James Madison Alexander Hamilton

Dan Jurafsky

Positive or negative movie review?

• unbelievably disappointing
• Full of zany characters and richly applied satire, and some
great plot twists
• this is the greatest screwball comedy ever filmed
• It was pathetic. The worst part about it was the boxing
scenes.

4
Dan Jurafsky

What is the subject of this article?

Text Classification
• Assigning subject categories, topics, or genres
• Spam detection
• Authorship identification
• Age/gender identification
• Language Identification
• Sentiment analysis
• …
Dan Jurafsky

Text Classification: definition

• Input:
• a document d
• a fixed set of classes C = {c1, c2,…, cJ}

• Output: a predicted class c  C

Classification Methods:
Dan Jurafsky

Hand-coded rules
• Rules based on combinations of words or other features
• spam: black-list-address OR (“dollars” AND“have been selected”)
• Accuracy can be high
• If rules carefully refined by expert
• But building and maintaining these rules is expensive
Dan Jurafsky

Classification Methods:
Supervised Machine Learning
• Input:
• a document d
• a fixed set of classes C = {c1, c2,…, cJ}
• A training set of m hand-labeled documents (d1,c1),....,(dm,cm)
• Output:
• a learned classifier γ:d  c

9
Dan Jurafsky
Classification Methods:
Supervised Machine Learning
• Any kind of classifier
• Naïve Bayes
• Logistic regression
• Support-vector machines
• k-Nearest Neighbors

•…
Text
Classificat The Naive Bayes Classifier
ion and
Naive
Bayes
Naive Bayes Intuition

Simple ("naive") classification method based on

Bayes rule
Relies on very simple representation of document
◦ Bag of words
The Bag of Words
Representation

13
The bag of words
representation
seen 2

γ
sweet 1
whimsical
recommend
1
1 )=c
(
happy 1
... ...
Bayes’ Rule Applied to Documents and Classes

• For a document d and a class c

Naive Bayes Classifier (I)

MAP is “maximum a
posteriori” = most
likely class

Bayes Rule

Dropping the
denominator
Naive Bayes Classifier (II)
"Likelihood "Prior"
"

Document d
represented
as features
x1..xn
Naïve Bayes Classifier (IV)

O(|X|n•|C|) parameters How often does this

class occur?

Could only be estimated if a

We can just count the
very, very large number of relative frequencies
training examples was in a corpus

available.
Multinomial Naive Bayes
Independence Assumptions

Bag of Words assumption: Assume position doesn’t matter

Conditional Independence: Assume the feature
probabilities P(xi|cj) are independent given the class c.
Multinomial Naive Bayes
Classifier
Applying Multinomial Naive Bayes Classifiers to
Text Classification

positions  all word positions in test document

Problems with multiplying lots of
probs
There's a problem with this:

Multiplying lots of probabilities can result in floating-point underflow!

.0006 * .0007 * .0009 * .01 * .5 * .000008….
Idea: Use logs, because log(ab) = log(a) + log(b)
We'll sum logs of probabilities instead of multiplying
probabilities!
We actually do everything in log
space
Instead of this:

This:

Notes:
1) Taking log doesn't change the ranking of classes!
The class with highest probability also has highest log probability!
2) It's a linear model:
Just a max of a sum of weights: a linear function of the inputs
So naive bayes is a linear classifier
Text
Classificat The Naive Bayes Classifier
ion and
Naive
Bayes
Text
Classificat
ion and Naive Bayes: Learning
Naïve
Bayes
Sec.13.3

Learning the Multinomial Naive Bayes Model

First attempt: maximum likelihood estimates

◦ simply use the frequencies in the data

𝑁𝑐
^ (𝑐 )=
𝑃 𝑗
𝑗
𝑁 𝑡𝑜𝑡𝑎𝑙
Parameter estimation

fraction of times word wi appears

among all words in documents of topic cj

Create mega-document for topic j by concatenating all

docs in this topic
◦ Use frequency of w in mega-document
Sec.13.3

Problem with Maximum Likelihood

What if we have seen no training documents with the word fantastic

and classified in the topic positive (thumbs-up)?

Zero probabilities cannot be conditioned away, no matter the other

evidence!
Laplace (add-1) smoothing for Naïve Bayes
Multinomial Naïve Bayes:
Learning

• From training corpus, extract Vocabulary

Calculate P(cj) terms • Calculate P(wk | cj) terms

◦ For each cj in C do • Textj  single doc containing all docsj
docsj  all docs with class =cj • For each word wk in Vocabulary
nk  # of occurrences of wk in Textj
Unknown words
What about unknown words
◦ that appear in our test data
◦ but not in our training data or vocabulary?
We ignore them
◦ Remove them from the test document!
◦ Pretend they weren't there!
◦ Don't include any probability for them at all!
Why don't we build an unknown word model?
◦ It doesn't help: knowing which class has more unknown words
is not generally helpful!
Stop words
Some systems ignore stop words
◦ Stop words: very frequent words like the and a.
◦ Sort the vocabulary by word frequency in training set
◦ Call the top 10 or 50 words the stopword list.
◦ Remove all stop words from both training and test sets
◦ As if they were never there!

But removing stop words doesn't usually help

• So in practice most NB algorithms use all words and don't
use stopword lists
Text
Classificat
ion and Naive Bayes: Learning
Naive
Bayes
Text
Classificat Sentiment and Binary
ion and Naive Bayes
Naive
Bayes
Let's do a worked sentiment
example!
A worked sentiment example with add-1 smoothing
1. Prior from training:
^ (𝑐 )=
𝑃 𝑗
𝑁𝑐 𝑗 P(-) = 3/5
𝑁 𝑡𝑜𝑡𝑎𝑙
P(+) = 2/5
2. Drop "with"
3. Likelihoods from training:
𝑐𝑜𝑢𝑛𝑡 ( 𝑤 𝑖 , 𝑐 ) +1
𝑝 ( 𝑤 𝑖|𝑐 ) =
(∑
𝑤 ∈𝑉
)
𝑐𝑜𝑢𝑛𝑡 (𝑤 ,𝑐 ) + ¿ 𝑉 ∨¿ ¿
4. Scoring the test set:
Optimizing for sentiment
analysis
For tasks like sentiment, word occurrence seems to
be more important than word frequency.
◦ The occurrence of the word fantastic tells us a lot
◦ The fact that it occurs 5 times may not tell us much more.
Binary multinominal naive bayes, or binary NB
◦ Clip our word counts at 1
◦ Note: this is different than Bernoulli naive bayes; see the
textbook at the end of the chapter.
Binary Multinomial Naïve Bayes:
Learning
• From training corpus, extract Vocabulary

Calculate P(cj) terms • Calculate P(wk | cj) terms

◦ For each cj in C do •• Remove duplicates
Textj  single docincontaining
each doc: all docsj
• For
• For each
each word
word wktype w in docj
in Vocabulary
docsj  all docs with class =cj
• Retain only a single instance of w
nk  # of occurrences of wk in Textj
Binary Multinomial Naive Bayes
on a test document d
First remove all duplicate words from d
Then compute NB using the same equation:

39
Binary multinominal naive
Bayes
Binary multinominal naive
Bayes
Binary multinominal naive
Bayes
Binary multinominal naive
Bayes

Counts can still be 2! Binarization is within-doc!

Text
Classificat Sentiment and Binary
ion and Naive Bayes
Naive
Bayes
Text
Classificat More on Sentiment
ion and Classification
Naive
Bayes
Sentiment Classification: Dealing with Negation
I really like this movie
I really don't like this movie

Negation changes the meaning of "like" to negative.

Negation can also change negative to positive-ish
◦ Don't dismiss this film
◦ Doesn't let us get bored
Sentiment Classification: Dealing with Negation
Das, Sanjiv and Mike Chen. 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. In
Proceedings of the Asia Pacific Finance Association Annual Conference (APFA).
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification
using Machine Learning Techniques. EMNLP-2002, 79—86.

Simple baseline method:

Add NOT_ to every word between negation and following punctuation:

didn’t like this movie , but I

didn’t NOT_like NOT_this NOT_movie but I

Sentiment Classification:
Lexicons
Sometimes we don't have enough labeled training
data
In that case, we can make use of pre-built word lists
Called lexicons
There are various publically available lexicons
MPQA Subjectivity Cues Lexicon
Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (2005). Recognizing Contextual Polarity in
Phrase-Level Sentiment Analysis. Proc. of HLT-EMNLP-2005.

Riloff and Wiebe (2003). Learning extraction patterns for subjective expressions. EMNLP-2003.

Home page: https://mpqa.cs.pitt.edu/lexicons/subj_lexicon/

6885 words from 8221 lemmas, annotated for intensity (strong/weak)
◦ 2718 positive
◦ 4912 negative
+ : admirable, beautiful, confident, dazzling, ecstatic, favor, glee, great
− : awful, bad, bias, catastrophe, cheat, deny, envious, foul, harsh, hate
49
The General Inquirer
Philip J. Stone, Dexter C Dunphy, Marshall S. Smith, Daniel M. Ogilvie. 1966. The General
Inquirer: A Computer Approach to Content Analysis. MIT Press

◦ Home page: http://www.wjh.harvard.edu/~inquirer

◦ List of Categories:
http://www.wjh.harvard.edu/~inquirer/homecat.htm
◦ Spreadsheet: http://www.wjh.harvard.edu/~inquirer/inquirerbasic.xls
Categories:
◦ Positiv (1915 words) and Negativ (2291 words)
◦ Strong vs Weak, Active vs Passive, Overstated versus Understated
◦ Pleasure, Pain, Virtue, Vice, Motivation, Cognitive Orientation, etc
Free for Research Use
Using Lexicons in Sentiment Classification

Add a feature that gets a count whenever a word

from the lexicon occurs
◦ E.g., a feature called "this word occurs in the positive
lexicon" or "this word occurs in the negative lexicon"
Now all positive words (good, great, beautiful,
wonderful) or negative words count for that feature.
Using 1-2 features isn't as good as using all the words.
• But when training data is sparse or not representative of the
test set, dense lexicon features can help
Naive Bayes in Other tasks: Spam Filtering

SpamAssassin Features:
◦ Mentions millions of (dollar) ((dollar) NN,NNN,NNN.NN)
◦ From: starts with many numbers
◦ Subject is all capitals
◦ HTML has a low ratio of text to image area
◦ "One hundred percent guaranteed"
◦ Claims you can be removed from the list
Naive Bayes in Language ID
Determining what language a piece of text is written in.
Features based on character n-grams do very well
Important to train on lots of varieties of each language
(e.g., American English varieties like African-American English,
or English varieties around the world like Indian English)
Summary: Naive Bayes is Not
So Naive
Very Fast, low storage requirements
Work well with very small amounts of training data
Robust to Irrelevant Features
Irrelevant Features cancel each other without affecting results

Very good in domains with many equally important features

Decision Trees suffer from fragmentation in such cases – especially if little data

Optimal if the independence assumptions hold: If assumed

independence is correct, then it is the Bayes Optimal Classifier for problem
A good dependable baseline for text classification
◦ But we will see other classifiers that give better accuracy Slide from Chris Manning
Text
Classificat More on Sentiment
ion and Classification
Naive
Bayes
Text
Classification
and Naïve
Bayes

Naïve Bayes:
Relationship to
Language Modeling
Dan Jurafsky

Generative Model for Multinomial Naïve Bayes

c=China

X1=Shanghai X2=and X3=Shenzhen X4=issue X5=bonds

57
Dan Jurafsky

Naïve Bayes and Language Modeling

• Naïve bayes classifiers can use any sort of feature
• URL, email address, dictionaries, network features
• But if, as in the previous slides
• We use only word features
• we use all of the words in the text (not a subset)
• Then
• Naïve bayes has an important similarity to language
58
modeling.
Dan Jurafsky Sec.13.2.1

Each class = a unigram language model

• Assigning each word: P(word | c)
• Assigning each sentence: P(s|c)=Π P(word|c)
Class pos
0.1 I
I love this fun film
0.1 love
0.01 this
0.1 0.1 .05 0.01 0.1
0.05 fun
0.1 film
P(s | pos) = 0.0000005
…
Dan Jurafsky Sec.13.2.1

Naïve Bayes as a Language Model

• Which class assigns the higher probability to s?

Model pos Model neg

0.1 I 0.2 I I love this fun film
0.1 love 0.001 love
0.1 0.1 0.01 0.05 0.1
0.01 this 0.01 this 0.2 0.001 0.01 0.005 0.1
0.05 fun 0.005 fun
0.1 film P(s|pos) > P(s|neg)
0.1 film
Text
Classification
and Naïve
Bayes

Naïve Bayes:
Relationship to
Language Modeling
Text
Classificat Precision, Recall, and F1
ion and
Naive
Bayes
Evaluating Classifiers: How well does our classifier
work?
Let's first address binary classifiers:
• Is this email spam?
spam (+) or not spam (-)
• Is this post about Delicious Pie Company?
about Del. Pie Co (+) or not about Del. Pie Co(-)

We'll need to know

1. What did our classifier say about each email or post?
2. What should our classifier have said, i.e., the correct
answer, usually as defined by humans ("gold label")
First step in evaluation: The confusion matrix
Accuracy on the confusion matrix
Why don't we use accuracy?
Accuracy doesn't work well when we're dealing with
uncommon or imbalanced classes
Suppose we look at 1,000,000 social media posts to find
Delicious Pie-lovers (or haters)
• 100 of them talk about our pie
• 999,900 are posts about something unrelated
Imagine the following simple classifier
Every post is "not about pie"
Accuracy re: pie posts
100 posts are about pie; 999,900 aren't
Why don't we use accuracy?
Accuracy of our "nothing is pie" classifier
999,900 true negatives and 100 false negatives
Accuracy is 999,900/1,000,000 = 99.99%!
But useless at finding pie-lovers (or haters)!!
Which was our goal!
Accuracy doesn't work well for unbalanced classes
Most tweets are not about pie!
Instead of accuracy we use precision and recall

Precision: % of selected items that are correct

Recall: % of correct items that are selected
Precision/Recall aren't fooled by the"just call
everything negative" classifier!
Stupid classifier: Just say no: every tweet is "not about pie"
• 100 tweets talk about pie, 999,900 tweets don't
• Accuracy = 999,900/1,000,000 = 99.99%
But the Recall and Precision for this classifier are terrible:
A combined measure: F1

F1 is a combination of precision and recall.

F1 is a special case of the general "F-measure"

F-measure is the (weighted) harmonic mean of precision and recall

F1 is a special case of F-measure with β=1, α=½

Suppose we have more than 2 classes?

Lots of text classification tasks have more than two classes.

◦ Sentiment analysis (positive, negative, neutral) , named entities (person, location,
organization)

We can define precision and recall for multiple classes like this 3-
way email task:
How to combine P/R values for different classes:
Microaveraging vs Macroaveraging
Text
Classificat Precision, Recall, and F1
ion and
Naive
Bayes
Text
Classificat Avoiding Harms in Classification
ion and
Naive
Bayes
Harms of classification
Classifiers, like any NLP algorithm, can cause harms
This is true for any classifier, whether Naive Bayes or
other algorithms
Representational Harms
• Harms caused by a system that demeans a social group
• Such as by perpetuating negative stereotypes about them.
• Kiritchenko and Mohammad 2018 study
• Examined 200 sentiment analysis systems on pairs of sentences
• Identical except for names:
• common African American (Shaniqua) or European American (Stephanie).
• Like "I talked to Shaniqua yesterday" vs "I talked to Stephanie yesterday"
• Result: systems assigned lower sentiment and more negative
emotion to sentences with African American names
• Downstream harm:
• Perpetuates stereotypes about African Americans
• African Americans treated differently by NLP tools like sentiment (widely
Harms of Censorship
• Toxicity detection is the text classification task of detecting hate speech,
abuse, harassment, or other kinds of toxic language.
• Widely used in online content moderation
• Toxicity classifiers incorrectly flag non-toxic sentences that simply
mention minority identities (like the words "blind" or "gay")
• women (Park et al., 2018),
• disabled people (Hutchinson et al., 2020)
• gay people (Dixon et al., 2018; Oliva et al., 2021)
• Downstream harms:
• Censorship of speech by disabled people and other groups
• Speech by these groups becomes less visible online
• Writers might be nudged by these algorithms to avoid these words
Performance Disparities
1. Text classifiers perform worse on many languages
of the world due to lack of data or labels
2. Text classifiers perform worse on varieties of
even high-resource languages like English
• Example task: language identification, a first step in
NLP pipeline ("Is this post in English or not?")
• English language detection performance worse for
writers who are African American (Blodgett and
O'Connor 2017) or from India (Jurgens et al., 2017)
Harms in text classification
• Causes:
• Issues in the data; NLP systems amplify biases in training data
• Problems in the labels
• Problems in the algorithms (like what the model is trained to
optimize)
• Prevalence: The same problems occur throughout NLP
(including large language models)
• Solutions: There are no general mitigations or solutions
• But harm mitigation is an active area of research
• And there are standard benchmarks and tools that we can use for
measuring some of the harms
Text
Classificat Avoiding Harms in Classification
ion and
Naive
Bayes

Merrill, Peter Do It Right The Second Time, Second Edition Benchmarking Best Practices in The Quality Change Process
No ratings yet
Merrill, Peter Do It Right The Second Time, Second Edition Benchmarking Best Practices in The Quality Change Process
400 pages
NLP - PPT - Module 3 - Naïve Bayes, Text Classification and Sentiment
100% (1)
NLP - PPT - Module 3 - Naïve Bayes, Text Classification and Sentiment
86 pages
Lecture 8-1 - Text Classification, Naïve Bayes, Vector Space Classification
No ratings yet
Lecture 8-1 - Text Classification, Naïve Bayes, Vector Space Classification
38 pages
Manual de Usuario Yaris Belta 2007
No ratings yet
Manual de Usuario Yaris Belta 2007
302 pages
Module 3 - NLP (1)
No ratings yet
Module 3 - NLP (1)
25 pages
S Ox Inventory Management Risks and Controls
100% (1)
S Ox Inventory Management Risks and Controls
21 pages
05 Text Classification - Naive Bayes (1)
No ratings yet
05 Text Classification - Naive Bayes (1)
64 pages
05 Text Classification - Naive Bayes
No ratings yet
05 Text Classification - Naive Bayes
64 pages
Module 3 NLP
No ratings yet
Module 3 NLP
17 pages
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper (1)
No ratings yet
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper (1)
74 pages
Lecture5 421
No ratings yet
Lecture5 421
115 pages
4 Naive Bayes
No ratings yet
4 Naive Bayes
82 pages
NB 24 Aug
No ratings yet
NB 24 Aug
82 pages
nb24aug
No ratings yet
nb24aug
85 pages
Text Classification and Naïve Bayes: The Task of Text Classifica1on
No ratings yet
Text Classification and Naïve Bayes: The Task of Text Classifica1on
74 pages
Classification
No ratings yet
Classification
81 pages
4 NB 2024
No ratings yet
4 NB 2024
82 pages
3. Text Classification
No ratings yet
3. Text Classification
60 pages
nb24aug
No ratings yet
nb24aug
79 pages
L5 TextClassification Updated
No ratings yet
L5 TextClassification Updated
179 pages
04_1 06 naivebayes
No ratings yet
04_1 06 naivebayes
65 pages
Naive Bayes and Sentiment Classification
No ratings yet
Naive Bayes and Sentiment Classification
23 pages
AE328 Solution-Manual Chapter-17
No ratings yet
AE328 Solution-Manual Chapter-17
26 pages
Slp3 TextClassification Reduced
No ratings yet
Slp3 TextClassification Reduced
60 pages
Educ 5 - Activity 2
60% (5)
Educ 5 - Activity 2
3 pages
Lecture13 Nbayes
No ratings yet
Lecture13 Nbayes
56 pages
Multimedia Application L8
No ratings yet
Multimedia Application L8
68 pages
Text Classification
No ratings yet
Text Classification
53 pages
OECD 06speed
No ratings yet
OECD 06speed
285 pages
3 Classification 1
No ratings yet
3 Classification 1
55 pages
T4L1 Naive Bayes
No ratings yet
T4L1 Naive Bayes
50 pages
Graphs and Properties of Parabolas
No ratings yet
Graphs and Properties of Parabolas
52 pages
Week4
No ratings yet
Week4
45 pages
4.Machine Learning for Text Understanding-1
No ratings yet
4.Machine Learning for Text Understanding-1
45 pages
Performance of Biocement Treatment in Improving The Interfacial Properties of Recycled Aggregate Concrete
No ratings yet
Performance of Biocement Treatment in Improving The Interfacial Properties of Recycled Aggregate Concrete
13 pages
3 - Temperature Sensors
No ratings yet
3 - Temperature Sensors
110 pages
Lecture 5-1 Naive
No ratings yet
Lecture 5-1 Naive
44 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
48 pages
Multimedia Application L7_for
No ratings yet
Multimedia Application L7_for
46 pages
7 - Text Classification Naive Bayes
No ratings yet
7 - Text Classification Naive Bayes
41 pages
NO. Musyrifah Anggota Halaqah Pekan
No ratings yet
NO. Musyrifah Anggota Halaqah Pekan
98 pages
Lecture03 Naive Bayes
No ratings yet
Lecture03 Naive Bayes
33 pages
Lecture 02
No ratings yet
Lecture 02
31 pages
Naivebayes 2021
No ratings yet
Naivebayes 2021
77 pages
05_NaiveBayesAndSentimentClassification
No ratings yet
05_NaiveBayesAndSentimentClassification
36 pages
BAI601 Module 3 PDF
No ratings yet
BAI601 Module 3 PDF
19 pages
bag_of_words nlp
No ratings yet
bag_of_words nlp
23 pages
20250129_Lecture03_naivebayes
No ratings yet
20250129_Lecture03_naivebayes
25 pages
MultinomialNB
No ratings yet
MultinomialNB
52 pages
Naive Bayes
No ratings yet
Naive Bayes
56 pages
NLP NB
No ratings yet
NLP NB
52 pages
Bma School Books Catalogue PDF
No ratings yet
Bma School Books Catalogue PDF
8 pages
Naive Bayes Sentiment Analysis
No ratings yet
Naive Bayes Sentiment Analysis
23 pages
Laboratory Experiment 8: Mohd Ashraf Mohd Ismail
100% (1)
Laboratory Experiment 8: Mohd Ashraf Mohd Ismail
16 pages
NLP ch4 l1
No ratings yet
NLP ch4 l1
23 pages
MH3 Hydrapak Gardner Denver
No ratings yet
MH3 Hydrapak Gardner Denver
10 pages
Sentiment Analysis: Using Naïve Bayes Classifier
No ratings yet
Sentiment Analysis: Using Naïve Bayes Classifier
18 pages
Naïve Bayes: The Task of Text Classification
No ratings yet
Naïve Bayes: The Task of Text Classification
34 pages
02 Text Processing PDF
No ratings yet
02 Text Processing PDF
70 pages
Leibsohn (2014) Manila, Ethnicity, Cartography
No ratings yet
Leibsohn (2014) Manila, Ethnicity, Cartography
23 pages
MLRD 2
No ratings yet
MLRD 2
15 pages
04 Textcat
No ratings yet
04 Textcat
101 pages
2012 AuthenticSTEM Guide To STEM Education Resources
100% (3)
2012 AuthenticSTEM Guide To STEM Education Resources
29 pages
24 Shivangi DMDW
No ratings yet
24 Shivangi DMDW
12 pages
Circulatory System Crossword Puzzle Answers
100% (3)
Circulatory System Crossword Puzzle Answers
1 page
Lecture-Feb20&25
No ratings yet
Lecture-Feb20&25
11 pages
Dokumen - Tips Widevine Level 1 Provisioning Models Level 1 Provisioning Models W I D e Vi 1
100% (1)
Dokumen - Tips Widevine Level 1 Provisioning Models Level 1 Provisioning Models W I D e Vi 1
13 pages
Coleman Trailers
50% (2)
Coleman Trailers
16 pages
2012 Liviu P. Dinu, Iulia Iuga, 2012. The Naive Bayes Classifier in Opinion Mining - in Search of The Best Feature
No ratings yet
2012 Liviu P. Dinu, Iulia Iuga, 2012. The Naive Bayes Classifier in Opinion Mining - in Search of The Best Feature
12 pages
Forensic Science International: Jacques Linden, Raymond Marquis, Silvia Bozza, Franco Taroni
No ratings yet
Forensic Science International: Jacques Linden, Raymond Marquis, Silvia Bozza, Franco Taroni
14 pages
Learning Based Approach For Hindi Text S 77957aeb
No ratings yet
Learning Based Approach For Hindi Text S 77957aeb
8 pages
NaiveBayes N Text Analytics
No ratings yet
NaiveBayes N Text Analytics
20 pages
German SrSec 2021-22 PDF
No ratings yet
German SrSec 2021-22 PDF
8 pages
IMCASF - Sept 16
No ratings yet
IMCASF - Sept 16
7 pages
CSF Cross-Curricular Standards Guidance
No ratings yet
CSF Cross-Curricular Standards Guidance
6 pages
Breman - A Short History of The Informal Economy
No ratings yet
Breman - A Short History of The Informal Economy
19 pages
Tackling The Poor Assumptions of Naive Bayes Text Classifiers
No ratings yet
Tackling The Poor Assumptions of Naive Bayes Text Classifiers
8 pages
Document
No ratings yet
Document
7 pages
3x3 Multimode Interference Optical Switches Using Electro-Optic Effects As Phase Shifters
No ratings yet
3x3 Multimode Interference Optical Switches Using Electro-Optic Effects As Phase Shifters
6 pages
Newtons Ring Experiment
No ratings yet
Newtons Ring Experiment
4 pages
Naive Bayes and Sentiment
No ratings yet
Naive Bayes and Sentiment
19 pages
Na Ive Bayes Classifier
No ratings yet
Na Ive Bayes Classifier
3 pages
Week 2 Problem Set
No ratings yet
Week 2 Problem Set
9 pages
05 Naive Bayes - Relationship To Language Modeling 4-35
No ratings yet
05 Naive Bayes - Relationship To Language Modeling 4-35
2 pages
How To Do Media and Cultural Studies - (PG 79 - 85)
No ratings yet
How To Do Media and Cultural Studies - (PG 79 - 85)
7 pages
Saving ENergy
No ratings yet
Saving ENergy
13 pages
Qualification Requirements: Rhodes State College Radiographic Imaging Qualification / Curriculum Check Sheet
No ratings yet
Qualification Requirements: Rhodes State College Radiographic Imaging Qualification / Curriculum Check Sheet
4 pages
01 - Inlges - Ficha Tecnica Interruptor MasterPact NW - 48112
No ratings yet
01 - Inlges - Ficha Tecnica Interruptor MasterPact NW - 48112
2 pages
50 Python Concepts Every Developer Should Know
From Everand
50 Python Concepts Every Developer Should Know
Hernando Abella
No ratings yet
A Concept of Limits
From Everand
A Concept of Limits
Donald W. Hight
4/5 (4)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Naive Bayes With Sentiment Classification

Uploaded by

Naive Bayes With Sentiment Classification

Uploaded by

Text

The Task of Text

Who wrote which Federalist papers?

• 1787-8: anonymous essays try to convince New York

James Madison Alexander Hamilton

Positive or negative movie review?

What is the subject of this article?

MEDLINE Article MeSH Subject Category Hierarchy

Text Classification: definition

• Output: a predicted class c  C

Simple ("naive") classification method based on

• For a document d and a class c

O(|X|n•|C|) parameters How often does this

Could only be estimated if a

Bag of Words assumption: Assume position doesn’t matter

positions  all word positions in test document

Multiplying lots of probabilities can result in floating-point underflow!

Learning the Multinomial Naive Bayes Model

First attempt: maximum likelihood estimates

fraction of times word wi appears

Create mega-document for topic j by concatenating all

Problem with Maximum Likelihood

What if we have seen no training documents with the word fantastic

Zero probabilities cannot be conditioned away, no matter the other

• From training corpus, extract Vocabulary

Calculate P(cj) terms • Calculate P(wk | cj) terms

But removing stop words doesn't usually help

Calculate P(cj) terms • Calculate P(wk | cj) terms

Counts can still be 2! Binarization is within-doc!

Negation changes the meaning of "like" to negative.

Simple baseline method:

didn’t like this movie , but I

didn’t NOT_like NOT_this NOT_movie but I

Home page: https://mpqa.cs.pitt.edu/lexicons/subj_lexicon/

◦ Home page: http://www.wjh.harvard.edu/~inquirer

Add a feature that gets a count whenever a word

Very good in domains with many equally important features

Optimal if the independence assumptions hold: If assumed

Generative Model for Multinomial Naïve Bayes

X1=Shanghai X2=and X3=Shenzhen X4=issue X5=bonds

Naïve Bayes and Language Modeling

Each class = a unigram language model

Naïve Bayes as a Language Model

Model pos Model neg

We'll need to know

Precision: % of selected items that are correct

F1 is a combination of precision and recall.

F-measure is the (weighted) harmonic mean of precision and recall

F1 is a special case of F-measure with β=1, α=½

Lots of text classification tasks have more than two classes.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.