0% found this document useful (0 votes)
4 views16 pages

Short Text Sentiment Classification Using Bayesian

This paper explores short text sentiment classification using Bayesian networks and deep neural networks, highlighting their effectiveness in analyzing social media comments. The study demonstrates that triplet dependency features enhance classification accuracy, achieving results between 0.8116 and 0.87, with an average accuracy of 0.8301. The research emphasizes the advantages of deep learning techniques in sentiment analysis and proposes a new model for improved classification performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views16 pages

Short Text Sentiment Classification Using Bayesian

This paper explores short text sentiment classification using Bayesian networks and deep neural networks, highlighting their effectiveness in analyzing social media comments. The study demonstrates that triplet dependency features enhance classification accuracy, achieving results between 0.8116 and 0.87, with an average accuracy of 0.8301. The research emphasizes the advantages of deep learning techniques in sentiment analysis and proposes a new model for improved classification performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

electronics

Article
Short Text Sentiment Classification Using Bayesian and Deep
Neural Networks
Zhan Shi and Chongjun Fan *

Business School, University of Shanghai for Science & Technology, Shanghai 200093, China;
211420079@st.usst.edu.cn
* Correspondence: fan_chongjun@163.com

Abstract: The previous multi-layer learning network is easy to fall into local extreme points in
supervised learning. If the training samples sufficiently cover future samples, the learned multi-
layer weights can be well used to predict new test samples. This paper mainly studies the research
and analysis of machine short text sentiment classification based on Bayesian network and deep
neural network algorithm. It first introduces Bayesian network and deep neural network algorithms,
and analyzes the comments of various social software such as Twitter, Weibo, and other popular
emotional communication platforms. Using modeling technology popular reviews are designed to
conduct classification research on unigrams, bigrams, parts of speech, dependency labels, and triplet
dependencies. The results show that the range of its classification accuracy is the smallest as 0.8116 and
the largest as 0.87. These values are obtained when the input nodes of the triple dependency feature
are 12,000, and the reconstruction error range of the Boltzmann machine is limited between 7.3175
and 26.5429, and the average classification accuracy is 0.8301. The advantages of triplet dependency
features for text representation in text sentiment classification tasks are illustrated. It shows that
Bayesian and deep neural network show good advantages in short text emotion classification.

Keywords: Bayesian network; deep neural network algorithms; text sentiment analysis; machine learning

1. Introduction
Sentiment analysis has a long research history in the field of natural language process-
Citation: Shi, Z.; Fan, C. Short Text
ing. In the past, basically most of the methods were partially based on domain knowledge.
Sentiment Classification Using
Since then, the method based on machine learning has become the mainstream method of
Bayesian and Deep Neural Networks.
sentiment analysis.
Electronics 2023, 12, 1589. https://
In emotion analysis, emotion classification is the most important item. It is based on
doi.org/10.3390/electronics12071589
the emotional information displayed in the text, and divides the text into two or more
Academic Editor: Juan-Carlos Cano different categories, that is, a division of the attitudes, views, and tendencies of the text
authors. Emotional classification is a new research direction, which has a very important
Received: 3 February 2023
Revised: 17 March 2023
application value in view mining, information prediction, comment classification, garbage
Accepted: 23 March 2023
filtering, part of speech tagging, public opinion monitoring, etc.
Published: 28 March 2023
On blog and Weibo data, support vector machine and multinomial naive Bayesian
model are tested respectively. It was found that on long texts (blogs), SVMs perform better,
while on short texts (microblogs and Twitter), multinomial naive Bayes models outperform.
Based on the analysis of sentiment data flow based on association rules, this paper
Copyright: © 2023 by the authors. studies the major events in 2010, and finds that new training data are continuously obtained
Licensee MDPI, Basel, Switzerland. in the data flow, and studies how to automatically analyze users’ opinions and emotions in
This article is an open access article a real-time environment.
distributed under the terms and
This paper mainly introduces the deep neural network algorithm, Bayesian regular-
conditions of the Creative Commons
ization deep belief network, machine learning text sentiment classification, tests the role of
Attribution (CC BY) license (https://
the meta-learning method based on deep belief network in text sentiment classification,
creativecommons.org/licenses/by/
and makes experimental research and analysis, and concludes the desired conclusion.
4.0/).

Electronics 2023, 12, 1589. https://doi.org/10.3390/electronics12071589 https://www.mdpi.com/journal/electronics


Electronics 2023, 12, 1589 2 of 16

The innovation of this paper is to use the deep neural network algorithm to establish the
BR-DBN model and test its performance. The results showed that the model is suitable for
discriminative classification problems, and then the experimental research and analysis are
carried out in the experimental part, which are closely linked.

2. Related Work
Text sentiment analysis plays an important role in social network information mining.
It is also the theoretical basis and basis for personalized recommendation, interest circle
classification, and public opinion analysis. Therefore, Chang G proposed a fine-grained
short text sentiment analysis method based on machine learning. In order to improve the
calculation method of feature selection and weighting, he proposed a sentiment analysis
algorithm N-CHI and weight calculation W-TF-IDF which is more suitable for feature
extraction, and improved the proportion and weight of sentiment words in feature words
through experiments [1]. In addition to the traditional document classification feature
set, it is also possible to extract the comments of certain posts as part of the microblog
features based on the relationship between the commenter and the poster by constructing a
microblog social network as input information. Sun X proposed a Deep Belief Network
(DBN) model and a multimodal feature extraction method to extend the features and
dimensions of short texts for Chinese microblog sentiment classification [2]. Emotions can
be expressed in many ways, such as facial expressions and gestures, speech, and written text.
Sentiment analysis in text documents is essentially a content-based classification problem
involving concepts from the fields of natural language processing and machine learning.
Joshi S discussed techniques used in sentiment recognition and sentiment analysis based
on textual data [3]. In recent years, sentiment analysis research has gained a huge impetus
on English text data, however, few studies have focused on Nepali text data, and this work
focuses on Nepalese text data. Piryani R explored machine learning methods and proposes a
dictionary-based approach to sentiment analysis of tweets written in Nepali using linguistic
features and lexical resources [4]. Text classification is a central task in natural language
processing, aiming to classify text documents into predefined classes or categories. It
needs appropriate functions to describe the content and meaning of text documents and
map them to target categories. The existing text feature representation depends on the
weighted representation of document terms. Therefore, it is very important to choose
an appropriate term weighting method, which will help to improve the effectiveness of
classification tasks. Attieh J provides a new text classification framework for category-based
feature engineering [5]. Naive Bayesian learning algorithm is widely used in many fields,
especially in text classification. However, when it is used in fields that violate its naive
assumptions, or when the training set is too small to find an accurate probability estimate,
its performance will decline. El Hindi K M proposed a naive Bayesian method of inertia
fine tuning to solve these two problems [6]. In recent years, the deep learning model has
been successfully applied to text emotion analysis. However, category imbalance and
unmarked corpus still limit the accuracy of text emotion classification. To overcome these
two problems, Jiang W proposed a new text sentiment analysis classification model [7].
Sentiment analysis of online content related to electronic news, products, services, etc.,
has become very important in this digital age to improve the quality of the services provided.
Machine learning-based, knowledge-based, and hybrid are three approaches for sentiment
analysis of text, audio, and sentiment. The system proposed by Divate MS is a polarity-
based sentiment analysis of Marathi electronic news [8]. Twitter is an online blogging
site on the Internet that provides a platform for people to experience and talk about their
thoughts on troubles, events, merchandise, and exclusive ideas. Bhagat C proposed that
its most important goal is to have a comprehensive understanding of the way machine
learning strategies are used in sentiment analysis in order to get better results in short
details [9]. Sentiment analysis is one of the main fields of natural language processing, and
its main task is to extract sentiments, opinions, attitudes, and emotions from subjective
texts. Due to its importance in decision-making and people’s trust in website reviews,
proposed that its most important goal is to have a comprehensive understanding of the
way machine learning strategies are used in sentiment analysis in order to get better re-
sults in short details [9]. Sentiment analysis is one of the main fields of natural language
processing, and its main task is to extract sentiments, opinions, attitudes, and emotions
Electronics 2023, 12, 1589 from subjective texts. Due to its importance in decision-making and people’s trust in 3web- of 16
site reviews, there are many academic studies addressing the SA problem. So, Albayati A
Q proposed deep learning to explore powerful machine learning techniques, emerging
with
there its
arefeature representation
many academic studies and ability to the
addressing discriminate
SA problem. data,So,resulting
Albayatiin A state-of-the-
Q proposed
art prediction results [10]. Social network data are unstructured
deep learning to explore powerful machine learning techniques, emerging with its feature and unpredictable, and
contain idioms, jargon, and dynamic themes. Machine learning algorithms
representation and ability to discriminate data, resulting in state-of-the-art prediction for traffic event
detection
results [10].may not network
Social be able todataextract valuable information
are unstructured from socialand
and unpredictable, network
containdata. Far-
idioms,
man Ali proposed a real-time monitoring framework based on
jargon, and dynamic themes. Machine learning algorithms for traffic event detection may social networks for traffic
accident
not be able detection
to extractand condition
valuable analysis from
information usingsocial
ontology and data.
network potential
Farman Dirichlet assign-
Ali proposed
ment as well as two-way short-term and short-term memory [11].
a real-time monitoring framework based on social networks for traffic accident detection In the emotional attitude
extraction
and conditiontask,analysis
the goalusing
is to identify
ontologythe and “attitude”—emotional relationship
potential Dirichlet assignment as between
well as two-the
entities mentioned in the text. Rusnachenko N studied attention-based
way short-term and short-term memory [11]. In the emotional attitude extraction task, the context coder in
emotional attitude extraction task [12]. The views put forward by
goal is to identify the “attitude”—emotional relationship between the entities mentioned these scholars are all in
line
in thewith theRusnachenko
text. current situation of emotional
N studied texts, and context
attention-based this research
coderhas great research
in emotional sig-
attitude
nificance.
extraction However,
task [12]. Thetheyviews
all overlooked
put forward a very important
by these point,
scholars arethat
all inis,line
they didthe
with notcurrent
clarify
their research
situation objects. Therefore,
of emotional texts, and this this research
paper will hasfocus
greaton an investigation
research significance. andHowever,
analysis
combining algorithm experiments and actual research objects.
they all overlooked a very important point, that is, they did not clarify their research objects.
Therefore, this paper will focus on an investigation and analysis combining algorithm
3. Bayesian Network
experiments and actual and Deep Neural
research objects.Network Algorithm
3.1. Deep Neural Network Algorithm
3. Bayesian Network and Deep Neural Network Algorithm
Deep learning is a research field of machine learning. It studies the distribution rules
3.1. Deep Neural Network Algorithm
of data, so that the machine can have the same learning ability as humans, and have cer-
Deep learning
tain recognition is a research
ability for images field
andof sounds.
machineIn learning. It studies
recent years, withthe thedistribution
great success rules
of
of data, so that the machine can have the same learning ability as
deep learning in computer vision, speech recognition, data mining, and many other fields,humans, and have certain
recognition
it ability forthat
also has difficulties images and sounds.
traditional methods In recent
cannotyears, with the
overcome. great success
Therefore, of deep
it has become
learning in computer vision, speech recognition, data mining, and many other fields, it also
a new research hotspot. Deep learning uses more complex neural networks to solve prob-
has difficulties that traditional methods cannot overcome. Therefore, it has become a new
lems. Face recognition technology is ubiquitous in daily life, and face recognition is a rel-
research hotspot. Deep learning uses more complex neural networks to solve problems. Face
atively important in-depth learning direction. For neural network, face is like a data ma-
recognition technology is ubiquitous in daily life, and face recognition is a relatively important
trix. The top layer is used to extract facial features, and the bottom layer is used to recog-
in-depth learning direction. For neural network, face is like a data matrix. The top layer is
nize facial features.
used to extract facial features, and the bottom layer is used to recognize facial features.
(1) Deep self-encoding network
(1) Deep self-encoding network
Encoder is a device that encodes a signal or data into a signal that can be communi-
Encoder is a device that encodes a signal or data into a signal that can be commu-
cated, transmitted, and stored. The encoder converts angular displacement and linear dis-
nicated, transmitted, and stored. The encoder converts angular displacement and linear
placement into electrical signals. According to the different reading modes, the encoder
displacement into electrical signals. According to the different reading modes, the encoder
can be divided into contact type and non-contact type. According to the working principle
can be divided into contact type and non-contact type. According to the working principle
of
of coding,
coding, the
the coding
coding device
device cancan bebe divided
divided intointo incremental
incremental codingcoding device
device and and absolute
absolute
value
value coding
coding device.
device. Encoder
Encoder appeared
appeared in in the
the 1980s,
1980s, which
which is is aacontrol-free
control-free learning
learning algo-
algo-
rithm.
rithm. The
The basic
basic idea
idea of
of Encoder
Encoder is is to
to match
match the the input
input network
network as much as
as much as possible.
possible. TheThe
encoding processconsists
encoding process consistsofofthetheinput
inputlayer
layer toto form
form thethe hidden
hidden layer,
layer, andand the decoding
the decoding pro-
process consists
cess consists of the
of the hidden
hidden layerlayer to output
to the the output layer.
layer. Figure
Figure 1 shows
1 shows the overall
the overall frame-
framework
work
of thisofpaper.
this paper.

Figure 1. Overall framework.

(2) Deep belief network


In the deep belief network, the energy function refers to the function used to calculate
the “energy” or “cost” of each state in the network. Energy function is one of the core
concepts in deep belief network, which can describe the complexity of the model and the
adaptability of the model to data. During the training process, the network parameters
are iteratively optimized to minimize the energy function on the training data. When the
Electronics 2023, 12, 1589 4 of 16

network parameters are fixed, the energy function can be used to evaluate whether the
new data conform to the network distribution. Suppose layer d has m visible units and
layer g has n hidden units. Then the energy function between the visible layer node and
the hidden layer node (d, g) is:
m m m n
P(d, g|α) = − ∑ x j d j − ∑ y i gi − ∑ ∑ d j w ji gi (1)
n =1 n =1 n =1 m =1

The joint probability distribution of (d, g) can be obtained as:

p− p(d,g|α)
E(d, g|α) = (2)
z(α)

In which,
z( α ) = ∑ p− p(d,g|α) (3)
d,g

Then the likelihood functions E(d|α) and E( g|α) can be expressed as:

1
z(α) ∑
E(d|α) = p− p(d,g|α) (4)
g

1
z(α) ∑
E( g|α) = p− p(d,g|α) (5)
d

In addition, the conditional probabilities E(d| g; α) and E( g|d; α) of the hidden layer
and visible layer can also be obtained as:

E(d, g|α)
1
z(α)
p− p(d,g|α) p− p(d,g|α)
E(d| g; α) = = 1
= (6)
E( g|α)
z(α) ∑
p− p(d,g|α) ∑ p− p(d,g|α)
d d

E(d, g|α)
1
z(α)
p− p(d,g|α) p− p(d,g|α)
E( g|d; α) = = 1
= (7)
E(d|α)
z(α) ∑
p− p(d,g|α) ∑ p− p(d,g|α)
g g

Since there is no connection between the hidden layer and the visible layer, the
activation function can be derived from Equations (6) and (7), respectively:

1
E(d j = 1| g; α) = − x j −∑i w ji gi
(8)
1+ p

1
E( g j = 1|d; α) = (9)
1 + p−yi −∑ j w ji d j
Learning an RBM is about determining what is best for learning the data. This value
can be obtained by minimizing the gradient and maximizing the likelihood function. In
order to simplify the calculation, the logarithm can be increased, and the key step is to
find the partial derivative of 2 in Learning an RBM is about determining what is best for
learning the data. This value can be obtained by minimizing the gradient and maximizing
the likelihood function. In order to simplify the calculation, the logarithm can be increased,
and the key step is to find the partial derivative of α in 1, namely:
" #
η InE(d|α) η ( p(d, g|α)) η ( p(d, g|α))
=∑ − (10)
ηα ηα E( g|d,α) ηα E(d| g,α)
Electronics 2023, 12, 1589 5 of 16


Since α = w ji , x j , yi , the partial derivative of w ji , x j , yi can be obtained as:

η InE(d|α)
= d j giE(g|d,α) − d j giE(d,g|α) (11)
ηw ji

Currently, in RBM algorithms, fast algorithms are usually used to approximately


sample the reconstructed data to update parameter values.
Electronics 2023, 12, x FOR PEER REVIEW
3.2. Bayesian Regularization Deep Belief Networks 6 of 18

The Bayesian Regularization BR (Bayesian Regularization) method is a Bayesian


inference method to determine the hyperparameters in the regularization method. At
combinations
present, to find
it has been the optimal
applied in the superparameter
research of imagecombination;
recognition,automatic
medicine, machine
economy,learn-
and
ing is an automatic machine learning method, which can automatically select
other fields. Bayesian regularization algorithm provides a new idea for improving the optimal
the
model and super-parameter
generalization combination,
ability of neural network, andand can
has transfer
been learning
widely in multiple
recognized in manytasks, thus
fields of
improving
research. Itsthe generalization
network structureability of the
diagram model.in Figure 2.
is shown

Figure2.2. Bayesian
Figure Bayesian network
networkstructure
structurediagram.
diagram.

The purpose
3.3. Bayesian of thisDeep
Regularized paper is toNetwork
Belief apply the Bayesian regularization algorithm to the
Model
RBM (1)
algorithm to improve
Model construction the generalization ability of DBN.
Suppose the function is:
This paper constructs a BR DBN model whose bottom is superimposed by multilayer
Bayesian regularization RBM (BR RBM). The frame is shown in Figure 3. Back-propaga-
1 V M
Q = ∑ ∑each
tion is to calculate the partial derivatives of − cmvin)2the opposite direction according
(bmvlayer (12)
2
to the loss function, so as to update thev=parameters.
1 m =1

where V represents the number of output nodes; M represents the number of training sets;
bmv represents the expected output value; cmv represents the actual output value. Then,
using the regularization method, the training function becomes:

P = β · Q + ϕQW = βQ + ϕQW (13)


Electronics 2023, 12, 1589 6 of 16

In the formula, P is the new learning function, β and ϕ are the hyperparameters that
determine the distribution of parameters such as weights and thresholds.
Hyperparametric optimization refers to the process of finding the optimal combination
of hyperparameters in machine learning to improve the performance and effectiveness
of the models. Common hyperparametric optimization methods include grid search,
random search, Bayesian optimization, and automatic machine learning. Among them,
grid search is a violent search method, which will train all possible combinations of super
parameters; random search is a method of randomly selecting super parameters, which
can achieve the balance between calculation cost and effect; Bayesian optimization is
an optimization method based on Bayesian theorem, which can gradually adjust the
value range of superparameters according to the performance of known superparameter
combinations to find the optimal superparameter combination; automatic machine learning
is an automatic machine learning method, which can automatically select the optimal
model and super-parameter combination, and can transfer learning in multiple tasks, thus
improving the generalization ability of the model.

3.3. Bayesian Regularized Deep Belief Network Model


(1) Model construction
This paper constructs a BR DBN model whose bottom is superimposed by multilayer
Bayesian regularization RBM (BR RBM). The frame is shown in Figure 3. Back-propagation
Electronics 2023, 12, x FOR PEER REVIEW 7 of 18
is to calculate the partial derivatives of each layer in the opposite direction according to the
loss function, so as to update the parameters.

Figure3.3.BR-DBN
Figure BR-DBNmodel
modelstructure.
structure.

(2)Model
(2) Modeltraining
training
Thecompleted
The completed statestate of each layer
layer of
of BR-RBM
BR-RBMisisused
usedasasthe
theinput
inputofofthe
thenext
nextlayer
layerof
BR-RBM,
of BR-RBM, and andthethe
process is repeated
process until
is repeated the the
until pre-training of allofBR-RBM
pre-training all BR-RBMlayers is com-
layers is
pleted [13].[13].
completed
Assuming that the BR-DBN network consists of m layers of BR-RBM, since the tuning
phase starts from the last layer of BR-DBN, set the output vector of the last layer to be
f m (a) , that is, the initial sample is a, then f m (a) is:
1
f m (a) = ( y + wm f m−1 ( a ))
m (14)
Electronics 2023, 12, 1589 7 of 16

Assuming that the BR-DBN network consists of m layers of BR-RBM, since the tuning
phase starts from the last layer of BR-DBN, set the output vector of the last layer to be
f m ( a), that is, the initial sample is a, then f m ( a) is:

1
f m ( a) = m m m −1
(14)
1+ p(y +w f (a))

In the formula, ym and wm are the bias value and weight of the 1st layer BR-RBM
respectively; f m−1 is the output vector of the m − 1th layer. After the forward l-layer
BR-RBM learning, it can be concluded that the jth sample belongs to the category. The
probability of b j ∈ (1, 2, . . . k) is:
m f m (a m
p Dr j )+ k
q ( b j = r | f m ( a j ), D m , k m ) = Drm f m ( a j )+km
(15)
∑rk=1 p

In the formula, D is used as a parameter coefficient, and the category corresponding


to the maximum probability is the category judged by BPNN. The formula for the error
function of the mth layer is:
m m m
" #
1 1 k  p Dr f (a j )+k
n j∑ ∑ 1 bj = r log k Drm f m (aj )+km
m
S(ε ) = − (16)
=1 r =1 ∑ p r =1

In the formula, εm = {wm , ym , km , D m } and 1 b j = r are used as the logical indicator




function, when b j = r, the value is equal to 1; when b j 6= r, the value is equal to 0. To find
the minimum value of the error, use gradient ascent to find the partial derivatives of the
parameters as follows:

1 n h m _ m _
i
n j∑
∇εm S ( εm ) =

f ( a j )( 1 b j = r − g ( a j )) (17)
=1

If the number of hyperparameters is very large, use random search to find the potential
combination of hyperparameters, and then use the local grid search to select the optimal
feature. Next, super-parameter trimming, the formula is:

εm = εm − β∇ε − S(εm ) (18)

In the formula, β represents the learning rate.


(3) Model performance test
The BR-DBN model constructed in this paper is used to discriminate and classify
several commonly used standard datasets. Initialize W, x, y, random small values that obey
a Gaussian distribution. The initial learning rate is set to 0.1, and the learning rate variation
coefficient is set to 0.01, and the test results are shown in Table 1 [14].

Table 1. Classification test results on different standard datasets.

Data Set Training Set Test Set Average Classification Error Rate %
Iris 100 50 1.97
Seeds 150 60 3.46
Perfume Data 320 150 2.87
Four class 500 200 2.59

As can be seen from Table 1, for different datasets, the BR-DBN model has a lower
average error rate, and the results show that the model is suitable for discriminative
classification problems. Error rate refers to the proportion of the number of samples with
incorrect classification to the total number of samples.
Electronics 2023, 12, 1589 8 of 16

4. Machine Text Emotion Classification Experiment Based on Deep Belief Network


4.1. Experimental Design
The manual annotation-based supervised learning algorithm in this paper uses a lan-
guage model. Language models can be probabilistic based or non-probabilistic. Twitter
emotion analysis is actually a classification problem. In order to use language models to
conduct twitter emotion analysis, this article combines twitter short essays from the same
class (positive or negative) to form a large document. The learning process of the affective
language model is similar to the subjective classification problem, except that the category
becomes both subjective and objective [15].
Recently, in the field of twitter short text sentiment analysis, there are more and more
learning algorithms that do not need manually annotated data [16]. Such algorithms learn
the classifiers from the training data with noisy annotations, which are emoticons or other
specific markers. The advantage of these learning algorithms is that they eliminate the
heavy manual annotation process. A large number of training data with noisy annotation
information can be automatically obtained through programs, including twitter’s open
API or other existing twitter emotion analysis websites. Although a number of noise-
annotation-based algorithms have been proposed. However, there are still some flaws in
these methods.
First, none of these methods solve the problem of subjective classification very well.
Second, they all need to climb a large number of twitter short texts and store them locally,
considering that the twitter crawl is limited, so it is also a time-consuming and inefficient
way. Third, because the annotation information is inherently noisy, the classifier obtained
only using such training data with noisy annotation has limited accuracy. Fourth, at
present, few models can effectively use the artificial annotation information and the noise
annotation information simultaneously to integrate these two kinds of information into a
set of framework.
For subjectivity classification, the two sentiment categories are subjective and objective.
This paper assumes that tweets containing “:)” or “:(” are subjectively colored by the
publisher. Therefore, the search phrase posted to the twitter search API constructed in this
paper is “:)” or “:(”, which is used to estimate the subjective emotion language model [17].
For the language model of objective emotion, it is more difficult to calculate Pu(|), that
is, the occurrence probability of words in the objective emotion category, than the subjective
category. To the best of our knowledge, no academics have made valid assumptions about
objective tweets. This paper has tried a hypothetical strategy such that if a tweet does not
contain any expressions, then it is likely to be objective. The experimental results show
that this hypothesis is not satisfactory. It tries to use hashtags, such as “#jobs”, as tags for
objective tweets, but there are some problems with this assumption. For one, the number
of tweets that contain specific hashtags is limited. Second, the sentiment of tweets can bias
specific hashtags, such as “#jobs,” without ensuring objectivity.
In this paper, we propose a novel hypothesis to label objective tweets. If a tweet
contains an objective url link, it is more likely to be objective. Based on our observations,
we find that if url links come from image sites (such as twitpic.com) or video sites (such
as youtube.com), they are likely to be subjective. But if the url link is from a news site,
there is a good chance it is an objective tweet. Therefore, if a url link does not come
from a picture website or a video website, this article calls this url an objective url link.
Based on the above findings and assumptions, the search phrase submitted to the twitter
search API constructed in this paper is “wifilter:links”, where filter:links indicates that
the returned tweets are linked by url. This paper does this to obtain the information of
objective tweets [18].
Considering that both algorithms based on manual annotation and only using noise
annotation information have their own disadvantages, the best strategy is to use both manual
and noise annotation information and use two types of training data for model training.
How to seamlessly integrate these two different types of data into a unified framework is
Electronics 2023, 12, 1589 9 of 16

the challenge addressed in this chapter. This paper presents a brand new model, based on
the emoji smoothing language model, namely emoticonsmoothedlanguagemodel (ESLAM).
The main contributions of ESLAM are as follows:
After training the language model through manually annotated data, ESLAM smoothed
the language model using training data annotated with emoticons. Thus, ESLAM seamlessly
integrates manual and noisy annotated data to form a unified probabilistic model framework.
The large amount of noise annotation data allows the ESLAM language model to handle
misspelled words, slang, tone words, abbreviations, and their various unlogged words. This
ability is not found in a common supervised learning model based on manual annotation.
In addition to discriminating between positive and negative polarity classification,
ESLAM can also be used for subjective classification. The previous noise annotation-based
algorithm cannot be used for subjective classification.
Most noise annotation-based learning algorithms need to crawl a large number of
Electronics 2023, 12, x FOR PEER REVIEW
twitter short texts and store them locally, but considering that twitter crawling has access
frequency limited, it is also a time-consuming, storage space consuming and inefficient
way [19]. The ESLAM in this paper proposes an innovative and simple method to directly
estimate
estimate the probability the probability
of each word in the oflanguage
each word in theby
model language model by
using twitter’s using
open API,twitter’s o
without the need to download
without the need to download any original text from twitter. any original text from twitter.
Experiments on real Experiments
data fromon real data
twitter show from
thattwitter
ESLAM showcanthat ESLAM can
effectively effectively int
integrate
artificial and noisetificial and noise
annotation annotation
information and information
work better than and work better than other
other algorithmic models algorithm
that use only one of that useinformation.
these only one of these information.
To test the role
To test the role of meta-learning of meta-learning
methods based on deep methods
beliefbased on deep
networks belief
in text networks in
emotion
tion classification, two sets of contrast experiments
classification, two sets of contrast experiments were used. The first group compares the were used. The first group
the resultsmethod
results of the meta-learning of the meta-learning
of deep beliefmethod
networkofand deepthebelief
deepnetwork and the deep b
belief network
directly acting on work directly
the text featureacting
vectoron in
thetext
textemotion
feature vector in text emotion
classification, and theclassification,
results of and t
metalearning and of metalearning
fixed rules in textand fixed rules
emotion in text emotion classification.
classification.
The deep persuasion Theofdeep persuasion
the network of theaffects
directly network thedirectly
emotional affects the emotional
classification of theclassificat
text. The work process text. The workofprocess
consists consists
three parts: textofpre-processing,
three parts: texttextpre-processing, text feature
feature selection,
and learning in theand learning
deep neuralin the deepThe
network. neural network.
process Thein
is shown process
Figureis4.shown in Figure 4.

Figure 4. Flow chart of emotion classification.


Figure 4. Flow chart of emotion classification.
Figure 4 shows the process from the original text of using the deep belief network
Figure
to obtain the classification 4 shows
results. the the
First, process from the original
preprocessing steps oftext of using
sentence the deep belief n
sentences,
polarity annotation, and word segmentation are carried out. Then, on the basis of sentence
obtain the classification results. First, the preprocessing steps of word sente
segmentation, thelarity
oneryannotation, and wordbinary
words of sentences, segmentation
words, are
wordcarried out. Then, onlabel,
sex, dependency the basis of w
mentation, the onery words of sentences, binary words,
ternary dependency relationship and other characteristics of the sentence are extracted word sex, dependency
nary dependency
to form the text representation relationship
vector. In general,and
for other characteristics
the feature of the
vector space sentence
with large are ex
form the text representation vector. In general, for the feature
dimension, the information gain feature selection algorithm is used. Finally, the training vector space with
set is used for the deep belief network for training to obtain the determined network the tr
mension, the information gain feature selection algorithm is used. Finally,
is used
structure. The network for the deep
structure belief with
was tested network for set,
the test training to obtain
and the the determined
final available deep netw
belief network was ture. The network
determined structure was
by adjusting the tested with the test
corresponding set, and
network the finaland
structure available d
parameters [20]. network was determined by adjusting the corresponding network structure and
tersa[20].
This paper uses publicly available Sanders dataset containing 5513 manually an-
This paper
notated tweets. The tweets are alluses
abouta publicly available
one of the Sandersnamely
four themes, datasetApple,
containing 5513 manua
Google,
tated tweets. The tweets are all about one of the four themes, namely Apple, Go
crosoft, and Twitter. After removing non-English tweets and junk tweets, ther
3723 tweets left. The larger the index value, the more accurate the text emotion
tion results are. Six sets of different features were selected, including monary wo
Electronics 2023, 12, 1589 10 of 16

Microsoft, and Twitter. After removing non-English tweets and junk tweets, there are still
3723 tweets left. The larger the index value, the more accurate the text emotion classification
results are. Six sets of different features were selected, including monary word, binary word,
word sex, dependency label, combined features of emotion score, and triplet dependency
feature. The dimensions of each functional set were network input with 1000, 2000, 4000,
6000, 8000, 12000, 14000 items with the highest information gain score. The number of
different network layers and their corresponding hidden layer nodes are shown in Table 2.

Table 2. Deep belief network structure settings.

Number of Hidden Layers Network Structure


2 X-600-300
3 X-600-300-100
5 X-2000-300-200-100

In Table 2, the network structure with X representing the input nodes and 2 layers is
X-600-300, indicating the first hidden layer knots of 600 and the second hidden layer knots
of 300.
Experimental results record the classification accuracy and reconstruction error of
different feature dimensions under different network structures. It records the accuracy
of the network DBN: X-2000-1000-500-200-100, DBN: X-600-300-100, DBN: X-600-300 and
BP: X-600 in different dimensions of six sets of feature sets. The reconstruction errors are
numbered according to the corresponding number of hidden layers.

4.2. Classification and Calculation


Experimental calculations were performed according to the experimental flow in
Figure 5. Experimental results record the classification accuracy and reconstruction error
of different feature dimensions under different network structures. They are the results
of network DBN: X-2000-1000-500-200-100, DBN: X-600-300-100, DBN: X-600-300, and
BP: X-600 in different dimensions of six sets of feature sets. The reconstruction errors are
numbered according to the corresponding number of hidden layers [21].
The minimum, maximum, and mean values of the source data results were calculated.
Statistical results are presented in Tables 3–6.

Table 3. DBN: X2000-1000-500-200-100 statistical results.

Reconstruction Reconstruction Reconstruction Reconstruction Reconstruction


Exact Value Time (s)
Error 1 Error 2 Error 3 Error 4 Error 5
minimum 0.8058 9.4408 0.6078 2.4241 2.4355 1.4961 1696.6
Imaximum value 0.8692 22.5905 10.3566 5.2308 5.2445 4.3040 4970.7
average value 0.8303 16.0944 8.7713 3.9798 4.1796 3.0649 3208.9

Table 4. DBN: X600-300-100 statistical results.

Exact Value Reconstruction Error 1 Reconstruction Error 2 Reconstruction Error 3 Time (s)
minimum 0.8116 7.3175 2.7168 1.3974 166.3
maximum value 0.8700 26.5429 6.2811 4.4129 1503.6
average value 0.8301 15.9288 4.9615 2.9398 763.4

Table 5. DBN: X600-300 statistical results.

Exact Value Reconstruction Error 1 Reconstruction Error 2 Time (s)


minimum 0.8000 7.3296 2.7044 142.2
maximum value 0.8700 26.5921 9.5712 1117.5
average value 0.8327 16.1251 5.0907 717.4
of 300.
Experimental results record the classification accuracy and reconstruction er
different feature dimensions under different network structures. It records the acc
of the network DBN: x-2000-1000-500-200-100, DBN: x-600-300-100, DBN: x-600-30
Electronics 2023, 12, 1589 BP: x-600 in different dimensions of six sets of feature sets. The reconstruction
11 of 16 erro
numbered according to the corresponding number of hidden layers.

4.2. Classification and Calculation


Table 6. DBN: X600 statistical results.
Experimental calculations were performed according to the experimental flow i
ure 5. Experimental results record
Exact Value the classification accuracy and reconstruction er
Time (s)
different
minimum feature dimensions under
0.8133 different network structures.
45.35 They are the resu
network DBN: x-2000-1000-500-200-100, DBN: x-600-300-100, DBN: x-600-300, and
maximum value 0.8641 1117.5
600 in different dimensions of six sets of feature sets. The reconstruction errors are
averagebered
value according to the corresponding
0.8333 322.3 [21].
number of hidden layers

0.88 0.88

1000 2000 4000 6000 8000 10,000


0.86 0.86

0.84 0.84
types

types
0.82 0.82

0.8 0.8

0.78 0.78

value value
Figure 5. DBN: X20000-1000-500-200-100 classification standard rate.
Figure 5. DBN: X20000-1000-500-200-100 classification standard rate.
Table 3 shows the results of the network DBN: X-2000-1000-500-200-100. Among them,
the minimum classification accuracymaximum,
The minimum, was 0.8058andandmean
the maximum
values of was 0.8692,data
the source obtained
results were
when the triplet dependency feature dimension was 14,000,
lated. Statistical results are presented in Tables 3–6. the average classification
accuracy of the 5-layer deep belief network was 0.8303. However, the first-layer restricted
Boltzmann machineTable reconstruction error for a single
3. DBN: X2000-1000-500-200-100 training
statistical set is 9.4408 to 22.5903, and
results.
the average reconstruction error is 16.0944. The reconstruction error increases with the
Exact
input Reconstruction
nodes. Reconstruction
The reconstruction error in the Reconstruction
second layer ranged Reconstruction
from 6.0798Reconstruction
to 10.3566, Tim
Value Error 1 Error 2 Error 3 Error 4
with an average value of 8.7713. The first layer is greatly reduced in the reconstruction Error 5
error.
minimum The
0.8058 9.4408
reconstruction 0.6078
error of the third layer ranges2.4241 2.4355 and the average
from 2.4241 to 5.2308, 1.4961 16
Imaximum reconstruction error is 3.9798, which is also reduced from the reconstruction error of the
0.8692 22.5905 10.3566 5.2308 5.2445 4.3040 49
value second layer. The fourth layer reconstruction error ranged from 2.4355 to 5.2445, with an
average of 4.1796, and the third layer showed little change. The reconstruction error in the
fifth layer ranges from 1.4961 to 4.3040, with an average of 3.0649, decreasing compared
with the reconstruction error in the previous layer. The network running time ranged from
166.3 to 1503.6 s, increasing with increasing input nodes.
Table 4 shows the results of the network DBN: X-600-300-100. Among them, the
minimum range of classification accuracy is 0.8116, the maximum is 0.87, taken when the
triplet dependency feature input node is 12,000, the average classification accuracy is 0.8301.
The reconstruction error range of the first layer confined Boltzmann machine is between
7.3175 and 26.5429, increasing with increasing input nodes. The reconstruction error of the
second layer ranges from 2.7168 to 6.2811, which is less accurate from that of the previous
Electronics 2023, 12, 1589 12 of 16

layer. The reconstruction error of the third layer ranges from 1.3974 to 4.4129, which is also
reduced compared with the previous layer. The time period ranged from 166.3 s to 1503.6 s.
Table 5 shows the results of the network DBN: X-600-300. Among them, the minimum
classification accuracy was at 0.8116 and the maximum was 0.87, obtained when the triplet
dependency feature input node was 14,000, and the average classification accuracy was
0.8326. However, the reconstruction error range of the training set is 7.3296 to 26.5921,
increasing with more input nodes. The reconstruction error in the second layer ranges from
2.7044 to 9.5712, which is lower than that of the previous layer. The time period ranged
from 142.2 s to 1409.4 s.
Table 6 shows the results of network BP: X-600. Among them, the minimum classi-
fication accuracy was 0.8133 and the maximum was 0.8641, obtained when the ternary
dependency feature input dimension was 14,000, and the average classification accuracy
was 0.8333. The time period ranged from 45.35 s to 1117.5 s.

4.3. Experimental
(1) Effect of different feature sets on the classification accuracy
As can also be seen from Figure 5, in order to verify numerical, by calculating the
average classification accuracy of different feature sets, the results are 0.81, 0.8381, 0.8195,
0.8152, 0.8220, and 0.8620. The highest classification accuracies were 0.8142, 0.8433, 0.8308,
0.825, 0.8342, and 0.8692, respectively. We show that the triple dependence is a feature
representation that achieves the highest classification accuracy, followed by combined
features of monary and binary words.
AsPEER
Electronics 2023, 12, x FOR can be seen from Figure 6, the average classification accuracy according to different
REVIEW 1
feature sets is 0.8202, 0.8379, 0.8214, 0.8175, 0.8215, and 0.8585. The highest classification
accuracy obtained was 0.8291, 0.845, 0.8283, 0.8325, 0.8375, and 0.87.

0.88 0.88
1000 2000 4000 6000
8000 10,000 12,000 14,000

0.86 0.86

0.84 0.84
types

types

0.82 0.82

0.8 0.8

0.78 0.78

value value
Figure 6. DBN: X600-300-100 classification standard rate.
Figure 6. DBN: X600-300-100 classification standard rate.
As can be seen from Figure 7, when the triplet dependency feature dimension is taken
at 4000 or above, the classification accuracy exceeds other features and feature combinations.
0.88 0.88
Second, the combination of monary word features achieves good classification results in
1000 2000 8000 10,000

4000 6000 12,000 14,000


0.86 0.86
0.82 0.82

0.8 0.8
Electronics 2023, 12, 1589 13 of 16

0.78 dimensions. On the basis of binary words, 0.78


some the classification accuracy between adding
words, dependency label, and emotion score features is not very different, and the average
classification accuracy of single word features is the lowest. According to the calculation,
value accuracy according to different feature sets
the average classification value
is successively: 0.8106,
0.8341, 0.8272, 0.8243, 0.8272, and 0.8605. The best classification accuracy on different
feature sets is 0.82, 0.8441, 0.8341, 0.8308, 0.8341, and 0.87. Conclusion DBN: X-2000-1000-
Figure 6. DBN: X600-300-100 classification standard rate.
500-200-100, DBN: X-600-300-100.

0.88 0.88
1000 2000 8000 10,000

4000 6000 12,000 14,000


0.86 0.86

0.84 0.84

types
types

0.82 0.82

0.8 0.8

0.78 0.78

value value
Figure 7. DBN: X600-300 classification standard rate.
Figure 7. DBN: X600-300 classification standard rate.
As can be seen from Figure 8, when the triplet dependency feature dimension is taken
at 4000 or above, the classification accuracy exceeds other features and feature combinations.
The average classification accuracy of binary word features is higher than the addition of
word features, dependency labels, and emotion score. The lowest average classification
accuracy was on unitary word features. According to the calculation, the average classi-
fication accuracy of different feature sets is: 0.8196, 0.8349, 0.8307, 0.8276. Meta-learning
text emotion classification 850.8306, 0.8479. Based on deep belief network, the highest
classification accuracy obtained was: 0.8275, 0.8416, 0.8425, 0.8375, 0.8391, and 0.8641.
(2) Analysis and comparison of deep belief network and BP network
The deep belief network is composed of the multilayered restricted Boltzmann machine
in the stack form, and the initial weights of the network are learned by the restricted
Boltzmann machine algorithm, and then adjusted by the BP algorithm according to the
label data. However, the initial value of the network is randomly assigned, which is
adjusted by the BP algorithm, which leads to the non-convergence of the BP network due
to the error decline. This section analyzes the classification accuracy and convergence of
deep belief networks and BP networks in experiments.
The deep belief network structure with different numbers of layers was compared
with the classification accuracy of the BP networks, They are: One yuan word 4000, One
Word 6000, +Binary word 4000, +Binary word 6000, +Binary word 8000, +Binary word
10,000, +Binary word 12,000, +Binary word 14,000, +Sex of Words 4000, +Sex of Words
6000, +Sex of Words 8000, +Sex of words 10,000, +Sex of Words 12,000, +Sex of Words
14,000, +Lalabel 4000, +Lalabel 6000, +Lalabel 8000, +Lalabel 10,000, +dependent label
12000, +Lalabel 14,000, +Emotional score of 4000, +5 Emotional score of 6000, +Emotional
Electronics 2023, 12, 1589 14 of 16

score of 8000, +Emotional score of 10,000, +Emotional Score of 12,000, +Emotional score
Electronics 2023, 12, x of
FOR14,000, Terplet dependency 4000, Terplet dependency 6000, Terplet dependency 8000,
PEER REVIEW 15
Terplet dependency 10,000, Terplet dependencies 12,000, Terplet dependency 14,000. The
comparison of the obtained results is shown in Figure 9.

0.88 0.88
1000 2000 8000 10,000

4000 6000 12,000 14,000


0.86 0.86

0.84 0.84

types
types

0.82 0.82

0.8 0.8

0.78 0.78

Electronics 2023, 12, x FOR PEER REVIEW value value 16 of 18

Figure 8. DBN: X600 classification standard rate.


Figure 8. DBN: X600 classification standard rate.
0.88
TheDBN:X-2000-1000-500-200-100
deep belief network is composed of the multilayered restricted Boltzmann
0.87
chine inDBN:X-600-300-100
the stack form, and the initial weights of the network are learned by the restri
BoltzmannDBN:X-600-300
machine algorithm, and then adjusted by the BP algorithm according to
0.86
label data. However, the initial value of the network is randomly assigned, which is
0.85 justed by the BP algorithm, which leads to the non-convergence of the BP network du
Accuracy

the error decline. This section analyzes the classification accuracy and convergenc
0.84
deep belief networks and BP networks in experiments.
0.83 The deep belief network structure with different numbers of layers was comp
with the classification accuracy of the BP networks, They are: One yuan word 4000,
0.82 Word 6000, +Binary word 4000, +Binary word 6000, +Binary word 8000, +Binary w
10,000, +Binary word 12,000, +Binary word 14,000, +Sex of Words 4000, +Sex of W
0.81
6000, +Sex of Words 8000, +Sex of words 10,000, +Sex of Words 12,000, +Sex of W
0.8 14,000, +Lalabel 4000, +Lalabel 6000, +Lalabel 8000, +Lalabel 10,000, +dependent l
0 12000, 5+Lalabel1014,000, 15 +Emotional20 score 25of 4000, 30+5 Emotional
35 score of 6000, +Emoti
score of 8000, +Emotional Feautre score
Sets of 10,000, +Emotional Score of 12,000, +Emotional sco
Figure
Figure 9. Emotional14,000,
9. Emotional Terplet
analysis
analysis dependency
lineline chart.
chart. 4000, Terplet dependency 6000, Terplet dependency 8000,
plet dependency 10,000, Terplet dependencies 12,000, Terplet dependency 14,000.
TheThe structural
structural classification
comparison accuracy
of the accuracy
classification obtained of of
thethe
results is three
threeshown different
in Figure
different deep
deep belief
9.belief networks
networks in in
Figure
Figure 9 has
9 has almost
almost thethe same
same trend
trend at at each
each input
input node.
node. In Inthethe comparison
comparison of of deep
deep belief
belief
network and BP between the 11th node and the 26th node, BP:
network and BP between the 11th node and the 26th node, BP: X-600 is better than other X-600 is better than other
networks, while DBN-X-2000-1000-500-500-200-100, from the 28th to 32nd junction, BP: X- X-
networks, while DBN-X-2000-1000-500-500-200-100, from the 28th to 32nd junction, BP:
600600
hashasthethe lowest
lowest classification
classification accuracy,
accuracy, indicating
indicating that
that BPBP learns
learns less
less under
under complex
complex
features, less than the deep belief network
features, less than the deep belief network [22]. [22].
BPBP network
network algorithm
algorithm is an
is an essential
essential gradient
gradient descent
descent method,
method, andandthethe
highhigh dimen-
dimen-
sional characteristics of network input and the nature of text emotion
sional characteristics of network input and the nature of text emotion classification itself classification itself
make optimization objective function very complex, therefore, the process of “zigzag”
make optimization objective function very complex, therefore, the process of “zigzag”
phenomenon is used. When the optimization of neuron output is close to 0 or 1, the weight
phenomenon is used. When the optimization of neuron output is close to 0 or 1, the weight
error change is small, in the error spread to pause, leading to network convergence. How-
ever, in the deep belief network, the optimization of weights is realized by limiting the
Boltzmann machine to avoid the non-convergence due to too small error in the gradient
descent algorithm [23].
Electronics 2023, 12, 1589 15 of 16

error change is small, in the error spread to pause, leading to network convergence. How-
ever, in the deep belief network, the optimization of weights is realized by limiting the
Boltzmann machine to avoid the non-convergence due to too small error in the gradient
descent algorithm [23].

5. Conclusions
The main work and conclusions are as follows:
(1) According to the characteristics of Chinese text, by analyzing the theory, character-
istics, and generating methods of dependent syntactic relationship, yielded the process of
constructing the dependent relationship characteristics of Chinese triad; it analyzed and
summarized the dependent syntactic relationship of many Chinese sentences, and formu-
lated the rules for Chinese sentences without affecting the structure of dependent tree. The
merge and delete algorithm of redundancy and useless nodes are presented. The above
method is used in Chinese hotel review data, book review data, and laptop review data,
effectively realizing the conversion of triplet dependency characteristics of text [24,25].
(2) It compared the accuracy of text emotion classification by comparing the com-
bination of the proposed ternary dependency features and common text representation
features including monary words, binary words, words, dependency labels, and emotion
scores [26,27]. To this end, two sets of experiments are designed for comparative analysis:
one calculates the emotion score of each comment statement on three datasets based on
semantic methods, and one uses the features extracted from three data instances for ma-
chine learning and k-neighbor classification algorithm [28]. Meanwhile, the text feature
representation method of different feature sets is dimension-reduced, and the feature vector
space of different dimensions is used in traditional machine learning algorithms. Experi-
mental results show that the triplet dependent feature representation method is effective in
text emotion classification, with a much higher result than the emotion dictionary scores
based on semantic methods, and the classification accuracy reaches 84 to 86% in large-scale
data for SVM classification algorithms, increasing 2~3% based on existing features. But it
is also found that the triplet dependency feature leads to the growth of the characteristic
dimension. Determining the dimension is a difficult problem to reduce the dimension.
However, due to the limitations of time and technology, this paper has not carried out a
detailed analysis of the problems encountered in the emotional classification of short text,
which will be further discussed in the future.

Author Contributions: Formal analysis, C.F.; Writing—original draft, Z.S. All authors have read and
agreed to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Chang, G.; Huo, H. A method of fine-grained short text sentiment analysis based on machine learning. Neural Netw. World 2018,
28, 325–344. [CrossRef]
2. Sun, X.; Peng, X.; Hu, M. Extended Multi-modality Features and Deep Learning Based Microblog Short Text Sentiment Analysis.
Dianzi Yu Xinxi Xuebao/J. Electron. Inf. Technol. 2017, 39, 2048–2055. [CrossRef]
3. Joshi, S.; Deshpande, D. Twitter Sentiment Analysis System. Int. J. Comput. Appl. 2018, 180, 35–39. [CrossRef]
4. Piryani, R.; Piryani, B.; Singh, V.K.; Pinto, D. Sentiment analysis in Nepali: Exploring machine learning and lexicon-based
approaches. J. Intell. Fuzzy Syst. 2020, 39, 2201–2212. [CrossRef]
5. Attieh, J.; Tekli, J. Supervised term-category feature weighting for improved text classification. Knowl. Based Syst. 2023, 261, 110215.
[CrossRef]
6. El Hindi, K.M.; Aljulaidan, R.R.; AlSalman, H. Lazy fine-tuning algorithms for naïve Bayesian text classification. Appl. Soft
Comput. 2020, 96, 106652. [CrossRef]
7. Jiang, W.; Zhou, K.; Xiong, C.; Guodong, D.; Chubin, O.; Zhang, J. KSCB: A novel unsupervised method for text sentiment
analysis. Appl. Intell. 2023, 53, 301–311. [CrossRef]
Electronics 2023, 12, 1589 16 of 16

8. Divate, M.S. Sentiment analysis of Marathi news using LSTM. Int. J. Inf. Technol. 2021, 13, 2069–2074. [CrossRef]
9. Bhagat, C.; Mane, D. Survey On Text Categorization Using Sentiment Analysis. Int. J. Sci. Technol. Res. 2019, 8, 1189–1195.
10. Albayati, A.Q.; Al_Araji, A. Arabic Sentiment Analysis (ASA) Using Deep Learning Approach. Univ. Baghdad Eng. J. 2020, 26, 85–93.
[CrossRef]
11. Ali, F.; Ali, A.; Imran, M.; Naqvi, R.A.; Siddiqi, M.H.; Kwak, K.-S. Traffic accident detection and condition analysis based on social
networking data. Accid. Anal. Prev. 2021, 151, 105973. [CrossRef] [PubMed]
12. Rusnachenko, N.; Loukachevitch, N. Attention-Based Neural Networks for Sentiment Attitude Extraction using Distant Su-
pervision. In Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics, Biarritz, France,
30 June–3 July 2020; pp. 159–168. [CrossRef]
13. Gallego, F.O.; Corchuelo, R. Torii: An aspect-based sentiment analysis system that can mine conditions. Software 2020, 50, 47–64.
[CrossRef]
14. Chen, J.; Yan, S.; Wong, K.C. Verbal aggression detection on Twitter comments: Convolutional neural network for short-text
sentiment analysis. Neural Comput. Appl. 2018, 3, 10809–10818. [CrossRef]
15. Rehman, A.U.; Malik, A.K.; Raza, B. A Hybrid CNN-LSTM Model for Improving Accuracy of Movie Reviews Sentiment Analysis.
Multimed. Tools Appl. 2019, 78, 26597–26613. [CrossRef]
16. Karthik, E.; Sethukarasi, T. Sarcastic user behavior classification and prediction from social media data using firebug swarm
optimization-based long short-term memory. J. Supercomput. 2021, 78, 5333–5357. [CrossRef]
17. Wang, X.; Zhang, H.; Xu, Z. Public Sentiments Analysis Based on Fuzzy Logic for Text. Int. J. Softw. Eng. Knowl. Eng. 2016, 26,
1341–1360. [CrossRef]
18. Ashok, K.J.; Trueman, T.E.; Cambria, E. A Convolutional Stacked Bidirectional LSTM with a Multiplicative Attention Mechanism
for Aspect Category and Sentiment Detection. Cogn. Comput. 2021, 13, 1423–1432.
19. Roseline, V.; Chellam, G.H. Sentiment Classification Using PS-POS Embedding with Bilstm-CRF and Attention. Int. J. Future
Gener. Commun. Netw. 2020, 13, 3520–3526.
20. Han, H.; Bai, X.; Ping, L. Augmented sentiment representation by learning context information. Neural Comput. Appl. 2019, 31,
8475–8482. [CrossRef]
21. Sengan, S.P.; Sagar, V.; Khalaf, O.I.; Dhanapal, R. The optimization of reconfigured real-time datasets for improving classification
performance of machine learning algorithms. Math. Eng. Sci. Aerosp. 2021, 12, 43–54.
22. Roseline, V.; Herenchellam, D. PS-POS Embedding Target Extraction Using CRF and BiLSTM. Int. J. Adv. Sci. Technol. 2020, 29,
10984–10995.
23. Bashar, M.A.; Nayak, R.; Luong, K. Progressive domain adaptation for detecting hate speech on social media with small training
set and its application to COVID-19 concerned posts. Soc. Netw. Anal. Min. 2021, 11, 69. [CrossRef] [PubMed]
24. Huan, J.L.; Sekh, A.A.; Quek, C.; Prasad, D.K. Emotionally charged text classification with deep learning and sentiment semantic.
Neural Comput. Appl. 2021, 34, 2341–2351. [CrossRef]
25. Yan, Z.; Cao, W.; Ji, J. Social behavior prediction with graph U-Net+. Discov. Internet Things 2021, 1, 18. [CrossRef]
26. Brooke, J.; Hammond, A.; Hirst, G. Using models of lexical style to quantify free indirect discourse in modernist fiction. Lit.
Linguist. Comput. 2017, 32, 234–250. [CrossRef]
27. Kumar, M.; Aggarwal, J.; Rani, A.; Stephan, T.; Shankar, A.; Mirjalili, S. Secure video communication using firefly optimization
and visual cryptography. Artif. Intell. Rev. 2021, 55, 2997–3017. [CrossRef]
28. Lu, H.; Wang, S.S.; Zhou, Q.W.; Zhao, Y.N.; Zhao, B.Y. Damage and control of major poisonous plants in the western grasslands of
China? a review. Rangel. J. 2012, 34, 329. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy