A Deep Learning Approach For Sentiment Analysis in Spanish Tweets
A Deep Learning Approach For Sentiment Analysis in Spanish Tweets
1 Introduction
Semantic analysis has opened up several fields of research in NLP. In turn, these
new fields have helped the development of comprehension systems, which include,
as is explained in [1], cross- and multi-domain sentiment analysis, aspect-based
sentiment analysis, fake news identification, classification of semantic relations,
question answering of non-factoid questions among others. Zhang et. al. [9] de-
fines sentiment analysis or opinion mining as the computational study of people’s
opinions, sentiments, emotions, appraisals, and attitudes towards entities such as
products, services, organizations, individuals, events, topics, and their attributes.
In sentiment analysis tasks, tweets analysis at document level is highlighted
and long addressed one due to the large amounts of information about multiple
topics generated in short time and its easy access (unlabeled data). Specific tasks
linked to this problem has raised the interest of NLP community for several years
[16]. The automatic sentiment detection in tweets is a powerful and useful tool
for social networks analysis or advertising analysis and many other applications.
In this paper, we propose a CNN-based model that automatically processes
short texts obtained from task 1 proposed in TASS 2017 [1] using tweets in
Spanish and detects if a tweet expresses any polarity (positive, negative, neutral
or none) about an specific topic. The next sections will be as follows. Section 2,
covers related works in the area. Section 3, exposes our proposals (preprocessing
method and architecture design) in detail. Section 4, includes final results and
their analysis, and Section 5 presents our conclusions and future works.
2 Related Studies
Pang and Lee et. al. [13] and Liu et. al. [8] provided an introduction to Senti-
ment Analysis area. Zhang et. al [20] have published a very complete state of
art in Sentiment Analysis using deep learning approaches. They explained that
sentiment analysis could be represented as a classification problem (classifying
a text document on a bunch of predefined categories) and therefore addressed
with different methods, Zhang et. al also mentions that black-box models such as
neuronal networks and deep neuronal networks have become increasingly pop-
ular. About short texts analysis there are many papers which shows relevant
results in real life applications using tweets in different languages.
Glivia et. al. [15] evaluates Twitter hashtags in sentiment analysis for Brazil-
ian presidential elections in 2010. To do so, they analyzed 10,173,382 tweets
labeled in 4 labels: Positive, Negative, Ambiguous and Neutral, for hashtags
about candidates or events around the election day. They finally conclude that
trends in Twitter over time were in accordance with the general feeling of the
population. They also verified that information spreads on Twitter following a
social graph model and people make their decisions consciously or not, depending
on the feelings and choices of their contacts in Twitter.
Go et. al. [3] introduced a method to classify Twitter messages. Positive
and negative tweets are separated using emoticons labels: ”:) / :-)” or ”:( /
A Deep Learning Approach for Sentiment Analysis in Spanish Tweets 3
:-(.”. They collected 80,000 positive and 80,000 negative tweets as a training
set. In preprocessing step, emoticons were removed on training process because
the negative impact on precisions on the SVM and Maximum Entropy (ME)
classifiers, but has insignificant effects on Naive-Bayes based classifier. Then,
they segmented sentences by unigrams (word by word), bigrams (two words),
unigram-bigram, and the Speech features extracted by well-known descriptors.
Their results in accuracy using SVM and unigrams were 82.9%, while using
unigram-bigram in ME and Naive-Bayes were 82.7%, being considered in both
cases the best results for each method.
Kin and Yoon et. al. [7] and Xin and Wang et. al. [19] presented respectively
their attempts to use convolutional (CNN) and recursive (RNN) neural networks
for polarity classification in short texts, achieving quite inspiring results that
define standard architectures to solve the problem. CNN architecture allows to
get a fast convergence and presents, in most of cases, a remarkable performance
on sentence classification. By other hand, RNN usually converges slow but it can
interpret sequences of words better, that is more useful applied to text due to
it could capture the context in a sentence. Lost memory or vanishing gradient
is a problem for RNN. So a residual network or recurrent Long-Short Term
Memory network (LSTM) [19] is capable of capturing the special functions of
words avoiding lost memory problem.
In sentiment analysis of tweets at document level, Hassan et. al. [5] pro-
posed to merge CNN and LSTM-RNN models for shorts texts due LSTM avoid
vanishing gradient problem but depending on the text size while CNN works
better for very short texts, which are normally the tweets size. For IMDB opin-
ions database, they achieved 88.3 % using a single word embedding channel in
binary classification. While Severyn et. al. [17] explored CNN solutions using
Twitter database, getting 84.79% in accuracy for phases and 64.59% in message
level.
As can be seen most of works come from English datasets. In Spanish there
are few works which define the state of art on TASS datasets. Navas-Loro et.
al. [12], and Martı́nez-Cmara et. al. [10] resume most of works and methods
developed during TASS 2017 competition. In TASS 2017, best results were ob-
tained by neural network models. Hurtado et. al. [6] obtained 60.70% in accuracy
InterTASS corpus and 72.50% in General Corpus using a fully connected neu-
ral network with ReLU functions, dropout layer (p = 0.3) and polarity-specific
embeddings.
4 Methodology
In this section, we present the pre-processing methodology realized and the ar-
chitecture designed.
4.1 Preprocessing
Based on Severyn and Moschitti et. al. [17] and Navas-Loro and Rodrı́guez-
Doncel et. al. [12], we create a tokenizer to handle trivial terms and repeated
words following this steps:
– Delete URLs, extra blank spaces, special characters and repeated words.
– Change words to lowercase.
– Replace laugh expressions (like ’jajaja’, ’haha’, ’LOL’, etc.) by ’ja’.
– Replace colloquialisms by formal expressions (e.g. ’por’ instead of ’x’).
– Create a stop words dictionary to delete trivial words.
Fig. 1. Assuming that the dictionary size (D) is 8, the encoding size (E) is 4 and the
four convolution layers are << 1, 2, 3, 4 > x E >. Then, the preprocessed tweet ’me
gusta jugar ftbol mis amigos’ is classified following the pipeline
On the second one, we selected the best tuples based on table 2, then we
tuned parameters per each tuple to get optimal results. We run ten times each
tuple in order to obtain the best, worst and average accuracies. The second
experiment results are showed in table 3
Table 4 expose results for InterTASS and General corpus. In the contest, testing
and training data were available in different packages, so results presented in
table 4 refers the testing precision, then we compare our results (CNN-EMOTIC)
before the state of art (*).
A Deep Learning Approach for Sentiment Analysis in Spanish Tweets 7
Table 4. Comparative results in TASS-2017 for sentiment analysis from [1], (*) are
best results in the contest and our results are in bold
Proposed Corpus
System InterTASS General
CNN-EMOTIC 0.615 0.741
ELiRF-UPV-run1 0.607 0.666
RETUYT-svm cnn 0.596 0.674
ELiRF-UPV-run3 0.597 0.725*
jacerong-run-2 0.602 0.701
jacerong-run-1 0.608* 0.706
INGEOTECevodag-001 0.507 0.514
The results presented in this paper show that the proposed approach is efficient
in sentiment analysis of tweets at document level in Spanish. Based on exper-
iments, our CNN-based model presents an accuracy of 61.82% and 73.22% in
testing for InterTASS and General Corpus. During architecture design, we used
a well-known CNN-based model of the state of the art but setting a different
convolutional tuples. After many runs we concluded that ¡1, 2¿ tuple is the best
combination, this could be explained if we consider unigram (¡1¿) representa-
tion as the weight of each word and bi-gram (¡2¿) representation as the weight
of context for short texts. The 3-channels input allows a more accurate word-
vector representation of the tweet. Also this improvement was possible importing
the emoticons statistical meaning [18] to our preprocessing step. During tests,
those factors meant a slight but important improvement (from 59.3% - 70.7%
to 61.58% - 74.14% in InterTASS and General corpus respectively). To improve
our current results we have to integrate a semantic windows and entropy-based
model for large texts, considering to break the words/ emoticons according to
context (not just for sentiment analysis but aspect-based sentiment analysis).
References
2. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with
subword information. arXiv preprint arXiv:1607.04606 (2016)
3. Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant
supervision. CS224N Project Report, Stanford 1(12) (2009)
4. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep learning, vol. 1. MIT
press Cambridge (2016)
5. Hassan, A., Mahmood, A.: Deep learning approach for sentiment analysis of short
texts. In: Control, Automation and Robotics (ICCAR), 2017 3rd International
Conference on. pp. 705–710. IEEE (2017)
6. Hurtado Oliver, L., Pla, F., Gonzlez Barba, J.: Elirf-upv en tass 2017: Análisis de
sentimientos en twitter basado en aprendizaje profundo. p. 6 (09 2017)
7. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint
arXiv:1408.5882 (2014)
8. Liu, B.: Sentiment analysis and opinion mining. Synthesis lectures on human lan-
guage technologies 5(1), 1–167 (2012)
9. Liu, B.: Sentiment analysis: Mining opinions, sentiments, and emotions. Cambridge
University Press (2015)
10. Martınez-Cámara, E., Dıaz-Galiano, M., Garcıa-Cumbreras, M., Garcıa-Vega, M.,
Villena-Román, J.: Overview of tass 2017. In: Proceedings of TASS 2017: Workshop
on Semantic Analysis at SEPLN (TASS 2017). vol. 1896 (2017)
11. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word repre-
sentations in vector space. arXiv preprint arXiv:1301.3781 (2013)
12. Navas-Loro, M., Rodrıguez-Doncel, V.: Oeg at tass 2017: Spanish sentiment anal-
ysis of tweets at document level
13. Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Foundations and
Trends
R in Information Retrieval 2(1–2), 1–135 (2008)
14. Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word repre-
sentation. In: Proceedings of the 2014 conference on empirical methods in natural
language processing (EMNLP). pp. 1532–1543 (2014)
15. Rodrigues Barbosa, G.A., Silva, I.S., Zaki, M., Meira Jr, W., Prates, R.O., Veloso,
A.: Characterizing the effectiveness of twitter hashtags to detect and track on-
line population sentiment. In: CHI’12 Extended Abstracts on Human Factors in
Computing Systems. pp. 2621–2626. ACM (2012)
16. Rosá, A., Chiruzzo, L., Etcheverry, M., Castro, S.: Retuyt en tass 2017: Análisis
de sentimientos de tweets en espanol utilizando svm y cnn. Proceedings of TASS
(2017)
17. Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional
neural networks. In: Proceedings of the 38th International ACM SIGIR Conference
on Research and Development in Information Retrieval. pp. 959–962. ACM (2015)
18. Wang, H., Castanon, J.A.: Sentiment expression via emoticons on social media.
arXiv preprint arXiv:1511.02556 (2015)
19. Wang, X., Liu, Y., Chengjie, S., Wang, B., Wang, X.: Predicting polarities of tweets
by composing word embeddings with long short-term memory. In: Proceedings of
the 53rd Annual Meeting of the Association for Computational Linguistics and
the 7th International Joint Conference on Natural Language Processing (Volume
1: Long Papers). vol. 1, pp. 1343–1353 (2015)
20. Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: A survey. Wiley
Interdisciplinary Reviews: Data Mining and Knowledge Discovery p. e1253 (2018)