Abstract
Document sentiment classification is a task to classify a document according to the positive or negative polarity of its opinion (favorable or unfavorable). We propose using syntactic relations between words in sentences for document sentiment classification. Specifically, we use text mining techniques to extract frequent word sub-sequences and dependency sub-trees from sentences in a document dataset and use them as features of support vector machines. In experiments on movie review datasets, our classifiers obtained the best results yet published using these data.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification using Machine Learning Techniques. In: Proc. of 7th EMNLP, pp. 79–86 (2002)
Pang, B., Lee, L.: A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. In: Proc. of 42nd ACL, pp. 271–278 (2004)
Charniak, E.: A Maximum-Entropy-Inspired Parser. In: Proc. of 1st NAACL, pp. 132–139 (2000)
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Prefixspan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc. of 17th ICDE, pp. 215–224 (2001)
Abe, K., Kawasoe, S., Asai, T., Arimura, H., Arikawa, S.: Optimized substructure discovery for semi-structured data. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 1–14. Springer, Heidelberg (2002)
Dave, K., Lawrence, S., Pennock, D.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proc. of 12th WWWC, pp. 519–528 (2003)
Turney, P.: Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In: Proc. of the 40th ACL, pp. 417–424 (2002)
Dumais, S., Platt, J., Heckerman, D., Sahami, M.: Inductive Learning Algorithms and Representations for Text Categorization. In: Proc. of 7th CIKM, pp. 148–155 (1998)
Kudo, T., Matsumoto, Y.: A Boosting Algorithm for Classification of Semi-Structured Text. In: Proc. of 9th EMNLP, pp. 301–308 (2004)
Mullen, T., Collier, N.: Sentiment Analysis using Support Vector Machines with Diverse Information Sources. In: Proc. of 9th EMNLP, pp. 412–418 (2004)
Hatzivassiloglou, V., McKeown, K.: Predicting the Semantic Orientation of Adjectives. In: Proc. of 35th ACL and 8th EACL, pp. 174–181 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Matsumoto, S., Takamura, H., Okumura, M. (2005). Sentiment Classification Using Word Sub-sequences and Dependency Sub-trees. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_37
Download citation
DOI: https://doi.org/10.1007/11430919_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26076-9
Online ISBN: 978-3-540-31935-1
eBook Packages: Computer ScienceComputer Science (R0)