Structural information aware deep semi-supervised recurrent neural network for sentiment analysis

doi:10.1007/s11704-014-4085-7

Front. Comput. Sci.

2015, Vol. 9

Issue (2) : 171-184 https://doi.org/10.1007/s11704-014-4085-7

RESEARCH ARTICLE

Structural information aware deep semi-supervised recurrent neural network for sentiment analysis

Wenge RONG^1,²,Baolin PENG¹,Yuanxin OUYANG^1,^2,^*(

),Chao LI^1,²,Zhang XIONG^1,²

1. School of Computer Science and Engineering, Beihang University, Beijing 100191, China
2. Research Institute of Beihang University in Shenzhen, Shenzhen 518057, China

Download: PDF(523 KB)
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

With the development of Internet, people are more likely to post and propagate opinions online. Sentiment analysis is then becoming an important challenge to understand the polarity beneath these comments. Currently a lot of approaches from natural language processing’s perspective have been employed to conduct this task. The widely used ones include bag-of-words and semantic oriented analysis methods. In this research, we further investigate the structural information among words, phrases and sentences within the comments to conduct the sentiment analysis. The idea is inspired by the fact that the structural information is playing important role in identifying the overall statement’s polarity. As a result a novel sentiment analysis model is proposed based on recurrent neural network, which takes the partial document as input and then the next parts to predict the sentiment label distribution rather than the next word. The proposed method learns words representation simultaneously the sentiment distribution. Experimental studies have been conducted on commonly used datasets and the results have shown its promising potential.

Keywords sentiment analysis recurrent neural network deep learning machine learning

Corresponding Author(s): Yuanxin OUYANG

Issue Date: 07 April 2015

Cite this article:

Wenge RONG,Baolin PENG,Yuanxin OUYANG, et al. Structural information aware deep semi-supervised recurrent neural network for sentiment analysis[J]. Front. Comput. Sci., 2015, 9(2): 171-184.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-014-4085-7
https://academic.hep.com.cn/fcs/EN/Y2015/V9/I2/171

1	Tan C, Lee L, Tang J, Jiang L, Zhou M, Li P. User-level sentiment analysis incorporating social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2011, 1397-1405 https://doi.org/10.1145/2020408.2020614
2	Beineke P, Hastie T, Manning C, Vaithyanathan S. Exploring sentiment summarization. In: Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications. 2004
3	Pang B, Lee L, Vaithyanathan S. Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing. 2002, 79-86 https://doi.org/10.3115/1118693.1118704
4	Cardie C, Wiebe J, Wilson T, Litman D J. Combining low-level and summary representations of opinions for multi-perspective question answering. In: Proceedings of New Directions in Question Answering. 2003, 20-27
5	Dave K, Lawrence S, Pennock D M. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International World Wide Web Conference. 2003, 519-528 https://doi.org/10.1145/775224.775226
6	Kim S M, Hovy E H. Automatic identification of pro and con reasons in online reviews. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. 2006 https://doi.org/10.3115/1273073.1273136
7	Socher R, Pennington J, Huang E H, Ng A Y, Manning C D. Semisupervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 2011, 151-161
8	Maas A L, Daly R E, Pham P T, Huang D, Ng A Y, Potts C. Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011, 142-150
9	Turney P D. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002, 417-424
10	Li J, Zheng R, Chen H. From fingerprint to writeprint. Communications of the ACM, 2006, 49(4): 76-82 https://doi.org/10.1145/1121949.1121951
11	Whitelaw C, Garg N, Argamon S. Using appraisal groups for sentiment analysis. In: Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management. 2005, 625-631 https://doi.org/10.1145/1099554.1099714
12	Hu M, Liu B. Mining and summarizing customer reviews. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004, 168-177 https://doi.org/10.1145/1014052.1014073
13	Liu X, Zhou M. Sentence-level sentiment analysis via sequence modeling. In: Proceedings of the 2011 International Conference on Applied Informatics and Communication. 2011, 337-343 https://doi.org/10.1007/978-3-642-23235-0_44
14	Mikolov T, Kombrink S, Burget L, Cernocky J, Khudanpur S. Extensions of recurrent neural network language model. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing. 2011, 5528-5531 https://doi.org/10.1109/ICASSP.2011.5947611
15	Kingsbury B. Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing. 2009, 3761-3764 https://doi.org/10.1109/ICASSP.2009.4960445
16	Maas A L, Le Q V, O’Neil T M, Vinyals O, Nguyen P, Ng A Y. Recurrent neural networks for noise reduction in robust ASR. In: Proceedings of the 13th Annual Conference of the International Speech Communication Association. 2012
17	Yao K, Zweig G, Hwang M Y, Shi Y, Yu D. Recurrent neural networks for language understanding. In: Proceedings of the 14th Annual Conference of the International Speech Communication Association. 2013, 2524-2528
18	Mikolov T, Karafiát M, Burget L, Cernocky J, Khudanpur S. Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association. 2010, 1045-1048
19	Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7): 1527-1554 https://doi.org/10.1162/neco.2006.18.7.1527
20	Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning. 2001, 282-289
21	Elman J L. Finding structure in time. Cognitive science, 1990, 14(2): 179-211 https://doi.org/10.1207/s15516709cog1402_1
22	Pang B, Lee L. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2007, 2(1-2): 1-135
23	Morinaga S, Yamanishi K, Tateishi K, Fukushima T. Mining product reputations on the web. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2002, 341-349 https://doi.org/10.1145/775094.775098
24	Volkova S, Wilson T, Yarowsky D. Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics Volume 2: Short Papers. 2013, 505-510
25	Wilson T, Wiebe J, Hoffmann P. Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics, 2009, 35(3): 399-433 https://doi.org/10.1162/coli.08-012-R1-06-90
26	Andreevskaia A, Bergler S. Mining WordNet for a fuzzy sentiment: Sentiment tag extraction from WordNet glosses. In: Proceedings of the 11st Conference of the European Chapter of the Association for Computational Linguistics. 2006
27	Higashinaka R, Prasad R, Walker M A. Learning to generate naturalistic utterances using reviews in spoken dialogue systems. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. 2006 https://doi.org/10.3115/1220175.1220209
28	Davidov D, Tsur O, Rappoport A. Enhanced sentiment learning using Twitter hashtags and smileys. In: Proceedings of the 23rd International Conference on Computational Linguistics,. 2010, 241-249
29	Hopfield J J. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 1982, 79(8): 2554-2558 https://doi.org/10.1073/pnas.79.8.2554
30	Waibel A. Modular construction of time-delay neural networks for speech recognition. Neural computation, 1989, 1(1): 39-46 https://doi.org/10.1162/neco.1989.1.1.39
31	Rowley H A, Baluja S, Kanade T. Neural network-based face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(1): 23-38 https://doi.org/10.1109/34.655647
32	Sanger T D. Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks, 1989, 2(6): 459-473 https://doi.org/10.1016/0893-6080(89)90044-0
33	Egmont-Petersen M, de Ridder D, Handels H. Image processing with neural networks—a review. Pattern Recognition, 2002, 35(10): 2279-2301 https://doi.org/10.1016/S0031-3203(01)00178-9
34	Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504-507 https://doi.org/10.1126/science.1127647
35	Bengio Y, Schwenk H, Senécal J, Morin F, Gauvain J. Neural probabilistic language models. In: Holmes D E, Jain L C, eds. Innovations in Machine Learning. Berlin: Springer, 2006, 137-186 https://doi.org/10.1007/3-540-33486-6_6
36	Kombrink S, Mikolov T, Karafiát M, Burget L. Recurrent neural network based language modeling in meeting recognition. In: Proceedings of the 12th Annual Conference of the International Speech Communication Association. 2011, 2877-2880
37	Mikolov T. Statistical language models based on neural networks. Dissertation for the Doctoral Degree. Brno: Brno University of Technology, 2012
38	Schwenk H, Gauvain J. Connectionist language modeling for large vocabulary continuous speech recognition. In: Proceedings of the 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing. 2002, 765-768 https://doi.org/10.1109/ICASSP.2002.1005852
39	Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 160-167 https://doi.org/10.1145/1390156.1390177
40	Subramanya A, Petrov S, Pereira F C N. Efficient graph-based semisupervised learning of structured tagging models. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. 2010, 167-176
41	Mnih A, Hinton G E. A scalable hierarchical distributed language model. In: Proceedings of the 22nd Annual Conference on Neural Information Processing Systems. 2008, 1081-1088
42	Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. CoRR, abs/1301.3781, 2013
43	Liu K L, Li W J, Guo M. Emoticon smoothed language models for twitter sentiment analysis. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012.
44	Hu X, Tang J, Gao H, Liu H. Unsupervised sentiment analysis with emotional signals. In: Proceedings of the 22nd International World Wide Web Conference. 2013, 607-618
45	Zhou Z H. Learning with unlabeled data and its application to image retrieval. In: Proceedings of the 9th Pacific Rim International Conference on Artificial Intelligence. 2006, 5-10 https://doi.org/10.1007/11801603_3
46	Zhu X, Goldberg A B. Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning, 2009, 3(1): 1-130
47	Chapelle O, Sch?lkopf B, Zien A, eds. Semi-supervised Learning. Cambridge: MIT Press, 2006 https://doi.org/10.7551/mitpress/9780262033589.001.0001
48	Rosenfeld B, Feldman R. Using corpus statistics on entities to improve semi-supervised relation extraction from the web. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. 2007
49	McClosky D, Charniak E, Johnson M. Effective self-training for parsing. In: Proceedings of the 2006 Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 2006 https://doi.org/10.3115/1220835.1220855
50	Ueffing N, Haffari G, Sarkar A. Transductive learning for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. 2007
51	Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of the the 16th International Conference on Machine Learning. 1999, 200-209
52	Bruzzone L, Chi M, Marconcini M. A novel transductive SVM for semi-supervised classification of remote-sensing images. IEEE Transactions on Geoscience and Remote Sensing, 2006, 44(11-2): 3363-3373
53	Smith N A, Eisner J. Contrastive estimation: Training log-linear models on unlabeled data. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. 2005, 354-362 https://doi.org/10.3115/1219840.1219884
54	Bengio Y, Lamblin P, Popovici D, Larochelle H. Greedy layer-wise training of deep networks. In: Proceedings of the 20th Annual Conference on Neural Information Processing Systems. 2006, 153-160
55	Erhan D, Bengio Y, Courville A C, Manzagol P A, Vincent P, Bengio S. Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 2010, 11: 625-660
56	Erhan D, Manzagol P A, Bengio Y, Bengio S, Vincent P. The difficulty of training deep architectures and the effect of unsupervised pretraining. In: Proceedings of the 12th International Conference on Artificial Intelligence and Statistics. 2009, 153-160
57	Ranzato M, Boureau Y, LeCun Y. Sparse feature learning for deep belief networks. In: Proceedings of the 21st Annual Conference on Neural Information Processing Systems. 2007
58	Lee D H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Proceedings of the 2013 ICML Workshop on Challenges in Representation Learning. 2013
59	de Boer P T, Kroese D T, Mannor S, Rubinstein R Y. A tutorial on the cross-entropy method. Annals of Operations Research, 2005, 134(1): 19-67 https://doi.org/10.1007/s10479-005-5724-z
60	Minsky M, Papert S. Perceptrons-an introduction to computational geometry. Cambridge: MIT Press, 1987
61	Werbos P J. Backpropagation through time: What it does and how to do it. Proceedings of the IEEE, 1990, 78(10): 1550-1560 https://doi.org/10.1109/5.58337
62	Frinken V, Fischer A, Bunke H. A novel word spotting algorithm using bidirectional long short-term memory neural networks. In: Proceedings of the 4th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition. 2010, 185-196
63	Pang B, Lee L. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics. 2005 https://doi.org/10.3115/1219840.1219855
64	Wiebe J, Wilson T, Cardie C. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation, 2005, 39(2-3): 165-210 https://doi.org/10.1007/s10579-005-7880-9
65	Hu M, Liu B. Mining opinion features in customer reviews. In: Proceedings of the 19th National Conference on Artificial Intelligence and the 16th Conference on Innovative Applications of Artificial Intelligence, 2004, 755-760
66	Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. In: Polk T A, Seifert C M, eds. Cognitive Modeling. Cambridege: MIT Press, 2002, 213-220
67	Manning C D, Raghavan P, Schütze H. Introduction to Information Retrieval. Cambridge: Cambridge University Press, 2008 https://doi.org/10.1017/CBO9780511809071
68	Nakagawa T, Inui K, Kurohashi S. Dependency tree-based sentiment classification using CRFs with hidden variables. In: Proceedings of the 2010 Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics. 2010, 786-794
69	Stone P J, Dunphy D C, Smith M S. The General Inquirer: A Computer Approach to Content Analysis. Cambridge: MIT Press, 1966
70	Pennebaker J W, Francis M E, Booth R J. Linguistic Inquiry and Word Count: LIWC 2001. Mahway: Lawrence Erlbaum Associates, 2001
71	van der Maaten L, Hinton G E. Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9: 2579-2605
72	van der Maaten L, Hinton G E. Visualizing non-metric similarities in multiple maps. Machine Learning, 2012, 87(1): 33-55 https://doi.org/10.1007/s10994-011-5273-4
73	Bengio Y, Courville A C, Vincent P. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798-1828 https://doi.org/10.1109/TPAMI.2013.50
74	Martens J, Sutskever I. Training deep and recurrent networks with hessian-free optimization. In: Montavon G, Orr G B, Müller K B, eds. Neural Networks: Tricks of the Trade. 2nd ed. Berlin: Springer, 2012, 479-535 https://doi.org/10.1007/978-3-642-35289-8_27
75	Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. In: Proceedings of the 30th International Conference on Machine Learning. 2013, 1310-1318
76	Cowan J D, Tesauro G, Alspector J, eds. Advances in Neural Information Processing Systems 6. San Francisco: Morgan Kaufmann, 1994
77	Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P A. Stacked denoising autoencoders: Learning useful representations in a deep network with a local den<?Pub Caret?>oising criterion. Journal of Machine Learning Research, 2010, 11: 3371-3408
78	Teh Y W, Hinton G E. Rate-coded restricted Boltzmann machines for face recognition. In: Proceedings of the 2000 Advances in Neural Information Processing Systems 13. 2000, 908-914

[1]	Xia-an BI, Yiming XIE, Hao WU, Luyun XU. Identification of differential brain regions in MCI progression via clustering-evolutionary weighted SVM ensemble algorithm[J]. Front. Comput. Sci., 2021, 15(6): 156903-.
[2]	Yan-Ping SUN, Min-Ling ZHANG. Compositional metric learning for multi-label classification[J]. Front. Comput. Sci., 2021, 15(5): 155320-.
[3]	Huiying ZHANG, Yu ZHANG, Xin GENG. Practical age estimation using deep label distribution learning[J]. Front. Comput. Sci., 2021, 15(3): 153318-.
[4]	Jian SUN, Pu-Feng DU. Predicting protein subchloroplast locations: the 10th anniversary[J]. Front. Comput. Sci., 2021, 15(2): 152901-.
[5]	Syed Farooq ALI, Muhammad Aamir KHAN, Ahmed Sohail ASLAM. Fingerprint matching, spoof and liveness detection: classification and literature review[J]. Front. Comput. Sci., 2021, 15(1): 151310-.
[6]	Chune LI, Yongyi MAO, Richong ZHANG, Jinpeng HUAI. A revisit to MacKay algorithm and its application to deep network compression[J]. Front. Comput. Sci., 2020, 14(4): 144304-.
[7]	Guijuan ZHANG, Yang LIU, Xiaoning JIN. A survey of autoencoder-based recommender systems[J]. Front. Comput. Sci., 2020, 14(2): 430-450.
[8]	Ebuka IBEKE, Chenghua LIN, Adam WYNER, Mohamad Hardyman BARAWI. A unified latent variable model for contrastive opinion mining[J]. Front. Comput. Sci., 2020, 14(2): 404-416.
[9]	Lydia LAZIB, Bing QIN, Yanyan ZHAO, Weinan ZHANG, Ting LIU. A syntactic path-based hybrid neural network for negation scope detection[J]. Front. Comput. Sci., 2020, 14(1): 84-94.
[10]	Xu-Ying LIU, Sheng-Tao WANG, Min-Ling ZHANG. Transfer synthetic over-sampling for class-imbalance learning with limited minority class data[J]. Front. Comput. Sci., 2019, 13(5): 996-1009.
[11]	Yu-Feng LI, De-Ming LIANG. Safe semi-supervised learning: a brief introduction[J]. Front. Comput. Sci., 2019, 13(4): 669-676.
[12]	Wenhao ZHENG, Hongyu ZHOU, Ming LI, Jianxin WU. CodeAttention: translating source code to comments by exploiting the code constructs[J]. Front. Comput. Sci., 2019, 13(3): 565-578.
[13]	Hao SHAO. Query by diverse committee in transfer active learning[J]. Front. Comput. Sci., 2019, 13(2): 280-291.
[14]	Qingying SUN, Zhongqing WANG, Shoushan LI, Qiaoming ZHU, Guodong ZHOU. Stance detection via sentiment information and neural network model[J]. Front. Comput. Sci., 2019, 13(1): 127-138.
[15]	Ruochen HUANG, Xin WEI, Liang ZHOU, Chaoping LV, Hao MENG, Jiefeng JIN. A survey of data-driven approach on multimedia QoE evaluation[J]. Front. Comput. Sci., 2018, 12(6): 1060-1075.

Viewed

Full text

Abstract

Cited

Shared

Discussed