Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2014, Vol. 8 Issue (2) : 279-288    https://doi.org/10.1007/s11704-014-3043-8
RESEARCH ARTICLE
Entity attribute discovery and clustering from online reviews
Qingliang MIAO1,*(),Qiudan LI2,Daniel ZENG2,Yao MENG1,Shu ZHANG1,Hao YU3
1. Fujitsu Research & Development Center, Beijing 100025, China
2. Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
3. Ricoh Software Research Center (Beijing) Co., Ltd., Beijing 100082, China
 Download: PDF(618 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

The rapid increase of user-generated content (UGC) is a rich source for reputation management of entities, products, and services. Looking at online product reviews as a concrete example, in reviews, customers usually give opinions on multiple attributes of products, therefore the challenge is to automatically extract and cluster attributes that are mentioned. In this paper, we investigate efficient attribute extraction models using a semi-supervised approach. Specifically, we formulate the attribute extraction issue as a sequence labeling task and design a bootstrapped schema to train the extraction models by leveraging a small quantity of labeled reviews and a larger number of unlabeled reviews. In addition, we propose a clustering By committee (CBC) approach to cluster attributes according to their semantic similarity. Experimental results on real world datasets show that the proposed approach is effective.

Keywords opinion mining      attribute extraction      attribute clustering     
Corresponding Author(s): Qingliang MIAO   
Issue Date: 24 June 2014
 Cite this article:   
Qingliang MIAO,Qiudan LI,Daniel ZENG, et al. Entity attribute discovery and clustering from online reviews[J]. Front. Comput. Sci., 2014, 8(2): 279-288.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-014-3043-8
https://academic.hep.com.cn/fcs/EN/Y2014/V8/I2/279
1 PangB, LeeL. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2008, 2(1-2): 1-135
doi: 10.1561/1500000011
2 LiuB, HuM, ChengJ. Opinion observer: analyzing and comparing opinions on the web. In: Proceedings of the 14th International World Wide Web Conference. 2005, 342-351
doi: 10.1145/1060745.1060797
3 HuM, LiuB. Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004, 168-177
4 PopescuA M, EtzioniO. Extracting product features and opinions from reviews. In: Proceedings of the 2005 Conference on Empirical Methods in Natural Language Processing. 2005, 339-346
5 MiaoQ, LiQ, DaiR. An integration strategy for mining product features and opinions. In: Proceedings of the 17th Conference on Information and Knowledge Management. 2008, 1369-1370
6 GiuseppeC, RaymondT, EdZ. Extracting knowledge from evaluative text. In: Proceedings of the 3rd International Conference on Knowledge Capture. 2005, 11-18
7 SuQ, XiangK, WangH, SunB, YuS. Using pointwise mutual information to identify implicit features in customer reviews. In: Proceedings of the 21st International Conference on the Computer Processing of Oriental Languages. 2006
8 ShiB, ChangK. Mining Chinese reviews. In: Proceedings of the 6th IEEE International Conference on Data Mining. 2006, 585-589
9 RayidG, KatharinaP, LiuY, MarkoK, AndrewF. Text mining for product attribute extraction. ACM SIGKDD Explorations Newsletter, 2006, 8(1): 41-48
doi: 10.1145/1147234.1147241
10 WangB, WangH. Bootstrapping both product properties and opinion words from Chinese reviews with cross-training. In: Proceedings of the 2007 IEEE/WIC/ACM International Conference on Web Intelligence. 2007, 259-262
11 JinW, HoH. A novel lexicalized HMM based learning framework for web opinion mining. In: Proceedings of the 26th Annual International Conference on Machine Learning. 2009, 465-472
12 QiL, ChenL. A linear-chain CRF-based learning approach for web opinion mining. In: Proceedings of the 11th International Conference on Web Information Systems Engineering. 2010, 128-141
13 ZhangS, JiaW, XiaY, MengY, YuH. Product features extraction and categorization in Chinese reviews. In: Proceedings of the 6th International Multi-Conference on Computing in the Global Information Technology. 2010, 38-43
14 SomprasertsriG, LalitrojwongP. Automatic product feature extraction from online product reviews using maximum entropy with lexical and syntactic features. In: Proceedings of the 2008 IEEE International Conference on Information Reuse and Integration. 2008, 250-255
doi: 10.1109/IRI.2008.4583038
15 MiaoQ, LiQ, DanielZ. Mining fine grained opinions by using probabilistic models and domain knowledge, In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 2010, 358-365
16 LaffertyJ, McCallumA, PereiraF. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning. 2001, 282-289
17 SuQ, XuX, GuoH, GuoZ, WuX, ZhangX, SwenB, SuZ. Hidden sentiment association in Chinese web opinion mining. In: Proceedings of the 17th International Conference on World Wide Web. 2008, 959-968
doi: 10.1145/1367497.1367627
18 GuoH, ZhuH, GuoZ, ZhangX, SuZ. Product feature categorization with multilevel latent semantic association. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. 2009, 1087-1096
19 ZhaiZ, LiuB, XuH, JiaP. Clustering product features for opinion mining. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining. 2011, 347-354
doi: 10.1145/1935826.1935884
20 GiuseppeP. A semantic similarity metric combining features and intrinsic information content. Data & Knowledge Engineering, 2009, 68(11), 1289-1308
doi: 10.1016/j.datak.2009.06.008
21 RudiL, PaulM. The Google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 370-383
doi: 10.1109/TKDE.2007.48
22 DanushkaB, YutakaM, MitsuruI. Measuring semantic similarity between words using web search engines. In: Proceedings of the 16th International Conference on World Wide Web. 2007, 757-766
23 HuX, SunN, ZhangC, ChuaT. Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. 2009, 919-928
24 PatrickP, DekangL. Discovering word senses from text. In: Proceedings of the 8th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining. 2002, 613-619
25 PeterD T, PatrickP. From frequency to meaning: vector space models of semantics. Journal of Artificial Intelligence Research, 2010, 37(1): 141-188
[1] Ebuka IBEKE, Chenghua LIN, Adam WYNER, Mohamad Hardyman BARAWI. A unified latent variable model for contrastive opinion mining[J]. Front. Comput. Sci., 2020, 14(2): 404-416.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed