Identification of Implicit Features using Persian Language Rules and Sentiments Clustering

Document Type : Original Article

Authors

1 Faculty of Engineering, Yazd University, Yazd, Iran

2 Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran

Abstract

Typically, when someone wants to buy an online product, he/she reviews the comments and notes written by others about the product. Clearly, it has a profound impact on the person's decision to buy or not to buy the product. Sentiment analysis or opinion mining is one of the hot research topics in computer science. The main purpose of sentiment analysis is to extract opinions of individuals about the characteristics of an entity such as a product. In this research, an unsupervised opinion mining is proposed for Persian products based on implicit feature extraction which is a critical step in sentiment analysis. In most previous studies, statistical information is utilized to create a co-occurrence matrix and determine the implicit features. In this paper, we benefit from syntactic rules and sentiment clustering in conjunction with statistical information to construct an efficient co-occurrence matrix between features and sentiment words. The evaluation results provided on a real-world dataset, extracted from Digikala website, indicates that the proposed method achieves higher recall and precision compared to the previous studies.

Keywords


[1]      J. J. Li, H. Yang and H. Tang, “Feature Mining and Sentiment Orientation Analysis on Product Review,“ in International Conference on Management Information and Optoelectronic Engineering, pp. 79-84, 2016.
[2]      B. Liu, “Sentiment analysis and opinion mining,“ Synthesis Lectures on Human Language Technologies, vol. 5, pp. 1-167, 2012.
[3]      مصطفی. رجب­زاده و رضا. رافع، «ارائه یک سیستم توصیه­گر ترکیبی برای تجارت الکترونیک»، مجله مهندسی برق دانشگاه تبریز، جلد 45، صفحه 177-163، 1394.
[4]    سیامک. عبداله­زاده، محمدعلی. بالافر و لیلی. محمدخانلی، «استفاده از خوشه­بندی و مدل مارکوف جهت پیش­بینی درخواست آتی کاربر در وب»، مجله مهندسی برق دانشگاه تبریز، جلد 45، صفحه 177-163، 1394.
[5]      A. Yadollahi, A. G. Shahraki, and O. R. Zaiane, “Current state of text sentiment analysis from opinion to emotion mining,“ ACM Computing Surveys (CSUR), vol. 50, no. 25, pp. 1-33, 2017.
[6]      E. Breck and C. Cardie, Opinion Mining and Sentiment Analysis, in The Oxford Handbook of Computational Linguistics 2nd edition, 2017.
[7]      A. Bagheri, M. Saraee, and F. De Jong, “Care more about customers: Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews,“ Knowledge-Based Systems, vol. 52, pp. 201-213, 2013.
[8]      Z. Hai, K. Chang, and G. Cong, “One seed to find them all: mining opinion features via association,“ in Proceedings of the 21st ACM international conference on Information and knowledge management, pp. 255-264, 2012.
[9]      N. Jakob and I. Gurevych, “Extracting opinion targets in a single-and cross-domain setting with conditional random fields,“ in Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1035-1045, 2010.
[10]      G. Somprasertsri and P. Lalitrojwong, “Automatic product feature extraction from online product reviews using maximum entropy with lexical and syntactic features,“ in IEEE International Conference on Information Reuse and Integration, pp. 250-255, 2008.
[11]      مائده. شیخ حسنی، محرم. منصوری­زاده و میرحسین. دزفولیان، «نظر کاوی جنبه­گرا به کمک استخراج  روابط معنایی»، بیست و دومین کنفرانس ملی سالانه انجمن کامپیوتر ایران، دانشگاه صنعتی شریف، 1395.
[12]      M. Z. Asghar, A. Khan, S. R. Zahra, S. Ahmad, and F. M. Kundi, "Aspect-based opinion mining framework using heuristic patterns," Cluster Computing, https://doi.org/10.1007/s10586-017-1096-9, pp. 1-19, 2017.
[13]      H. Liang, X. Sun, Y. Sun, and Y. Gao, "Text feature extraction based on deep learning: a review," EURASIP journal on wireless communications and networking, vol. 1, pp. 1-12, 2017.
[14]      E. Asgarian, M. Kahani, and S. Sharifi, "The impact of sentiment features on the sentiment polarity classification in Persian reviews," Cognitive Computation, vol. 10, pp. 117-135, 2018.
[15]      M. Dragoni, "Computational advertising in social networks: an opinion mining-based approach," in Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp. 1798-1804, 2018.
[16]      Q. Su, X. Xu, H. Guo, Z. Guo, X. Wu, and X. Zhang, "Hidden sentiment association in chinese web opinion mining," in Proceedings of the 17th international conference on World Wide Web, pp. 959-968, 2008.
[17]      Z. Hai, K. Chang, and J.-j. Kim, "Implicit feature identification via co-occurrence association rule mining," in International Conference on Intelligent Text Processing and Computational Linguistics, Lecture Notes in Computer Science, Springer, pp. 393-404, 2011.
[18]      K. Schouten and F. Frasincar, "Implicit Feature Extraction for Sentiment Analysis in Consumer Reviews," in International Conference on Applications of Natural Language to Data Bases Information Systems, Lecture Notes in Computer Science, Springer, pp. 228-231, 2014.
[19]      مرضیه. باباعلی و محمدعلی. نمعت‌بخش، "استخراج ویژگی‌های محصول در زبان فارسی،" سومین همایش زبان‌شناسی رایانشی، دانشگاه صنعتی شریف، 1393.
[20]      محسن. ایمانی و مجتبی. خلاش، ابزار پردازش زبان فارسی، (http://www.sobhe.ir/hazm)، 1392.
[21]      T. Dunning, "Accurate methods for the statistics of surprise and coincidence," Computational linguistics, vol. 19, pp. 61-74, 1993.
[22]      Digikala Dataset (2018, December 13), Retrieved   February 2, 2019.
[23]      https://www.uplooder.net/files/701edb674a0ef75695b47de35db298d4/Data.rar.html.