A Set Statistical features for Evaluating Interactive Question Answering

Document Type : Original Article

Authors

Faculty of Computer Engineering, Shahrood University of Technology, Shahrood, Iran

Abstract

: Evaluation plays an important role in the interactive question answering(IQA) systems. In the context of evaluating IQA systems, there is practically no specific methodology for evaluating these systems in general. The main problem with designing an assessment method for IQA systems lies in the fact that is rarely possible to predict interaction part. To this end, human needs to be involved in the evaluation process. In this paper, an appropriate model is presented by introducing a set of built-in features for evaluating IQA systems. To conduct the evaluation process, four IQA systems were considered, and then a database of conversation was exchanged between users and systems. After performing the preprocessing on the conversation, the statistical characteristics of the conversation was extracted and base on that characteristics matrix was formed. Finally, using SVM, human thinking divided into two groups. The correlation coefficient between human thinking and proposed set features indicated the high accuracy of set features presented in evaluating of IQA systems.

Keywords


[1]      M. Amit, and S. K. Jain. “A survey on question answering systems with classification.” Journal of King Saud University-Computer and Information Sciences 28, no. 3: 345-361, 2016.
[2]      Bouziane, Abdelghani, Bouchiha, Doumi, and Malki, Question Answering Systems: Survey and Trends, Procedia Computer Science, pp. 366-375, 2015.
[3]      Hartawan, Andrei, and Derwin Suhartono, Using Vector Space Model in Question Answering System, Procedia Computer Science, pp. 305-311, 2015.
[4]      Höffner, Konrad, S. Walter, E. Marx, R. Usbeck, J. Lehmann, and A. Ngomo. Survey on challenges of question answering in the semantic web, Semantic Web 8, no. 6, pp.895-920, 2017.
[5]      Bao, Junwei, Nan Duan, Ming Zhou, and Tiejun Zhao. "Knowledge-based question answering as machine translation." Cell 2, no. 6, 2014.
[6]      L. Vanessa, V. Uren, M. Sabou and E. Motta. “Is question answering fit for the semantic web? A survey.” Semantic Web 2, no. 2: 125-155, 2011.
[7]      Rodrigo, Alvaro, and Anselmo Penas. "A study about the future evaluation of Question-Answering systems." Knowledge-Based Systems 137, 83-93, 2017.
[8]      S. Ying, P. B. Kantor and E. L. Morse. “Using cross-evaluation to evaluate interactive QA systems.” Journal of the Association for Information Science and Technology 62, no. 9: 1653-1665, 2011.
[9]      Quarteroni, Silvia and S. Manandhar. “Designing an interactive open-domain question answering system.” Natural Language Engineering 15, no. 1: 73-95, 2009.
[10]      N. Wacholder, S. G. Small, B. Bai, D. Kelly, R. trittman, S. Ryan, R. Salkin, “Designing a Realistic Evaluation of an End-to-end Interactive Question Answering System.” In LREC. 2004.
[11]      M. Mansoori, and H. Hassanpour. “Boosting passage retrieval through reuse in question answering.” International Journal of Engineering 25, no. 3:187-196, 2012.
[12]      Kelly, Diane, P. B. Kantor, E. L. Morse, J. Scholtz, and Y. Sun. “Questionnaires for eliciting evaluation data from users of interactive question answering systems.” Natural Language Engineering 15, no. 1: 119-141, 2009.
[13]      سلیمه شهر آیینی، مرتضی زاهدی،” سیستم پاسخگوی تعاملی با استفاده از تکنیک‌های هوش مصنوعی”، دانشگاه صنعتی شاهرود، دانشکده کامپیوتر و فناوری اطلاعات، پایان‌نامه ارشد، 1394.
[14]      محمدمهدی حسینی، مرتضی زاهدی، “بهبود پاسخ ارائه‌شده در سیستم‌های پرسش و پاسخ تعاملی با استفاده ازبه‌د شبکه عصبی”، هشتمین کنفرانس بین‌المللی فناوری اطلاعات و دانش، صفحات 84-91، 1395.
[15]      L. C. Yew. “Rouge: A package for automatic evaluation of summaries.” In Text summarization branches out: Proceedings of the ACL-04 workshop, vol. 8. 2004.
[16]      ابزارهای پردازش متون زبان فارسی، آزمایشگاه فناوری وب دانشگاه فردوسی مشهد، 1391. (wtlab.um.ac.ir)
[17]      C. Guinaudeau, M. Strube, “Graph-based Local Coherence Modeling”, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 93–103, 2013.
[18]      فرید قنبری، محسن رحمانی، "ارائه یک روش مبتنی بر گرایش معنایی برای طبقه‌بندی چند برچسبی محتوای فیلم‌ها به کمک متون زیرنویس آن‌ها"، مجله مهندسی برق دانشگاه تبریز، آذرماه 96.
[19]      الناز زعفرانی، محمدرضا فیضی درخشانی و آزاده روحانی، " تشخیص هوشمند و خودکار غلط‌های تایپی در پایگاه داده‌های بزرگ بدون استفاده از لغت‌نامه"، مجله مهندسی برق دانشگاه تبریز، جلد 47، شماره 1، بهار 96.
[20]      Hersh, William. "Evaluating interactive question answering." In Advances in Open Domain Question Answering, Springer, Dordrecht, pp. 431-455, 2008.