Improving Linked Data Quality Assessment and Fusion by a Conflict Resolution Approach

Editorial

Authors

1 Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran

2 Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran

3 School of Computer Science, Carleton University, Ottawa, Canada

Abstract

The semantic web technology and decision making based on the linked data is progressing every day. The linked data are managed as decentralized sources, and their quality is a serious concern. The assessment of the quality of linked data is a key to adopting them in different fields because each data set has been developed by a different group, using various methods and tools. The qualitative and quantitative diversity of such data is higher than those generated by official organizations and firms. In this paper, we first overview and evaluate the dimensions and measures for the quality assessment of data especially linked data. Then, we present a novel framework as a solution for improving linked data quality assessment and data fusion. The good quality data make good result in data fusion. Finally, we introduce six rules for handling data conflicts and a new metric for assessment of granularity level of data (GLQM) and adopt several tools to assess the quality of data using the proposed framework.

Keywords


[1]      J.M. Juran, R.S. Bingham and F.M. Gryna, The Quality Control Handbook, 3rd edition, McGraw-Hill, New York, 1974
[2]      R.Y. Wang and D. M. Strong, “Beyond accuracy: what data quality means to data consumers,” Journal of Management Information Systems, vol. 12, no. 4, pp. 5–33, March 1996.
[3]      P.N. Mendes, H. Mühleisen, and C. Bizer. “Sieve: linked data quality assessment and fusion,” In Proceedings of the 2012 Joint EDBT/ICDT Workshops, pp. 116-123, Berlin, Germany, March 2012.
[4]      A. Hogan, A. Harth, A. Passant, S. Decker, and A. Polleres, “Weaving the pedantic web,” In Proceedings of the Linked Data on the Web Workshop (LDOW2010), Raleigh, USA, April 2010
[5]      A. Hogan, J. Umbrich, A. Harth, R. Cyganiak, A. Polleres, and S. Decker, “An empirical survey of linked data conformance,” Journal of Web Semantics: Science, Services and Agents on the World Wide Web, vol. 14, pp. 14-44, July 2012
[6]      Y. Lei, A. Nikolov, V. Uren, and E. Motta, “Detecting quality problems in semantic metadata without the presence of a gold standard,” In Workshop on Evaluation of Ontologies for the Web (EON), pp. 51-60, Busan, Korea, November 2007
[7]      C. Guéret, P. Groth, C. Stadler, and J. Lehmann, “Assessing linked data mappings using network measures,” In ExtendedSemanticWebConference, pp. 87-102, Heraklion, Greece, May 2012
[8]      C. Bizer, and R. Cyganiak, “Quality-driven information filtering using the WIQA policy framework,” Journal of Web Semantics: Science, Services and Agents on the World Wide Web, vol. 7, no.1, pp.1-10, January 2009
[9]      A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer, “Quality assessment for linked data: A survey,” Semantic Web, vol. 7, no. 1, pp.63-93, January 2016
[10]      D. Kontokostas, P. Westphal, S. Auer, S. Hellmann, J. Lehmann, R. Cornelissen, and A. Zaveri, “Test-driven evaluation of linked data quality,” In Proceedings of the 23rd International Conference on World Wide Web, pp. 747-758, Seoul, Republic of Korea, April 2014
[11]      A. Rula, and A. Zaveri, “Methodology for assessment of linked data quality,” In Proceedings of the 1st Workshop on Linked Data Quality co-located with 10th International Conference on Semantic Systems, LDQ@SEMANTiCS, Leipzig, Germany, September 2014
[12]       A. Schultz, A. Matteini, R. Isele, C. Bizer, and C. Becker, “LDIF-linked data integration framework,” In Proceedings of the Second International Conference on Consuming Linked Data, vol.782, pp. 125-130, CEUR-WS.Org, Bonn, Germany, October 2011
[13]      V. Bryl, C. Bizer, R. Isele, M. Verlic, S. G. Hong, S. Jang, M. Y. Yi and K.S. Choi, “Interlinking and knowledge fusion,” In Auer S., Bryl V., Tramp S. (eds) Linked Open Data - Creating Knowledge Out of Interlinked Data. Lecture Notes in Computer Science, vol. 8661. Springer International Publishing, pp.70-89, 2014.
[14]      M. A. M. Sherif, Automating Geospatial RDF Dataset Integration and Enrichment. Ph.D. Thesis, Universität Leipzig, 2016.
[15]      محمدعلی زارع چاهوکی و سیده زهرا آفتابی، «کاهش شکاف معنایی در دسته‌بندی پرسش‌ها با بهره‌گیری از قوانین طبقه‌بندی»، مجله مهندسی برق دانشگاه تبریز، دوره ۴۶، شماره ۳، صفحه ۱۳-۲۴، پاییز ۱۳۹۵
[16]      فاطمه کاوه‌یزدی، علی‌محمد زارع‌بیدکی و محمدرضا پژوهان، «تعیین مشابهت معنایی به روش بدون‌سرپرست با استفاده از قدم‌زنی تصادفی بر گراف جایگزینی زبانی»، مجله مهندسی برق دانشگاه تبریز، دوره ۴۸، شماره ۱، صفحه ۲۳۷-۲۴۹، بهار ۱۳۹۷
[17]      H. Boström, S.F. Andler, M. Brohede, R. Johansson, A. Karlsson, J. van Laere, L. Niklasson, M. Nilsson, A. Persson and T. Ziemke, On the Definition of Information Fusion as a Field of Research, Informatics Research Centre, University of Skövde, Tech. Rep. HS-IKI-TR-07-006, 2007
[18]      V. Zadorozhny and Y.F. Hsu, “Conflict-aware historical data fusion,” International Conference on Scalable Uncertainty Management, pp. 331-345, Dayton, OH, USA, October 2011
[19]      L. Getoor and A. Machanavajjhala, “Entity resolution for big data,” In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1527-1527, Chicago, IL, USA, August 2013.
[20]      J. Michelfeit, T. Knap and M. Nečaský. "Linked data integration with conflicts," arXiv preprint arXiv:1410.7990 (2014).
[21]      J. T. Yao, A. V. Vasilakos and W. Pedrycz, “Granular computing: perspectives and challenges,” IEEE Transactions on Cybernetics, vol. 43, no. 6, pp.1977-1989, December 2013
[22]      P. Smeros and M. Koubarakis. "Discovering Spatial and Temporal Links among RDF Data," In Proceedings of the Workshop on Linked Data on the Web co-located with 25th International World Wide Web Conference, Montreal, Canada, April 2016.