A Hierarchical Method For Content-Structured Graph Clustering

Document Type : Original Article

Authors

1 Iran University of Science and Technology, Tehran, Iran

2 Emam Hossein comprehensive University, Tehran, Iran

Abstract

Entities in social networks, in addition to having the relationship with each other, also have content. This type of networks can be modeled by the enriched graph, in which nodes could have text too. Graph clustering is one of the important attempts toward analyzing social networks. Despite these two facts, most of the existing graph clustering methods independently focused on one of the content or structural aspects. Content-Structural graph clustering algorithms simultaneously consider both the structure and the content of the graph. The main aim of this paper is to achieve well connected (structured) clusters while their nodes benefit from homogeneous attribute values (content). The proposed algorithm in this paper so-called RSL-Cluster performs the clustering by hierarchically removing the edge between nodes which has a weight lower that the average similarity of nodes. This stage continues until reaching the user’s desired number of clusters. Comparing the proposed algorithm with three well-known content-structural clustering algorithms represents the proper functioning of the proposed method. The used measures to evaluate our method include structural, content and the content-structural measures.

Keywords


[1]      مریم مرادی، رزا یوسفیان و وحید رافع، «ارائه راهکاری جهت مقابله با مشکل انفجار فضای حالت در سیستم‌های تبدیل گراف با استفاده از الگوریتم‌های پرندگان و جستجوی گرانشی»، مجله مهندسی برق دانشگاه تبریز، دوره 45، شماره 4، صفحه 163-177، زمستان 1394.
[2]      مرتضی فرهید، موسی شمسی، محمدحسین صداقی، «تأثیر توپولوژی شبکه‌های پیچیده بر روی عملکرد تخمین تطبیقی توزیع شده با مشارکت نفوذی»، مجله مهندسی برق دانشگاه تبریز، دوره 46، شماره 4، صفحه 207-216، زمستان 1395.
[3]      سمیه توکلی، افسانه فاطمی، «تشکیل تیم دوهدفه در شبکه‌های اجتماعی»، مجله مهندسی برق دانشگاه تبریز، دوره 47، شماره 2، صفحه 423-433، تابستان 1396.
[4]      سمیرا رفیعی، پرهام مرادی، «بهبود عملکرد الگوریتم خوشه‌بندی فازی سی- مینز با وزن‌دهی اتوماتیک و محلی ویژگی‌ها»، مجله مهندسی برق دانشگاه تبریز، دوره 46، شماره 2، صفحه 75-86، تابستان 1395.
[5]      C. Aggarwal, H. Wang, Managing and Mining Graph Data, Springer US, 2010.
[6]      S. B. Patkar, H. Narayanan "An Efficient Practical Heuristic for Good Ratio-Cut Partitioning", 16th International Conference on VLSI Design (VLSI’03), pp. 1-6, 2003.
[7]      A. E. Feldmann, L. Foschini, "Balanced Partitions of Trees and Applications"; ALGORITHMICA. Vol. 71, pp. 354-376, 2015.
[8]      کبری رحمتی، حسن نادری، سامان کشوری،  «خوشه‌بندی محتوایی-ساختاری گراف و معیاری جدید جهت ارزیابی آن»، مجله علوم و فناوری‌های پدافند نوین، دوره 9، شماره 2، تابستان  1397 (در نوبت چاپ).
[9]      M. Newman, "Community Detection in Networks: Modularity Optimization and Maximum Likelihood are Equivalent", Social and Information Networks (cs.SI). vol. 94, pp. 1-8, 2016.
[10]      Zh. Yang, R. Algesheimer, C. J. Tessone, "A Comparative Analysis of Community Detection Algorithms on Artificial Networks"; Scientific Reports 6. http://www.nature.com/articles/srep30750#supplementary-information ,2016.
[11]      S. Fortunatoa, D. Hricb, "Community Detection in Networks: A User quide", PHYS REP. vol. 659, pp. 1-44, 2016.
[12]      M. Khatoon, W. Aisha Banu, "A Survey on Community Detection Methods in Social Networks", Education and Management Engineering (IJEME). vol. 1, pp. 8-18, 2015.
[13]      H. Elhadi, G. Agam, "Structure and Attributes Community Detection: Comparative Analysis of Composite, Ensemble and Selection Methods", SNAKDD '13 Proceedings of the 7th workshop on Social Network Mining and Analysis, pp. 1-7, 2013.
[14]      S. Harenberg, G. Bello, L. Gjeltema, S. Ranshous, J. Harlalka, R. Seay, K. Padmanabhan, N. Samatova, "Community detection in large-scale networks: a survey and empirical evaluation", Computational Statistics, vol. 6, pp. 426-439, 2014.
[15]      J. R. Matthew, M. Maier, D. Jensen, "Graph Clustering with Network Structure Indices", ICML '07 Proceedings of the 24th Int. Con. on Machine learning, pp. 783-790, 2007.
[16]      V. Shchukin, D. Khristich, I. Galinskaya, "Word Clustering Approach to Bilingual Document Alignment", First Con. on Machine Translation, vol 2, pp. 953-994, 2016.
[17]      L. M. Weber, M. D. Robinson, "Comparison of Clustering Methods for High-Dimensional Single-Cell Flow and Mass Cytometry Data", Cold Spring Harbor Labs Journals, 2016.
[18]      J. Han, M. Kamber, J. Pei. "Data Mining: Concepts and Techniques", 3rd ed, The Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann Publishers, 2011.
[19]      Y. Zhou, H. Cheng, J. Xu Yu, "Graph Clustering Based on Structural/Attribute Similarities"; VLDB. vol. 2, pp. 718-729, 2009.
[20]      M. Parimala, L. Daphne, "Graph Clustering based on Structural Attribute Neighborhood Similarity (SANS)"; IEEE international Conference on Electrical, Computer and Communication Technologies (ICECCT). pp. 1-5, 2015.
[21]      S. Pool, F. Bonchi, M. Leeuwen, "Description-Driven Community Detection" ACM Transactions on Intelligent Systems and Technology (TIST). Vol. 5, pp.1-25, 2014.
[22]      M. Qiao, L. Qin, H. Cheng, J. X. Yu, W. Tian, "Top-K Nearest Keyword Search on Large Graphs", VLDB, vol. 10, pp. 901-912, 2013.
[23]      M. Wang, Ch. Wang, J. Xu Yu, J. Zhang, "Community Detection in Social Networks: An In-depth Benchmarking Study with a Procedure-Oriented Framework"; VLDB. Vol. 8, pp. 998-1009, 2015.
[24]      J. Yang, J. Leskovec, "Defining and evaluating network communities based on ground-truth", Knowledge and Information Systems (KAIS), vol. 42, pp. 181-213, 2015.