استفاده از خوشه‌بندی و مدل مارکوف جهت پیش‌بینی درخواست آتی کاربر در وب

نویسندگان

1 دانشجوی کارشناسی ارشد دانشکده مهندسی برق و کامپیوتر - دانشگاه تبریز

2 استادیار دانشکده مهندسی برق و کامپیوتر - دانشگاه تبریز

3 دانشیار دانشکده مهندسی برق و کامپیوتر - دانشگاه تبریز

چکیده

چکیده: تاکنون روش­های مختلفی جهت کاهش تأخیری که توسط کاربر مشاهده­ می­شود، ارائه شده است. پیش­واکشی وب یکی از روش­های کاهش این تأخیر است. این روش­ از یک الگوریتم پیش­بینی استفاده می­کند تا فعالیت­های آتی کاربر را پیش­بینی کند. در این مقاله، روشی برای پیش­بینی درخواست آتی کاربران ارائه شده است که قابل استفاده برای پیش­واکشی وب است. در ایـن روش، از تـرکیب یک مدل مارکوف با خوشه­بندی، مدلی جهت پیش­بینی درخواست آتی کاربر ایجاد شده است. همچنین، از خواص زمانی دسترسی­ها نیز جهت پیش­بینی استفاده شده است. نتایج پیاده­سازی بیانگر بهبود پیش­بینی­ها نسبت به مدل مارکوف است.

کلیدواژه‌ها


[1]A. Sidiropoulos, G. Pallis, D. Katsaros, K. Stamos, A. Vakali and Y. Manolopoulos, “Prefetching in content distribution networks via web communities identification and outsourcing,” World Wide Web, vol. 11, no. 1, pp. 39-70, 2008.
[2]Y. Jiang, M-Y. Wu and W. Shu, “Web prefetching: costs, benefits and performance,” in Proceedings of the 7th international workshop on web content caching and distribution, Boulder, Colorado, 2002.
[3]V. N. Padmanabhan and J. C. Mogul, “Using predictive prefetching to improve World Wide Web latency,” SIGCOMM ComputCommun Rev, vol. 26, no. 3, pp. 22-36, 1996.
[4]J. Domenech, B. de la Ossa, J. Sahuquillo, J. A. Gil and A. Pont, “A taxonomy of web prediction algorithms,” Expert Systems with Applications, vol. 39, no. 9, pp. 8496-8502, 2012.
[5]L. Fan, P. Cao, W. Lin and Q. Jacobson, “Web prefetching between low-bandwidth clients and proxies: potential and performance,” SIGMETRICS Perform Eval Rev, vol. 27, no. 1, pp.178-187, 1999.
[6]C. Xin and Z. Xiaodong, “Popularity-based PPM: an effective Web prefetching technique for high accuracy and low storage,” in Proceedings of the International Conference on Parallel Processing, pp. 296-304, 2002.
[7]S. K. Rangarajan, V. V. Phoha, K. S. Balagani, R. R. Selmic and S. S. Iyengar, “Adaptive neural network clustering of web users,” Computer, vol. 37, no. 4, pp. 34-40, 2004.
[8]Ş. Gündüz and M. T. Özsu, “A web page prediction model based on click-stream tree representation of user behavior,” in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 535-540, 2003.
[9]X. Li, B. Liu and P. Yu, “Time sensitive ranking with application to publication search,” Link Mining: Models, Algorithms, and Applications, Springer, New York, pp. 187-209, 2010.
[10]J. Domenech, J. A. Gil, J. Sahuquillo and A. Pont, “Using current web page structure to improve prefetching performance,” Computer Networks, vol. 54, no. 9, pp. 1404-1417, 2010.
[11]G. Pallis, A. Vakali and J. Pokorny, “A clustering-based prefetching scheme on a Web cache environment,” Computers & Electrical Engineering, vol. 34, no. 4, pp. 309-323, 2008.
[12]M. Wan, L. Li, J. Xiao, Y. Yang, C. Wang and X. Guo, “CAS based clustering algorithm for Web users,” Nonlinear Dynamics, vol. 61, no. 3, pp. 347-361, 2010.
[13]H. Liu and V. Kešelj, “Combined mining of Web server logs and web contents for classifying user navigation patterns and predicting users’ future requests,” Data & Knowledge Engineering, vol. 61, no. 2, pp. 304-330, 2007.
[14]J.X. Yu, O. Yuming, C. Zhang and S. Zhang, “Identifying interesting visitors through Web log classification,” Intelligent Systems, vol. 20, no. 3, pp. 55-59, 2005.
[15]R. Cooley, B. Mobasher and J. Srivastava, “Data preparation for mining world wide web browsing patterns,” Knowledge and information systems, vol. 1, no. 1, pp. 5-32, 1999.
[16]T. W. Yan, M. Jacobsen, H. Garcia-Molina and U. Dayal, “From user access patterns to dynamic hypertext linking,” Computer Networks and ISDN Systems, vol. 28, no. 7, pp. 1007-1014, 1996.
[17]A. Banerjee and J. Ghosh, “Clickstream clustering using weighted longest common subsequences,” in Proceedings of the web mining workshop at the 1st SIAM conference on data mining, vol. 143, pp. 144-152, 2001.
[18]B. Mobasher, “Webpersonalizer: a server-side recommender system based on web usage mining,” in Proceedings of the 9th Workshop on Information Technologies and Systems (WITS'99), Charlotte, NC, 1999.
[19]Y. Xie and V. V. Phoha, “Web user clustering from access log using belief function,” in Proceedings of the 1st international conference on Knowledge capture, ACM, pp. 202-208, 2001.
[20]P. Kumar, P. R. Krishna, R. S. Bapi and S. K. De, “Rough clustering of sequential data,” Data & Knowledge Engineering, vol. 63, no. 2, pp. 183-199, 2007.
[21]B. Liu, Web data mining: exploring hyperlinks, contents, and usage data, Springer, 2007.
[22]Z. Shi, “Efficient online spherical k-means clustering,” IEEE International Joint Conference, pp. 3180-3185, 2005.
[23]D. Arthur and S. Vassilvitskii, “K-means++: the advantages of careful seeding,” in Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, New Orleans, Louisiana, 2007.
[24]NASA webserver log, available at http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html.