One-class Classifier Based on Support Vectors for Noisy Data by Using Chaotic Krill Herd Algorithm and Local Density

Authors

Faculty of Computer Engineering and Information Technology, Sadjad University of Technology, Mashhad, Iran

Abstract

The purpose of one-class classification is to detect and separate target data from outlier. Support vector data description classifier is one of the one-class data classification methods. This method creates a hyper-sphere in feature space and tries to cover target data in the hyper-sphere. The hyper-sphere surface is the discernment boundary between target and outlier data. Determining appropriate radius and center for the sphere is an optimization problem. Existence of the noise in the data set and lack of attention to data density for choosing the center is the challenge of this method that triggered the mistake in determining of detection boundary. In the proposed classifier (KH-SVDD) we tried to search appropriate center of sphere with the use of chaotic krill herd optimization algorithm. Also, a weight is calculated for the effectiveness of the points on the classifier boundary with the use of local density of data points. This weight is an auxiliary parameter to detect target data from noise. The results of the experiments have been compared with state-of-the-art methods, which show superiority of the proposed method in noise detection.

Keywords


[1] محمدعلی زارع چاهوکی و سیدحمیدرضا محمدی، «بهینه‌سازی هسته‌های چندگانه در ماشین بردار پشتیبان جفتی برای کاهش شکاف معنایی تشخیص صفحات فریب‌آمیز»، مجله مهندسی برق دانشگاه تبریز، شماره 4 جلد 46، 135-145، 1395.
[2] محمدامیر عباسیان و حسین نظام‌آبادی‌پور، «الگوریتم جستجوی گرانشی چندهدفه مبتنی بر مرتب‌سازی جبهه‌های مغلوب‌نشده»، مجله مهندسی برق دانشگاه تبریز، شماره 1 جلد 41، 68-80، 1390.
[3] سیدحسین غفاریان، هادی صدوقی یزدی و یونس الله‌یاری، «دسته‌بند تک‌کلاسه گرانش‌گرای مبتنی بر ماشین بردار پشتیبان»، نشریه مهندسی برق و مهندسی کامپیوتر ایران، سال 10، شماره 2، 1391.
[4] وحیده منعمی‌زاده و جواد حمیدزاده، «جستجوی k نزدیک‌ترین همسایه تقریبی به روش ترکیب خطی»، نشریه مهندسی برق و مهندسی کامپیوتر ایران، آماده انتشار.
[5] S. S. Khan and M. G. Madden, “A survey of recent trends in one class classification,” Artificial Intelligence and Cognitive Science, vol. 6206, pp. 188-197, 2010.
[6] A. Wenjuan, M. Liang and H. Liu, “An improved one-class support vector machine classifier for outlier detection,” Mechanical Engineering Science, vol. 229, pp. 580-588, 2015.
[7] S. S. Khan and M. G. Madden, “One-Class Classification: Taxonomy of Study and Review of Techniques,” The Knowledge Engineering Review, vol. 29, pp. 1-30, 2014.
[8] S. Kang, S. Cho and P. Kang, “Multi-class classification via heterogeneous ensemble of one-class classifiers,” Engineering Applications of Artificial Intelligence, vol. 43,pp. 35–43, 2015.
[9] L. Zhang, L. Xingning, W. Bangjun and H. Shuping, “Similarity learning based on multiple support vector data description,” Neural Networks (IJCNN), pp. 1-7, 2015.
[10] D. M. Tax and R.P. Duin, “Uniform object generation for optimizing one-class classifiers,” The Journal of Machine Learning Research, vol. 2, pp. 155-173, 2002.
[11] R. Sadeghi and J. Hamidzadeh, “Automatic Support Vector Data Description,” Soft Computing, 2016, DOI: 10.1007/s00500-016-2317-5.
[12] V. H. Moghaddam, and J. Hamidzadeh, “New Hermite orthogonal polynomial kernel and combined kernels in Support Vector Machine classifier,” Pattern Recognition, vol. 60, pp. 921-935, 2016.
[13] D. M. Tax and R.P. Duin, “Support vector data description,” Machine Learning, vol. 54, pp. 45–66, 2004.
[14] J. Bootkrajang, “A generalised label noise model for classification in the presence of annotation errors,” Neurocomputing, vol. 192, pp. 61–71, 2016.
[15] J. Hamidzadeh, R. Monsefi and H. SadoghiYazdi, “IRAHC: Instance Reduction Algorithm using Hyperrectangle Clustering,” Pattern Recognition, vol. 48, pp.1878-1889, 2015.
[16] J. Hamidzadeh, R. Monsefi and H. SadoghiYazdi, “LMIRA: Large Margin Instance Reduction Algorithm,” Neurocomputing, vol. 145, pp. 477-487, 2014.
[17] S. Y. Xia, Z. Xiong, Y. He, K. Li, L. M. Dong and M. Zhang, “Relative density-based classification noise detection,” Optik International Journal for Light and Electron Optics, vol. 125, pp. 6829–6834, 2014.
[18] K. Lee, D. Kim, K. H. Lee and D. Lee, “Density-induced support vector data description,” Neural Networks, IEEE Transactions on, vol. 18, pp. 284–289, 2007.
[19] C. K. Wang, Y. Ting, Y. H. Liu and G. Hariyanto, “A Novel Approach to Generate Artificial Outliers for support Vector Data Description,” IEEE International Symposium on Industrial Electronics (ISIE), Korea, pp. 2202-2207, 2009.
[20] H. W. Cho, “Data description and noise filtering based detection with its application and performance comparison,” Expert systems with Applications, vol. 36, no. 1, pp. 434-441, 2009.
[21] S. M. Guo, L. C. Chen and J. S. Tsai, “A boundary method for outlier detection based on support vector domain description,” Pattern Recognition, vol. 42, pp. 77-83, 2009.
[22] G. X. Huang, H. F. Chen and F. Yin, “Improved support vector data description,” International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1459-1463, 2010.
[23] B. Liu, Y. Xiao, L. Cao, Z. Hao and F. Deng, “SVDD-based outlier detection on uncertain data,” Knowledge and Information Systems, vol. 34, pp. 597-618, 2013.
[24] M. Cha, J. Kim and J. Baek, “Density weighted support vector data description,”  Expert Systems with Applications, vol. 41, pp. 3343–3350, 2014.
[25] G. Chen, X. Zhang, Z. Wang and F. Lia, “Robust support vector data description for outlier detection with noise or uncertain data,” Knowledge-Based Systems, vol. 90, pp. 129–137, 2015.
[26] S. Kim, Y. Choi and M. Lee, “Deep learning with support vector data description,” Neurocomputing, vol. 165, pp. 111–117, 2015.
[27] G. Wang, L. Guo, A. Gandomi, G. Hao and H. Wang, “Chaotic Krill Herd algorithm,” Information Sciences, vol. 274, pp. 17–34, 2014.
[28] A. Asuncion and D. Newman, UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA, 2013.
[29] S.S. Khan, J. Hoey and D. Lizotte, “Bayesian multiple imputation approaches for one-class classification,” Advances in Artificial Intelligence, pp. 331–336, 2012.
[30] B. Liu, Y. Xiao and Z. Hao, “An efficient approach for outlier detection with imperfect data labels,” IEEE Trans Knowl. Data Eng, vol. 26, pp. 1602-1616, 2014.