Semi-supervised Sparse Feature Selection based on Hessian Regularization and Fisher Discriminant Analysis

Document Type : Original Article

Author

Department of Computer Engineering, Faculty of Engineering, Ardakan University, P.O. Box 184, Ardakan, Iran

Abstract

Feature selection is one of the most important techniques in machine learning and pattern recognition, which eliminates redudant features and selects a suitable subset of features. This avoids overfitting when building the model and improves the model performance. In many applications, obtaining labeled data is costly and time consuming, while unlabeled data are readily available. Therefore, semi-supervised feature selection methods can be used to consider both labeled and unlabeled data in the feature selection process. In this paper, a semi-supervised sparse feature selection method is proposed based on hessian regularization and Fisher discriminant analysis which selects the appropriate features using the labeled data and the local structure of both labeled and unlabeled data. In the proposed method, an objective function based on semi-supervised scatter matrix and l2,1-norm is presented for feature selection which considers the correlation among features. To solve the proposed objective function, an iterative algorithm is used and its convergence is experimentally and theoretically proved. The results of the experiments on five data sets indicate that the proposed method improves the selection of relevant features compared to other methods used in this paper.

Keywords


[1]   G. Chandrashekar, F. Sahin, A survey on feature selection methods, Computers and Electrical Engineering. 40 (2014) 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024.
[2]   Y. Hu, Y. Zhang, D. Gong, Multiobjective Particle Swarm Optimization for Feature Selection With Fuzzy Cost, IEEE TRANSACTIONS ON CYBERNETICS. 51 (2021) 874–888. https://doi.org/10.1109/TCYB.2020.3015756.
[3]   G. Dhiman, D. Oliva, A. Kaur, K.K. Singh, S. Vimal, A. Sharma, K. Cengiz, BEPO: A novel binary emperor penguin optimizer for automatic feature selection, Knowledge-Based Systems. 211 (2021). https://doi.org/10.1016/j.knosys.2020.106560.
[4]   J. Wang, H. Zhang, J. Wang, Y. Pu, N.R. Pal, Feature Selection Using a Neural Network With Group Lasso Regularization and Controlled Redundancy, IEEE Transactions on Neural Networks and Learning Systems. 32 (2021) 1110–1123.
[5] ح. بیاتی، م. دولتشاهی، م. پنیری، انتخاب ویژگی چندبرچسبی با استفاده از الگوریتم بهینه ساز جمعیت رقابتی، مجله علمی رایانش نرم و فناوری اطلاعات. 9 (2020) 56–69.
[6] س. حیدری مقدم بجستانی، س. شعرباف تبریزی ، ع. قاضی خانی ، ارائه‌ی یک روش انتخاب ویژگی جدید مبتنی بر بهینه‌سازی ازدحام ذرات با استفاده از به‌روزرسانی فازی، مجله مهندسی برق دانشگاه تبریز. 50 (2021) 1567–1557.
[7]   W. Zhong, X. Chen, F. Nie, J. Zhexue, Adaptive discriminant analysis for semi-supervised feature selection, Information Sciences. 566 (2021) 178–194. https://doi.org/10.1016/j.ins.2021.02.035.
[8]   M. Tubishat, S. Ja, M. Alswaitti, S. Mirjalili, Dynamic Salp Swarm Algorithm for Feature Selection, Expert Systems with Applications. 164 (2021) 113873. https://doi.org/https://doi.org/10.1016/j.eswa.2020.113873.
[9]   G. ROFFO, S. Melzi, U. Castellani, A. Vinciarelli, M. Cristani, Infinite Feature Selection: a Graph-based Feature Filtering Approach, IEEE Transactions on Pattern Analysis and Machine Intelligence. (2020). https://doi.org/10.1109/TPAMI.2020.3002843.
[10] م. رحمانی نیا، پ. مرادی، م. جلیلی ، یک راهکار انتخاب ویژگی چندهدفه بر اساس اطلاعات متقابل شرطی و نظریه مجموعه پارتو، مجله مهندسی برق دانشگاه تبریز. 50 (2020) 1237–1225.
[11] M. Sharifnezhad, M. Rahmani, A. Professor, H. Ghaffarian, A Distributed Minimum Redundancy Maximum Relevance Feature Selection Approach, Tabriz Journal of Electrical Engineering (TJEE). 51 (2021) 286–293.
[12] Razieh Sheikhpour; Mehdi Agha Sarrama; Sajjad Gharaghani; Mohammad Ali Zare Chahookia, R. Sheikhpour, M.A.M.A. Sarram, S. Gharaghani, M.A.Z.M.A.Z. Chahooki, A Survey on semi-supervised feature selection methods, Pattern Recognition. 64 (2017) 141–158. https://doi.org/10.1016/j.patcog.2016.11.003.
[13] T. Bhadra, S. Bandyopadhyay, Supervised feature selection using integration of densest subgraph finding with floating forward – backward search, Information Sciences. 566 (2021) 1–18. https://doi.org/10.1016/j.ins.2021.02.034.
[14] R. Zhang, H. Zhang, X. Li, S. Yang, Unsupervised Feature Selection With Extended OLSDA via Embedding Nonnegative Manifold Structure, IEEE Transactions on Neural Networks and Learning Systems. (2020) 1–7.
[15] Q. Pang, L. Zhang, Semi-supervised neighborhood discrimination index for feature selection, Knowledge-Based Systems. 204 (2020) 106224. https://doi.org/10.1016/j.knosys.2020.106224.
[16] X. He, D. Cai, P. Niyogi, Laplacian score for feature selection, in: Adv Neural Inf Process Syst, 2005: pp. 507–514.
[17] G. Doquire, M. Verleysen, A graph Laplacian based approach to semi-supervised feature selection for regression problems, Neurocomputing. 121 (2013) 5–13.
[18] J. Zhao, K. Lu, X. He, Locality sensitive semi-supervised feature selection, Neurocomputing. 71 (2008) 1842–1849. https://doi.org/10.1016/j.neucom.2007.06.014.
[19] Z. Ma, F. Nie, Y. Yang, J.R.R. Uijlings, N. Sebe, S. Member, A.G. Hauptmann, Discriminating joint feature analysis for multimedia data understanding, IEEE TRANSACTIONS ON MULTIMEDIA. 14 (2012) 1662–1672.
[20] C. Shi, Q. Ruan, G. An, Sparse feature selection based on graph Laplacian for web image annotation, Image and Vision Computing. 32 (2014) 189–201. https://doi.org/10.1016/j.imavis.2013.12.013.
[21] Y. Han, Y. Yang, Y. Yan, Z. Ma, N. Sebe, S. Member, Semisupervised feature selection via spline regression for video semantic recognition, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,. 26 (2015) 252–264.
[22] R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B (Methodological). 58 (1996) 267–288.
[23] S. Foucart, M.-J. Lai, Sparsest solutions of underdetermined linear systems via ℓq-minimization for 0< q<1, Applied and Computational Harmonic Analysis. 26 (2009) 395–407.
[24] R. Chartrand, Exact reconstruction of sparse signals via nonconvex minimization, IEEE Signal Processing Letters. 14 (2007) 707–710. https://doi.org/10.1109/LSP.2007.898300.
[25] F. Nie, H. Huang, X. Cai, C.H. Ding, Efficient and robust feature selection via joint ℓ2, 1-norms minimization, in: Adv Neural Inf Process Syst, 2010: pp. 1813–1821.
[26] L. Wang, S. Chen, l2,p-matrix norm and its application in feature selection, ArXiv Preprint ArXiv:1303.3987. (2013).
[27] C. Shi, Q. Ruan, S. Member, G. An, R. Zhao, Hessian semi-supervised sparse feature selection based on L21/2-matrix norm, IEEE Transactions on Multimedia. 17 (2015) 16–28.
[28] C.M. Bishop, Neural networks for pattern recognition, Oxford University Press, 1995.
[29] Q. Gu, Z. Li, J. Han, Generalized Fisher Score for Feature Selection, CoRR. abs/1202.3 (2012).
[30] M. Yang, Y. Chen, G. Ji, Semi_fisher score : a semi-supervised method for feature selection, in: International Conference on Machine Learning and Cybernetics, 2010: pp. 527–532.
[31] S. Lv, H. Jiang, L. Zhao, D. Wang, M. Fan, Manifold based fisher method for semi-supervised feature selection, in: 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2013: pp. 664–668.
[32] L. Chen, R. Huang, W. Huang, Graph-based semi-supervised weighted band selection for classification of hyperspectral data, in: Audio Language and Image Processing (ICALIP), 2010 International Conference On, IEEE, 2010: pp. 1123–1126.
[33] W. Yang, C. Hou, Y. Wu, A semi-supervised method for feature selection, 2011 International Conference on Computational and Information Sciences. (2011) 329–332. https://doi.org/10.1109/ICCIS.2011.54.
[34] Y. Liu, F. Nie, J. Wu, L. Chen, Efficient semi-supervised feature selection with noise insensitive trace ratio criterion, Neurocomputing. 105 (2013) 12–18. https://doi.org/10.1016/j.neucom.2012.05.031.
[35] Y. Liu, F. Nie, J. Wu, L. Chen, Semi-supervised feature selection based on label propagation and subset selection, in: Computer and Information Application (ICCIA), 2010 International Conference On, IEEE, 2010: pp. 293–296.
[36] R. Sheikhpour, M.A. Sarram, S. Gharaghani, M.A.Z. Chahooki, A robust graph-based semi-supervised sparse feature selection method, Information Sciences. 531 (2020) 13–30. https://doi.org/10.1016/j.ins.2020.03.094.
[37] R. Sheikhpour, M.A. Sarram, E. Sheikhpour, Semi-supervised sparse feature selection via graph Laplacian based scatter matrix for regression problems, Information Sciences. 468 (2018) 14–28. https://doi.org/10.1016/j.ins.2018.08.035.
[38] X. Li, Y. Zhang, R. Zhang, Semisupervised Feature Selection via Generalized Uncorrelated Constraint and Manifold Embedding, IEEE Transactions on Neural Networks and Learning Systems. (2021). https://doi.org/10.1109/TNNLS.2021.3069038.
[39] R. Sheikhpour, M.A. Sarram, S. Gharaghani, M.A.Z. Chahooki, Feature selection based on graph Laplacian by using compounds with known and unknown activities, Journal of Chemometrics. (2017). https://doi.org/10.1002/cem.2899.
[40] K.I. Kim, F. Steinke, M. Hein, Semi-supervised Regression using Hessian Energy with an Application to Semi-supervised Dimensionality Reduction, in: Advances in Neural Information Processing Systems (NIPS). MPI for Biological Cybernetics, Germany, 2010: pp. 979–987.
[41] R. Zhang, Y. Zhang, X. Li, Unsupervised Feature Selection via Adaptive Graph Learning and Constraint, IEEE Transactions on Neural Networks and Learning Systems. (2020). https://doi.org/10.1109/TNNLS.2020.3042330.
[42] Z. Wang, F. Nie, L. Tian, R. Wang, X. Li, Discriminative Feature Selection via A Structured Sparse Subspace Learning Module, in: IJCAI, 2020: pp. 3009–3015.