A Distributed Minimum Redundancy Maximum Relevance Feature Selection Approac

Document Type : Original Article

Authors

Department of Computer Engineering, Faculty of Engineering, Arak University, Arak, Iran.

Abstract

Feature selection (FS) is served in almost all data mining applications along with some benefits such as reducing the computation and storage cost. Most of the current feature selection algorithms just work in a centralized manner. However, this process does not apply to high dimensional datasets, effectively. In this paper, we propose a distributed version of Minimum Redundancy Maximum Relevance (mRMR) algorithm. The proposed algorithm acts in six steps to solve the problem. It distributes datasets horizontally into subsets, selects and eliminates redundant features, and finally merges the subsets into a single set. We evaluate the performance of the proposed method using different datasets. The results prove that the suggested method can improve classification accuracy and reduce the runtime

Keywords


[1] V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos, “A Distributed Wrapper Approach for Feature Selection”, In Computational Intelligence and Machine Learning Conference(ESANN), April 2013, Bruges, Belgium pp. 24-26.
[2] V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos, “A Distributed Feature     Selection Approach Based on a Complexity Measure”, Advances in Computational Intelligence, pp. 15-28, 2015.
[3] G. Chandrashekar and F. Sahin, “A Survey on Feature Selection Methods”, Journal of Computers and Electrical Engineering, vol. 40, pp.16–28, 2014.
[4] I. Guyon, A. Elisseeff, “An Introduction to Variable and Feature Selection”, Journal of Machine Learning Research, vol.3, pp.1157–1182, 2003.
[5] I.Guyon, S.Gunn, M.Nikravesh and L.A.Zadeh, “Feature Extraction: Foundations and Applications”, Springer, vol. 207, 2006.
[6] L. Yu, H. Liu, “Efficient Feature Selection via Analysis of Relevance and Redundancy”, Journal of Machine Learning Research, vol. 5, pp. 1205–1224, 2004.
 [7] C. Ding, H. Peng, “Minimum Redundancy Feature Selection From Microarray Gene Expression Data”, Journal of Bioinformatics and Computational Biology, vol. 3, no. 2, pp.185–205, 2005.
[8] V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos, “Distributed Feature Selection: An Application to Microarray Data Classification”, Applied Soft Computing, vol. 30, pp. 136-150, 2015.
[9] R. Kohavi, GH. John, “Wrappers for Feature Subset Selection”, Artificial Intelligence, vol. 97, pp. 273–324, 1997.
[10] J.Li, K.Cheng, S.Wang, F. Morstatter, and R. P. Trevino, “Feature Selection: A Data Perspective”, Journal of ACM Computing Surveys, vol. 50, no 6, 2018.
[11] V. Bolón-Canedo, N. Sánchez-Maroño, and J. Cerviño-Rabuñal, “Scaling up Feature Selection: a Distributed Filter Approach”, Advances in Artificial Intelligence, pp. 121-130, 2013.
[12] L. Mor´an-Fern´andez, V. Bol´on-Canedo, and A. Alonso-Betanzos, “A Time Efficient Approach for Distributed Feature Selection Partitioning by Features”, Lecture Notes in Computer Science book series (LNCS), vol. 9422, pp.245–254, 2015.
[13] Guyon, I., Gunn, S., Nikravesh, M., & Zadeh, L. A. (Eds.), “Feature extraction: foundations and applications”, Springer, Vol. 207, 2008.
[14] Rego-Fernández, D., Bolón-Canedo, V., & Alonso-Betanzos, A., “Scalability Analysis of mRMR for Microarray Data”, In ICAART (1), pp. 380-386,2014.
[15] Ramírez‐Gallego, S., Lastra, I., Martínez‐Rego, D., Bolón‐Canedo, V., Benítez, J. M., Herrera, F., & Alonso‐Betanzos, A., “Fast‐mRMR: Fast minimum redundancy maximum relevance algorithm for high‐dimensional big data”, International Journal of Intelligent Systems, vol.32(2), pp.134-152,2017.
[16] Brown, G., Pocock, A., Zhao, M. J., & Luján, M.,”Conditional likelihood maximisation: a unifying framework for information theoretic feature selection”, The journal of machine learning research, vol.13(1),pp. 27-66, 2012.
[17] D. Boughaci and A.A Alkhawaldeh (2018), “Three Local Search-Based Methods for Feature Selection in Credit Scoring”, Vietnam Journal of Computer Science, vol. 5, no 2, pp. 107–121, 2018.
[18] H. Min and W. Fangfang  “Filter-Wrapper Hybrid Method on Feature Selection”, In Second WRI Global Congress on Intelligent Systems (GCIS), Dec 2010,  Wuhan, Chinapp, pp.98-101.
[19] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene Selection for Cancer Classification using Support Vector Machines”, Machine Learning, vol. 46, no 1-3, pp. 389–422, 2002.
[20] Q. Wang,  J. Wan, F. Nie, B. Liu, C.Yan, and X. Li, “Hierarchical Feature Selection for Random Projection”, IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 5, pp. 1581–1586, 2019.
[21] Toğaçar, M., Ergen, B., Cömert, Z., & Özyurt, F., “A deep feature learning model for pneumonia detection applying a combination of mRMR feature selection and machine learning models”, IRBM, 41(4), pp.212-222, 2020.
[22]Cheriguene, S., Azizi, N., Dey, N., Ashour, A. S., & Ziani, A., “A new hybrid classifier selection model based on mRMR method and diversity measures”, International Journal of Machine Learning and Cybernetics, 10(5), pp.1189-1204, 2019.
[23]Billah, M., & Waheed, S.,” Minimum redundancy maximum relevance (mRMR) based feature selection from endoscopic images for automatic gastrointestinal polyp detection”, Multimedia Tools and Applications, pp.1-11,2020.
[24] F. Alighardashi, M. A. Zare Chahooki, “The Effectiveness of the Combination of Filter and Wrapper Feature Selection Methods to Improve Software Fault Prediction”, Tabriz Journal of Electrical Eng., vol. 47, no. 1, 2017.
[25]S. Kashef, H. Nezamabadi-pour, “A Hybrid Method to Find Effective Subset of Features in Multi-label Datasets”, Tabriz Journal of Electrical Engineering, vol. 48, no. 3, 2018
[26] L. Morán-Fernández, V. Bolón-Canedo, and A. Alonso-Betanzos, “Centralized vs. Distributed  Feature Selection Methods Based on Data Complexity Measures”, Journal of Knowledge-Based Systems, vol. 117, pp.27–45, 2016.
[27] A.De Haro Garc´ıa, “Scaling Data Mining Algorithms. Application to Instance and Feature Selection”, Ph.D. Thesis, University of Granada, 2011.
[28] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene Selection for Cancer Classification using Support Vector Machines”, Journal of Machine Learning, vol.46, pp.389–422, 2002.
[29] H. Peng, F. Long, and C. Ding, “Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min Redundancy”, IEEE Transaction on Pattern Analysis and Machine Intelligence., vol. 27, no. 8, pp. 1226–1238, 2005.
[30] Y.Lu, W.Liu, and Y.Li, “A Feature Selection Based on Relevance and Redundancy”, JCP, vol. 10, no. 4, pp. 284-291, 2015.
[31] PH. Taylor et al., “Redundant Feature Selection for Telemetry Data”, ADMI, vol. 8316, pp.53- 65, 2013.
[32] M. Radovic et.al, “Minimum Redundancy Maximum Relevance Feature Selection Approach for Temporal Gene Expression Data”, BMC Bioinformatics, vol. 18, no. 9, 2017.
[33]. V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos, “A Distributed Feature Selection Approach Based on a Complexity Measure”, In 13th International Work-Conference on Artificial Neural Network,  Palma de Mallorca, Spain, pp 15-28, 2015.
[34] http://archive.ics.uci.edu/ml/datasets/
[35] M.A. Hall, L.A. Smith, “Practical Feature Subset Selection for Machine Learning”, Computer Science, vol. 98, pp.181–191, 1998.
[36] I. Kononenko, “Estimating Attributes: Analysis and Extensions of RELIEF”, Machine Learning: ECML-94, vol. 784, pp 171-182, 1994.
[37] M. Robnik-Šikonja and I. Kononenko,  “Theoretical and Empirical Analysis of ReliefF and RReliefF”, Machine learning, vol. 53, no. 1-2, pp. 23-69, 2003.
[38] H. Djellali, N. Ghoualmi Zine and N. Azizi, “Two Stages Feature Selection Based on Filter Ranking Methods and SVM-RFE on Medical Applications”, Modelling and Implementation of Complex Systems, pp. 281-293, 2016.
[39] Y. Liu and Y.F.Zheng, “One-against-all multi-class SVM classification using reliability measures”, IEEE international joint conference on neural network (IJCNN), vol. 2, pp. 849-854, 2005.
[40] M. Arun Kumar, M. Gupta, “Fast multiclass SVM classification using decision tree based one-against-all method”, Springer, neural process letter, vol. 32, pp. 311-323, 2010.
[41] J. C. Platt, N. Cristianini and J. Shahere-Taylo, “Large margin DAGs for multiclass classification”, Advances in neural information processing system, vol. 12, no. 3, pp. 547-553, 2000.
[42] G. madzarov, D. gjorgjevikj, and I. chorbev, “A multi-class SVM classifier utilizing binary decision tree”, An international journal of computing and informatics, Informatica, vol. 33 number 2, ISSN0350-5596, Slovenia; pp.233-241, 2009.
[43] A. Meshram, R. Gupta and S. Sharma, “Advanced Probabilistic Binary Decision Tree Using SVM for large class problem”, International Journal of Computer Science and Information Technologies(IJCSIT), Vol. 6 (2), pp. 1660-1664,2015.
[44] E. Afshari, “Proposing a New Embedded Method for Feature Reduction in Big Data”, Master of Science Thesis, Arak University, 2017.