Improving Microarray Data Classification Using Optimized Clustering-Based Hybrid Gene Selection Algorithm

D. Saravanakumar and Dr.S.K. Mahendran

The primary goal of a gene selection algorithm is to find a small, closely related gene set that can improve the accuracy of a tumor classification system. In this work, an enhanced hybrid gene selection algorithm, that combined clustering, filter and wrapper-based algorithms, is proposed. Initially, the redundant data present in the micro array data set are removed using an optimized K-Means clustering algorithm. This is followed by the use the of a hybrid filter-based algorithm to reduce the size of the gene dataset. The hybridization combined three wellknown filter-based algorithms, namely, fisher score, mutual information and information gain. The candidate gene set obtained from this algorithm is size reduced and consists of only relevant and non-redundant data. Finally, a wrapper-based algorithm is enhanced and applied to obtain optimal gene set. The enhancement algorithm considers class-dependent and class-independent techniques and the classification is performed using support vector machine classifier. Experimental results using six cancer datasets proved that the usage of the proposed gene selection algorithm have positive impact during tumor classification and improved its accuracy.

Volume 12 | 03-Special Issue

Pages: 486-495

DOI: 10.5373/JARDCS/V12SP3/20201283