

Big data is a collection of larger volume of unstructured types of data such as images, videos and social media data. Recently, the applications of big data analytics in the field of medical image processing increased rapidly. Tumor detection and classification of leukemia cells are challenging tasks in medical image processing, as manual data analysis is time consuming and most often not accurate. In big data applications, the feature reduction has a prominent role in eliminating the irrelevant features and building a good learning model. Clustering based on Backtracking Search Optimization Algorithm (BSA) is used to segment the nucleus. Various types of features were used to address the segmented nucleus including shape, texture and colour based features. In this paper, a Modified Dominance Soft Set-based Feature Reduction Algorithm (MDSSA) is designed to select the most prominent features for the leukemia image classification. The results of MDSSA are tested using Analysis of Variance (ANOVA). In the feature extracted datasets, the MDSSA selected 17% of the features that showed more promising performance in comparison to existing feature reduction algorithms. The proposed method also reduces the computation time for further analysis of Big Data. The ANOVA results confirm that the efficiency of the proposed feature reduction method is better than other methods.