Discretization is a process applied to transform continuous data into data with discrete attributes. It makes the learning step of many classification algorithms more accurate and faster. Although many efficient supervised discretization methods have been proposed, unsupervised methods such as Equal Width Discretization (EWD) and Equal Frequency Discretization (EFD) are still in use especially with datasets when classification is not available. Each of these algorithms has its drawbacks. To improve the classification accuracy of EWD, a new method based on adjustable intervals is proposed in this paper. The new method is tested using benchmarking datasets from the UCI repository of machine learning databases; the C4.5 classification algorithm is then used to test the classification accuracy. The experimental results show that the method improves the classification accuracy by about 5% compared to the conventional EWD and EFD methods, and is as good as the supervised Entropy Minimization Discretization (EMD) method.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 email@example.com
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 firstname.lastname@example.org