Predictive Toxicology (PT) attempts to describe the relationships between the chemical structure of chemical compounds and biological and toxicological processes. The most important issue related to real-world PT problems is the huge number of the chemical descriptors. A secondary issue is the quality of the data since irrelevant, redundant, noisy, and unreliable data have a negative impact on the prediction results. The pre-processing step of Data Mining deals with complexity reduction as well as data quality improvement through feature selection, data cleaning, and noise reduction. In this paper, we present some of the issues that can be taken into account for preparing data before the actual knowledge discovery is performed.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 email@example.com
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 firstname.lastname@example.org