Medical care data is a valuable resource that can be used for many purposes including managing and planning for future health needs as well as clinical research. However, the heterogeneity and complexity of medical data can be an obstacle in applying data mining techniques. Much of the potential value of this data therefore goes untapped. In this paper we have developed a methodology that reduces the dimensionality of primary care data, in order to make it more amenable to visualisation, mining and clustering. The methodology involves employing a combination of ontology-based semantic similarity and principal component analysis (PCA) to map the data into an appropriate and informative low dimensional space. Throughout the study, we had access to anonymised patient data from primary care in Salford, UK. The results of our application of this methodology show that diagnosis codes in primary care data can be used to map patients into an informative low dimensional space, which in turn provides the opportunity to support further data exploration and medical hypothesis formulation.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com