As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Automatic document classification is a common problem that has successfully been addressed with machine learning methods. However, these methods require extensive training data, which is not always readily available. Additionally, in privacy-sensitive settings, transfer and reuse of trained machine learning models is not an option because sensitive information could potentially be reconstructed from the model. Therefore, we propose a transfer learning method that uses ontologies to normalize the feature space of text classifiers to create a controlled vocabulary. This ensures that the trained models do not contain personal data, and can be widely reused without violating the GDPR. Furthermore, the ontologies can be enriched so that the classifiers can be transferred to contexts with different terminology without additional training. Applying classifiers trained on medical documents to medical texts written in colloquial language shows promising results and highlights the potential of the approach. The compliance with GDPR by design opens many further application domains for transfer learning based solutions.