As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Background: Tagging text data with codes representing biomedical concepts plays an important role in medical data management and analysis. A problem occurs if there are ambiguous words linked to several concepts.
Objectives and Methods: This study aims at investigating word sense disambiguation based on word embedding and recurrent convolutional neural networks. The study focuses on terms mapped to multiple concepts of the Unified Medical Language System (UMLS).
Results: We created 20 text processing pipelines trained on a subset of the MeSH Word Sense Disambiguation (MSH WSD) data set, each pipeline disambiguating the sense of one word. The pipelines were then tested on a disjoint subset of MSH WSD data. Most pipelines achieved good or even excellent results (70% of the pipelines achieved at least 90% accuracy, 40% achieved at least 98% accuracy). One poor-performing outlier was detected.
Conclusion: The proposed approach can serve as a basis for an up-scaled system combining pipelines for many ambiguous words. The methods used here recently proved very successful in other fields of text understanding and can be expected to scale-up with improved availability of training data.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.