The objective of this study is to develop a method for clinical abbreviation disambiguation using deep contextualized representation and cluster analysis. We employed the pre-trained BioELMo language model to generate the contextualized word vector for abbreviations within each instance. Then principal component analysis was conducted on word vectors to reduce the dimension. K-Means cluster analysis was conducted for each abbreviation and the sense for a cluster was assigned based on the majority vote of annotations. Our method achieved an average accuracy of around 95% in 74 abbreviations. Simulation showed that each cluster required the annotation of 5 samples to determine its sense.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com