As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Natural Language Processing (NLP) is a powerful technique for extracting valuable information from unstructured electronic health records (EHRs). However, a prerequisite for NLP is the availability of high-quality annotated datasets. To date, there is a lack of effective methods to guide the research effort of manually annotating unstructured datasets, which can hinder NLP performance. Therefore, this study develops a five-step workflow for manually annotating unstructured datasets, including (1) annotator training and familiarising with the text corpus, (2) vocabulary identification, (3) annotation schema development, (4) annotation execution, and (5) result validation. This framework was then applied to annotate agitation symptoms from the unstructured EHRs of 40 Australian residential aged care facilities. The annotated corpus achieved an accuracy rate of 96%. This suggests that our proposed annotation workflow can be used in manual data processing to develop annotated training corpus for developing NLP algorithms.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.