Processing unstructured clinical texts is often necessary to support certain tasks in biomedicine, such as matching patients to clinical trials. Among other methods, domain-specific language models have been built to utilize free-text information. This study evaluated the performance of Bidirectional Encoder Representations from Transformers (BERT) models in assessing the similarity between clinical trial texts. We compared an unstructured aggregated summary of clinical trials reviewed at the Johns Hopkins Molecular Tumor Board with the ClinicalTrials.gov records, focusing on the titles and eligibility criteria. Seven pretrained BERT-Based models were used in our analysis. Of the six biomedical-domain-specific models, only SciBERT outperformed the original BERT model by accurately assigning higher similarity scores to matched than mismatched trials. This finding is promising and shows that BERT and, likely, other language models may support patient-trial matching.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com