Background: Huge amounts of data are collected by healthcare providers and other institutions. However, there are data protection regulations, which limit their utilisation for secondary use, e.g. research. In scenarios, where several data sources are obtained without universal identifiers, record linkage methods need to be applied to obtain a comprehensive dataset.
Objectives: In this study, we had the objective to link two datasets comprising data from ergometric performance tests in order to have reference values to free text annotations for assessing their data quality.
Methods: We applied an iterative, distance-based time series record linkage algorithm to find corresponding entries in the two given datasets. Subsequently, we assessed the resulting matching rate. The implementation was done in Matlab.
Results: The matching rate of our record linkage algorithm was 74.5% for matching patients' records with their ergometry records. The highest rate of appropriate free text annotations was 87.9%.
Conclusion: For the given scenario, our algorithm matched 74.5% of the patients. However, we had no gold standard for validating our results. Most of the free text annotations contained the expected values.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com