Ebook: German Medical Data Sciences 2022 – Future Medicine: More Precise, More Integrative, More Sustainable!
The aim of medical research has always been to gain scientific knowledge which will serve to improve the diagnosis, therapy and prevention of diseases. It is also becoming increasingly important to take account of the changing circumstances of medical care. Factors such as the ageing of society and the recent pandemic have not only led to greater use of medical care, but have also put the human resources and infrastructural basis of the health system under great pressure. Such developments call for science-based solutions which can better adapt medical action to the needs of patients to ensure that medicine remains affordable and accessible for all.
This book is the 6th volume of the German Medical Data Science series in Studies in Health Technology and Informatics and presents the proceedings of the joint conference of the 67th Annual Meeting of the German Association of Medical Informatics, Biometry, and Epidemiology (GMDS) and the 14th Annual Meeting of the Technology and Methods Platform for Networked Medical Research (TMF). The conference was entitled Medicine in Transition - More Precise, More Integrative, More Sustainable. It was due to be held from 21-25 August 2022 in Kiel, Germany, but was changed to an online event on the same dates due to an increasing surge in cases of coronavirus. The pandemic has not only disrupted the planning of many events, it has also impressively demonstrated the importance of technical and methodological aspects of digitization. The 13 papers included here address the challenges of and opportunities for the digitization so vital for the functionality of the modern healthcare system, and the book will be of interest to all those involved in the planning and delivery of healthcare.
At the time of the submission of contribution, we assumed that we would be able to hold the joint meeting of TMF and GMDS as a live event. Despite this prospect, the number of paper submissions has decreased, especially the number of full papers (Table 1). One possible reason for the decrease in submissions may be the proximity of the deadline to the MIE2022 deadline (EFMI-conference Medical Informatics Europe 2022, Nice)(65 full papers was submitted from Germany for MIE 2022, Nice) and the deadline for the submission of applications in the Medical Informatics Initiative.
Acceptance rates were 100% among the full papers submitted, (1 of 1) for contributions to GMS MIBE and 44% (14 of 32)(For comparison, acceptance rate of full paper at MIE 2022, 54% (147 of 271).) for contributions to Studies in Health Technology and Informatics (Stud HTI). One contribution to the MIBE was withdrawn during the review process, and one contribution to Stud HTI was withdrawn after final acceptance. The complete review process is shown in Fig. 1. Due to technical constraints, the figure (and statistics) do not include the withdrawn abstracts.
We wish you an exciting conference and an inspiring reading of the proceedings.
(Editor in Chief GMDS in Stud HTI) Rainer Röhrig
(Bioinformatics and Systems Medicine) Tim Beißbarth
(Biometrics) Verena Hoffmann
(Medical Informatics) Ursula Hübner
(Bioinformatics and Systems Medicine) Nils Grabe
(Editor in Chief GMS MIBE) Petra Knaup-Gregori
(Epidemiology) Jochem König
(Medical Informatics) Ulrich Sax
(Chair of SPC) Björn Schreiweis
(Congress Secretary) Martin Sedlmayr
Chronic wounds have significant impacts on patient health-related quality of life (HRQoL) and the healthcare expenditures. Various complex decision-making scenarios arise from wound management. Clinical decision-making systems (CDSS) can assist in relieving healthcare providers in these complex decision-making processes and improve the quality of care. In our study, we used the Decision Model & Notation (DMN) standard as a knowledge representation format to implement a knowledge base for chronic wound material recommendation in phase-based therapy. The resulting decision model is theoretically consistent and sustainable. With this study, we also emphasized the need of a semantic interoperability framework. This opens further research possibilities regarding the improvement of the model and the interest of DMN for decision models in clinical fields.
Verbal probabilities such as “likely” or “probable” are commonly used to describe situations of uncertainty or risk and are easy and natural to most people. Numerous studies are devoted to the translation of verbal probability expressions to numerical probabilities.
The present work aims to summarize existing research on the numerical interpretation of verbal probabilities. This was accomplished by means of a systematic literature review and meta-analysis conducted alongside the MOOSE-guidelines for meta-analysis of observational studies in epidemiology. Studies were included, if they provided empirical assignments of verbal probabilities to numerical values.
The literature search identified 181 publications and finally led to 21 included articles and the procession of 35 verbal probability expressions. Sample size of the studies ranged from 11 to 683 participants and covered a period of half a century from 1967 to 2018. In half of the studies, verbal probabilities were delivered in a neutral context followed by a medical context. Mean values of the verbal probabilities range from 7.24% for the term “impossible” up to 94.79% for the term “definite”.
According to the results, there is a common ‘across-study’ consensus of 35 probability expressions for describing different degrees of probability, whose numerical interpretation follows a linear course. However, heterogeneity of studies was considerably high and should be considered as a limiting factor.
In Germany, the current COVID-19 cases are managed and reported by the local health authorities. The workload of their employees during the pandemic is high, especially in periods of high infection numbers. In this work a decision support toolkit for local health authorities is introduced. A demonstrator web application was developed with the R Shiny framework and is publicly accessible online. It contains five separate tools based on statistical models for specific use cases and corresponding questions of COVID-19 cases and their contacts. The underlying statistical methods have been implemented in a new open-source R package. The toolkit has the potential to support local health authorities’ employees in their daily work. A simulated-based validation of the statistical models and a usability evaluation of the demonstrator application in a user study will be carried out in the future.
Machine learning based disease classification have already achieved amazing results in medicine: for example, models can find a tumor in computer tomography images at least as accurately as experts in the field. Since the development and widespread use of actigraphy watches, activity data has been used as a basis for diagnosing various diseases such as depression or Alzheimer’s disease. In this study, we use a dataset with activity measurements of mentally ill and healthy people, calculate various features and achieve a classification accuracy of over 78%. The paper describes and motivates the used features, discusses differences between healthy, bipolar 2 and unipolar participants and compares several well-known machine learning classifiers on different classification tasks and with different feature sets.
Recent advances in machine learning show great potential for automatic detection of abnormalities in electroencephalography (EEG). While simple and interpretable models combined with expert-comprehensible input features offer full control of the decision making process, these methods commonly lag behind complex deep learning and feature extraction methods in terms of performance. Here we study a feasibility of a bridging solution, where deep learning is combined with interpretable input and an algorithm computing the importance of particular EEG features in the decision process. We built a convolutional neural network with multi-channel EEG frequency bands as input and investigated four different methods for feature importance attribution: Layer-wise Relevance Propagation (LRP), DeepLIFT, Integrated Gradients (IG) and Guided GradCAM. Our analysis showed consistency between the first three methods, and deviating attributions of the fourth method, suggesting the importance of using a package of methods together to ensure the robustness of medical interpretation.
The integration of routine medical care data into research endeavors promises great value. However, access to this extra-domain data is constrained by numerous technical and legal requirements. The German Medical Informatics Initiative (MII) – initiated by the Federal Ministry of Research and Education (BMBF) – is making progress in setting up Medical Data Integration Centers to consolidate data stored in clinical primary information systems. Unfortunately, for many research questions cross-organizational data sources are required, as one organization’s data is insufficient, especially in rare disease research. A first step, for research projects exploring possible multi-centric study designs, is to perform a feasibility query, i.e., a cohort size calculation transcending organizational boundaries. Existing solutions for this problem, like the previously introduced feasibility process for the MII’s HiGHmed consortium, perform well for most use cases. However, there exist use cases where neither centralized data repositories, nor Trusted Third Parties are acceptable for data aggregation. Based on open standards, such as BPMN 2.0 and HL7 FHIR R4, as well as the cryptographic techniques of secure Multi-Party Computation, we introduce a fully automated, decentral feasibility query process without any central component or Trusted Third Party. The open source implementation of the proposed solution is intended as a plugin process to the HiGHmed Data Sharing Framework. The process’s concept and underlying algorithms can also be used independently.
The provision of knowledge through clinical practice guidelines and hospital-specific standard operating procedures (SOPs) is ubiquitous in the medical context and in the treatment of melanoma patients. However, these knowledge sources are only available in unstructured text form and without any contextual link to real patient data. The aim of our project is to give a modeled decision support for the next treatment step based on the actual data and position of a patient.
First, we identified passages for qualified decision-making necessary at the point of care from the SOP for melanoma. Thereby, the patient-specific contextual reference data at decision points was considered in parallel and represented by FHIR (Fast Healthcare Interoperability Resource) resources. The decision algorithm was then formalized using BPMN modeling with FHIR annotations. Validation was provided by medical experts, dermatooncologists from University Hospital Essen.
The resulting BPMN model is presented here with the diagnostic procedure of sentinel lymph node excision as the example snippet from the whole algorithm. Each decision point is edited with FHIR resources covering the patient data and preparing the context sensitivity of the model.
Modeling guideline-based information into a decision algorithm that can be presented at the point of care with contextual reference, may have the potential to support patient-specific clinical decision-making. For patients from a certain status like in the metastatic setting modeling becomes highly tailored to specific patient cases, alternative and individualized treatment options.
Within the scope of the two NUM projects CODEX and RACOON we developed a preliminary technical concept for documenting clinical and radiological COVID-19 data in a collaborative approach and its preceding findings of a requirement analysis. At first, we provide an overview of NUM and its two projects CODEX and RACOON including the GECCO data set. Furthermore, we demonstrate the foundation for the increased collaboration of both projects, which was additionally supported by a survey conducted at University Hospital Frankfurt. Based on the survey results mint Lesion™, developed by Mint Medical and used at all project sites within RACOON, was selected as the “Electronic Data Capture” (EDC) system for CODEX. Moreover, to avoid duplicate entry of GECCO data into both EDC systems, an early effort was made to consider a collaborative and efficient technical approach to reduce the workload for the medical documentalists. As a first effort we present a preliminary technical concept representing the current and possible future data workflow of CODEX and RACOON. This concept includes a software component to synchronize GECCO data sets between the two EDC systems using the HL7 FHIR standard. Our first approach of a collaborative use of an EDC system and its medical documentalists could be beneficial in combination with the presented synchronization component for all participating project sites of CODEX and RACOON with regard to an overall reduced documentation workload.
We describe the creation of GRASCCO, a novel German-language corpus composed of some 60 clinical documents with more than.43,000 tokens. GRASCCO is a synthetic corpus resulting from a series of alienation steps to obfuscate privacy-sensitive information contained in real clinical documents, the true origin of all GRASCCO texts. Therefore, it is publicly shareable without any legal restrictions We also explore whether this corpus still represents common clinical language use by comparison with a real (non-shareable) clinical corpus we developed as a contribution to the Medical Informatics Initiative in Germany (MII) within the SMITH consortium. We find evidence that such a claim can indeed be made.
Next-generation sequencing methods continuously provide clinicians and researchers in precision oncology with growing numbers of genomic variants found in cancer. However, manually interpreting the list of variants to identify reliable targets is an inefficient and cumbersome process that does not scale with the increasing number of cases. Support by computer systems is needed for the analysis of large scale experiments and clinical studies to identify new targets and therapies, and user-friendly applications are needed in molecular tumor boards to support clinicians in their decision-making processes. The MTB-Report tool annotates, filters and sorts genetic variants with information from public databases, providing evidence on actionable variants in both scenarios. A web interface supports medical doctors in the tumor board, and a command line mode allows batch processing of large datasets. The MTB-Report tool is available as an R implementation as well as a Docker image to provide a tool that runs out-of-the-box. Moreover, containerization ensures a stable application that delivers reproducible results over time. A public version of the web interface is available at: http://mtb.bioinf.med.uni-goettingen.de/mtb-report.
The interaction between nurses and physicians in the primary care setting is challenging with regard to structural, process and technical barriers. In order to overcome these barriers, the eMedCare project was launched and a commercial system was implemented.
This study aimed at a formative evaluation of the project. The findings should be used retrospectively to understand the failure of the project.
To this end, two rounds of qualitative interviews with 10 respectively 8 healthcare providers were performed.
The interviews revealed a mixed benefit. Difficulties arose because the initial aim to monitor patients shifted towards improving the communication between the providers, partly due to the poor usability of the monitoring system. Additional workload was imposed because the system was not interoperable with the institutional IT systems.
Projects with an unclear or shifting vision and focus seem to be susceptible to failure. The secure communication applications could have been realised on the intended scale if the national Telematikinfrastruktur had been in place.
EHR are a part of daily task of physicians in Germany. This study surveyed the satisfaction of a small group of physicians in German university hospitals using EHR with focus on usability.
The questioning was carried out by an online survey. Addressed were all physicians working at university hospitals in Germany.
The study showed that users are not satisfied with EHR (Grade 3.62). They pointed out some problems in general but reflected many advantages of those systems.
EHR systems have to develop and adopt to users’ tasks. They have to get faster and low error rates must be realized. Existing infrastructure must be improved and rolled out to users especially in times where digital healthcare services gain importance.
Data quality in health research encompasses a broad range of aspects and indicators. While some indicators are generic and can be calculated without domain knowledge, others require information about a specific data element. Even more complex are indicators addressing contradictions, that stem from implausible combinations of multiple data elements. In this paper, we investigate how contradictions within interdependent categorical data can be identified and if they give additional information about possible quality issues, their cause, and mitigation options. The 19 data elements that represent four biosample types including their pre-analytic states within the DZHK Biobanking basic set are exported to the CDISC Operational Data Model (ODM), transformed and loaded into a tranSMART instance. Through the implementation of a data quality assessment workflow as a SmartR plug-in, statistical information about the domain-specific consistency of interdependent values are retrieved, assessed, and visualized. Data quality indicators have been selected for the assessment according to common recommendations found in the literature. Different contradictions could be discovered in the dataset including mismatch of interdependent values in the pre-analytic states of blood and urine samples, as well as primary and aliquoted samples. The overall assessment rating shows that 99.61% of the interdependent values are free of contradictions. However, measures within the EDC design to avoid contradictions may result in overestimated missing rates in automatic, item-based quality assessment checks. Through consistency checks on interdependent categorical features, we demonstrated that consistency flaws can be found in the categorical data of biobanking metadata and that they can help to detect issues in the data entry process. Our approach underscores the importance of domain knowledge in the definition of the consistency rules but also knowledge about the EDC implementation of such consistency rules to consider the impact on item-based quality indicators.