Ebook: German Medical Data Sciences: Visions and Bridges
We live in an age characterized by computerized information, but ubiquitous information technology has profoundly changed our healthcare systems and, if not adequately trained to deal with it, healthcare professionals can all too easily be overwhelmed by the complexity and magnitude of the data. This demands new skills from physicians as well as novel ways to provide medical knowledge. Selecting and assessing relevant information presents a challenge which can only be met by bridging the various disciplines in healthcare and the data sciences.
This book presents the proceedings of the 62nd annual meeting of the German Association of Medical Informatics, Biometry and Epidemiology (German Medical Data Sciences – GMDS 2017): Visions and Bridges, held in Oldenburg, Germany, in September 2017. The 242 submissions to the conference included 77 full papers, of which 42 were accepted for publication here after rigorous review. These are divided into 7 sections: teaching and training; epidemiological surveillance, screening and registration; research methods; IT infrastructure for biomedical research/data integration centers; healthcare information systems; interoperability – standards, terminologies, classification; and biomedical informatics, innovative algorithms and signal processing.
The book provides a vision for healthcare in the information age, and will be of interest to all those concerned with improving clinical decision making and the effectiveness and efficiency of health systems using data methods and technology.
We are living in the information age, characterized by the shift from traditional industry to an economy based on information computerization. Ten years after the first iPhone appeared on the market, digitization has become habitual: there is hardly any professional or private area in this world not affected by digitization. This also applies to health care.
Our vision for healthcare in the information age is to improve clinical decision making and the effectiveness and efficiency of health systems by data methods and technology:
• Integrate available data and information for biomedical research.
• Provide information, knowledge and decision support to patients and healthcare professionals.
As medical data scientists, we develop methods for clinical research and provide them to clinical researchers, manufacturers and clinicians. In the information age, we are important actors in the health care systems.
Ubiquitous information technology profoundly changes the sociotechnical system of health care: Patients are empowered by information technology, health care professionals are overwhelmed by the complexity and the magnitude of the data, not being trained to deal with it. This accelerates shared decision-making, requiring other skills from physicians as well as new ways to how medical knowledge is provided. As of now, availability of information is not the crucial point. Rather, selecting relevant information and assessing the quality of the information remains a challenge.
Information technology also has an impact on hierarchy and communication paths in the hospitals. Similar to the development, introduction and evaluation of new medical procedures, we must understand all effects on the target system in the application of information technology. This is an obligatory prerequisite to determine the benefit of what and to be able to control the risks of side effects. We have to face the fact that the increasing impact of information technology on patient care is directly linked to our own commitment to apply the principles of evidence-based medicine.
We can only master this challenge together: It is necessary for all stakeholders, scientist, clinicians and patients to work together in research and health care. We need to bridge!
• Bridging the various disciplines in the Data Sciences
• Bridging data scientists and clinicians
• Bridging different healthcare professionals
• Bridging science and society
• Bridging providers and patients!
GMDS 2017 catalyzes constructing these bridges.
Finally some data: Two-hundred-and-forty contributions were submitted, among 77 full papers. These were reviewed in a two stage interdisciplinary peer-reviewing process: A total of 766 reviews, some of which were very comprehensive, were produced by 186 reviewers. 42 full papers are accepted for publication in this volume of Studies in Health Technology and Informatics. We cordially thank all authors and reviewers for this work at the scientific core of the conference.
Rainer Röhrig
Antje Timmer
Harald Binder
Ulrich Sax
Systematic health IT evaluation studies are needed to ensure system quality and safety and to provide the basis for evidence-based health informatics. Well-trained health informatics specialists are required to guarantee that health IT evaluation studies are conducted in accordance with robust standards. Also, policy makers and managers need to appreciate how good evidence is obtained by scientific process and used as an essential justification for policy decisions. In a consensus-based approach with over 80 experts in health IT evaluation, recommendations for the structure, scope and content of health IT evaluation courses on the master or postgraduate level have been developed, supported by a structured analysis of available courses and of available literature. The recommendations comprise 15 mandatory topics and 15 optional topics for a health IT evaluation course.
The number of students enrolled in online courses is increasing steadily. Distance education offers many advantages, but also has inherent challenges. Successful distance education needs a thoughtfully designed instructional strategy where students are supported to actively create knowledge. We present the design and evaluation of three online-based courses in health informatics. The courses were based on a collaborative instructional strategy. The evaluation comprised workload analysis, student evaluation, student interviews and student reflections. Students expressed high satisfaction with online learning, despite a high workload, and high perceived learning outcomes. Using the Community of Inquiry framework as reference, we found very high levels of teaching presence, social presence and cognitive presence. Summarizing, we found that the chosen instructional strategy supported student-centered, collaborative learning. We conclude by presenting lesson learned for online-based instructional design.
Background: In recent years, the interest in user experience (UX) evaluation methods for assessing technology solutions, especially in health systems for children with special needs like cognitive disabilities, has increased.
Objective: Conduct a systematic mapping study to provide an overview in the field of UX evaluations in rehabilitation video games for children.
Methods: The definition of research questions, the search for primary studies and the extraction of those studies by inclusion and exclusion criteria lead to the mapping of primary papers according to a classification scheme.
Results: Main findings from this study include the detection of the target population of the selected studies, the recognition of two different ways of evaluating UX: (i) user evaluation and (ii) system evaluation, and UX measurements and devices used.
Conclusions: This systematic mapping specifies the research gaps identified for future research works in the area.
Introduction: The National Competence Based Catalogue of Learning Objectives for Undergraduate Medical Education (NKLM) describes medical skills and attitudes without being ordered by subjects or organs. Thus, the NKLM enables systematic curriculum mapping and supports curricular transparency. In this paper we describe where learning objectives related to Medical Informatics (MI) in Hannover coincide with other subjects and where they are taught exclusively in MI.
Methods: An instance of the web-based MERLIN-database was used for the mapping process.
Results: In total 52 learning objectives overlapping with 38 other subjects could be allocated to MI. No overlap exists for six learning objectives describing explicitly topics of information technology or data management for scientific research. Most of the overlap was found for learning objectives relating to documentation and aspects of data privacy.
Discussion: The identification of numerous shared learning objectives with other subjects does not mean that other subjects teach the same content as MI. Identifying common learning objectives rather opens up the possibility for teaching cooperations which could lead to an important exchange and hopefully an improvement in medical education.
Conclusion: Mapping of a whole medical curriculum offers the opportunity to identify common ground between MI and other medical subjects. Furthermore, in regard to MI, the interaction with other medical subjects can strengthen its role in medical education.
A Global Solar Ultraviolet Index (UVI) value of 2 is generally linked to the health message ‘You can safely stay outside!’ To examine whether this is sound advice for all skin types and even for prolonged periods spent outside we used erythemal irradiance data of all 136 days during the study period from 2014 till 2016 with such a UVI measured by the German Federal Office for Radiation Protection (BfS) in Munich, Germany. A comparison between the ambient erythemal doses calculated for various time intervals and minimal erythemal doses (MEDs) of the Caucasian skin types I–IV led us to a critical reappraisal of the above health message. Specifically, the message might be misleading if people with a fair complexion want to spend several hours outside, because without any protective measures the doses received can be sufficient to induce erythema. We thus recommend an amendment of the health message related to a safe level of the UVI and, moreover, generally tailoring UVI-related health messages to different skin types. Currently, these messages do not seem to be strictly evidence based, which might be one reason for the unexpected result of our analysis.
Background: Routine data analyses are becoming increasingly important for health policy decision making. However such databases often vary in data quality, completeness and accessibility. The aim of this study is to describe the quality of a large outpatient healthcare database, the process of data extraction and to give a brief overview of data-structure with focusing on provider-type and disease severity in an example of the treatment of depressive disorders.
Method: The quality of the database is described and diagnosis rates of depression in outpatient care (ICD-10 diagnoses F32/33) in relation to the provider-type (i.e. general or somatic physician vs. physicians specialized in mental-health vs. psychotherapist) were calculated using Cramers V as a measure for effect size.
Results: The database consisted of 2,383,672 cases from 2015. Most depressive patients were diagnosed and treated by general or somatic physicians. A clear relationship between the severity of depression and provider-type is shown. In contrast to psychotherapists or physicians specialized in mental-health, general or somatic physicians diagnose a higher rate of unspecified depressive episodes.
Phenotyping, or the identification of patient cohorts, is a recurring challenge in medical informatics. While there are open source tools such as i2b2 that address this problem by providing user-friendly querying interfaces, these platforms lack semantic expressiveness to model complex phenotyping algorithms. The Arden Syntax provides procedural programming language construct, designed specifically for medical decision support and knowledge transfer. In this work, we investigate how language constructs of the Arden Syntax can be used for generic phenotyping. We implemented a prototypical tool to integrate i2b2 with an open source Arden execution environment. To demonstrate the applicability of our approach, we used the tool together with an Arden-based phenotyping algorithm to derive statistics about ICU-acquired hypernatremia. Finally, we discuss how the combination of i2b2's user-friendly cohort pre-selection and Arden's procedural expressiveness could benefit phenotyping.
Information retrieval is a major challenge in medical informatics. Various research projects have worked on this task in recent years on an institutional level by developing tools to integrate and retrieve information. However, when it comes down to querying such data across institutions, the challenge persists due to the high heterogeneity of data and differences in software systems. The German Biobank Node (GBN) project faced this challenge when trying to interconnect four biobanks to enable distributed queries for biospecimens. All biobanks had already established integrated data repositories, and some of them were already part of research networks. Instead of developing another software platform, GBN decided to form a bridge between these. This paper describes and discusses a core component from the GBN project, the OmniQuery library, which was implemented to enable on-the-fly query translation between heterogeneous research infrastructures.
Efficient and powerful information systems are substantial to perform medical research projects successfully. Especially, translational medicine poses specific challenges to the corresponding IT infrastructure. The RESIST study is a translational research project in oncology where xenografts inform about patients second-line treatment. DBFORM, an in-house developed system, was used as EDC system. It was enhanced with project specific features. We demonstrate how the CIPROS checklist has the potential to optimize the related requirements engineering process. The CIPROS checklist consists of 72 items, organized within 12 Aspects/Topics and was developed to assess such patient registry software systems. In this paper we use the CIPROS checklist (1) to elucidate the projects requirements and (2) to assess systems features. The application of the CIPROS checklist to fix the RESIST project requirements and system features was successful. The interplay between (1) and (2) helped to accelerate the requirements engineering process and to set up a system suitable to perform the translational research project successful.
Introduction: Diagnostic diversity has been in the focus of several studies of health services research. As the fraction of people with statutory health insurance changes with age and gender it is assumed that diagnostic diversity may be influenced by these parameters.
Methods: We analyze fractions of patients in Schleswig-Holstein with respect to the chapters of the ICD-10 code in outpatient treatment for quarter 2/2016 with respect to age and gender/sex of the patient. In a first approach we analyzed which diagnose chapters are most relevant in dependence of age and gender. To detect diagnostic diversity, we finally applied Shannon's entropy measure. Due to multimorbidity we used different standardizations.
Results: Shannon entropy strongly increases for women after the age of 15, reaching a limit level at the age of 50 years. Between 15 and 70 years we get higher values for women, after 75 years for men.
Discussion: This article describes a straight forward pragmatic approach to diagnostic diversity using Shannon's Entropy. From a methodological point of view, the use of Shannon's entropy as a measure for diversity should gain more attraction to researchers of health services research.
Background: Benchmarking and guidance of outpatient physicians in Germany are almost always based on one year data. This also holds true for morbidity related groups, a classification system applied in northern Germany since 2017. A study of the markov properties of prescription based grouping algorithms is reported here.
Results: There is a strongly connected graph for almost all components and the resulting markov chain has a unique stationary solution.
Conclusions: Target values based on the status quo of prescription behavior can provide stable guidelines for outpatient physicians. Every set of partitions converging like MRG should be considered for controlling measures.
The efficient use of routine data for biomedical research presupposes an IT infrastructure designed for health care facilities. The objective of this study was to analyse which IT infrastructure is used in hospitals and by general practitioners' (GP) practices in the region Oldenburg-Bremen and to examine how well this supports research projects. To this end, IT managers and GPs were interviewed. The usage of hospital information systems (HIS) and data warehouse systems (DWS) in hospitals is of major importance for the study. Over 90 % use DWS for administration, 42 % for clinical research. None of the hospitals implemented consent for the use of routine data for research. Only a third of the GPs have participated in studies. The GPs' offices based EHR systems in use offer virtually no support for research projects. The study results demonstrate that technical and organisational measures are required for the further usage of routine data in the region.
Objective: Openclinica Input Completion (OIC) was developed to increase the efficiency to enter drugs in eCRF in OpenClinica®. The aim of the study was to evaluate the impact on efficiency and data quality as well as usability.
Methods: 20 participants were asked to input 15 drugs with the new tool and by hand.
Results: The mean input time got decreased from 16:12m to 3:59m. 31 of 300 (10%) of manual entered medication data sets had one or more errors versus 10 of 300 (3,3%) data sets entered with OIC.
Conclusion: OIC was able to increase efficiency and data quality. We conclude that new additions to the graphical user interface in electronical Case-Report-Form (eCRF) systems should be validated before usage in research projects.
There is a need among researchers for the easy discoverability of biobank samples. Currently, there is no uniform way for finding samples and negotiate access. Instead, researchers have to communicate with each biobank separately. We present the architecture for the BBMRI-CS IT platform, whose goal is to facilitate sample location and access. We chose a decentral approach, which allows for strong data protection and provides the high flexibility needed in the highly heterogeneous landscape of European biobanks. This is the first implementation of a decentral search in the biobank field. With the addition of a Negotiator component, it also allows for easy communication and a follow-through of the lengthy approval process for accessing samples.
Extraction of structured data from textual reports is an important subtask for building medical data warehouses for research and care. Many medical and most radiology reports are written in a telegraphic style with a concatenation of noun phrases describing the presence or absence of findings. Therefore a lexico-syntactical approach is promising, where key terms and their relations are recognized and mapped on a predefined standard terminology (ontology). We propose a two-phase algorithm for terminology matching: In the first pass, a local terminology for recognition is derived as close as possible to the terms used in the radiology reports. In the second pass, the local terminology is mapped to a standard terminology. In this paper, we report on an algorithm for the first step of semi-automatic generation of the local terminology and evaluate the algorithm with radiology reports of chest X-ray examinations from Würzburg university hospital. With an effort of about 20 hours work of a radiologist as domain expert and 10 hours for meetings, a local terminology with about 250 attributes and various value patterns was built. In an evaluation with 100 randomly chosen reports it achieved an F1-Score of about 95% for information extraction.
Health IT adoption research is rooted in Rogers' Diffusion of Innovation theory, which is based on longitudinal analyses. However, many studies in this field use cross-sectional designs. The aim of this study therefore was to design and implement a system to (i) consolidate survey data sets originating from different years (ii) integrate additional secondary data and (iii) query and statistically analyse these longitudinal data. Our system design comprises a 5-tier-architecture that embraces tiers for data capture, data representation, logics, presentation and integration. In order to historicize data properly and to separate data storage from data analytics a data vault schema was implemented. This approach allows the flexible integration of heterogeneous data sets and the selection of comparable items. Data analysis is prepared by compiling data in data marts and performed by R and related tools. IT Report Healthcare data from 2011, 2013 and 2017 could be loaded, analysed and combined with secondary longitudinal data.
In recent years, clinical data warehouses (CDW) storing routine patient data have become more and more popular to support scientific work in the medical domain. Although CDW systems provide interfaces to import new data, these interfaces have to be used by processing tools that are often not included in the systems themselves. In order to establish an extraction-transformation-load (ETL) workflow, already existing components have to be taken or new components have to be developed to perform the load part of the ETL. We present a customizable importer for the two CDW systems PaDaWaN and I2B2, which is able to import the most common import formats (plain text, CSV and XML files). In order to be run, the importer only needs a configuration file with the user credentials for the target CDW and a list of XML import configuration files, which determine how already exported data is indented to be imported. The importer is provided as a Java program, which has no further software requirements.
Due to the increasing use of electronic data capture systems for clinical research, the interest in saving resources by automatically generating and reusing case report forms in clinical studies is growing. OpenClinica, an open-source electronic data capture system enables the reuse of metadata in its own Excel import template, hampering the reuse of metadata defined in other standard formats. One of these standard formats is the Operational Data Model for metadata, administrative and clinical data in clinical studies. This work suggests a mapping from Operational Data Model to OpenClinica and describes the implementation of a converter to automatically generate OpenClinica conform case report forms based upon metadata in the Operational Data Model.
Cross-institutional biobank networks hold the promise of supporting medicine by enabling the exchange of associated samples for research purposes. Various initiatives, such as BBMRI-ERIC and German Biobank Node (GBN), aim to interconnect biobanks for enabling the compilation of joint biomaterial collections. However, building software platforms to facilitate such collaboration is challenging due to the heterogeneity of existing biobank IT infrastructures and the necessary efforts for installing and maintaining additional software components. As a remedy, this paper presents the concept of a hybrid network for interconnecting already existing software components commonly found in biobanks and a proof-of-concept implementation of two prototypes involving four biobanks of the German Biobank Node. Here we demonstrate the successful bridging of two IT systems found in many German biobanks – Samply and i2b2.