Pengwei Hu, Eryu Xia, Shochun Li, Xin Du, Changsheng Ma, Jianzeng Dong, Keith C.C. Chan
1480 - 1481
The low proportion and the rapid evolvement of major adverse cardiac events (MACE) present challenges for predicting MACE by machine learning models. In this paper, we propose a method to predict MACE from large-scale imbalanced EMR data by using a network-based one-class classifier. It only used the reliably known MACE samples to establish the hyperspherical model. Experiments show that our model outperforms the state-of-the-art models.
Drug combination therapy can improve drug efficacy, reduce drug dosage, and overcome drug resistance. Many studies have focused on predicting synergistic drug combinations. However, existing methods fail to consider the heterogeneous characteristics of drugs fully, and it is difficult to identify effective drug combinations. Therefore, we propose a new integrated prediction model based on deep representations by integrating information from multiple domains to accurately and effectively predict drug combinations.
Yanqun Huang, Ni Wang, Honglei Liu, Hui Zhang, Xiaolu Fei, Lan Wei, Hui Chen
1484 - 1485
A comprehensive scheme of patient similarity based on different types of patient features and the corresponding similarity measurements was proposed. Patient similarity was used in building a predictive model where training samples similar to the index patient were selected instead of randomly selected samples. The predictive models using the proposed patient similarity measurement outperformed those using Euclidean distance based similarity and those not using patient similarity.
Terminology binding has been adopted in Hong Kong Hospital Authority to link terminology component to clinical information models. This linkage facilitates structured data capturing and streamlines clinical workflow. With information models and data representation standards in place, data interoperability and data integration can be maintained for seamless patient care delivery.
Vojtech Huser, Xiaochun Li, Zuoyi Zhang, Sungjae Jung, Rae Woong Park, Juan Banda, Hanieh Razzaghi, Ajit Londhe, Karthik Natarajan
1488 - 1489
Large healthcare datasets of Electronic Health Record data became indispensable in clinical research. Data quality in such datasets recently became a focus of many distributed research networks. Despite the fact that data quality is specific to a given research question, many existing data quality platform prove that general data quality assessment on dataset level (given a spectrum of research questions) is possible and highly requested by researchers. We present comparison of 12 datasets and extension of Achilles Heel data quality software tool with new rules and data characterization measures.
Maria I. Restrepo, Mary C. McGrath, Indra Neil Sarkar, Elizabeth S. Chen
1490 - 1491
Statistical analysis of Medical Subject Headings (MeSH) descriptors to improve biomedical literature search is an active research area. Existing tools have limited interactive visualizations that are accessible to researchers investigating how their hypotheses compare to trends in the research literature. We present a web application that computes and provides an interactive visualization of basic frequencies and co-occurrence statistics of MeSH descriptors associated with a PubMed query.
By applying the Bayesian network method to the clinical registry database J-DREAMS (Japan Diabetes compREhensive database project based on an Advanced electronic Medical record System), we have developed a reference model that summarizes the exploration of the patient group’s state and facilitates a bird’s-eye view. This visualization method would help registry researchers to screen the registry database.
We performed a cohort study to quantify the association between rheumatic arthritis (RA) and acute myocardial infarction (AMI) risk. ICD-9 was used to identify AMI and RA patients, and the Cox proportional hazards model with adjusted confounding factors was used to quantify the risk. The overall risk of AMI for RA patients was an aHR of 1.05 (95% CI 1.01–1.09). We found RA was associated with an increased risk for AMI.
Eluizio H. Saraiva Barretto, Diogo F. da Costa Patrao, Márcia Ito
1496 - 1497
TUSS is a Brazilian health procedure standard used by the supplementary health providers. Currently, there is no available mapping between TUSS and other standards. In this paper, we analyze performance of two term weighting algorithms when classifying TUSS procedure description. The TF-IDF classified 99% of chapters, 89% of groups, and 33% of subgroups; Doc2Vec classified 65%, 43%, and 33%, respectively, showing that those algorithms can support creation of an accurate mapping between those procedure standards.
We aimed to develop rhabdomyolysis (RB) phenotyping algorithms using machine learning techniques and to create subphenotyping algorithms to identify RB patients who lack RB diagnosis. Two pattern algorithms, one with a focus on improving predictive value and one focused on improving sensitivity, were finally created and had a high area under the curve value of 0.846. Although we were unable to create subphenotyping algorithms, an attempt to detect unknown RB patients is important for epidemiological studies.
Amin Jalali, Paul Johannesson, Erik Perjons, Ylva Askfors, Abdolazim Rezaei Kalladj, Tero Shemeikka, Anikó Vég
1500 - 1501
Janusmed is a clinical decision support system, developed by the Stockholm County Council that supports physicians in identifying drug-drug interactions. To determine how Janusmed is used in and affects the clinical practice, an evaluation study is currently being carried out that analyzes multiple data sources through descriptive statistics. The study focuses on how Janusmed affects the behavior of the physicians, in particular, to what extent physicians reconsider their prescription decisions based on warnings from Janusmed.
Guoqian Jiang, Yue Yu, Paul R. Kingsbury, Nilay Shah
1502 - 1503
The objective of the study is to augment safety and effectiveness evaluation of medical devices through building a reusable unique device identifier (UDI) interoperability solution. We propose a framework for building a UDI research database for medical device evaluation using the OHDSI common data model (CDM). As a pilot study, we design, develop and evaluate a UDI vocabulary, which would enable tackling challenges of data islands and standardization for medical device evaluation.
The study was done to validate the real time efficacy of a customised algorithm in detecting diabetic retinopathy (DR) among diabetic patients being examined at the vitreo retinal outpatient department (VR OPD) of a tertiary care hospital, Diabetic Retinopathy algorithm showed sensitivity of 79% and specificity of 57% which is an acceptable methodology to diagnose diabetic retinopathy and avoid unnecessary referrals.
Sun Jung Lee, Sung Hye Yu, Yejin Kim, Jun Hyuk Hong, Choung-Soo Kim, Seong Il Seo, Chang Wook Jeong, Seok-Soo Byun, Byung Ha Chung, Ji Youl Lee, In Young Choi
1506 - 1507
In this study, we built a multi-center integrated database platform of localized prostate cancer and developed biochemical recurrence (BCR) prediction system with Gradient Boosted Regression model using Korean Prostate Cancer Registry (KPCR) database. This platform will facilitate clinical management of patients with prostate cancer, and it will also help develop appropriate treatment of prostate cancer.
Gaetan Kamdje-Wabo, Tobias Gradinger, Matthias Löbe, Robert Lodahl, Susanne Andrea Seuchter, Ulrich Sax, Thomas Ganslandt
1508 - 1509
The Demonstrator study aims to analyse comorbidities and rare diseases among patients from German university hospitals within the German Medical Informatics Initiative. This work aimed to design and determine the feasibility of a model to assess the quality of the claims data used in the study. Several data quality issues were identified affecting small amounts of cases in one of the participating sites. As a next step an extension to all participating sites is planned.
Suranga N. Kasthurirathne, Gregory Dexter, Shaun J. Grannis
1510 - 1511
We leverage Generative Adversarial Networks (GAN) to produce synthetic free-text medical data with low re-identification risk, and apply these to replicate machine learning solutions. We trained GAN models to generate free-text cancer pathology reports. Decision models were trained using synthetic datasets reported performance metrics that were statistically similar to models trained using original test data. Our results further the use of GANs to generate synthetic data for collaborative research and re-use of machine learning models.
Faiza Khan Khattak, Serena Jeblee, Noah Crampton, Muhammad Mamdani, Frank Rudzicz
1512 - 1513
We present AutoScribe, a system for automatically extracting pertinent medical information from dialogues between clinicians and patients. AutoScribe parses the dialogue and extracts entities such as medications and symptoms, using context to predict which entities are relevant, and automatically generates a patient note and primary diagnosis.
Currently, the Common Data Model (CDM) for primary use (as distinct from models designed for secondary use) poorly supports clinical decision-making and medical process analyses. We designed a CDM featuring a search flow that identifies facts after defining the clinical process, and we make the data definition language of the CDM employed freely available as open source.
Ann-Kristin Kock-Schoppenhauer, Philipp Bruland, Dennis Kadioglu, Dominik Brammen, Hannes Ulrich, Kerstin Kulbe, Petra Duhm-Harbeck, Josef Ingenerf
1516 - 1517
Scientific challenges based on benchmark data enable the comparison and evaluation of different algorithms and take place regularly in scientific disciplines like medical image processing, text mining or genetics. The idea of a challenge is rarely applied within the eHealth community. Mappathon is a metadata mapping challenge that asks for methods to find corresponding data elements within similar datasets and to correlate data elements among each other.
Laboratory tests results have potential secondary usage. Each healthcare facility has a laboratory test code. Hence, test code mapping is required to support laboratory technicians. An automatic code mapping can reduce the burden of manual mapping during data preparation. The authors developed a semi-automatic mapping support system that uses the newest test results generated in the electronic health record.
In this study we developed an ontology for accessing online health information related to pregnancy. Social media data and the categories in the literature on pregnancy information were used to collect terms for identifying class and class hierarchy. The developed ontology included 241 classes and 788 synonyms, with six superclasses. This ontology can be used to provide appropriate information based on a needs assessment.
Raphael Lenain, Martin G. Seneviratne, Selen Bozkurt, Douglas W. Blayney, James D. Brooks, Tina Hernandez-Boussard
1522 - 1523
Clinical and pathological stage are defining parameters in oncology, which direct a patient’s treatment options and prognosis. Pathology reports contain a wealth of staging information that is not stored in structured form in most electronic health records (EHRs). Therefore, we evaluated three supervised machine learning methods (Support Vector Machine, Decision Trees, Gradient Boosting) to classify free-text pathology reports for prostate cancer into T, N and M stage groups.
Named entity recognition in electronic medical records is of great significance to the construction of medical knowledge maps. This paper proposes a model of bidirectional Long Short-Term Memory with a conditional random field layer(BiLSTM-CRF). In terms of simultaneously identifying 5 types of clinical entities from CCKS2018 Chinese EHRs corpus, the BiLSTM-CRF model finally achieved better performance than the baseline CRF model (F-score of 84.23% vs 82.49%).
Retrospective analysing of fall incident reports can uncover hidden information, identify potential risk factors, and improve healthcare quality. This study explores potential fall incident clusters using word embeddings and hierarchical clustering. Fall incident reports from 7 local hospitals in Hong Kong were catalogued into 5 potential clusters with significantly different fall severity, gender, reporting department, and keywords. This study demonstrates the feasibility of using text clustering methods on real-world fall incident reports mining.
Matthias Löbe, Oya Beyan, Sebastian Stäubert, Frank Meineke, Danny Ammon, Alfred Winter, Stefan Decker, Markus Löffler, Toralf Kirsten
1528 - 1529
Secondary use of electronic health record (EHR) data requires a detailed description of metadata, especially when data collection and data re-use are organizationally and technically far apart. This paper describes the concept of the SMITH consortium that includes conventions, processes, and tools for describing and managing metadata using common standards for semantic interoperability. It deals in particular with the chain of processing steps of data from existing information systems and provides an overview of the planned use of metadata, medical terminologies, and semantic services in the consortium.