Standardization is essential for information sharing among different health care institutions. Our objective was to identify the essential oral health attributes to include in an electronic health record for primary care. This action research study utilized a Definer Group, which selected attributes as a mind map, into four main pillars: Data Collection, Diagnosis, Care Plan and Evaluation. This research applied the practice of knowledge leveling, favoring the interaction of dental specialties and identification of attributes.
K. Bretonnel Cohen, Lawrence E. Hunter, Peter S. Pressman
1433 - 1434
“P-hacking” is the repeated analysis of data until a statistically significant result is achieved. We show that p-hacking can also occur during data generation, sometimes unintentionally. We use the type-token ratio to demonstrate that differences in the definitions of “type” and “token” can produce significantly different results. Since these terms are rarely defined in the biomedical literature, the result is an inability to meaningfully interpret the body of literature that makes use of this measure.
Tobias Bronsch, Ruwen Böhm, Claudia Bulin, Björn Bergh, Björn Schreiweis
1435 - 1436
Integrating data from various source systems to gain knowledge and meaningful data about patients for care and research is challenging. This work demonstrates how medication knowledge data from the database of the Federal Union of German Associations of Pharmacists (ABDA) can be used for storing and annotating medicinal products in an openEHR medication archetype.
N. Cassim, M. Mapundu, V. Olago, J.A. George, D.K. Glencross
1437 - 1438
Prostate cancer (PCa) data is of public health importance in South Africa. Biopsy data is recorded as semi-structured narrative text that is not easily analysed. Our study reports a pilot study that applied predictive analytics and text mining techniques to extract prognostic information that guides patient management. In particular, the Gleason score (GS) reported in a number of formats were extracted successfully. Our study reports that predominantly older men were diagnosed with PCa reporting a high-risk GS (8-10). Where cell differentiation was reported, 64% of biopsies reported poor differentiation. The approaches demonstrated in our study should be extended to a larger dataset to assess whether it has the potential to scale up to the national level.
José Castaño, Hee Park, Pilar Ávila, David Pérez, Hernán Berinsky, Laura Gambarte, Carlos Otero, Daniel Luna
1439 - 1440
Clinical terms are noisy descriptions typed by healthcare professionals in Spanish language in the electronic health record system (EHR). Thus, an evaluation of terminology search engine that extends SNOMED CT and an approach that uses historical data of clinical terms is described. We show how to measure precision and recall using historical search data, and we show how the performance of the search engine can be improved significantly using the technology available in the search engine.
L. Chiudinelli, M. Gabetta, G. Centorrino, N. Viani, C. Tasca, A. Zambelli, M. Bucalo, A. Ghirardi, N. Barbarini, E. Sfreddo, C. Tondini, R. Bellazzi, L. Sacchi
1441 - 1442
Unstructured clinical notes contain a huge amount of information. We investigated the possibility of harvesting such information through an NLP-based approach. A manually curated ontology is the only resource required to handle all the steps of the process leading from clinical narrative to a structured data warehouse (i2b2). We have tested our approach at the Papa Giovanni XXIII hospital in Bergamo (Italy) on pathology reports collected since 2008.
Wona Choi, Soo Jeong Ko, Hyuck Jun Jung, Tong Min Kim, Inyoung Choi
1443 - 1444
We expanded and constructed a Common Data Model (CDM) based on hospital EHR to enable analysis and comparison of Adverse Drug Reactions(ADRs) integrated with external organizations with different data structures. This is significant in that it is possible to conduct joint research, analysis, and comparisons among institutions with the same type of CDM constructed, and provide the basis for conducting the same research simultaneously on various data sources.
Clinical information in electronic health records (EHRs) is mostly unstructured. With the ever-increasing amount of information in patients’ EHRs, manual extraction of clinical information for data reuse can be tedious and time-consuming without dedicated tools. In this paper, we present SmartCRF, a prototype to visualize, search and ease the extraction and structuration of information from EHRs stored in an i2b2 data warehouse.
Autism spectrum disorder (ASD) is a brain development disorder that restricts a person’s communication abilities and social interaction capabilities from natural growth. In this paper, we have applied various supervised classification techniques to detect the presence of child autism. Our findings show that the Sequential Minimal Optimization (SMO) classifier performs best to detect ASD cases with the highest accuracy and minimum execution time and error rate. We also identify the most dominant features in dectecting child autism.
Panpan Deng, Yujing Ji, Liu Shen, Junlian Li, Huiling Ren, Qing Qian, Haixia Sun
1449 - 1450
Terminology facilitates consistent use and semantic integration of heterogeneous, multimodal data within and across domains. This paper presents TBench (Termilology Workbench) for multilingual terminology editing and development within a distributed environment. TBench is a web-service Java tool consisting of two main functionalities that are knowledge construction (i.e.extended model based on ISO25964, batch reusing and constructing multilingual concept hierarchy and relationships) and collaborative control in order to achieve custom extensions, reuse, multilingual alignment, integration and refactoring.
Adherence to medications is a key performance indicator and behavioral outcome in healthcare. Electronic healthcare databases represent rich data sources for estimating adherence in both research and practice. To build a solid evidence base for adherence management across clinical settings, it is necessary to standardize adherence estimation and facilitate its appropriate use. We present the recent development and oportunities offered by AdhereR, an R package for visualisation of medication histories and computation of adherence.
Nhan V. Do, Jaime C. Ramos, Nathanael R. Fillmore, Robert L. Grossman, Michael Fitzsimons, Danne C. Elbers, Frank Meng, Brett R. Johnson, Samuel Ajjarapu, Corri L. DeDomenico, Karen E. Pierce-Murray, Robert B. Hall, Andrew F. Do, Kelly Gaynor, Peter L. Elkin, Mary T. Brophy
1453 - 1453
We completed a pilot study to guide the development of the VA Research Precision Oncology Data Commons infrastructure as a collaboration platform with the greater research community. Our results using a small subset of patients from the VA’s Precision Oncology Program demonstrate the feasibility of our data sharing platform to build predictive models for lung cancer survival using machine learning, as well as highlight the potential of target genome sequencing data.
Arthur Domingues, A. Sousa Neto, A. Freitas, S.D. Silva, A. Félix, F.M. Mendes Neto
1454 - 1455
Technologies for health have been receiving considerable attention with the popularization of devices for internet access. The Internet can be seen as a repository of knowledge due to its large amount of available information; however, on the other hand, in the midst of this vast amount of content, there is information either scientifically inaccurate or incomplete. This work presents a semantic integration service to provide information of diabetes from medical databases to eHealth applications.
David Dorr, Cosmin A. Bejan, Christie Pizzimenti, Sumeet Singh, Matt Storer, Ana Quinones
1456 - 1457
Social and behavioral factors influence health but are infrequently recorded in electronic health records (EHRs). Here, we demonstrate that psychosocial vital signs can be extracted from EHR data. We processed structured and unstructured EHR data using expert-driven queries and Natural Language Processing (NLP), validating results through structured annotation. We found that although these vital signs are present in EHRs, with 681 structured entries identified for psychosocial concepts, NLP identified a nearly 90-fold increase in patients.
Esther E. Schmidt, Corinna Eichelser, Bernd Ahlborn, Dietmar Keune, Esmeralda Castaños-Vélez, David Juárez, Martin Lablans
1458 - 1459
Standardised, automated quality reports were generated at three pilot locations of the decentralized translational research network DKTK with separated local data warehouses (LDW), for assessing syntactic conformity against common data element definitions deposited in a central metadata repository (MDR). Deviations in the LDW were categorised, and locally corrected. Comparisons of reports from two time points confirm a major improvement in data quality in terms of syntactic conformity, an essential prerequisite for network-wide data integration.
Jose P. Garci, Jr., Sarah A. Collins, Kenrick D. Cato, Suzanne Bakken, Haomiao Jia, Min J. Kang, Christopher Knaplund, Kumiko O. Schnock, Patricia C. Dykes
1462 - 1463
We assessed the feasibility of using REDCap as a factorial design survey (FDS) platform. REDCap lacks randomization and automation functionality, requiring the development of a workaround. A template survey was created containing all vignettes, copied for each survey instance and edited to hide unwanted content. REDCap configuration required three hours for forty-two surveys. The utilized “copy-and-hide” workaround was successful, providing quasi-automation and reasonable labor-time. Additional strategies are planned using REDCap’s Data Dictionary and other survey software.
Joël Gardes, Christophe Maldivi, Denis Boisset, Timothée Aubourg, Nicolas Vuillerme, Jacques Demongeot
1464 - 1465
In the 5P medicine (Personalized, Preventive, Participative, Predictive and Pluri-expert), the general trend is to process data by displacing the barycenter of the information from hospital centered systems to the patient centered ones through his personal medical records. Today, the use of artificial intelligence for supporting this transition shows real limitations in its implementation in operational practice, both at the level of patient care, but also in the general daily life of the health professional, because of the medico-legal imperatives induced by the promises of the ‘5P medicine’. In this paper, we propose to fill this gap by introducing an original artificial intelligence platform, named Maxwell, which follows an unsupervised learning approach in line with the medico-legal imperatives of the ‘5P medicine’. We describe the functional platform characteristics and illustrate them by two examples of clustering in genomics and magnetic resonance imaging.
Theresa L. Walunas, Anika S. Ghosh, Jennifer A. Pacheco, Kathryn L. Jackson, Anh H. Chung, Daniel L. Erickson, Karen Mancera-Cuevas, Rosalind Ramsey-Goldman, Abel N. Kho
1466 - 1467
We developed a computable phenotype for systemic lupus erythematosus (SLE) based on the Systemic Lupus International Collaborative Clinics clinical classification criteria set for SLE. We evaluated the phenotype over registry and EHR data for the same patient population to determine concordance of criteria detected in both datasets and to assess which types of structured data detected individual classification criteria. We identified a concordance of 68% between registry and EHR data relying solely on structured data.
Nur Hafieza Ismail, Ninghao Liu, Mengnan Du, Zhe He, Xia Hu
1468 - 1469
The trauma of cancer often leaves survivors with PTSD. Tweets posted on Twitter usually reflect the users’ psychological state, which is convenient for data collection. However, Twitter also contains a mix of noisy and genuine tweets. The process of manually identifying genuine tweets is expensive and time-consuming. Thus, we propose a knowledge transfer technique to filter out unrelated tweets. Our experiments show that our model outperforms the baselines.
Bibo Hao, Shouyu Yan, Eryu Xia, Shilei Zhang, Jing Mei
1470 - 1471
Clinical trials are key and essential processes for researchers to develop new treatments as well as evaluate their effectiveness and safety, whilst more than half of all clinical trials experience delays, which leads to a considerable amount of cost. In this paper, we present a cost-effective framework to reduce the time and monetary cost in the stage of recruiting and screening eligible clinical trial participants. By leveraging patients’ observed conditions and the cost of medical examinations, the proposed framework uses collaborative filtering techniques to predict the utilized cost for the to-do medical examinations and then rank patients and medical examinations. The preliminary experiment results indicate that the framework is promising to reduce the cost spent on medical examinations by three quarters or even more and accelerate the recruitment process in the screening stage.
The FAIR principles require the reporting of rich metadata. However, when researchers use data for secondary use from external data owners, the FAIR principles require a different implementation as if the researchers would describe their own data. In this paper, we specify how FAIR metadata can be implemented for secondary data analyses and provide a suggestion for relevant metadata.
Zhe He, Laura A. Barrett, Rubina Rizvi, Seyedeh Neelufar Payrovnaziri, Rui Zhang
1474 - 1475
Dietary supplements (DSs) have gained increased popularity for weight loss due to its availability without prescription, relatively low price, and ease of use. Consumers with limited health literacy may not adequately know the benefits and risks associated with DSs. In this project, we found a knowledge gap between reported benefits of major DSs by adults with obesity in the National Health and Nutrition Examination Survey 2003–2014 and those reported in existing DS knowledge databases.
Automated extraction of patient trial eligibility for clinical research studies can increase enrollment at a decreased time and money cost. We have developed a modular trial eligibility pipeline including patient-batched processing and an internal webservice backed by a uimaFIT pipeline as part of a multi-phase approach to include note-batched processing, the ability to query trials matching patients or patients matching trials, and an external alignment engine to connect patients to trials.
Sara Herrero Jaén, Marta Fernández Batalla, Alexandra González Aguña, Adriana Cercas Duque, José Ma Santamaría García, Sergio Martínez Botija, Niurka Vialart Vidal, Sylvia Claudine Ramírez Sánchez, Daniel Flavio Condor Camara
1478 - 1479
The health concept has evolved throughout history. The people health level is determined by the perception that each individual has of it. It is a dynamic process over time, so the variations can be see from one moment to another. In this way, knowing the health of the patients you care for will facilitate decision-making in the treatment of care. To know the level of health of the people, a technological tool is presented that calculates the people health level through the Health Variables and Nursing Outcomes Classification (NOC) labels.