Using the study of pulse condition in Traditional Chinese Medicine (TCM) as a case example, this paper discusses the characteristics of pulse information obtained by Chinese medicine practitioners, how to correctly understand the relationship between the human body and Chinese medicine in the information age, how to deal with pulse data, and how to study TCM pulse condition . Furthermore, we point out that the application of modern big data processing technology to the pulse of Chinese medicine offers new opportunities.
Analyzing human pupillary behavior is a non-invasive method for evaluating neurological activity. This method contributes to the medical field because changes in pupillary behavior can be correlated with several health conditions such as Parkinson, Alzheimer, autism and diabetes. Analyzing human pupillary behavior is simple and low-cost, and may be used as a complementary diagnosis. Therefore, this work aims to develop an automated system to evaluate human pupillary behavior. The solution consists of a portable recording device, a pupillometer; integrated with a recording and evaluation software based on computer vision. The system is able to stimulate, record, measure and extract relevant features of human pupillary behavior. The results show that the proposed system is fast and accurate, and can be used as an assessment tool for real and extensive clinical practice and research.
Opioid dependence and overdose is on the rise. One indicator is the increasing trends of prescription buprenorphine use among patient on chronic pain medication. In addition to the New York State Department of Health's prescription drug monitoring programs and training programs for providers and first responders to detect and treat a narcotic overdose, further examination of the population may provide important information for multidisciplinary interventions to address this epidemic. This paper uses an observational database with a Natural Language Processing (NLP) based Not Only Structured Query Language architecture to examine Electronic Health Record (EHR) data at a regional level to study the trends of prescription opioid dependence. We aim to help prioritize interventions in vulnerable population subgroups. This study provides a report of the demographic patterns of opioid dependent patients in Western New York using High Throughput Phenotyping NLP of EHR data.
Infusion-related reactions (IRRs) are typical adverse events for breast cancer patients. Detecting IRRs and visualizing their occurance associated with the drug treatment would potentially assist clinicians to improve patient safety and help researchers model IRRs and analyze their risk factors. We developed and evaluated a phenotyping algorithm to detect IRRs for breast cancer patients. We also designed a visualization prototype to render IRR patients' medications, lab tests and vital signs over time. By comparing with the 42 randomly selected doses that are manually labeled by a domain expert, the sensitivity, positive predictive value, specificity, and negative predictive value of the algorithms are 69%, 60%, 79%, and 85%, respectively. Using the algorithm, an incidence of 6.4% of patients and 1.8% of doses for docetaxel, 8.7% and 3.2% for doxorubicin, 10.4% and 1.2% for paclitaxel, 16.1% and 1.1% for trastuzumab were identified retrospectively. The incidences estimated are consistent with related studies.
The online patient question and answering (Q&A) system, either as a website or a mobile application, attracts an increasing number of users in China. Patients will post their questions and the registered doctors then provide the corresponding answers. A large amount of questions with answers from doctors are accumulated. Instead of awaiting the response from a doctor, the newly posted question could be quickly answered by finding a semantically equivalent question from the Q&A achive. In this study, we investigated a novel deep learning based method to retrieve the similar patient question in Chinese. An unsupervised learning algorithm using deep neural network is performed on the corpus to generate the word embedding. The word embedding was then used as the input to a supervised learning algorithm using a designed deep neural network, i.e. the supervised neural attention model (SNA), to predict the similarity between two questions. The experimental results showed that our SNA method achieved P@1 = 77% and P@5 = 84%, which outperformed all other compared methods.
Automated identification provides an efficient way to categorize patient safety incidents. Previous studies have focused on identifying single incident types relating to a specific patient safety problem, e.g., clinical handover. In reality, there are multiple types of incidents reflecting the breadth of patient safety problems and a single report may describe multiple problems, i.e., it can be assigned multiple type labels. This study evaluated the abilty of multi-label classification methods to identify multiple incident types in single reports. Three multi-label methods were evaluated: binary relevance, classifier chains and ensemble of classifier chains. We found that an ensemble of classifier chains was the most effective method using binary Support Vector Machines with radial basis function kernel and bag-of-words feature extraction, performing equally well on balanced and stratified datasets, (F-score: 73.7% vs. 74.7%). Classifiers were able to identify six common incident types: falls, medications, pressure injury, aggression, documentation problems and others.
The adverse events of the dietary supplements should be subject to scrutiny due to their growing clinical application and consumption among U.S. adults. An effective method for mining and grouping the adverse events of the dietary supplements is to evaluate product labeling for the rapidly increasing number of new products available in the market. In this study, the adverse events information was extracted from the product labels stored in the Dietary Supplement Label Data-base (DSLD) and analyzed by topic modeling techniques, specifically Latent Dirichlet Allocation (LDA). Among the 50 topics generated by LDA, eight topics were manually evaluated, with topic relatedness ranging from 58.8% to 100% on the product level, and 57.1% to 100% on the ingredient level. Five out of these eight topics were coherent groupings of the dietary supplements based on their adverse events. The results demonstrated that LDA is able to group supplements with similar adverse events based on the dietary supplement labels. Such information can be potentially used by consumers to more safely use dietary supplements.
People with diabetes experience elevated blood glucose (BG) levels at the time of an infection. We propose to utilize patient-gathered information in an Electronic Disease Surveillance Monitoring Network (EDMON), which may support the identification of a cluster of infected people with elevated BG levels on a spatiotemporal basis. The system incorporates data gathered from diabetes apps, continuous glucose monitoring (CGM) devices, and other appropriate physiological indicators from people with type 1 diabetes. This paper presents a novel approach towards modeling of the individual's BG dynamics, a mechanism to track and detect deviations of elevated BG readings. The models were developed and validated using self-recorded data in the non-infection status using Dexcom CGM devices, from two type 1 diabetes individuals over a 1-month period. The models were also tested using simulated datasets, which resemble the individual's BG evolution during infections. The models accurately simulated the individual's normal BG fluctuations and further detected statistically significant BG elevations.
Uncovering clinical research trends allows us to understand the direction of healthcare services and is essential for longer-term healthcare planning. The Hospital Authority Convention is a mainstream annual healthcare conference that gathers up-to-date Hong Kong medical research. We propose to use state-of-the-art medical document mining and topic modelling methods to uncover latent themes and structures in the publications. We collected 742 articles from HA Convention from the year 2013 to 2016 and selected 56 publications from the category of “Clinical Safety and Quality Service” for further analysis. Applying natural language processing and Latent Dirichlet Allocation (LDA) methods, we identified 7 potential topics, namely: surgical operation, hospital discharge, medical error, nursing procedure, service performance assessment, patient and staff engagement, and admission algorithm and standardisation. This exploratory study demonstrates that key themes exist in the annual HA Convention and we observe potential changes in healthcare services focus over the years in the selected category.
Search techniques in clinical text need to make fine-grained semantic distinctions, since medical terms may be negated, about someone other than the patient, or at some time other than the present. While natural language processing (NLP) approaches address these fine-grained distinctions, a task like patient cohort identification from electronic health records (EHRs) simultaneously requires a much more coarse-grained combination of evidence from the text and structured data of each patient's health records. We thus introduce aligned-layer language models, a novel approach to information retrieval (IR) that incorporates the output of other NLP systems. We show that this framework is able to represent standard IR queries, formulate previously impossible multi-layered queries, and customize the desired degree of linguistic granularity.
Radiation therapy allows precision targeting of certain groups of lymph nodes and is a treatment for metastatic head and neck squamous cell carcinoma. In current practice, there is approximately 15% probability that physicians inadvertently treat healthy tissue or leave the cancerous lymph nodes untreated. The aim of this work is to improve the accuracy of medical decision-making by extending existing predictive models to capture the probabilities of finding cancerous lymph nodes at each of the six image-based surgical neck level using a patient's genetic profile, primary tumor site and tumor size. Our model was trained with publicly available data extracted from the Cancer Genome Atlas (TCGA) and validated against the TCGA dataset both with and without genetic information. Results show that genetic profile data improves model accuracy. These findings suggest that our predictive model may improve the accuracy of clinical decision-making, especially for patients with more advanced metastasis. However, more data is needed to ensure significance of the proposed effects, as well as to improve accuracy of the overall model.
In clinical practice, many patients may have unknown or missing values for some predictors, causing that the developed risk models cannot be directly applied on these patients. In this paper, we propose an incremental learning approach to apply a developed risk model on new patients with unknown predictor values, which imputes a patient's unknown values based on his/her k-nearest neighbors (k-NN) from the incremental population. We perform a real world case study by developing a risk prediction model of stroke for patients with Type 2 diabetes mellitus from EHR data, and incrementally applying the risk model on a sequence of new patients. The experimental results show that our risk prediction model of stroke has good prediction performance. And the k-nearest neighbors based incremental learning approach for data imputation can gradually increase the prediction performance when the model is applied on new patients.
Semantic relations have been studied for decades without yet reaching consensus on the set of these relations. However, biomedical language processing and ontologies rely on these relations, so it is important to be able to evaluate their suitability. In this paper we examine the role of inter-annotator agreement in choosing between competing proposals regarding the set of such relations. The experiments consisted of labeling the semantic relations between two elements of noun-noun compounds (e.g. cell migration). Two judges annotated a dataset of terms from the biomedical domain using two competing sets of relations and analyzed the inter-annotator agreement. With no training and little documentation, agreement on this task was fairly high and disagreements were consistent. The results support the utility of the relation-based approach to semantic representation.
The progressive digitization of medical records has resulted in the accumulation of large amounts of data. Electronic medical data include structured numerical data and unstructured text data. Although text-based medical record processing has been researched, few studies contribute to medical practice. The analysis of unstructured text data can improve medical processes. Hence, this study presents a clustering approach for detecting typical patient's condition from text-based medical record of clinical pathway. In this approach, the sentences in a cluster are merged to generate a “sentence graph” of the cluster after classified feature word by Louvain method. An analysis of real text-based medical records indicates that sentence graphs can represent the medical treatment and patient's condition in a medical process. This method could help the standardization of text-based medical records and the recognition of feature medical processes for improving medical treatment.
Maximizing the effectiveness of prescriptions and minimizing adverse effects of drugs is a key component of the health care of patients. In the practice of traditional Chinese medicine (TCM), it is important to provide clinicians a reference for dosing of prescribed drugs. The traditional Cheng-Church biclustering algorithm (CC) is optimized and the data of TCM prescription dose is analyzed by using the optimization algorithm. Based on an analysis of 212 prescriptions related to TCM treatment of kidney diseases, the study generated 87 prescription dose quantum matrices and each sub-matrix represents the referential value of the doses of drugs in different recipes. The optimized CC algorithm can effectively eliminate the interference of zero in the original dose matrix of TCM prescriptions and avoid zero appearing in output sub-matrix. This results in the ability to effectively analyze the reference value of drugs in different prescriptions related to kidney diseases, so as to provide valuable reference for clinicians to use drugs rationally.
Estimation of semantic similarity and relatedness between biomedical concepts has utility for many informatics applications. Automated methods fall into two categories: methods based on distributional statistics drawn from text corpora, and methods using the structure of existing knowledge resources. Methods in the former category disregard taxonomic structure, while those in the latter fail to consider semantically relevant empirical information. In this paper, we present a method that retrofits distributional context vector representations of biomedical concepts using structural information from the UMLS Metathesaurus, such that the similarity between vector representations of linked concepts is augmented. We evaluated it on the UMNSRS benchmark. Our results demonstrate that retrofitting of concept vector representations leads to better correlation with human raters for both similarity and relatedness, surpassing the best results reported to date. They also demonstrate a clear improvement in performance on this reference standard for retrofitted vector representations, as compared to those without retrofitting.
Extracting and understanding information, themes and relationships from large collections of documents is an important task for biomedical researchers. Latent Dirichlet Allocation is an unsupervised topic modeling technique using the bag-of-words assumption that has been applied extensively to unveil hidden thematic information within large sets of documents. In this paper, we added MeSH descriptors to the bag-of-words assumption to generate ‘hybrid topics’, which are mixed vectors of words and descriptors. We evaluated this approach on the quality and interpretability of topics in both a general corpus and a specialized corpus. Our results demonstrated that the coherence of ‘hybrid topics’ is higher than that of regular bag-of-words topics in the specialized corpus. We also found that the proportion of topics that are not associated with MeSH descriptors is higher in the specialized corpus than in the general corpus.
In this paper, we proposed an approach systematically based on the use of gene co-expression network analyses to identify potential biomarkers for Hepatocellular Carcinoma (HCC). With the analysis of differential gene expression, we first selected candidate genes closely related to HCC from the whole genome on a large scale. By identifying the relationships between each two genes, we built up the gene co-expression network using Cytoscape software. Then the global network was clustered into several sub-modules by Markov Cluster Algorithm (MCL). And, GO-Analysis was carried out for these identified gene modules to further explore the genes obviously associated with the dysfunctions of HCC, and in result we find Hexokinase 2 (HK2) and Krüppel-like Factor 4 (KLF4) as potential candidate biomarkers to provide insights into the mechanism of the development of HCC. Finally, we evaluated the disease classification results via an SVM-based machine learning method to verify the accuracy of the classification
The rapid growth of digital health and welfare services demands new competences for health and social care, information technology, and business professionals. This study aims to describe the competences that students have before their studies and those they expect to gain from the study module “Developing Digital Health and Welfare Services” in multiprofessional groups during their bachelor studies. This study reports open-ended questions about students' knowledge concerning digital health prior to the study units. The results, analyzed by QSR NVivo 10 for Windows, show that students are keen to learn about developing digital health and welfare services, and they see that multiprofessional work requires a communicative environment and respect for every profession. Students also believe that they have competences to bring to the multiprofessional group. A successful multidisciplinary development of digital health and welfare services requires changes and cooperation in education between various professions.
Inpatient management of insulin-dependent diabetes (IDD) is a complex task that requires clinicians to cognitively process information across distinct domains in different locations of the electronic medical record (EMR). Current data displays are not optimized to support insulin management by end users. We sought to develop a set of user-centered displays of capillary glucose data and insulin dose to improve inpatient management of IDD. Our proposed conceptual data display prototype is designed to simplify the presentation and visualization of key information needed for treatment decisions. The goal is also to enhance clinician's ability to identify opportunities to optimize insulin dosing and decrease end users' cognitive load and error rates.
The implementation of an Electronic Health Record has many benefits; but when it is not available, it can impact patient continuity of care. If there is no support, or a failure to guarantee the continuity of services, contingency plans have to be implemented to overcome the information disruption. End users are in direct contact with the information system, and are responsible for documenting patient clinical information. Focusing on them, we propose the design, development, and validation of a survey to evaluate the beliefs, knowledge, and perceptions of end users, about the Electronic Health Record contingency plan. Preliminary results showed that even when there were less downtime periods, end users perceived that they did not have adequate training or information about how to go through the downtime event.
Despite the continuous technical advancements around health information standards, a critical component to their widespread adoption involves political agreement between a diverse set of stakeholders. Countries that have addressed this issue have used diverse strategies. In this vision paper we present the path that Chile is taking to establish a national program to implement health information standards and achieve interoperability. The Chilean government established an inter-agency program to define the current interoperability situation, existing gaps, barriers, and facilitators for interoperable health information systems. As an answer to the identified issues, the government decided to fund a consortium of Chilean universities to create the National Center for Health Information Systems. This consortium should encourage the interaction between all health care stakeholders, both public and private, to advance the selection of national standards and define certification procedures for software and human resources in health information technologies.
Colorectal cancer screening access within a rural and remote health care environment represents a complex systems problem. Existing modeling approaches are inadequate in their representation of health system complexity. A combined Collaborative Information Behavior (CIB) and Continuity of Care framework was developed to model the health care processes involved in screening access over time. This framework highlighted necessary information behavior supports and system gaps in screening access, supporting development of targeted informatics solutions to improve screening access and cancer outcomes.