
Ebook: Global Healthcare Transformation in the Era of Artificial Intelligence and Informatics

Technologies such as artificial intelligence, data science, and informatics have become ever more important in the provision of healthcare in recent decades, inspiring health professionals and informaticians to improve delivery and outcomes for the benefit of patients.
This publication presents the proceedings of ICIMTH 2025, the 23rd International Conference on Informatics, Management, and Technology in Healthcare, held from 4 to 6 July 2025 in Athens, Greece. The annual ICIMTH conference is a scientific event for those working in the field of biomedical and health informatics from all continents, and the conferences cover the field of biomedical and health informatics from a wide perspective, with participants presenting their research and application outcomes on informatics from cells to populations. Several technologies, such as imaging, sensors, and biomedical equipment are covered, together with management and organizational aspects, including legal and social issues. A significant number of papers are still related to public health issues, but submissions in artificial intelligence studies and applications in healthcare continue to increase, and this is reflected in the theme of the conference and the title of the proceedings. The conference received over 167 submissions from 28 countries, and after a thorough review process involving 94 reviewers, 111 full papers and 14 short communication papers were finally selected, representing an overall acceptance rate 74%, all of which are included in these proceedings.
Providing a global overview of biomedical and health informatics, the proceedings will be of interest to all those working in the field.
This volume contains the accepted papers of the ICIMTH (International Conference on Informatics, Management, and Technology in Healthcare) for the year 2025 and presents to the community of Biomedical and Health Informatics the scientific outcomes of the ICIMTH 2025 Conference, held in Athens, Greece, from 4-6 July 2025.
The ICIMTH 2025 Conference is the 23rd annual conference in this series of scientific events, which gathers scientists from all continents working in the field of biomedical and health informatics. The conference is held as a live event, and virtual sessions by means of teleconferencing have deliberately not been offered, to encourage all presenters to be at the venue.
The field of biomedical and health informatics is studied from a very broad perspective at this conference, with participants presenting their research and application outcomes of informatics from cells to populations, including several technologies such as imaging, sensors, biomedical equipment and management and organisational aspects, including legal and social issues. Essentially, artificial intelligence, data science, informatics, and technology inspire health professionals and informaticians to improve healthcare for the benefit of patients. As expected, a significant number of papers submitted to the conference are still related to public health issues. This year, however, we again saw an increased influx of submissions in artificial intelligence studies and applications in healthcare, and we wanted this to be reflected in the theme of the conference and in the title of the proceedings.
The organisers would like to thank all participants in this conference. Of the 167 submissions, 111 full papers and 14 short communications papers were accepted for publication. Since last year we have not opened a call for posters. An invited panel will provide an overview of historical perspectives of the evolution of artificial intelligence in healthcare. The panellists are Professor Reinold Haux, the former president of IAHSI and IMIA; Professor George Mihalas, the former president of EFMI; Professor Casimir Kulikowski of Rutgers University; and Professor Arie Hasman, the SPC chair. The keynote speaker, Professor Theodoros N. Arvanitis from the University of Birmingham, UK, will provide a state-of-the-art presentation on digital health.
It should be noted that the proceedings, entitled Global Healthcare Transformation in the Era of Artificial Intelligence and Informatics, will be published as an open access publication, with the advantages of indexing and citation in the biggest scientific literature databases, such as PubMed/Medline and Scopus, enabled by the submission of the volume by IOS Press for indexing consideration as part of the series Studies in Health Technology and Informatics (HTI). The expected date of publication is the date of the conference.
The Editors would like to thank the members of the Scientific Programme Committee, the Organising Committee, and all reviewers for their professional, thorough, and objective refereeing of the scientific work, which contributed to the achievement of a high-quality publication for a successful scientific event.
Athens, 14.05.2025
The Editors,
John Mantas, Arie Hasman, Parisis Gallos, Emmanouil Zoulias, and Konstantinos Karitis
Artificial Intelligence is being adopted by older people at home in the form of personal home networks. Occupational therapy services do not have explicit guidance to support their practice, and older people may be in a similar information vacuum as their needs change and the complexity of their personal networks increases. This doctoral research study used a literature informed survey, to surface insights from both cohorts. The findings highlight gaps and potential areas of common ground from which to build a new understanding and co-create the means to support safe deployments at home.
This study evaluated GPT-based conversational agents in tasks related to healthcare provider stigma. The main finding was that GPT-4o models, using Role-Playing (RP) and Chain of Thought (CoT) techniques, outperformed other models in tasks such as defining healthcare provider stigma, identifying types of stigma, and explaining its consequences. The Personalized GPT model showed lower performance, particularly in areas related to treatment access, adherence, and stigma risk factors. These results suggest that advanced prompting techniques significantly enhance the agent’s ability to deliver complex and nuanced information about healthcare provider stigma. The study supports the potential of GPT-based agents as scalable educational tools for reducing stigma, especially in resource-limited settings.
We present the application of the COSTAR method as a tool to design LLM prompts that can effectively support the dialogue of neurologists with a pwMS
The introduction of the Licence-Master-Doctorate (LMD) system in Africa, particularly in Burkina Faso, has created challenges for medical training due to a lack of infrastructure and personnel. This study develops a custom GPT model for medical dialogue simulation, overcoming the limitations of a previous MLP model. A structured dataset was designed, grouping diseases into modules and defining patient profiles. An open-source GPT-2 model was modified and trained, with user interaction facilitated through an API. After initial training, the model showed promising results, indicating stable learning. This model offers better context management and customization suited for medical dialogues. Future improvements include expanding to other pathologies and optimizing performance for effective integration into medical training in Africa.
Accurately identifying patient signs and symptoms from clinical notes is essential for effective diagnosis, treatment planning, and medical research. In this study, we evaluated the performance of the Meta Llama model in extracting signs and symptoms related to the genitourinary system, along with their corresponding ICD-10 codes, from urological clinical notes in the MTSamples dataset. The dataset was manually annotated to compare the extraction results of large language models (LLMs) output. We utilized Llama 3.3-70B and performed prompt engineering. The findings suggest that the best performance was achieved when the prompt included a predefined list of definitions of corresponding ICD-10 codes and restricted the model from making assumptions. Under these conditions, Llama 3.3-70B achieved an average recall of 0.96, precision of 0.89, and F1-score of 0.92 for S&S extraction, as well as an average recall of 0.93, precision of 0.85, and F1-score of 0.89 for ICD-10 code generation.
Tuberculosis (TB) remains a significant public health concern in correctional facilities due to overcrowding, poor ventilation, and inmate mobility. This cross-sectional study assessed the roles of 298 Prison Health Volunteers (PHVs) in TB surveillance at three Central Correctional Institutions for Young Offenders in Thailand. Although most PHVs demonstrated low knowledge and moderate attitudes, they performed well in key health education and case detection tasks. Significant associations were found between higher knowledge and favorable attitudes with better TB control practices (p = 0.027 and p = 0.005). These findings suggest that experiential and contextual factors may compensate for knowledge gaps. Despite ongoing challenges—including limited training and paper-based systems—PHVs remain pivotal in TB control. Strengthening their capacity through digital tools and structured support is essential. The Royal “Pan Suk” Project represents a promising, community-led model aligned with His Majesty King Maha Vajiralongkorn’s vision of promoting prison health equity and advancing TB elimination.
Accurate prognostic biomarkers are essential for evaluating survival risks in cancer patients. However, despite the wide use of biomarkers like prostate-specific antigen (PSA) and other clinical factors, achieving high predictive accuracy remains a challenge in prostate cancer prognosis. This study aimed to predict 24-month mortality in metastatic castration-resistant prostate cancer (mCRPC) patients by analyzing a comprehensive set of 41 clinical and demographic features. A cohort of 703 patients was assessed, and machine learning models, including XGBoost, SVM, and Random Forest, were compared. Of these, the Random Forest model demonstrated the highest performance, achieving an accuracy of 0.67 and an AUC of 0.68, effectively distinguishing between patients with less than 24 months of survival and more than 24 months of survival. Predictors identified in our analysis included PSA, albumin, and lactate dehydrogenase (LDH). These findings suggest that clinical factors can be effectively utilized in machine learning models to predict mortality outcomes in cancer patients.
The use of large language models (LLMs) in healthcare poses challenges related to data security, accuracy, bias, and usability, which can hinder their effectiveness in enhancing patients ability to locate trusted health information. We conducted a pilot study of a local LLM named “SAM,” built upon the Llama 7B architecture. Participants engaged with SAM by submitting health-related questions across five health domains. Following their interactions, participants completed the System Usability Scale (SUS) to evaluate the usability of the model. Of the ten participants, eight (80%) were female, and two (20%) were male. The highest-rated theme was ease of learning, with participants strongly agreeing that most people would quickly learn to use the chatbot (Mean = 4.7, SD = 0.46). Developing a local LLM for patient health information must tackle healthcare barriers. By enhancing data security, personalizing responses, and increasing user familiarity, SAM can improve patient engagement and outcomes.
Clinical decision-making often involves uncertainty due to vague guidelines and incomplete patient data. HL7 Arden Syntax addresses this challenge by incorporating fuzzy logic, enabling nuanced reasoning beyond mere Boolean values. Through practical examples, this paper explores key fuzzy connectives—and, or, not, at least, and at most—and their application in medical logic. We show that Medexter’s ArdenSuite implements these connectives properly and illustrate how fuzzy logic enhances the expressiveness of Arden Syntax, making it a valuable tool for clinical decision support.
This paper explores the potential of Large Language Models (LLMs) to improve patient safety incident (PSI) reporting in Finland. Through semi-structured interviews with doctors and authorities, key requirements and perspectives on AI integration were gathered. A Proof-of-Concept (PoC) study evaluated the feasibility of using a commercial LLM (GPT-4o) to generate structured PSI reports from unstructured clinical text from patient records. Interview results highlighted the need for integrated and automated reporting systems, with AI seen as a tool to reduce documenting burden and improve data analysis. The PoC demonstrated the technological capability of the LLM to generate coherent and relevant reports but also revealed challenges in completeness and distinguishing incident causality. The findings suggest promising avenues for leveraging LLMs in PSI reporting, warranting further research and development for national implementation.
This paper presents the results of a scoping review that examines potentials of Artificial Intelligence (AI) in early diagnosis of Cognitive Decline (CD), which is regarded as a key issue in elderly health. The review encompasses peer-reviewed publications from 2020 to 2025, including scientific journals and conference proceedings. Over 70% of the studies rely on using magnetic resonance imaging (MRI) as the input to the AI models, with a high diagnostic accuracy of 98%. Integration of the relevant clinical data and electroencephalograms (EEG) with deep learning methods enhances diagnostic accuracy in the clinical settings. Recent studies have also explored the use of natural language processing models for detecting CD at its early stages, with an accuracy of 75%, exhibiting a high potential to be used in the appropriate pre-clinical environments.
With the use of artificial intelligence (AI) for image analysis of Magnetic Resonance Imaging (MRI), the lack of training data has become an issue. Realistic synthetic MRI images can serve as a solution and generative models have been proposed. This study investigates the most recent advances on synthetic brain MRI image generation with AI-based generative models. A search has been conducted on the relevant studies published within the last three years, followed by a narrative review on the identified articles. Popular models from the search results have been discussed in this study, including Generative Adversarial Networks (GANs), diffusion models, Variational Autoencoders (VAEs), and transformers.
This study evaluates the performance of ChatGPT-4, a Large Language Model (LLM), in automatically extracting U scores from free-text thyroid ultrasound reports collected from University Hospitals Birmingham (UHB), UK, between 2014 and 2024. The LLM was provided with guidelines on the U classification system and extracted U scores independently from 14,248 de-identified reports, without access to human-assigned scores. The LLM-extracted scores were compared to initial clinician-assigned and refined U scores provided by expert reviewers. The LLM achieved 97.7% agreement with refined human U scores, successfully identifying the highest U score in 98.1% of reports with multiple nodules. Most discrepancies (2.5%) were linked to ambiguous descriptions, multi-nodule reports, and cases with human-documented uncertainty. While the results demonstrate the potential for LLMs to improve reporting consistency and reduce manual workload, ethical and governance challenges such as transparency, privacy, and bias must be addressed before routine clinical deployment. Embedding LLMs into reporting workflows, such as Online Analytical Processing (OLAP) tools, could further enhance reporting quality and consistency.
The exponential growth of biomedical literature necessitates automated approaches for extracting biological entities, such as genes, to support research. This study systematically compares rule-based, Named Entity Recognition (NER)-based, and transformer-based models for extracting 161 Oncomine™ genes from 100 randomly selected cancer-related abstracts. The transformer-based BioBERT model achieved the highest recall (1.00) and F1-score (0.98), followed by GPT-4o, which, despite its effectiveness, required substantial computational resources. NER-based SciSpaCy models exhibited varying performance, while rule-based string-matching demonstrated high precision but lower recall. The finding highlights the trade-offs between accuracy and computational efficiency, emphasizing the potential for hybrid approaches in large-scale text mining applications.
Generative AI (Gen AI) is catalyzing a paradigm shift in healthcare, revolutionizing leadership, innovation, and patient-centered care. By leveraging advanced AI models—including Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Transformer-based architectures such as GPT—healthcare organizations are unlocking unprecedented capabilities in diagnostics, clinical decision support, and operational efficiency. With the global Gen AI healthcare market poised for exponential growth, this paper examines its disruptive potential in automating workflows, enhancing medical imaging, accelerating drug discovery, and optimizing personalized treatment strategies. However, widespread adoption presents critical challenges, including regulatory compliance, ethical AI governance, and algorithmic bias. As healthcare leaders navigate this evolving landscape, strategic integration of Gen AI will be imperative in driving innovation, reducing costs, and delivering superior patient outcomes. This conceptual paper provides a forward-looking perspective on how Gen AI is reshaping healthcare leadership and setting new benchmarks for AI-driven transformation.
The use of ChatGPT has steadily increased over the years following its launch. This study evaluated ChatGPT-4o’s diagnostic accuracy in dermatological education case studies, comparing Free Answer and Multiple Choice formats, and assessing the impact of input data. Results showed a better performance in the Free Answer format, as opposed to Multiple Choice. Furthermore, adding patient data to the input images did not improve accuracy. These findings suggest that while ChatGPT-4o can serve as a second-opinion tool in dermatological education, it can do so by supporting, not replacing, the critical thinking process the students should perform when exercising diagnostic capabilities. Proper regulation is needed to ensure ethical and effective implementation of Generative AI in medical education.
We describe a process for selecting informational material for people living with dementia that is most appropriate for their stage of dementia. This material provides information on the disease, on ways of living with the disease, and on activities that may delay the progression of the disease, with a view to maintaining an individual’s independent living at home. Informational material is suggested according to both an individual’s overall clinical disease progression and progression along seven cognitive domain axes that contribute to the overall progression, for example ‘working memory’ performance or ‘attention’, with scores for these axes being derived from clinical judgement, from fixed and wearable sensors, or from performance when interacting with devices. Material is rated according to relevance to overall disease progression and relevance to each of the seven axes. For example, the Alzheimer Society of Ireland ‘Living Day To Day’ document is most relevant to those diagnosed with early dementia onwards, but may also have useful information for those without a dementia diagnosis but are experiencing problems with their working memory or attention.
Synthetic data utilization for model training in healthcare has grown in the last 15 years. However, there are several risks associated with this approach such as, privacy issues, where the patient’s data can be identifiable; data quality, where the complexity of the real-world scenarios is not captured; and ethical concerns, concerning malicious misuse. This paper highlights the challenges of using synthetic data by outlining the risks to contribute to developing more reliable, fair, and solutions.
This study evaluated the dual capability of GPT-4o to both summarize and translate English discharge notes into Spanish, addressing critical language barriers faced by U.S. Latinos. Given that over 40 million U.S. residents are primarily Spanish speakers and that limited English proficiency is associated with adverse clinical outcomes, effective communication is essential. A dataset of 66 discharge summaries from the MTsample database was used. GPT-4o was deployed via the OpenAI API with a temperature setting of 0.1 and a prompt designed to simulate a medical translator using auto chain-of-thought reasoning. Each generated Spanish summary was assessed by a bilingual physician using a five-point Likert scale across four dimensions: Completeness, Correctness, Conciseness, and Writing Quality. In addition, automated metrics including cosine similarity and compression ratio were computed. Results indicated that the majority of summaries scored high on Completeness, Correctness, and Writing Quality, with over 80% of responses rating these dimensions as excellent, although Conciseness was more moderate. Quantitative analysis revealed cosine similarity values ranging from 0.44 to 0.90 (median 0.66) and Compression Ratios varying from 0.3 to 1.8, with no significant differences observed across medical specialties. These findings demonstrate that, under controlled conditions, GPT-4o can generate clinically accurate and linguistically fluent Spanish summaries of discharge notes, offering a promising complementary tool for overcoming language barriers in healthcare. Further refinement, particularly in enhancing summarization conciseness, is warranted to optimize patient communication.
With the integration of Artificial Intelligence (AI) and Machine Learning (ML) in medical devices, unprecedented opportunities for automation, precision and efficiency in healthcare sectors have risen. However, these advancements also introduce significant challenges, including data integrity, algorithmic transparency, adversarial robustness, and regulatory compliance. Traditional assurance methods fail to capture the dynamic and evolving nature of AI-driven medical systems. To address these challenges, we discuss a wide range of structured assurance case patterns tailored to AI-enabled medical devices. We explore potential risks that ML-based systems will face and design assurance cases to build trustworthy intelligent systems.
The increased dependence on patient safety studies using the MAUDE database underscores the critical need to define and standardize the methods for extracting and analyzing event reports. The lack of reproducible methods leads to an inconsistent understanding of reported events and diminishes their effectiveness in informing clinicians. Thus, an ETL pipeline combined with LLMs was proposed to standardize the identification and interpretation of the reports. Using endoscopic clip reports as an example, the ETL-LLM method demonstrates the effectiveness of extracting and analyzing categorical and narrative reports, helping uncover insights related to patient complications, surgical procedures, and device uses. This innovative and transparent method of examining MAUDE underscores its potential to inform clinicians promptly and encourages more research on patient safety through open-access databases.
According to researchers drawing on the ideas of Jürgen Habermas, Canadian patients and Danish General Practitioners are both experiencing the ‘colonisation’ of their ‘lifeworlds’, though in different ways. Their suggested remedy is to ensure that the clinical encounter, freed of strategic rationality, prioritises Habermasian ‘communicative action’ aimed at mutual understanding. However, Blau argues that such communicative action can, and should be, inextricably interwoven with means-end rationality, rejecting Habermas’ caricature of the latter. In agreement, but taking an operational perspective, we argue that decision support based on Multi-Criteria Decision Analysis can help produce the ‘communicative means-end rationality’ essential in a public health service based on role-respecting sincerity and autonomy.
Background:
Sleep disturbances are a major issue, nowadays, make the whole scientific community to be alert, utilizing machine learning techniques to predict its underlying determinants.
Objective:
The main purpose of this paper is to test the accuracy of machine learning algorithms in interpretation of sleep problems.
Methods:
A public dataset was used and multiple feature selection techniques were addressed to identify the most influential predictors in sleep disturbances. Explainable AI was used to further interpret how each predictor impacts individual predictions.
Results:
Results from model performance show that AdaBoost outperformed other models (71.27% accuracy) and sleep quality is the dominant predictor (with SHAP value 0.01586), indicating the strongest influence on model.
Conclusion:
The incorporation of explainable AI methods (e.g., SHAP) enhances the clinical and public health value of these models, enabling healthcare providers to target specific interventions and potentially improve patients’ sleep health outcomes.