

Federated learning (FL) has the potential to revolutionize healthcare by enabling collaborative data analysis while keeping data decentralized. Monitoring data quality is crucial for successful FL in healthcare, as undetected issues can compromise model reliability and fairness. This project develops and evaluates a cross-silo FL system for healthcare data using the Flower framework, focusing on monitoring metrics and data quality issues such as label imbalance and feature drift. Using a harmonized synthetic dataset from the LETHE project, the FL system was tested on five clients with varying data distributions, including one with a heavily imbalanced dataset. Metrics such as accuracy, loss, and Matthews Correlation Coefficient (MCC) were tracked in real-time using Prometheus and visualized in Grafana. Results showed that the baseline model outperformed the federated model (accuracy: 0.806 vs. 0.754, loss: 0.416 vs. 0.545, MCC: 0.445 vs. 0.349); however, the federated model demonstrated competitive performance. The customized FedAvg, incorporating label distribution and dataset size, remarkably improved the global model’s MCC (0.349 vs. 0.111). Feature drift detection using the Kolmogorov-Smirnow test was successfully integrated into the monitoring system with a visual alert. The effectiveness of the customized FedAvg approach across diverse data distributions was not thoroughly assessed and could be further explored in future research. Additionally, the system could be extended to non-harmonized datasets, advanced privacy techniques could be integrated, and the impact of different types of feature drift on model performance could be investigated. This project demonstrates the importance of monitoring systems for ensuring reliable and fair FL models in healthcare applications.