

Evaluating data quality is essential when combining multi-site observational clinical data for analysis. We collaborated with five research networks, representing various data approaches and workflows, to generalize an established data quality checking and report generation tool so it could be implemented more easily by other research consortia. The resulting approach reduced the need for technical expertise at user sites by leveraging the REDCap data collection software to store details about a research group, their data model, and expectations about variables (e.g., plausible numeric range, valid format and codes, date logic). The application then used the REDCap API to retrieve those details and assess a dataset’s conformance to the data model, logical consistency, and completeness. Users could download reports that summarized the dataset contents and quality. The generalized Harmonist Data Toolkit was built using the freely available REDCap and R/Shiny platforms, with code available on GitHub. All five collaborating consortia found the Toolkit beneficial in detecting inconsistencies and providing informative data reports and visualizations. The Harmonist Data Toolkit fills a need for data quality and report generation solutions for consortia without local programming expertise.