Why Harmonize

The breadth of data collected by health studies around the world is providing invaluable opportunities to advance scientific knowledge. And the quest for larger sample sizes, the need for valid cross-study comparisons, and the necessity for making optimal use of existing data has led to increased interest in co-analyzing data across studies. However, heterogeneity between studies in the design, recruitment methods and selection criteria, data collection time frame, and measures collected, limits our capacity to easily compare or integrate data.

Generating inferentially equivalent (harmonized) content across studies is essential to supporting such comparison and/or integration. Harmonization involves achieving or improving comparability of similar measures collected by separate studies or databases for different individuals. Investigators can foster prospective harmonization (i.e., implement standard procedures across studies prior to data collection). This renders data integration relatively straightforward since compatible protocols and data collection tools are employed across studies. However, it is not always relevant or possible to implement common protocols, and investigators are increasingly opting for retrospective harmonization to support the integration of data collected across pre-existing studies. Since the datasets have already been collected, retrospective harmonization also leverages the use of existing research data.