A Latent Trait-based Measure as a Data Harmonization and Missing Data Solution Applied to the Environmental Influences on Child Health Outcomes Cohort.

View Abstract

BACKGROUND

Collaborative research consortia provide an efficient method to increase sample size, enabling evaluation of subgroup heterogeneity and rare outcomes. In addition to missing data challenges faced by all cohort studies like nonresponse and attrition, collaborative studies have missing data due to differences in study design and measurement of the contributing studies.

METHODS

We extend ROSETTA, a latent variable method that creates common measures across datasets collecting the same latent constructs with only partial overlap in measures, to define a common measure of socioeconomic status (SES) across cohorts with varying indicators in the Environmental influences on Child Health Outcomes Cohort, a consortium of pregnancy and pediatric cohorts.

RESULTS

Starting with 52 indicators of prenatal SES from 39,372 participants across 53 cohorts, ROSETTA created three factors representing key domains of SES: income and education, insurance and poverty, and unemployment. At least one factor score was available for 34,528 participants; two factors were available for more participants than any single indicator. Factors fit the data well, had content validity, and were correlated with alternative measures of SES (for income & education factor, r= 0.40-0.89). Higher SES as measured by the factor scores was associated with lower odds of prenatal smoking:OR income & education 0.42 (95% CI 0.38, 0.45). Missing data were reduced compared to most methods, except for multiple imputation.

CONCLUSIONS

ROSETTA aids in pooled analysis of individual participant data by creating measures on a common scale and maximizing data in the presence of missing and mismatched measures.

Investigators
Abbreviation
Epidemiology
Publication Date
2025-01-30
Pubmed ID
39884749
Medium
Print-Electronic
Full Title
A Latent Trait-based Measure as a Data Harmonization and Missing Data Solution Applied to the Environmental Influences on Child Health Outcomes Cohort.
Authors
Knapp EA, Kress AM, Ghidey R, Gorham TJ, Galdo B, Petrill SA, Aris IM, Bastain TM, Camargo CA, Coccia MA, Cragoe N, Dabelea D, Dunlop AL, Gebretsadik T, Hartert T, Hipwell AE, Johnson CC, Karagas MR, LeWinn KZ, Maldonado LE, McEvoy CT, Mirzakhani H, O'Connor TG, O'Shea TM, Wang Z, Wright RJ, Ziegler K, Zhu Y, Bartlett CW, Lau B,