Secondary Logo

Journal Logo

Abstracts of the 2019 Annual Conference of the International Society for Environmental Epidemiology, August 25-28 2019, Utrecht, the Netherlands

The CHEAR Data Repository: Facilitating children’s environmental health and exposome research through data harmonization, pooling and accessibility

J, Stingone1; P, Pinheiro2; J, Meola3; J, McCusker2; S, Bengoa3; P, Kovatch3; D, McGuinness2; S, Teitelbaum3

Author Information
Environmental Epidemiology: October 2019 - Volume 3 - Issue - p 382
doi: 10.1097/01.EE9.0000610256.39316.c4
  • Open

TPS 691: Methods of measurement, design and data analysis, Exhibition Hall, Ground floor, August 28, 2019, 3:00 PM - 4:30 PM

Funded by the U.S. National Institute of Environmental Health Sciences, the Children’s Health Exposure Analysis Resource (CHEAR) provides scientific investigators access to laboratory and statistical analyses aimed at incorporating and expanding environmental exposures within their research. To benefit the broader research community, the CHEAR Data Center has created a public data repository that houses deidentified data from studies accepted into the CHEAR program. To date, 26 studies have submitted data containing > 41,000 specimens, > 3,000 mothers and children and 139 environmental chemicals. The goal of this repository is to promote the secondary analysis of pooled CHEAR studies by providing data in a manner that is findable, accessible, interoperable and reusable (FAIR). The repository has been constructed by coupling the open-source Human-aware Data Acquisition Framework with semantic annotation templates that transform CHEAR datasets into machine-readable knowledge graphs. These tools facilitate the ingestion, semantic-mapping, harmonization and accessibility of data (epidemiologic, clinical and biomarker) and metadata across the multiple studies within the CHEAR Program. We demonstrate how users of the public repository have the ability to simultaneously search, view, and download data from multiple CHEAR studies. The repository can be searched based on a number of factors including health outcomes, biological markers of exposure and common covariates. Because data have been harmonized to a common vocabulary (the CHEAR ontology), downloaded datasets automatically contain CHEAR-wide harmonized codes and labels for variables that are present in multiple studies. By selecting common data elements, users can create customized datasets with accompanying codebooks in a format that is easily imported into statistical analysis software. For maximal FAIR impact, we have promoted the CHEAR data, tools and methods through Google dataset search, Github, and Bioportal. The repository will encourage secondary analysis of pooled CHEAR studies, facilitating investigations that leverage larger sample sizes and greater exposure variability.

Copyright © 2019 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of Environmental Epidemiology. All rights reserved.