Sickle cell disease (SCD) is a monogenic disorder, yet the severity and specific organ involvement varies between patients. This clinical variation is influenced by environmental and genetic determinants. This whole genome sequencing (WGS) study aims to define the latter. Specifically, the study seeks to discover genetic variants that predict outcome in sickle cell disease to better inform treatment decisions and discover new therapies according to the principles of genetic medicine. To that end, the Sickle Genome Project, a WGS strategy, was undertaken to define genomic variation and modifiers of SCD.
This was a collaborative project between St. Jude Children's Research Hospital (St. Jude), St. Jude Affiliate Hospitals, and Baylor College of Medicine Texas Children's Hospital Hematology Center (TCHC). Participants were identified from the Sickle Cell Clinical Research and Intervention Program (SCCRIP) led by Jane Hankins, MD, MS, and Jeremie Estepp, MD, and from TCHC protocols led by Vivien Sheehan, MD, PhD (Pediatr Blood Cancer 2018;65:e27228).
Results from this study were presented at the 2018 ASH Annual Meeting by Evadnie Rampersaud, PhD, Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, Tenn. Regarding this ambitious project, Rampersaud stated, “Ours is one of the first whole genome sequencing studies in sickle cell populations, a normally understudied group.”
Objectives & Methods
Objectives of this study include validating known genotype-phenotype associations for SCD, discovering new genetic modifiers, developing new bioinformatic strategies to analyze genomic data relevant to SCD, and generating a facile Cloud-based platform for data sharing. The latter point is particularly important since most genomic studies of SCD fail to achieve statistical significance due to small sample size.
Analysis of the WGS data was performed by aligning paired-end 150 base-pair reads to the GRCh38 human reference using the Burrows-Wheeler Aligner, which achieved 30x average coverage. In all study participants, the genetic variations were determined using the GATK best-practices workflow. For α-thalassemia deletion, local de novo assembly of WGS data and coverage depth analysis were used. Association of clinical traits (fetal hemoglobin (HbF), red blood cell traits) with specific genetic variants was accomplished using mixed model regression analysis. Time-to-event analyses for albuminuria and vaso-occlusive pain were performed using Cox proportional hazards models.
In this study, 51.1 percent of the patients included were male. At the St. Jude Center, there were 503 study participants (252–female and 251–male) with a mean age of 6.77 years, while at the Baylor College of Medicine TCHC, there were 368 patients (194–male and 174–female) with a mean age of 6.84 years. The patient populations in both cohorts (St. Jude and Baylor) had similar ethnicity and hematological characteristics.
Most known genetic modifiers of SCD were validated. These include associations between red cell HbF levels and common variants in MYB and BCL11A genes. In general, patients with elevated HbF levels experience fewer SCD-related problems. Researchers also confirmed associations between serum bilirubin and UGT1A1 variants that cause Gilbert's disease in the general population and predict increased risk for gallstones in patients with SCD. Variants in the APOL1 gene are common in people of African descent and predispose to kidney disease. The study showed that pediatric SCD patients with APOL1 risk alleles are more likely to develop albuminuria (an early sign of kidney damage) at a younger age (Abstract 118912). Thus, the APOL1 status in SCD may predict earlier progression to chronic kidney disease and identify patients who will benefit from early medical intervention with anti-sickling and/or renoprotective treatments.
“A key advantage of combining WGS with the SCCRIP longitudinal cohort studies is the potential to use germline genetic data collected at baseline (when participants were enrolled) to predict trends in disease progression over time,” Rampersaud commented. “This is important because, even though sickle cell disease is caused by a single mutation, the patients often experience variable disease manifestations according to genetic influences such as the potential for accelerated kidney disease with APOL1 risk variants and reduced rate of organ disease with MYB and BCL11A variants that predict increased HbF levels.”
Both a- and b-thalassemia are common genetic disorders that frequently coexist with SCD and can potentially modify the course of the disease. Somewhat ironically, a- and b-thalassemia are frequently caused by genomic deletions that are difficult to identify by WGS and require specialized testing. Researchers developed bioinformatic algorithms to identify thalassemia mutations from WGS data and verified that these mutations track with red cell traits, such as red cell size and hemoglobin content in our cohort. Interestingly, of 38 patients who were classified by clinical parameters as having the HbSb0 genotype, 18 (47%) actually had genotype HbSS. Thus, WGS of SCD patients facilitates accurate genetic diagnosis of concomitant thalassemias.
The study also identified a candidate locus for predisposition to pain events and researchers are developing methods to ascertain blood types more accurately via WGS for SCD patients who require transfusions, which could eventually improve blood donor selection to reduce the formation of anti-red cell antibodies and thereby improve transfusion outcomes.
When asked about the implications for the most commonly noted single nucleotide polymorphisms identified in their WGS study, Rampersaud replied, “We have validated most of the known common genetic associations that impact SCD, including HbF with BCL11A and MYB, hyperbilirubinemia with UGT1A1, kidney damage with APOL1, and thalassemias with certain red cell traits.
“These variants are ‘the low hanging fruit’ and likely represent the tip of the iceberg; identifying the less common variants or those that have relatively small effects on organ damage in SCD and understanding how different variants act in combination require more patient data via inter-institutional collaborations,” Rampersaud noted. “Most SCD genetic association studies have only a few hundred subjects. However, in contrast, good genetic epidemiology studies for major diseases have tens or even hundreds of thousands of subjects. We hope that our cloud-based sharing platform developed in collaboration with the Department of Computational Biology and Department of Information Services (hosted in St. Jude Cloud at https://www.stjude.cloud/) will foster collaborations. The TOPMed program at NIH has similar goals and we will deposit our data there as well.”
When asked how the study's findings might guide precision medicine-based therapies for sickle cell disease, Rampersaud stated, “Identification of genes that put SCD patients at increased risk for organ damage or drug metabolism could impact treatment decision. Precision-based medicine for SCD is already happening at St. Jude and elsewhere. For example, codeine is a useful drug for pain, but can be dangerous for some patients because they have genetic variants that either speed up or slow down its metabolism, which can result in dangerously high or ineffectively low blood levels.
“Mary Relling's group in the Department of Pharmacology at St. Jude started applying genetic testing for these variants several years ago and we are now using this information clinically to guide our prescribing of codeine,” she further explained.
“Improved blood typing by DNA analysis may reduce the rate of red cell alloimmunization by allowing clinicians and blood banks to better match blood donors with patients who require blood transfusions. Patients who have APOL1 risk alleles that predispose to kidney disease may benefit from earlier treatment with drugs that protect the kidney,” she observed.
When noting the study's strengths, Rampersaud stated, “The key to successful genetic association studies is high-quality genomic information and accurate, consistent clinical phenotyping. Our study appears to have both, as evidenced by our ability to validate known associations. At the end of the day, data quality is the key.”
Regarding study limitations, she noted, “Our study size is too small to detect most genetic signals and we do not yet have the benefit of time that will allow us to track genetic variants with the kinetics of disease progression.”
In discussing future directions for this important research, Rampersaud commented, “We will expand the cohort and continue following these patients longitudinally. As these children age into adulthood, organ damage occurs, so our ability to detect these changes as the patients age, along with the genetic data, provide a powerful tool to understand the prognostic value of genetic markers in sickle cell disease. This truly can allow us to detect genetic risk factors that may be used in the future as screening tools for better management.” As noted earlier, data sharing and fostering multi-center studies is essential.
“This work represents a large, multidisciplinary effort with input from pediatric hematologists, computational biologists, software engineers, web designers, and geneticists,” Rampersaud explained. “All of us at St. Jude are dedicated to improving the lives of SCD patients through evidence-based clinical care and research. In this regard, the patients and their families are major stakeholders and we thank them for participating in the research.”
Richard Simoneaux is a contributing writer.