Share this article on:

Can a Web-Based Recruitment Tool for Genomic Analysis be Valid?

Talan, Jamie

doi: 10.1097/01.NT.0000405140.28789.76


Two novel genetic loci were associated with Parkinson disease (PD) and 20 other regions were confirmed in a genome-wide association study (GWAS) published in the June 23 issue of Public Library of Science Genetics. But notwithstanding the new genetic association, the most intriguing aspect of the new study was the web-based methodology used to recruit participants for the study — a research model, various investigators told Neurology Today, which offers both opportunities and limitations.

Back to Top | Article Outline


To recruit subjects for the PD analysis study, investigators teamed up with scientists from 23andme — a for-profit company that charges individuals a fee to provide reports on their genetic risks for disease — to identify patients and collect web-based questionnaires and mailed-in saliva kits for genome-wide association screening.

The company collaborated with PD organizations, including the Parkinson's Institute and the Michael J. Fox Foundation. The organizations reached out to their members in an Internet e-mail campaign and at conferences asking them to be part of the study. The scientists enrolled 3,426 self-reported PD patients. This Internet-based recruitment pulled from all over North America and took only 18 months.

With genomic and phenotypic information in hand, company statisticians analyzed the data and compared it with 29,624 control cases from the 23andMe database. People self–reporting movement disorders other than PD were excluded from the study. The scientists were hoping that the model would be successful in identifying new markers associated with PD.

“This is the largest case-control GWAS study of Parkinson's based on a single dataset,” said J. William Langston, MD, chief executive officer, scientific director and founder of the Parkinson's Institute in Sunnyvale, CA. “It is a fascinating model,” said Dr. Langston, an investigator in the study. “This is the wave of the future, whether we think it's the right thing or not. The numbers of potential subjects they can get is phenomenal.”

Scientists not involved with the study said that the idea behind the research model is compelling, but there are major hurdles to overcome.

Margaret Sutherland, PhD, director of the neurodegeneration program at the NINDS, invited the scientists to present their data at a NINDS workshop on genome-wide association screening and neurodegenerative disorders last April. Referring to the web-based model of collecting data, Dr. Sutherland said: “It feeds into how people interact and get information these days. Looking at unique ways to engage people to join research studies is a good idea. It could be advantageous to science.”

Back to Top | Article Outline


One caveat is that the study relies on medical information obtained directly from the individual, said Matthew Farrer, PhD, a professor of medical genetics and Canada Excellence Research Chair in Neurogenetics at the University of British Columbia. PD is best diagnosed by a neurologist with specific training in movement disorders who can evaluate them over an extended time period, he said.

Early in the disease the signs and symptoms are subtle, progression is insidious, and misdiagnosis is common, even among physicians, he noted. In the current study, only 84 percent of the subjects provided detailed information about their disease progression.

The study may suffer from an epidemiological bias, Dr. Farrer said. The study members represent a unique community of like-minded participants who may have a lot more in common with one another than with others in the general population. Every time scientists identify an association they have to ask whether it is due to the polymorphism or to some other characteristic shared by study subjects.

Dr. Farrer said a complementary GWAS approach might focus on detailed clinical characterization of patients from ethnically homogeneous populations, where less sample genotyping may provide novel results — and where access to the web is still uncommon.

Still, he added: “23andMe's approach is clever. I applaud it. Getting participation through the web is a wonderful idea.”

Dr. Farrer said he suspects that it may be easier to get people involved in research without institutional ownership, where university institutional review boards (IRBs), consents, and intellectual property issues can become an impediment to legitimate scientific research.

“This type of study takes away the legal paperwork,” he pointed out. However, he added, “there is a risk of marginalizing neurologists who are critical in diagnosing a true case of Parkinson's and in managing the disease. A model must be devised that involves their intellectual contribution.”

“It might not be possible to ask people if they have Parkinson disease — or any other disease — and always get an accurate answer,” said 23andMe statistician and co-author Nicholas Eriksson, PhD. But he said that they conducted in-house calculations of the study methodology to see what would happen if some of the controls were really cases and vice versa.

Back to Top | Article Outline


The two novel associations were replicated in a meta-analysis of studies that represented 12,000 cases and 21,000 controls carried out by the International Parkinson's Disease Genomics Consortium (IPDGC). At the same time as the 23andMe study was being conducted, the IPDGC was performing its own meta-analysis of Parkinson's disease datasets, most of which were from European countries. Prior to publication, the IPDGC scientists reached out to 23andMe, and both groups agreed to exchange associations for the purpose of replication. As a result of this exchange, the IPDGC scientists were able to confirm both of the novel associations identified by 23andMe, and 23andMe scientists were able to replicate five of the 7 new associations identified by the IPDGC, said Dr. Eriksson.

Dr. Langston said that the epidemiological tools in place in the 23andMe platform — questionnaires that gather information on environmental exposures, family histories, health and disease — must be validated if they are to be used for research purposes. He agreed that the potential error from the use of self-reported data is a downside to the research. But, he added: “With these kinds of numbers you can tolerate a lot of noise.”



Dr. Langston is now trying to validate the environmental data with another study funded by the Michael J. Fox Foundation for Parkinson's Research. Dr. Langston and his colleagues will study eight environmental exposures using both web-based questions and in-person exams and assess whether the data match up.

23andMe is collaborating with the IPDGC and other groups in order to conduct another meta-analysis of known loci and will attempt to validate these new genetic findings. This larger study seeks to combine data from all of the major Parkinson's disease genetics groups around the world, and will include data from both the IPDGC and 23andMe. This meta-analysis is currently in progress, and the results have yet to be published. Dr. Sutherland said that the study should allow scientists to see whether “there is lasting power to this approach.”

Back to Top | Article Outline


Scientists said the company's extensive database including 1,000 phenotypes could be useful for other research studies as well. “The genotyping is a one-time cost and should be used in subsequent association studies,” said Dr. Farrer. Examples include analyses of onset age, rates of progression, motor and non-motor features (such as cognitive decline and dementia), responses to medications etc. The challenge for GWAS is enabling a standardized and longitudinal phenotypic assessment.

The study was funded by a grant from Google co-founder Sergey Brin who is married to 23andMe co-founder Anne Wojcicki. The investigators now have a total of 5,000 self-reported PD patients and the grant will allow as many as 10,000 to enroll in the study.

GWAS studies can only provide hints about where to look for disease genes. “It is a long road from a GWAS locus to identifying the underlying gene and specific variant,” said Dr. Farrer. “Without it, we don't understand the mechanism underlying the association or the biology of disease. This is a prerequisite to ‘translate’ genetic discovery, to improve diagnosis and treatment.”

Back to Top | Article Outline


Using this new methodology to recruit large groups of patients and controls, the scientists identified two novel single nucleotide polymorphisms (SNPs): rs6812193 near SCARB2 and rs11868035 near SREBF1/RAI1. The associations were replicated in an independent meta-analysis. They also replicated previously discovered genetic associations, including LRRK2, GBA, SNCA, MAPT, GAK, and the HLA region.

The analysis of the 23andMe data also led the scientists to conclude that genes play a bigger role than had been previously thought. In Parkinson disease, genetics always took a backseat to environment and that still is the case. Pathogenic mutations in known genes explain only about 6 to 7 percent of the variability. The 23andMe study suggests the heritability of the disease is about .27, which means that one-fourth of the variation in risk for Parkinson can be attributed to genetics. This concurs with the latest estimates from the longitudinal evaluation of monozygotic twin pairs (Wirdefeldt K et al., 2011).

But Matthew Farrer, PhD, a professor of medical genetics and Canada Excellence Research Chair in Neurogenetics at the University of British Columbia, added that it is important to interpret the results given the methodology. “Heritability will vary from population to population, as does the frequency of genetic susceptibility variants.”

He added that the quality of diagnosis, population structure and age of the study recruits may all have a profound influence on heritability estimates and the discovery that can be made. The complexity and uncertainty of the diagnosis is also a problem in identifying genetic associations, and this is particularly an issue with age-associated neurodegenerative disorder believed to have an environmental trigger, he explained.

Heritability is usually defined as the proportion of total phenotypic variation that is due to additive genetic factors. The other major component of variance is generally ascribed to environmental influences. In almost every trait that GWAS have attempted to map, associated SNPs only appear to explain a small proportion of genetic variation in the sample from that population.

Back to Top | Article Outline


Eriksson N, Macpherson JM, Naughton B, et al. 2010 Web-based, participant-driven studies yield novel genetic associations for common traits. PLoS Genet 2011; E-pub 2011 Jun 23.
    International Parkinson's Disease Genomics Consortium (IPDGC) Wellcome Trust Case Control Consortium 2 (WTCCC2) 2011. A two-stage meta-analysis identifies several new loci for Parkinson's disease. PLoS Genet 2011; E-pub 2011 Jun 30.
      Wirdefeldt K, Gatz M, Pedersen NL, et al. Heritability of Parkinson disease in Swedish twins: A longitudinal study. Neurobiol Aging 2011; E-pub 2011 Apr 9
        ©2011 American Academy of Neurology