ARTICLE IN BRIEF
From basic research using a mouse cortex to the search for detecting early signs of seizures, investigators are using crowdsourcing techniques to elicit more data and answers to basic and clinical neurology questions. Here, they discuss the challenges and opportunities of next-generation research techniques.
Increasingly, researchers are using crowdsourcing techniques to cull data from a broader array of sources, using everything from smart watches that track respiration to holding contests to develop better algorithms for predicting seizures.
The investigators say that crowdsourcing for data enables easier access to huge datasets, faster development, and larger outreach to other researchers. But it's not as simple as launching an app and putting the data online.
Data collection is tricky, and everyone does it a little differently — even on the same subject, several investigators told Neurology Today. Holding contests to find algorithmic answers to neurologic problems might seem novel now, but what happens when interests die out? And focusing on specific goal-oriented research requires a dramatic change in how the government and academic institutions reward progress.
Problems can emerge when people input and extract the data in inconsistent ways, for example. “There's data in and there's data out, and there are issues on both ends,” said Joshua T. Vogelstein, PhD, an assistant professor of biomedical engineering at Johns Hopkins University and its Institute for Computational Medicine. Dr. Vogelstein is a co-founder with his brother, computer scientist Randal Burns, of the Open Connectome Project, which stores large-scale neurologic data in the cloud, to “allow scientists to generate and test theories of brain function and dysfunction.”
The project, which includes 10 terabytes images from a mouse cortex dataset, is all online and open for free, and so is the code that everyone is writing to do the analysis. The cloud-based data enables users to view and analyze the images to identify neurons and synapses using special image-processing tools, and then help annotate them.
In creating the project, Dr. Vogelstein said the developers learned that they needed to create a template, so that all the data would fit the same standards. Now when someone sends data for the project, Dr. Vogelstein's team writes a customized script so that it will inject into the database.
“Doctors should never have to learn how to code, like I shouldn't have to learn how to do surgery,” said Dr. Vogelstein. “But we have to work together to find an appropriate middle ground.”
TAKING THE RESEARCH TO PATIENTS
David S. Liebeskind, MD, FAAN, FAHA, FANA, professor of neurology and director of the Neurovascular Imaging Research Core at the University of California, Los Angeles department of neurology, views crowdsourcing as a vehicle for changing the focus of research from the traditional clinical approach.
“The focus is not just on the hospital, but going to where the patients are, going to the individual level,” he said. “The focus is on longitudinal outcomes and not as much on the acute inpatient admission to an academic medical center where the individual has a specific complaint.”
With so many people wearing activity monitors and carrying smart phones, a lot of the data are worthwhile and helpful at an individual level, he said. But he acknowledged that with open sharing and access to data, there are concerns about the validity and quality of the data, as well as protecting the confidentiality of information from participants. It's not enough to pull in the data, a framework or context is important to interpret the data, he said
In a 2016 paper in Frontiers in Neuroscience, Dr. Liebeskind proposed “A Million Brains Initiative,” aimed at collecting imaging data on the brain and vessels to advance stroke research and vascular substrates of dementia. The project requires that individuals upload their brain imaging data to a secure cloud, which could then be developed into a searchable and scalable platform.
“Despite such variability in the type of data available and other limitations, the data hierarchy logically starts with imaging and can be enriched with almost endless types and amounts of other clinical and biological data,” he wrote. “Crowdsourcing allows an individual to contribute to aggregated data on a population, while preserving their right to specific information about their own brain health.”
But he said many patients may be unaware of their right to access and obtain their medical images.
COMPETITIONS FOR RESEARCH IDEAS
Benjamin H. Brinkmann, PhD, assistant professor of neurology biomedical engineering at the Mayo Clinic in Rochester, MN, already had the data, but wanted a better way to use it. Dr. Brinkmann, along with Brian Litt, MD, director of the Center for Neuroengineering and Therapeutics at the University of Pennsylvania, hosted an online competition to develop computer algorithms to detect, predict, and ultimately prevent epileptic seizures.
Hosted by the online platform Kaggle, more than 500 teams worked with shared datasets from a collaborative project with a startup company NeuroVista, and from research epilepsy recordings taken at Mayo Clinic. The contest made the recordings available on the International Epilepsy Electrophysiology Portal, www.ieeg.org, a National Institute for Neurological Disorders and Stroke-funded data-sharing platform for collaborative neuroscience research hosted by the University of Pennsylvania. The results of the contest were reported at last year's annual meeting of the American Epilepsy Society.
About $40 million had been spent over 15 years to find a program to predict seizures, with the best results reaching 65 percent, said Dr. Litt. The best result from the crowdsourcing contest reached 84 percent — in three months. The top prize was $15,000.
“There were pros and cons [to this approach], but overall it was a big success, and it helped us explore so many algorithms and figure out what features we needed to pull out of the data — it helped us see past the noise,” Dr. Brinkmann said.
The challenge with contests, said Dr. Vogelstein of Johns Hopkins, is that the situation is usually so specific that the algorithm often can't be used in other, similar scenarios and datasets. But Dr. Litt said contests require participants to design the work, specifically the data and the framework, so that it can be used much more widely.
Dr. Brinkmann said one of the winning entries in the algorithm contest for seizure detection was from a man in Israel who had his own company that predicts colon cancer. Although the contest stipulated that the winning algorithm would be released publicly, the man decided to patent his answer instead, and he was disqualified.
The team is adapting the winning answer as part of their current grant project, and, Dr. Vogelstein said, the project will require “a fair bit of rewriting.”
Although the results may need fine-tuning, Dr. Litt said the crowdsourcing contests drastically changed his thoughts on research objectives and academic success. Where the incentive has been focusing on obtaining large grants or being named first or last on an authored publication, the focus should be on obtaining actual solutions and sharing the information, he said.
“If you want to find a way to track epileptic seizures, you want people to solve the problem, not focus on the progress of one individual,” said Dr. Litt, who is helping to build an “Open Data Ecosystem for Neuroscience,” a project that is enlisting multiple centers to share data, collaborate, and crowdsource on the best methods for surgery to treat epileptic seizures.
“All the data should be posted and be completely transparent, so other people can validate it,” he said. “People could use the data, and credit you, like a publication, and you would be promoted and given funding based on your record of sharing, collaborating, and how many people use and quote your data.”
“Our goal is nothing short of changing the fabric of science, and changing the way research is funded,” Dr. Litt said.