That Research Subject's Confidentiality? How Facial Recognition Software Could Compromise It
By Mark Moran
November 21, 2019
Article In Brief
A group of researchers describe an experiment, which shows that facial recognition software has the potential to declassify and identify the faces of subjects who participate in research protocols involving MRI scans.
Artificial intelligence (AI) and biomedical data sharing are twenty-first century advances that separately, and in different ways, are advancing patient care. Together, though, they also create the potential for a breach of research subject confidentiality and privacy.
So suggests an experiment—described in correspondence to the New England Journal of Medicine(NEJM) on October 24—that showed how facial recognition software could successfully identify subjects from MRI images enrolled in research protocols.
Christopher Schwarz, PhD, assistant professor of radiology at the Mayo Clinic and colleagues, who conducted the experiment, said it has never happened in real-world settings (to anyone's knowledge). But he and other neuroimaging experts who spoke to Neurology Today agreed that the technology that can be weaponized for non-medical, possibly malicious, purposes is always improving.
“Someone with access to de-identified MRIs from a research study and a belief or suspicion that a specific individual's images may be contained in that study, could potentially identify that individual,” Dr. Schwarz told Neurology Today.
“They could do this by creating a reconstruction of the face from each research MRI and training commercial face recognition software to recognize each individual MRI-based face,” he said. “Finally, they could provide a photo of the individual of interest to the software and ask it which of the faces in the study is a match. Thus, the photograph of an identified individual would be matched to all research data collected by that study. In other words, face matching could potentially establish the link between publicly available photographs and a person's private medical information.”
It's not totally science fiction. In the experiment described in NEJM, Dr. Schwarz and colleagues recruited 84 volunteers between the ages of 34 and 89 years old and photographed each participant's face from five slightly varying angles. Each participant had undergone MRI of the head within the previous three months in association with their participation in the Mayo Clinic Study of Aging or in other studies conducted at the Mayo Clinic Alzheimer's Disease Research Center.
From each MRI scan, they used an automated system to reconstruct a three-dimensional computer model of the participant's face and create two-dimensional photograph-like images. They then applied a commercially available facial recognition software to match the 84 possible MRI-constructed faces with the actual photographs of the study participants. For each photograph, the software returned a ranked list of the 50 closest matches from the set of MRI-derived faces, with a confidence score for each.
The facial recognition software was ominously accurate: It chose the correct MRI scan as the most likely match for 70 of the 84 participant photographs (83 percent), and the correct MRI scan was among the top five choices for 80 of 84 participants (95 percent).
In the NEJM correspondence, Dr. Schwarz and colleagues noted that the current standard of removing only metadata in medical images may be insufficient to prevent reidentification of participants in research.
“Existing software for the removal or blurring of faces in medical images is rarely used, because these methods can reduce the quality of gray matter volume and cortical thickness measurements and may still not fully prevent reidentification,” they wrote. “Further research is needed to develop improved deidentification methods for medical imaging that contains facial features.”
So far, no data breach using this technology has been reported. “Research studies vary in how widely they share research data, but typically they grant access only to individuals who legally agree not to attempt to identify participants,” Dr. Schwarz said. “The overwhelming majority of researchers with access to the data will respect this.”
But a theoretical problem left unaddressed is likely to become a real one. “Individuals who have participated in a brain MRI study may be at risk if someone with access to that MRI data has a motivation to find them specifically and believes that they were one of its participants,” he said.
Why It Could Happen
Experts who commented on the report for Neurology Today agreed. “It's really unclear from a practical point of view how likely this is, but AI methods are getting better all the time,” said Michael Weiner, MD, professor of radiology and biomedical imaging at the University of California, San Francisco. “3D reconstruction software that can construct a face from an MRI image has been available for years. What's new are the AI programs that can potentially link a face from an MRI with a photo on the internet.”
Anyone with a “footprint” in the digital world, including photographs on social media, is potentially vulnerable, he said.
“Publicly available data sets and sharing policies entail that MRI scans often come with other subject information—demographics, blood-based markers, genetics,” said Christian Habeck, PhD, associate professor of neuroimaging at Columbia University Medical Center. “So, if a face that was reconstructed on the basis of an MRI scan could be matched to the face of a person with known identity, this would be very concerning indeed.”
Dr. Habeck said he believes for now the threat remains somewhat hypothetical. “The study in NEJM enjoyed the best possible matching conditions: It was a relatively small set of high-quality photographic reference images with confirmed identity links that the reconstructed facial image had to be compared to with a matching score. For a rogue actor to apply this to the internet in general, the hurdles would be considerably higher—an infinite set of images for comparison with questionable identity links and possibly poor photographic quality.”
But Dr. Habeck emphasized that the NEJM experiment has demonstrated proof of concept, and there is no room for complacency.
Dr. Schwarz told Neurology Today that he and colleagues at Mayo are working to develop technology that could more efficiently disguise MRI images without losing image quality. In the meantime, researchers and research institutions should be aware of the potential risk and communicate it to potential study subjects.
Research participants should always be told that there is a risk of loss of privacy, and research study protocols will need to be modified to inform people of this new potential risk.” Dr. Weiner said. “There are many safeguards in place to protect clinical and research data, but safeguards can always be overridden. Computer systems can be hacked, mistakes can be made by personnel that allow data to be identified. The availability of genetic data allows people to be identified, and we leave our genetic material on eating and drinking utensils every day. We are in an era where our privacy is not as secure as it once was.”
Dr. Schwarz had no competing interests.