With Patrick Cahan, PhD, Associate Professor of Biomedical Engineering at Johns Hopkins Medicine
By Sarah DiGiulio
Cancer laboratory research relies on many types of tumor cell models. Now, Johns Hopkins Medicine researchers report they've developed a new computer-based tool, CancerCellNet, to allow them to compare the RNA sequences of research tumor models with data from a cancer genome atlas to compare how closely the two sets match up. And, in the journal Genome Medicine, they've published a report documenting the cancer models with the greatest transcriptional fidelity to natural tumors using the new tool (2021; doi.org/10.1186/s13073-021-00888-w).
The investigators said the report adds to evidence that cancer cell lines grown in the laboratory are less similar to their human source because of the complex differences between a human cell's natural environment and a laboratory growth environment. The tool revealed that human cancer cells grown in culture dishes are the least genetically similar to their human sources. Other research models, such as genetically engineered mice and 3D balls of human tissue (tumoroids), tend to be more similar. On average, genetically engineered mice and tumoroids have RNA sequences most closely aligned with the genome atlas baseline data in four out of every five tumor types they tested, including breast, lung, and ovarian cancers, according to the new report.
“It is already appreciated that there are wide gaps between various cancer models and native tumors. I think that our work helps to give this understanding a quantitative framework," Patrick Cahan, PhD, Associate Professor of Biomedical Engineering at Johns Hopkins Medicine, told Oncology Times. “Our hope is that it will lead to 1) selection of the most appropriate models for a given cancer research study, and 2) a generation of better models, which will ultimately lead to a better understanding of cancer and better treatments."
Some of the key findings from the report, according to Cahan, include the following. Patient-derived xenografts have the greatest potential to be native-like in their expression profile. When all models were analyzed, in general, genetically engineered mouse models and tumoroids were on average more similar to native tumors than either patient-derived xenografts or the more commonly used models, cancer cell lines.
“We identified tumor types that have good models, and those that are poorly served by current models. For example, none of the models we analyzed faithfully replicated the gene expression state of esophageal carcinoma," Cahan stated. In several tumor types, genetically engineered mouse models tended to reflect mixtures of subtypes rather than conforming strongly to single subtypes. It was found that many cancer cell lines are not classified as their annotated labels. Cahan shared his thoughts about the new tool and hope that other cancer researchers will use the tool, which has been made available as a free, downloadable package.
1. What led you and your colleagues to develop CancerCellNet now?
“This is something that we have been working towards since 2014. However, there are several things that have made it especially timely.
“First and most importantly, it has become easier to generate new cancer models recently, and thus more models are emerging. Technologies that have enabled this include tumoroids, and genome engineering techniques, such as CRISPR/Cas9. Patient-derived xenografts continue to be popular, and in a sense, each patient-derived xenograft can be considered its own unique model.
“So, with this explosion of new model technologies, we need an accompanying metric or method to understand their relation to native tumors."
2. How exactly does CancerCellNet work?
“CancerCellNet is a computational tool that takes as input expression data of a cancer model and outputs the probability that that model is indistinguishable in its gene expression state from each of 22 solid tumor types and 36 subtypes. CancerCellNet is a particular type of machine learning algorithm called a Random Forest. While this class of method pre-dates the current deep learning trend, it is still remarkably powerful, resistant to overfitting, and does not require as much training data as many more recent methods."
3. What are the implications of this work? How should researchers use this tool?
“The major implication is that this easy-to-use tool can be applied to any newly minted model to better understand how it compares both to native tumors and to existing models. It can be used to help researchers choose among potential models so that they can select the ones most appropriate for their specific question and goal.
“We are working to extend this to the single-cell level of resolution, as native tumors and even isogenic tumor models are heterogenous. Additionally, transcriptional state is only one window into the molecular state of the tumor. We need to extend this to other crucial contributors of tumor behavior, including genome and epigenome.
“Right now, CancerCellNet only gives a few readouts of similarity. To make it more useful to researchers that are choosing models, we need to provide more fine granularity in terms of how the models diverge from native tumors, and how this might impact their behavior. Finally, we are very actively pushing to using predictive analytics to improve the fidelity of models to native tumors."