Identification of key genes and their functions in palbociclib-resistant breast carcinoma by using bioinformatics analysis

Background: Palbociclib resistance is a significant problem in breast carcinoma, and its underlying molecular mechanisms remain poorly understood. This study aims to elucidate the molecular mechanisms of palbociclib resistance and to identify the key genes and pathways mediating progesterone resistance in breast cancer (BC). Methods: Gene dataset GSE117743 was downloaded from the Gene Expression Omnibus (GEO) database, which included 3 palbociclib-resistant and 3 palbociclib-sensitive BC cell lines. Then, we calculated the differentially expressed genes (DEGs) by using R software. Gene ontology and Enriched pathway analysis of genes we identified were analyzed by using the Database for Database of Annotation Visualization and Integrated Discovery (DAVID) and R software. The protein-protein interaction network was performed according to Metascape, String, and Cytoscape software. Results: In total, 447 DEGs were selected, which consisted of 67 upregulated and 380 downregulated genes. According to gene ontology annotation, DEGs were associated with cytoplasm, signal transduction, and protein binding. The research of the Kyoto Encyclopedia of Genes and Genomes (KEGG) demonstrated that genes enriched in certain tumor pathways, including IL-17 signaling pathways and Herpes simplex infection signaling pathways. Also, certain hub genes were highlighted after constructed and analyzed the protein-protein interaction network, including α-2A adrenergic receptor, cytochrome P450 subfamily IIR polypeptide, Cystathionine β-synthase, nucleotide-binding oligomerization domain-containing, erythropoietin-producing hepatocellular receptor A2 and adrenomedullin, which may be related with BC prognosis. A total of 4 of 6 hub genes had a significant relationship with the overall survival (P<0.05). Conclusions: Using microarray and bioinformatics analyses, we identified DEGs and determined a comprehensive gene network of progesterone resistance. We offered several possible mechanisms of progesterone resistance and identified therapeutic and prognostic targets of palbociclib resistance in BC.

By 2018, 1,762,450 new tumor patients are been made definite diagnosed and 606,880 among them died in the United States. Of them, breast carcinoma makes up 30% (268,600) of all new female tumor cases and 15% (41,760) of all tumor-related deaths [1] . Although we have made great progress in prediagnosis and treatment, medicine resistance and molecular heterogeneity have influenced the active treatment of drugs. Every year, the increasing prevalence of breast carcinoma in the United States and the relevant treatment of malignant breast carcinoma bring huge pressure to the national economy and health care industry [2] .
According to recent treatment views, like radiotherapy and adjuvant chemotherapy, have promoted the overall survival of patients, particularly in early-stage breast tumor of women who were diagnosed timely [3] . However, among these treatments, side effects like psychological effects, gastrointestinal discomfort, adverse physical, and psychological effects on sexuality often exist which influences the quality of life of patients. Besides, they can also take the possibility of mortality like bone loss, increased risk for secondary cancers and cardiac toxicity [4,5] . One of the significant advantages for new, molecularly targeted therapies is to lower the possibility of morbidity related to tumor also with mortality [6] . Approximately 70% of patients with breast carcinoma express the estrogen receptor (ER) and endocrine therapies are better for the treatment of ER + breast carcinoma [7] . Recently, CDK4/6 inhibitors like palbociclib, ribociclib, and abemaciclib have led to a better disease-free survival compared with the antiestrogen alone in patients with advanced ER + breast cancer (BC), and FDA has thus approved these drugs [8,9] . Even though plenty of patients with breast carcinoma treated with antiestrogens and CDK4/6 inhibitors gain benefits from this treatment plan, it is also important to investigate the resistance-associated gene. In the current study, microarray data for GSE117743 facilitated the investigation of differentially expressed genes (DEGs) in palbociclibsensitive and palbociclib-resistant BC. The functions of the DEGs were assessed by using gene ontology (GO) annotation, Kyoto Encyclopedia of Genes and Genomes (KEGG), protein-protein interaction (PPI) network, and the relationship between palbociclibresistant genes and overall survival of patients with breast carcinoma. In conclusion, we conducted this research to discover the key genes of medicine resistance and to find potential new cancer therapy targets to reduce palbociclib resistance.

Microarray data
The Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/ gds) database is a public functional genomics dataset that helps users download statistics and achieve gene expression introduction. In our research, Gene expression profile data (GSE117743) was obtained from GEO. Three palbociclib-resistant BC cells and 3 palbociclib-sensitive cell lines were included. The array data were acquired from the Affymetrix Human Genome U133A 2.0 Array [GPL571; transcript (gene) version].

DEG analysis
R software was utilized to compare 2 groups of tissues to identify different genes under identical experimental factors: different expression genes in resistant and sensitive cell lines. Besides, we used |log2FC| ≥ 1 and P < 0.05 as a cut-off criterion and an obvious statistical difference would be considered if the statistics met our standards [10] . In every experiment, the null hypothesis of the rank way was that each gene was assorted randomly; The gene would have topper ranking if the P-value was smaller. More importantly, the pheatmap package was used to draw the different genes.

Functional and pathway enrichment analysis
The KEGG pathway analysis and GO functional analysis of the DEGs we identified were conducted by taken advantage of the Database of Annotation Visualization and Integrated Discovery (DAVID) and R software. GO analysis was divided into the cellular component, biological process (BP), and molecular function, and a P-value of <0.05 was considered to indicate a statistical difference. In terms of the KEGG pathway analysis, they were identified according to the P-value of <0.05. DAVID includes many functional annotation programs to analyze plenty of biological information of genes and data that existed in DAVID was vital for the achievement of high-throughput gene functional analysis [11] . We also used R software to analyze datasets.

PPI network analysis
PPI network analysis was conducted by utilizing the Search Tool for the Retrieval of Interacting Genes (STRING). STRING supplies messages related to proven and predicted interactions among huge numbers of proteins. DEGs which were selected would then input into STRING. We would consider genes to be meaningful which had a score of > 0.9 [12] . The PPI network was built using Cytoscape and Metascape. Besides, degrees > 10 was set as the cut-off criterion. Molecular Complex Detection (MCODE) was then used to analyze the submodules of the PPI network, and the criteria we set was the number of nodes > 6 and MCODE score > 5.

Analysis of the hub genes and their relationship with breast carcinoma prognosis
The Kaplan-Meier Plotter (http://kmplot.com/analysis/) is a nonparametric statistic that is utilized to evaluate the overall survival statistics from recorded data. Also, every hub gene which was selected would be put into the online tool to evaluate the overall survival and progression-free survival of patients with BC for the Kaplan-Meier curve. This tool was built using gene expression and survival data of patients with BC, which were downloaded from GEO and The Cancer Genome Atlas (Affymetrix HG-U133A, HG-U133A 2.0 and HG-U133 plus 2.0 microarrays) [13] . As every gene mapped to a different Affy ID, each Affy ID was entered into the online tool to obtain the survival curves.

Identification of the genes between palbociclib-sensitive and palbociclib-resistant BC cells
The "t test" option in R software was used to research the gene expression profiles from the GSE117743 public microarray dataset. It highlighted the DEGs between GSM3308126, GSM3308127, GSM3308128 (palbociclib-sensitive), GSM3308129, GSM3308130, and GSM3308131 (acquired palbociclib-resistant) BC cells. According to the cut-off criteria (P < 0.05 and |log2FC| ≥ 2), 447 DEGs were identified, which were consisted of 380 downregulated and 67 upregulated genes. The top 20 upregulated and downregulated DEGs are shown in Figure 1.

Construction of a PPI network
The 447 DEGs were entered in the String website to obtain their PPI data. Then, if the combined score was ≥ 0.9, the PPIs were chosen to construct a PPI network (Fig. 4A). In the PPI network, 6 node proteins, including ADRA2A (α-2A adrenergic receptor), CYP2R1 (cytochrome P450 subfamily IIR polypeptide 1), CBS (Cystathionine β-synthase), NOD2 (nucleotide-binding oligomerization domain containing 2), EphA2 (erythropoietin-producing hepatocellular receptor A2), and ADM (adrenomedullin) had a high association with other node proteins ( > 7), which demonstrated their increased hub degrees (Fig. 4B). The hub genes may, therefore, serve a crucial role in palbociclib resistance within BC.

Relationship between hub genes and BC prognosis
The Kaplan-Meier Plotter was used to research the prognosis of patients with BC associated with genes we identified. Then, 4 hub genes were found to have a significant relationship with BC prognosis (Fig. 5). Kaplan-Meier survival analysis indicated that overexpression of the hub genes ADRA2A (Fig. 5B), CYP2R1 (Fig. 5C), and EphA2 (Fig. 5D) were related with poorer overall survival in patients with BC. However, overexpression of ADM (Fig. 5A) seemed was associated with better overall survival and the expression level of NOD2 (Fig. 5E) and CBS (Fig. 5F) may have no significant relationship with the overall survival of patients with BC. This suggested that the selected genes may be potential targets for palbociclib resistance in BC.

Discussion
In this present research, gene expression data of GSE117743 were searched from the GEO database. Three palbociclib-resistant and 3 palbociclib-sensitive BC cell lines were identified for analysis. A total of 380 downregulated and 67 upregulated DEGs were identified among palbociclib-resistant and palbociclib-sensitive BC. Function annotation indicated that these DEGs were primarily related to cell membrane components, regulationassociated terms, regulator-associated and binding-associated components. KEGG pathway analysis of DEGs revealed involvement in "IL-17 signaling pathways," "Herpes simplex infection," "Influenza A," "Hepatitis C," and "NOD-like receptor signaling pathway." According to previous studies, IL-17 signaling pathways promoted resistance to paclitaxel in breast tumors through activation of the ERK1/2 pathway [14] . Also, in  terms of the "NOD-like receptor signaling pathway," a previous study reported that it may be related to a higher risk of certain cancer types like colorectal cancer, gastric cancer, lymphoma, and as well as BC [15] . Therefore, the above KEGG pathways mentioned in this study may assist us to find new diagnostic and therapeutic ways in palbociclib-resistant BC. Also, PPI network analysis indicated that 6 hub genes may be used as new targets in palbociclib-resistant BC which had higher degrees of interaction. The first gene ADRA2A (α-2A adrenergic receptor), encodes a protein receptor of 450 amino acid residues [16] and locates in chromosome 10q25.2. Almost all tissues and organs in our body express it. It is also involved in some functions in the platelet aggregation, cardiovascular system, central nervous system, lipolysis, blood pressure, insulin secretion, and neurotransmitter release [17] . Besides, patients with pancreatic cancer who have a lower expression of this receptor-related with higher insulin secretion [18] . it also regulates adrenergic suppression of insulin secretion and associated with poor prognosis of breast carcinoma: advanced TNM stages, larger tumor size, positive lymph node involvement, higher Her2, and nottingham prognostic index positive status by learning from previous studies [19,20] .
CBS (Cystathionine β-synthase), the second gene, encodes a protein of 551 amino acids which is almost expressed in the liver, brain, kidney, and pancreas. Overexpression of CBS in cancer tissues or cell lines has been found in ovarian [21] , colon [22,23] , prostate [24] , and BC [25] , in contrast with normal tissues or nontransformed cell lines. On the one hand, CBS promotes cancer progression and growth, on the other hand, it can also initiate cancer formation. More than that, lower expression of CBS takes down antioxidant capacity and promotes the sensitivity of tumor cells to chemotherapeutic drugs. The cytoprotective influence of CBS is also related to the regulation of p53 apoptosis-related and NF-κB signaling pathways [26] . With the identification of the function of CBS in the tumor, it will become more significant to take advantage of CBS as a predictive and prognostic biomarker. Also, the third gene CYP2R1 (cytochrome P450 subfamily IIR polypeptide 1), is a key enzyme involved in vitamin D 25 hydroxylation [27] . It encodes microsomal vitamin D 25-hydroxylase and catalyzes the first hydroxylation reaction in the biosynthesis of vitamin D [28] . According to further studies, it was noticed by a protective effect of this single nucleotide polymorphisms against more advanced disease, but we need more researches to find the mechanism in palbociclib resistance cell lines [29] .
EphA2 (erythropoietin-producing hepatocellular receptor A2), the fourth gene, plays a role in critical BPs, like cancer progression, metastasis, and angiogenesis. Many further studies researching EphA2 targets were applied to the treatment of tumors. According to recent researches, we identified the 64CU-DOTA-1C1 antibody in colorectal cancer, EA2, B233, 3F2-WT antibody in BC and EA5 antibody in ovarian cancer [30][31][32] . In clinical researches, EphA2 was identified to be overexpressed in many malignant carcinomas and might be used as a prognostic marker of poor prognosis in patients with tumors, except for lung cancer [33] . In terms of researches on drug resistance, one study believed that the EphA2-to-YAP pathway promoted gastric cancer growth and therapy resistance [34] , and another study found that EphA2 inhibition combined with eicosapentaenoic acid has great efforts in preclinical models of triple-negative breast tumor by disrupting cellular cholesterol efflux [35] .
Another huge gene, NOD2 (nucleotide-binding oligomerization domain containing 2), is a member of evolutionarily conserved Nod-like receptors family which share a tripartite structure of a terminal sensor domain (leucine-rich repeats, LRRs), a central nucleotide-binding oligomerization domain (NOD) [36] . NOD2 has already been certified to have a relationship with a high risk of breast tumor [37][38][39] , reduces estrogeninduced proliferative responses and promotes apoptosis in BC [40] . The previous analysis indicated that overexpression of NOD2 disrupts immune-related pathways, especially those signaling through NF-κB which are activated via TNF signaling pathways and antigen presentation signals (HLA class I histocompatibility proteins) [41] . Also, another study reported that lower expression of NOD2 promotes both cellular adhesion/migration and proliferation via MAPK signaling pathways and PI3K/Akt/ mTOR signaling pathways [42] .
Furthermore, the relationship between the mentioned hub genes and the overall survival curves for patients with BC revealed that overexpression levels of the hub genes ADRA2A, CYP2R1, and EphA2 were related to worse prognosis in patients with BC. Besides, overexpression of ADM seemed was associated with better overall survival. However, the expression level of NOD2 and CBS may have no significant relationship with overall survival according to the above database.
These findings approved that these hub genes may play a vital role in palbociclib resistance to BC. However, the molecular reasons for palbociclib resistance, and whether these huge genes can be used as targets for the treatment of palbociclib-resistant BC remain unknown. Since all our data were achieved from the GEO database by bioinformatics tools, as well as the limited number of relevant BC cell lines and lack of multiple database analysis, more data analysis, and clinical experiments should be further performed to certify its feasibility.

Conclusions
Our study concluded certain mechanisms of palbociclib resistance in breast carcinoma. Bioinformatics methods were utilized to identify the DEGs in palbociclib-resistant BC cells. Also, 6 hub genes were identified as potential targets for the treatment of palbociclib resistance. These conclusions from the present research may offer a significant understanding of the mechanisms of palbociclib resistance in BC. However, these findings may be considered for future investigation.

Ethical approval
None.

Sources of funding
None.

Author contribution
G.G. and X.S. contributed equally to this work.