CeRNA network identified hsa-miR-17-5p, hsa-miR-106a-5p and hsa-miR-2355-5p as potential diagnostic biomarkers for tuberculosis : Medicine

Journal Logo

Research Article: Observational Study

CeRNA network identified hsa-miR-17-5p, hsa-miR-106a-5p and hsa-miR-2355-5p as potential diagnostic biomarkers for tuberculosis

Song, Jie PhDa; Sun, Jiaguan MDa; Wang, Yuqing MDb; Ding, Yuehe MDb; Zhang, Shengrong MDb; Ma, Xiuzhen MDb; Chang, Fengxia MDb; Fan, Bingdong MDb; Liu, Hongjuan MDb; Bao, Chenglan MDb; Meng, Weimin PhDb,*

Author Information
Medicine 102(11):p e33117, March 17, 2023. | DOI: 10.1097/MD.0000000000033117


1. Introduction

Tuberculosis (TB) is the most common chronic communicable disease caused mainly by Mycobacterium tuberculosis (Mtb).[1] It is the ninth leading cause of death worldwide, even ranking above HIV/AIDS.[2,3] The World Health Organization estimated that approximately 1 quarter of the total population were already infected with Mtb, and about 10.0 (8.9–11.0) million people were diagnosed with TB in 2019 ( It imposes a huge health hazard and tremendous socio-economic burden.[4]

Early accurate diagnosis is crucial to TB control.[5] While the diagnostic methods in routine clinical practice have some inherent limitations. For example, direct sputum smear staining is simple and rapid but has suboptimal sensitivity and specificity.[6] Conventional sputum culture is the gold standard, whereas it generally takes an average of 4 to 5 weeks and leads to delays in diagnosis and treatment.[7] Imaging inspection (such as plain chest radiograph and computed tomography) is difficult to discriminate the Mtb infection from other lung diseases.[8] Tuberculin skin test is a recommended diagnostic approach with merits of low-cost and convenience, however, it has low specificity and could cross-react with bacille Calmette-Guerin vaccination.[9] Xpert MTB/RIF assay is an emerging and available automatic molecular diagnosis method, while the high-cost restricts its widespread use.[10] Therefore, it is urgent to establish sensitive and efficient methods for the diagnosis of Mtb infection. Notably, relevant studies in non-coding RNAs have provided new avenues.[11] Several pilot studies illustrated that Mtb infection was associated with dysregulated expression of microRNA (miRNA), and the expression level of miRNAs in peripheral blood could discriminate TB patients from healthy individuals.[12] For example, Wagh et al[13] found that miR-155 was significantly reduced in untreated TB patients compared to healthy controls, while returned back in cured TB patients. Similarly, Ndzi et al[14] reported that hsa-miR-29a-3p was significantly upregulated in active TB patients than healthy controls, and hsa-miR-29a-3p showed a good diagnostic performance (sensitivity: 80%, specificity: 71.43%).

Long non-coding RNAs (lncRNAs), circular RNAs (circRNAs) and miRNAs are non-coding RNAs, which have been considered as important regulators in host-Mtb complex interactions.[15] Non-coding RNAs can interact with mRNA and affect mRNA’s function at the posttranscriptional level by working as miRNA sponges, that is the competitive endogenous RNA (ceRNA) hypothesis. In this study, we recruited 5 newly-diagnosed pulmonary tuberculosis (PTB) patients for whole transcriptome sequencing and functional annotation and pathway enrichment analysis. Finally, ceRNA networks were established to identify potential diagnostic biomarkers.

2. Materials and methods

2.1. Patients recruitment and peripheral blood samples collection

Five young females with PTB were recruited from The 4th People’s Hospital of Qinghai Province, all volunteers were diagnosed according to sputum Ziehl-Neelsen acid-fast staining and Lowenstein-Jensen slant culture. All the participants in this study were free from known basic diseases, such as diabetes, hypertension, malignant diseases, or immunodeficiency diseases. The present study received approval from the Institutional Review Board of The 4th People’s Hospital of Qinghai Province (2019-ZJ-7084), and all volunteers provided written informed consent before enrollment.

Peripheral blood samples were collected using PAXgene Blood RNA tubes (BD, USA) before anti-TB therapy, and stored at −80 °C. The second follow-up was conducted after a 6-month drug regimen. Blood samples were collected again and sent to a commercial company for whole transcriptome sequencing.

2.2. The preparation of whole-transcriptome sequencing

The preparation of peripheral blood mononuclear cells was performed as Zeng et al[16] described. We adhered to the manufacturer’s protocol strictly to extract total RNA using a Trizol reagent kit (Invitrogen, Carlsbad, CA). The integrity of the RNA was assessed by RNase-free agarose gel electrophoresis. After the extraction of total RNA, rRNAs were removed to reserve mRNAs and non-coding RNAs. The workflow comprised the major steps as follows: the fragmentation of mRNAs and non-coding RNAs, reverse transcription and second-strand cDNA synthesis, the purification cDNA fragments, end-repaired, poly(A) added, ligated to Illumina sequencing adapters, the digestion of the second-strand cDNA, RNA size selection and library construction. Finally, the library was paired-end sequencing on the Illumina HiSeqTM4000 platform (Illumina, Guangzhou, Guangdong Province, China) by Gene Denovo Biotechnology Co. (Guangzhou, Guangdong Province, China).

2.3. Filtering of clean reads and quantification of transcripts abundance

The obtained reads from the sequencing machines frequently include adapters and low-quality bases. Thus, fastp (version 0.18.0) and strict criteria were applied to filter out sequencing adapters and low-quality bases for the acquisition of high-quality clean reads. The exclusion criteria for raw reads are as follows: raw reads with adapter sequences; raw reads with over 10% unknown nucleotides; eliminate low-quality reads that the ratio of low-quality bases (Q-value ≤ 20) exceeds over 50%.

According to the reference-based approach, the relative abundance of transcripts was quantified using the software StringTie. The expression abundance of each transcription region was assessed by fragment per kilobase of transcript per million mapped reads (FPKM) values. The FPKM is an effective normalization method to reduce the influence of gene length and sequencing level on the calculation of gene expression.[17] Therefore, the calculated transcripts expressions are able to evaluate the differences in transcripts expression among samples.

2.4. Screening differentially expressed RNAs and enrichment analysis

In the present study, fold change (FC) was calculated by dividing post-treatment gene expression levels by pretreatment levels. The selection criteria for differentially expressed RNAs (DE RNAs) were |log2 (FC)| > log2 (1.5) and P value < .05. To predict the putative biological functions and pathways of DE-mRNAs, gene ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were conducted with clusterProfiler. GO comprises 3 sub-ontologies: biological process (BP), cellular component (CC), and molecular function (MF). All relevant GO terms enriched in DE RNAs were identified in the GO enrichment analysis by comparison with the background genome. The KEGG pathway enrichment analysis was used for determining the remarkably enriched signal transduction and metabolic pathways. The cutoff value for statistical significance was false discovery rate <0.05.

2.5. Construction of ceRNA network

CeRNA regulatory networks were constructed with DE RNAs. The primary principles are as follows: the Spearman Rank correlation coefficient was calculated to estimate the expression correlation of lncRNA-miRNA pairs and mRNA-miRNA pairs. Pairs with Spearman Rank correlation coefficient cutoff threshold <−0.7 were picked out to be co-expressed negatively lncRNA–miRNA pairs and mRNA-miRNA pairs. The Pearson correlation coefficient was calculated to estimate the expression correlation of lncRNA-mRNA pairs. Pairs with Pearson correlation coefficient cutoff threshold >0.9 were picked out to be co-expressed positively lncRNA-mRNA pairs. The hypergeometric cumulative distribution function test was administered to verify the statistical significance of the common miRNA sponges shared by 2 genes (P value < .05). The visualization of ceRNA network was actualized with Cytoscape software (v3.6.0).

2.6. Statistical analysis

Numerical data were recorded in the format of mean ± standard error of the mean (SEM). The DE RNAs was selected with |log2(FC)| > log2(1.5) and P value < .05. The statistical differences in transcripts expression levels were determined by paired 2-tailed Student t test, and P value < .05 was deemed to represent a statistically significant difference. The correlation of ceRNA pairs was tested by the hypergeometric cumulative distribution function test, and was considered significant when the P value < .05.

3. Results

3.1. Demographic and clinical characteristics of volunteers

In the present work, we recruited 5 young female PTB patients. The volunteers ranged from 15 to 21 years of age (mean age:17.4 ± 3.3 years). Prior to the diagnosis, volunteers reported several classic TB symptoms, such as chronic cough (5/5), fever (2/5), chest pain (2/5), and night sweats (1/5). And all volunteers were diagnosed by sputum Ziehl-Neelsen acid-fast staining and Lowenstein-Jensen slant culture. After a 6-month drug regimen, symptoms disappeared in all volunteers and they were diagnosed as cured by doctors.

3.2. Quality control and alignment analysis of sequencing data

After whole-transcriptome sequencing, massive amounts of sequencing raw data were available from 5 volunteers before and after medical treatment. For further bioinformatics analysis, the low-quality reads were trimmed and adapters were clipped. Characteristics of the data cleaning procedures are shown in the Table 1. The result illustrated that the proportion of clean reads in each sample was 99.8% approximately. Besides, base composition and base quality were analyzed to assess the sequence quality. Statistic data showed that the GC ratio was 45% approximately and the distribution of base composition was well-proportioned, indicating that the sequencing data were suitable for subsequent enrichment analyses. In addition, alignment analysis was performed to identify the transcripts sequence by aligning clean data to the human reference genome. It turned out to be that mapped reads accounted for nearly 98% of total reads (Table 1). Therefore, the RNA-seq data had a good quality and excellent reliability.

Table 1 - Statistical result of the RNA-seq data quality test.
Samples Sequencing data Alignment data
Raw data Clean data Low quality GC ration Total mapped Multiple mapped Unique mapped
Smc-p 78,202,214 78,041,868 (99.79%) 144,230 (0.18%) 5,102,431,434 (43.81%) 76,294,913 (97.98%) 2,865,076 (3.68%) 73,429,837 (94.30%)
Mjj-p 86,173,936 85,960,690 (99.75%) 15,722 (0.02%) 6,212,006,769 (48.53%) 83,860,461 (97.99%) 11,619,648 (13.58%) 72,240,813 (84.41%)
Zxlm-p 86,339,374 86,133,366 (99.76%) 194,114 (0.24%) 5,696,115,089 (44.38%) 83,685,762 (97.52%) 3,892,366 (4.54%) 79,793,396 (92.99%)
Zm-p 82,294,224 82,084,468 (99.75%) 194,114 (0.24%) 5,304,255,495 (43.23%) 80,203,982 (97.88%) 3,542,801 (4.32%) 76,661,181 (93.55%)
Pblm-p 78,165,402 77,982,338 (99.77%) 144,098 (0.15%) 5,332,263,056 (45.89%) 75,887,574 (97.59%) 6,641,793 (8.54%) 69,245,781 (89.05%)
Smc-a 96,752,556 96,592,442 (99.83%) 144,098 (0.15%) 6,242,564,497 (43.27%) 94,533,421 (98.03%) 2,739,431 (2.84%) 91,793,990 (95.19%)
Mjj-a 80,540,630 80,297,112 (99.70%) 223,814 (0.28%) 5,282,197,932 (44.14%) 77,987,877 (97.33%) 3,870,415 (4.83%) 74,117,462 (92.50%)
Zxlm-a 76,292,856 76,167,644 (99.84%) 113,028 (0.15%) 5,104,981,656 (44.94%) 70,203,886 (92.39%) 4,105,584 (5.40%) 74,309,470 (97.80%)
Zm-a 125,737,878 125,422,658 (99.75%) 289,058 (0.23%) 7,814,573,344 (41.68%) 122,700,645 (97.93%) 3,373,038 (2.69%) 119,327,607 (95.23%)
Pblm-a 94,156,934 93,967,856 (99.80%) 171,820 (0.18%) 6,063,448,973 (43.23%) 91,584,309 (97.64%) 3,822,183 (4.07%) 87,762,126 (93.57%)
Initial diagnosis: Smc-p, Mjj-p, Zxlm-p, Zm-p, Pblm-p.
Six-month therapy: Smc-a, Mjj-a, Zxlm-a, Zm-a, Pblm-a.

3.3. Identification of DE RNAs

To compare the intensity distribution from sequencing data after normalization, log10FPKM values were computed, and the violin plot of mRNA expression showed that the distribution of log10FPKM were similar in all study subjects (Fig. 1). Furthermore, principal component analysis was employed to speculate the relationship between subjects. The unsupervised hierarchical clustering analysis displayed distinct mRNA expression profiling and separated subjects into 2 groups on the basis of gene expression levels (Fig. 2). The Volcano plot was drawn to depict the different expressions of mRNAs (Fig. 3). Totally, 1469 DE-mRNAs (1137 upregulated and 332 downregulated), 996 DE-lncRNAs (529 upregulated and 467 downregulated), 468 DE-circRNAs (245 upregulated and 223 downregulated), and 86 DE-miRNAs (40 upregulated and 46 downregulated) were screened (Tables S1–S4, Supplemental Digital Content,;;;, which summarizes all sequencing data generated in this study).

Figure 1.:
Violin plot of mRNA expression in all samples. The x-axis represents patient samples and the y-axis represents Log10(FPKM). FPKM = fragment per kilobase of transcript per million mapped reads.
Figure 2:
. Principal component analysis plot of mRNA expression in all samples. The samples were labeled with different colors and shapes. The closer the dots are to each other on the graph, the more similar the samples are in gene expression trend.
Figure 3.:
Volcano plot of DE-mRNAs. The horizontal axis represents the Log2FC, the vertical axis represents the −Log10 P. The red (up-regulated) and blue (down-regulated) dots represent the difference in gene expression (|log2(FC)| > log2 (1.5), P < .05), and the black dots represent no difference. DE = differentially expressed, FC = fold change.

3.4. Functional and pathway enrichment analysis

To gain insight into the putative biological functions of DE-mRNAs, the GO functional analysis was carried out (Table S5, Supplemental Digital Content,, and further classified into 3 main subcategories such as, BP terms (662), CC terms (683), and MF terms (681). As shown in Figure 4, the top 3 enriched BP terms were cellular process (624/662, 94.26%), metabolic process (513/662, 77.49%), and single-organism process (505/662, 76.28%). The top 3 enriched CC terms were cell (651/683, 95.31%), cell part (650/683, 95.17%), and organelle (569/683, 83.31%). The top 3 enriched MF terms were binding (629/681, 92.36%), catalytic activity (223/681, 32.75%) and nucleic acid binding transcription factor activity (76/681, 11.16%). Additionally, directed acyclic graphs were used to display GO terms in detail (Fig. 4). Directed acyclic graphs can intuitively reveal the hierarchical relationship of the significantly enriched terms. A directed acyclic graph is constructed by ordering GO terms from general functional terms to more detailed functions.[18] The significantly enriched BP terms were nucleic acid metabolic process (GO:0090304), RNA metabolic process (GO:0016070), transcription and regulation of RNA metabolic process (GO:0051252). The noteworthy enriched CC terms were intracellular part (GO:0044424), intracellular organelle (GO:0043229) and nucleus (GO:0005634). The obviously enriched MF terms were nucleic acid binding (GO:0003676), DNA binding (GO:0003677) and metal ion binding (GO:0046872).

Figure 4.:
The GO enrichment analysis of DE-mRNAs. The significantly enriched biological processes, cellular component, and molecular function were listed on the x-axis, and the y-axis showed mRNA numbers of each term. Directed Acyclic Graph is the graphical display of GO enrichment results. DE = differentially expressed, GO = gene ontology.

To determine the pivotal signaling pathways that DE-RNAs participated in, a KEGG pathway analysis was performed (Table S6, Supplemental Digital Content, The pathway enrichment analysis was visually compiled in Figure 5. The result indicated that HTLV-l infection (P = .000508) was the most enriched pathway, followed by T cell receptor signaling pathway (P = .002813), glycosaminoglycan biosynthesis – heparan sulfate/heparin (P = .005069) and hippo signaling pathway (P = .006061). Moreover, Reactome pathway enrichment analysis was used to explore biological reaction pathways (Fig. 6). It turned out that the top 3 signaling pathways were latent infection of Mycobacterium TB (R-HSA-1222352), Mtb iron assimilation by chelation (R-HSA-1222449) and lactoferrin scavenges iron ions (R-HSA-1222491), respectively (Table S7, Supplemental Digital Content,

Figure 5.:
KEGG pathway enrichment analysis of DE-mRNAs. X-axis represents rich factor. Y-axis represents KEGG pathway terms. The size of the circle denotes the number of genes enriched in the pathway whereas different colors indicate their different q values. KEGG = Encyclopedia of Genes and Genomes.
Figure 6.:
Circos plot of Reactome pathway enrichment. The outermost circle: the top 20 enriched Reactome pathways, outside of the circle is a coordinate scale for the number of genes. Second circle: the number of Reactome pathways in the background genes and the Q-value. Third circle: the number of differential genes in this Reactome pathway. Fourth circle: the Rich Factor value of each Reactome pathway.

3.5. Construction of ceRNA regulatory network and identification of diagnostic biomarkers

All DE-mRNAs were mapped to Disease Ontology database for describing gene function and disease (Table S8, Supplemental Digital Content, The most striking result to emerge from the Disease Ontology analysis was that IL-32, B3GAT1, CAMP, PRF1 and CD247 were enriched in TB (DOID 399), KMT2A and PRF1 were enriched in chronic granulomatous disease (DOID 3265). According to the ceRNA theory, ceRNAs competitively bind the same miRNAs by miRNA response elements, and relieve the miRNA-mediated post-transcriptional repression. Ultimately, by constructing lncRNA-miRNA-gene pairs with lncRNA as a decoy, miRNA as the center, KMT2A and PRF1 as the target, a 14 lncRNAs-5miRNAs-1mRNA network involved in chronic granulomatous disease was built (Fig. 7A). Similarly, an 11 lncRNAs/1 circRNA-2miRNAs-2mRNAs network associated with TB was also established (Fig. 7B). It can be clearly seen that non-coding RNAs competitively bind the same miRNAs, indirectly upregulating the expression of IL-32, B3GAT1 and KMT2A to influence TB progression. In ceRNA regulatory network, 1 miRNA could target multiple mRNAs, and reciprocally, individual mRNA may be regulated by multiple miRNAs, but only hub nodes with high node degrees are generally considered as key roles in the biological process. In light of the high connectivity degrees of hsa-miR-106a-5p, hsa-miR-17-5p and hsa-miR-2355-5p, they are hub nodes and might be promising novel biomarkers of TB.

Figure 7.:
The ceRNA network of chronic granulomatous disease (A) and TB (B). The circle represents the nodes (DE-RNAs), node size indicates interaction number, node color indicates expression level. The green nodes, blue nodes, gray nodes and red nodes indicate down-regulated miRNAs, upregulated circRNAs, upregulated lncRNA and upregulated mRNAs, respectively. DE = differentially expressed, TB = tuberculosis.

4. Discussion

Tuberculosis is a prevalent infectious disease, and the precise mechanisms of TB pathogenesis and drug resistance has been a concerning problem. Stringent response in Mycobacteria has been deeply researched, and it is reported that stringent response helps Mtb establish long-term infection and survive in hostile conditions by modulating important biological processes, such as biofilm formation, antibiotic resistance, and virulence.[19] The mechanisms of drug resistance in Mtb are complicated, and post-translational modification is one of the potential causes. One third of the Mtb proteome undergoes a post-translational modification to regulate various aspects of cell physiology, even affect mycobacterial metabolism and virulence by phosphorylation (Ser/Thr/Tyr and His/Asp), acetylation, glycosylation, and pupylation.[20] In addition, the literature reported that HupB, a nucleoid-associated protein, could contribute to the acquisition of Mtb drug resistance to rifampicin and isoniazid by modulating the levels of surface lipids including phthiocerol dimycocerosates and the permeability of cell walls.[21] Despite the extensive research on TB pathogenesis and drug resistance, the specific mechanism still needs to be further explored. Therefore, we analyze the regulatory non-coding RNAs in the pathological process of TB, and identify potential diagnostic biomarkers.

In this bioinformatics study, aberrantly expressed lncRNAs, circRNAs, miRNAs and mRNAs were screened, after comprehensively analyzing the whole transcriptome expression profiles at the time of Mtb infection and clearance. Functional enrichment analysis elaborated that the DE-mRNAs were not merely linked with transcription and regulation of RNA metabolic process, but executed the molecular function of nucleic acid binding and metal binding in nucleus or intracellular organelle. Enrichment pathway analysis implied that the DE-mRNAs participated in pathogen infection, T cell receptor signaling pathway, glycosaminoglycan biosynthesis-heparan sulfate/heparin, and Hippo signaling pathway in host-pathogen interactions. CeRNA regulatory network implied that hsa-miR-17-5p, hsa-miR-106a-5p, and hsa-miR-2355-5p potentiated to be novel TB biomarkers. This comprehensive bioinformatic analysis not only deepens the understanding of the underlying pathogenesis mechanism but sheds novel light on potential molecular targets for clinical diagnosis and therapeutic strategies against TB.

The KEGG enrichment pathway analysis showed that HTLV-l infection was closely related to TB pathogenesis. Similar to Mtb, HTLV-1 is also an intracellular parasitic pathogen, and can infect innate immune cells through cell-to-cell contact.[22] In addition, HTLV-1 infection and Mtb infection share common immunological characteristics, that is the proliferation of lymphocytes, especially activated mononuclear cells. The mononuclear cells exert a prominent role in the innate immune response, and they are crucial to control Mtb infection.[23] During the course of Mtb invasion, activated mononuclear cells proliferated and exhibited a protective effect through secreting pro-inflammatory cytokines, forming phagolysosomes and triggering cellular immune reactions.[24] However, Mascarenhas et al[25] reported that spontaneous proliferation of peripheral blood mononuclear cells is an important immunological feature of HTLV-1-infected individuals. Thus, it is speculated that the HTLV-1 infection pathway was enriched in the present study perhaps due to the increased number of mononuclear cells in the process of bacterial infection. Moreover, recent study further reported the association between the occurrence of TB and HTLV-1 infection. According to Bastos et al[26] study, 10% of TB patients were also infected with HTLV-1, and HTLV-1 infection could increase the susceptibility to TB. A literature review pointed out that patients previously infected with HTLV-1 are prone to TB due to changes in the innate immune response or chronic immunosuppression, and summarized the relevant immunologic events, including impairment of cytokines (TNF-α, IL-1β, and IL-17) production, overexpression of TGF-β and dysfunction of HTLV-1-infected lymphocytes.[27]

Another striking KEGG enrichment pathway was T cell receptor signaling pathway. T cell receptor-mediated signaling is an interactive pathway for cytokine production in TB, primarily because Mtb is one of the well-known intracellular pathogens to thrive inside macrophages, and T cell effector mechanism is essential to limit bacterial infection and promote bacterial clearance rather than antibodies.[28] The variation of T-cell signal transduction may be resulted from the impaired immune function of T cells, because the dysfunction of circulating T lymphocytes and mononuclear cells is a common feature in TB patients, especially at chronic infection stage.[29] Furthermore, Mahon et al[30] reported that the Mtb cell wall glycolipids directly inhibited CD4+ T cells activation by interference with proximal T cell receptor signaling. The follow-up study gave a more detailed explanation that Mtb mannose capped lipoarabinomannan inserted in T lymphocytes membranes to disrupt the T cell receptor signaling pathway, eventually hindered the activation of CD4+ T cells.[31] CD4+ T cells were known to facilitate adaptive immunity against Mtb by producing IFN-γ and interleukin (IL-17 and IL-22).[32,33] Altogether, the present enrichment pathway analysis is well in agreement with the above studies on immune responses against TB.

The third KEGG enrichment pathway was glycosaminoglycan biosynthesis-heparan sulfate/heparin. Heparin is one of the highly sulfated glycosaminoglycans, and it is synthesized and secreted from activated mast cells in inflammatory processes.[34] The mycobacterial heparin-binding hemagglutinin adhesin is known to be a virulence factor for adhesion, aggregation, and dissemination in Mtb infection.[35] Nevertheless, the adherence of mycobacteria to alveolar epithelial cells can be specifically inhibited by heparin and other sulfated carbohydrates.[36] Thereafter, Abreu et al[37] found that heparin reduced iron availability of intracellular bacilli to limit bacterial replication through iron-chelation mechanisms, and exhibited a protective immunomodulatory role in Mtb infection process. The detailed explanation of the mechanisms was that heparin both significantly decreased the expression of a critical iron regulatory protein – hepcidin, and elevated the expression of an iron exporter protein – ferroportin. Relevant study elaborated that the anti-TB function of heparin was achieved by iron chelation. It has been reported that the chelation of intracellular iron may inhibit Mtb replication in macrophages and restore host defense ability.[38] This is in accordance with the present result of Reactome pathway enrichment analysis (R-HSA-1222449: Mtb iron assimilation by chelation).

The last enrichment pathway is Hippo signaling pathway. The core components of the canonical Hippo pathway, mammalian sterile 20-like 1 and 2 kinases are vital for T-cell maturation, migration, homing, and differentiation.[39] Growing evidence confirmed that the Hippo signaling pathway in lymphocytes had the function of regulating the immune system and maintaining immune homeostasis.[40] Additional research showed that the Hippo signaling pathway could modulate bactericidal activity and immune responses in mammals.[41] For example, Mtb-triggered Hippo signaling regulates the expression of C-X-C motif chemokine ligand 1 (CXCL1) and CXCL2 to alter host immune responses.[42] In macrophages, chemokines CXCL1 and CXCL2 participate in Caspase1-mediated inflammasome activation and subsequent IL-1β production.[43] It has been repeatedly reported that Hippo signaling pathway will be activated in Mtb invasion. The hippo signaling pathway is initiated in patients with multi-drug resistant TB.[44] Host-mycobacterial sophisticated interactions are orchestrated by Hippo signaling pathway.[39,45]

The regulatory relationship of abnormally expressed RNAs has been also explored. In the ceRNA regulatory network about chronic granulomatous disease, multitudinous lncRNAs indirectly suppress the down-regulation of KMT2A by sponging hsa-miR-106a-5p and hsa-miR-17-5p. Whereas, current studies found that KMT2A was mainly associated with acute lymphoblastic leukemia and acute myeloid leukemia.[46,47] The reason for this has been uncovered, lymphoma relative pathways, such as HTLV-l infection, were abnormally activated because the number of lymphocytes, especially activated mononuclear cells, were rapidly increased during Mtb infection.[24] Extensive research has shown that miR-17-5p acts an immunomodulatory regulator of autophagy in mycobacteria-infected macrophages. Hsa-miR-17-5p targets Mcl-1 and STAT3 to regulate autophagy in Mtb-infected macrophages.[48] Besides, miR-17-5p plays a role in the anti-TB effect of macrophages by directly targeting the autophagy-related gene ATG7 3’ – untranslated region and preventing the fusion of autophagosomes with lysosomes to inhibit autophagy.[49] The consensus finding indicates the high reliability and feasibility of miR-17-5p as a novel biomarker.

In the ceRNA regulatory network about TB, hsa_circ_0004790 and lncRNAs sponges hsa-miR-2355-5p to regulate the expression of B3GAT1 and IL-32. B3GAT1 is a member of the glucuronyltransferase family and involved in glycosaminoglycan metabolism/heparin.[50] As previously mentioned, heparin reduces iron levels in human macrophages to inhibit intracellular Mtb bacterial replication.[37] IL-32 is a multifaceted cytokine associated with anti-inflammatory functions.[51] In comparison with cellular necrosis, IL-32 can induce caspase-3-dependent apoptosis and cathepsin-mediated pyroptosis, and this is an important effector mechanism for macrophages to kill intracellular Mtb.[52] Accumulating evidence has confirmed the anti-TB effect of IL-32.[53,54] More importantly, it has been reported that IL-32 can be as a protective marker against active TB.[55] Taken together, hsa-miR-2355-5p indirectly modulates the iron chelation and immune response status in Mtb -infected macrophages through altering the expression levels of B3GAT1 and IL-32.

In summary, the methods for discriminating TB in current clinical practice are limited because they are time-consuming to perform or strenuous to trace these pathogenic bacteria. This fact drives us to explore effective biomarkers for the diagnosis of TB. Adequate use of transcriptome methods and gene expression techniques helps to promote research in TB diagnostic methods, monitoring treatment response and development vaccines, more specifically, the utilization of transcriptome profiles to recognize biomarkers.[56] Identification of novel biomarkers is significant to improve the accuracy of TB diagnosis. Apart from improving diagnostic efficiency, the role of TB biomarkers is crucial in 3 areas: predict durable treatment success in active TB patients; indicate reactivation risk and predict treatment success in latent infection patients; evaluate vaccine efficacy.[57] The bioinformatics study identified differentially expressed genes and miRNAs in PTB patients. Functional annotation and enrichment of DE-mRNAs revealed TB pathological mechanism was strongly related to the biological processes and immune response affected by KMT2A, B3GAT1 and IL-32. Although the longitudinal study eliminated the bias from the genetic background and confirmed that 3 miRNAs could serve as diagnostic markers, it has several limitations. Firstly, the extrapolation of the result calls for cautious interpretation because the sample size is relatively small. In short, this finding needs to be verified in further studies with a large and representative sample. Secondly, the current work just detected DE-RNAs at the transcriptional level but did not validate the expression of target molecules at the protein level. In future research, more attention will be paid to the target molecular functional analysis, mechanisms of TB pathogenesis and diagnostic value of biomarkers.

5. Conclusion

In conclusion, a panel of dysregulated RNAs in TB development and progression were identified. Function and pathway enrichment analysis revealed that the TB pathogenesis was associated with pathogen infection, T cell receptor signaling pathway, glycosaminoglycan biosynthesis-heparan sulfate/heparin and Hippo signaling pathway. What’s more, the key finding to emerge from this study is that hsa-miR-106a-5p, hsa-miR-17-5p, and hsa-miR-2355-5p might be promising candidate biomarkers to diagnose TB. Nevertheless, more detailed and exhaustive studies are required to verify this finding.


We thank Gene Denovo Biotechnology Co. (Guangzhou, China) for whole transcriptome sequencing.

Author contributions

Conceptualization: Jie Song, Jiaguan Sun, Weimin Meng.

Data curation: Jie Song, Bingdong Fan, Hongjuan Liu, Weimin Meng.

Formal analysis: Bingdong Fan, Hongjuan Liu.

Investigation: Yuqing Wang, Chenglan Bao.

Methodology: Yuqing Wang.

Resources: Yuehe Ding, Shengrong Zhang.

Software: Yuehe Ding, Shengrong Zhang.

Supervision: Yuehe Ding, Shengrong Zhang, Fengxia Chang, Weimin Meng.

Validation: Shengrong Zhang, Xiuzhen Ma, Fengxia Chang.

Writing – original draft: Jiaguan Sun, Xiuzhen Ma, Fengxia Chang.

Writing – review & editing: Jie Song, Jiaguan Sun.


biological process
cellular component
competitive endogenous RNA
circular RNA
C-X-C motif chemokine ligand 1
differentially expressed RNAs
fold change
fragment per kilobase of transcript per million mapped reads
gene ontology
Kyoto Encyclopedia of Genes and Genomes
long non-coding RNA
molecular function
Mycobacterium tuberculosis
pulmonary tuberculosis


[1]. Floyd K, Glaziou P, Zumla A, et al. The global tuberculosis epidemic and progress in care, prevention, and research: an overview in year 3 of the End TB era. Lancet Respir Med. 2018;6:299–314.
[2]. Kishk SM, McLean KJ, Sood S, et al. Design and synthesis of imidazole and triazole pyrazoles as mycobacterium tuberculosis CYP121A1 inhibitors. ChemistryOpen. 2019;8:995–1011.
[3]. Wang YXJ, Chung MJ, Skrahin A, et al. Radiological signs associated with pulmonary multi-drug resistant tuberculosis: an analysis of published evidences. Quant Imag Med Surg. 2018;8:161–73.
[4]. WHO. Global Tuberculosis Report 2020. Available at: [access date March 7, 2023].
[5]. Yi Z, Gao K, Li R, et al. Dysregulated circRNAs in plasma from active tuberculosis patients. J Cell Mol Med. 2018;22:4076–84.
[6]. Ren N, JinLi J, Chen Y, et al. Identification of new diagnostic biomarkers for Mycobacterium tuberculosis and the potential application in the serodiagnosis of human tuberculosis. Microb Biotechnol. 2018;11:893–904.
[7]. Huang ZK, Yao F-Y, Xu J-Q, et al. Microarray expression profile of circular RNAs in peripheral blood mononuclear cells from active tuberculosis patients. Cell Physiol Biochem. 2018;45:1230–40.
[8]. Liu H, Lu G, Wang W, et al. A panel of CircRNAs in the serum serves as biomarkers for mycobacterium tuberculosis infection. Front Microbiol. 2020;11:1215.
[9]. Lu P, Chen X, Zhu L-M, et al. Interferon-gamma release assays for the diagnosis of tuberculosis: a systematic review and meta-analysis. Lung. 2016;194:447–58.
[10]. Lanzas F, Ioerger TR, Shah H, et al. First evaluation of GenoType MTBDRplus 2.0 performed directly on respiratory specimens in Central America. J Clin Microbiol. 2016;54:2498–502.
[11]. Wang Z, Xu H, Wei Z, et al. The role of non-coding RNA on macrophage modification in tuberculosis infection. Microb Pathog. 2020;149:104592.
[12]. Correia CN, Nalpas NC, McLoughlin KE, et al. Circulating microRNAs as potential biomarkers of infectious disease. Front Immunol. 2017;8:118.
[13]. Wagh V, Urhekar A, Modi D. Levels of microRNA miR-16 and miR-155 are altered in serum of patients with tuberculosis and associate with responses to therapy. Tuberculosis (Edinb). 2017;102:24–30.
[14]. Ndzi EN, Nkenfou CN, Mekue LM, et al. MicroRNA hsa-miR-29a-3p is a plasma biomarker for the differential diagnosis and monitoring of tuberculosis. Tuberculosis (Edinb). 2019;114:69–76.
[15]. Munteanu MC, Huang C, Liang Y, et al. Long non-coding RNA FENDRR regulates IFNgamma-induced M1 phenotype in macrophages. Sci Rep. 2020;10:13672.
[16]. Zeng JC, Lin D-Z, Yi L-L, et al. BTLA exhibits immune memory for alphabeta T cells in patients with active pulmonary tuberculosis. Am J Transl Res. 2014;6:494–506.
[17]. Xu LN, Wang Y-Q, Wang Z-Y, et al. Transcriptome differences between Cry1Ab resistant and susceptible strains of Asian corn borer. BMC Genomics. 2015;16:173.
[18]. Jain A, Kihara D. Phylo-PFP: improved automated protein function prediction using phylogenetic distance of distantly related sequences. Bioinformatics. 2019;35:753–9.
[19]. Gupta KR, Arora G, Mattoo A, et al. Stringent response in mycobacteria: from biology to therapeutic potential. Pathogens. 2021;10:1417.
[20]. Arora G, Bothra A, Prosser G, et al. Role of post-translational modifications in the acquisition of drug resistance in Mycobacterium tuberculosis. FEBS J. 2021;288:3375–93.
[21]. Singh N, Sharma N, Singh P, et al. HupB, a nucleoid-associated protein, is critical for survival of Mycobacterium tuberculosis under host-mediated stresses and for enhanced tolerance to key first-line antibiotics. Front Microbiol. 2022;13:937970.
[22]. Eusebio-Ponce E, Anguita E, Paulino-Ramirez R, et al. HTLV-1 infection: an emerging risk. Pathogenesis, epidemiology, diagnosis and associated diseases. Rev Esp Quimioter. 2019;32:485–96.
[23]. Flynn JL. Immunology of tuberculosis and implications in vaccine development. Tuberculosis (Edinb). 2004;84:93–101.
[24]. Zhuang ZG, Zhang J-A, Luo H-L, et al. The circular RNA of peripheral blood mononuclear cells: Hsa_circ_0005836 as a new diagnostic biomarker and therapeutic target of active pulmonary tuberculosis. Mol Immunol. 2017;90:264–72.
[25]. Mascarenhas RE, Brodskyn C, Barbosa G, et al. Peripheral blood mononuclear cells from individuals infected with human T-cell lymphotropic virus type 1 have a reduced capacity to respond to recall antigens. Clin Vaccine Immunol. 2006;13:547–52.
[26]. Bastos MdL, Santos SB, Souza A, et al. Influence of HTLV-1 on the clinical, microbiologic and immunologic presentation of tuberculosis. BMC Infect Dis. 2012;12:199.
[27]. Keikha M, Karbalaei M. Overview on coinfection of HTLV-1 and tuberculosis: mini-review. J Clin Tuberc Other Mycobact Dis. 2021;23:100224.
[28]. Sharma B. T cell receptor mediated signalling: an interactive pathway for cytokine production in tuberculosis. EC Pulmonol Respir Med. 2019;8:340–52.
[29]. Delgado JC, Tsai EY, Thim S, et al. Antigen-specific and persistent tuberculin anergy in a cohort of pulmonary tuberculosis patients from rural Cambodia. Proc Natl Acad Sci USA. 2002;99:7576–81.
[30]. Mahon RN, Rojas RE, Fulton SA, et al. Mycobacterium tuberculosis cell wall glycolipids directly inhibit CD4+ T-cell activation by interfering with proximal T-cell-receptor signaling. Infect Immun. 2009;77:4574–83.
[31]. Mahon RN, Sande OJ, Rojas RE, et al. Mycobacterium tuberculosis ManLAM inhibits T-cell-receptor signaling by interference with ZAP-70, Lck and LAT phosphorylation. Cell Immunol. 2012;275:98–105.
[32]. Tateosian NL, Pellegrini JM, Amiano NO, et al. IL17A augments autophagy in Mycobacterium tuberculosis-infected monocytes from patients with active tuberculosis in association with the severity of the disease. Autophagy. 2017;13:1191–204.
[33]. Kupz A, Zedler U, Stäber M, et al. ESAT-6-dependent cytosolic pattern recognition drives noncognate tuberculosis control in vivo. J Clin Invest. 2016;126:2109–22.
[34]. Garcia-Mayoral MF, Moussaoui M, de la Torre BG, et al. NMR structural determinants of eosinophil cationic protein binding to membrane and heparin mimetics. Biophys J. 2010;98:2702–11.
[35]. Pethe K, Alonso S, Biet F, et al. The heparin-binding haemagglutinin of M. tuberculosis is required for extrapulmonary dissemination. Nature. 2001;412:190–4.
[36]. Menozzi FD, Rouse JH, Alavi M, et al. Identification of a heparin-binding hemagglutinin present in mycobacteria. J Exp Med. 1996;184:993–1001.
[37]. Abreu R, Essler L, Loy A, et al. Heparin inhibits intracellular Mycobacterium tuberculosis bacterial replication by reducing iron levels in human macrophages. Sci Rep. 2018;8:7296.
[38]. Cronje L, Edmondson N, Eisenach KD, et al. Iron and iron chelating agents modulate Mycobacterium tuberculosis growth and monocyte-macrophage viability and effector functions. FEMS Immunol Med Microbiol. 2005;45:103–12.
[39]. Lin Y, Zhang Y, Yu H, et al. Identification of unique key genes and miRNAs in latent tuberculosis infection by network analysis. Mol Immunol. 2019;112:103–14.
[40]. Nehme NT, Schmid JP, Debeurme F, et al. MST1 mutations in autosomal recessive primary immunodeficiency characterized by defective naive T-cell survival. Blood. 2012;119:3458–68.
[41]. Hong L, Li X, Zhou D, et al. Role of Hippo signaling in regulating immunity. Cell Mol Immunol. 2018;15:1003–9.
[42]. Boro M, Singh V, Balaji KN. Mycobacterium tuberculosis-triggered Hippo pathway orchestrates CXCL1/2 expression to modulate host immune responses. Sci Rep. 2016;6:37695.
[43]. Boro M, Balaji KN. CXCL1 and CXCL2 regulate NLRP3 inflammasome activation via G-protein-coupled receptor CXCR2. J Immunol. 2017;199:1660–71.
[44]. Yan H, Xu R, Zhang X, et al. Identifying differentially expressed long non-coding RNAs in PBMCs in response to the infection of multidrug-resistant tuberculosis. Infect Drug Resist. 2018;11:945–59.
[45]. Chakrabarty S, Kumar A, Raviprasad K, et al. Host and MTB genome encoded miRNA markers for diagnosis of tuberculosis. Tuberculosis (Edinb).2019;116:37–43.
[46]. Basilico S, Wang X, Kennedy A, et al. Dissecting the early steps of MLL induced leukaemogenic transformation using a mouse model of AML. Nat Commun. 2020;11:1407.
[47]. Ferrari A, Ghelli Luserna Di Rora A, Domizio C, et al. Rearrangements of ATP5L-KMT2A in acute lymphoblastic leukaemia. Br J Haematol. 2021;192:e139–44.
[48]. Kumar R, Sahu SK, Kumar M, et al. MicroRNA 17-5p regulates autophagy in Mycobacterium tuberculosis-infected macrophages by targeting Mcl-1 and STAT3. Cell Microbiol. 2016;18:679–91.
[49]. Dan-tong Hong FZ, Shu-e W, Hong-xia W, et al. miR-17-5p targeting autophagy related protein ATG7 regulates macrophages against mycobacterium tuberculosis infection. China Biotechnol. 2019;39:1–8.
[50]. Vantaku V, Amara CS, Piyarathna DWB, et al. DNA methylation patterns in bladder tumors of African American patients point to distinct alterations in xenobiotic metabolism. Carcinogenesis. 2019;40:1332–40.
[51]. Aass KR, Kastnes MH, Standal T. Molecular interactions and functions of IL-32. J Leukoc Biol. 2021;109:143–59.
[52]. Bai X, Dinarello CA, Chan ED. The role of interleukin-32 against tuberculosis. Cytokine. 2015;76:585–7.
[53]. Bai X, Kim S-H, Azam T, et al. IL-32 is a host protective cytokine against Mycobacterium tuberculosis in differentiated THP-1 human macrophages. J Immunol. 2010;184:3830–40.
[54]. Bai X, Shang S, Henao-Tamayo M, et al. Human IL-32 expression protects mice against a hypervirulent strain of Mycobacterium tuberculosis. Proc Natl Acad Sci USA. 2015;112:5111–6.
[55]. Montoya D, Inkeles MS, Liu PT, et al. IL-32 is a molecular marker of a host defense network in human tuberculosis. Sci Transl Med. 2014;6:250ra114.
[56]. van Rensburg IC, Loxton AG. Transcriptomics: the key to biomarker discovery during tuberculosis? Biomark Med. 2015;9:483–95.
[57]. Wallis RS, Pai M, Menzies D, et al. Biomarkers and diagnostics for tuberculosis: progress, needs, and translation into practice. Lancet. 2010;375:1920–37.

competitive endogenous RNA; diagnostic biomarkers; regulatory network; tuberculosis

Supplemental Digital Content

Copyright © 2023 the Author(s). Published by Wolters Kluwer Health, Inc.