INTRODUCTION
The repetitive elements are ubiquitous in the metazoan genome and consist of the bulk of the genome.[1,2] Repetitive elements mean a segment of DNA which reiterate itself within the genome in multiple times.[3,4] Various types of repetitive elements are present within the genome of organisms, but these elements can be divided into two major classes which include dispersed repeats and tandem repeats.[5,6] Tandem repeats comprise satellite DNA, minisatellites, and microsatellite DNA whereas dispersed repeats contain transposable elements.[7] Based on the orientation of the repeating part, some other types of repetitive elements such as direct repeats, inverted repeats (palindromic sequences), mirror repeats, and everted repeats are present within the genome. These different types of repeat elements are shown in Figure 1. Direct repeats are sequence motifs which replicate in the same order on the same strand, inverted repeats are DNA motifs which duplicate in opposite orientation on the opposite strand, everted repeats are sequence motifs which are repeated in similar orientation but in the complementary strand of DNA, and mirror repeats are DNA motifs which are duplicated on same strand but in reverse orientation and have a center of symmetry.[8] These repeats construct different types of unusual structures, such as left-handed Z-DNA, triple-stranded H-DNA, G-quadruplex, and cruciform, and are related to various diseases such as cancer.[9,10] In literature, it was observed that poly (dG-dT/dC-dT) will form triplex H-DNA, poly (dT-dA) forms cruciform, and poly (dC-dA/dT-dG) makes Z-DNA in the eukaryotic genome.[11] An intermolecular triplex DNA is formed between the homopurine–homopyrimidine site in dsDNA and triplex-forming oligonucleotides.[12] Many previous reports observed that these repeating units are found near the promoter region of gene and recombination sites.[13,14] Shin et al. observed that in humans, 76% of H-DNA and cruciform-making sequences are located on either untranslated region or intron, only 3% of non-B-DNA structure-forming sequences are found in the protein-coding region, and the remaining 20% of unusual DNA structure-forming sequences are present on either 5’ or 3’ flanking region of the gene.[15,16] Several studies have demonstrated that these non-B-DNA structures regulate gene expression, replication, translation, and recombination.[17] A study exhibited that various abnormalities such as myotonic dystrophy, Fragile X syndrome, X-linked spinal and bulbar muscular atrophy, and Friedreich’s ataxia are due to various repeat elements.[18–20] These repeats have an association with the regulatory network of epigenesis, transcription, telomere maintenance, replication, and evolution.[21,22] In mammalian cells, H-DNA is intrinsically mutagenic and can cause mutation at large scales, such as rearrangement and deletion, which in turn help in understanding the molecular basis of autosomal dominant polycystic kidney disease.[23]
Figure 1: Types of repeat elements in genome. (a) Direct repeats, (b) Inverted repeats, (c) Everted repeats, (d) Mirror repeats
The nematode Caenorhabditiselegans has been adopted as a model organism by researchers from the last 50 years in various fields such as genetics, neurobiology, and developmental biology.[24,25] The genome of C. elegans is well annotated due to its small size (100 MB).[26] The genome of C. elegans comprised 17% of repetitive elements including inverted repeats, tandem repeats, and transposons.[27] However, our key interest is to identify mirror repeats within ced-9 gene of C. elegans. We utilized a simple approach for the identification of mirror repeats within ced-9 gene of C. elegans. The ced-9 gene is an antiapoptotic gene which prevents the cell from cell death.[28] If a mutation occurs in ced-9 gene, then normal cells ordinarily die which usually survive during cell death.[29] Our forthcoming aim is to characterize molecular and biophysical properties of identified mirror repeats.
METHODS
Primarily, the full gene sequence of ced-9 of C. elegans was procured from the National Center for Biotechnology Information in FASTA format. Mirror repeats were analyzed by utilizing the simple approach as used in similar studies as shown in Figure 2.[30–32] These identified mirror repeats were explored within Caenorhabditisvulgaris, Xenopustropicalis, and Drosophilamelanogaster genomes by using the Megablast tool. Another non-B-DNA tool was also utilized in this study for the identification of mirror repeats within the ced-9 gene of C. elegans (https://nonb-abcc.ncifcrf.gov/app s/nBMST/default/).[33]
Figure 2: Pictorial view of methodology of identification of mirror repeats
RESULTS
The ced-9 gene is located on chromosome III and is comprised of 2351 bps. It has 4 exons, the position and length of exons are given in Table 1 and a maximum number of hits were determined at an E value of 100. Exon 4 being largest, has 16 mirror repeats whereas exon 3 is the smallest, has 4 mirror repeats.
Table 1: Total number of hits analyzed within the exons of ced-9 gene at E-value 100
The ced-9 gene was splitted into regions of 500 bps to obtain maximum mirror repeats. We observed 53 mirror repeats of different lengths and types within the complete ced-9 gene of C. elegans. The largest mirror repeat is observed of the size of 55 bps and the smallest mirror repeat is of size 7 bps. The identified mirror repeats of different lengths are shown in Table 2. Out of 53 MRs, 45 MRs have a size of 7–12 bps, 3 MRs have a size of 13–18 bps, 3 MRs have a size of 19–24 bps, and 2 MRs have a size of ≥25 bps. The frequency of distribution of small-size MRs was greater than large-size MRs. Among 53 identified mirror repeats within ced-9 gene of C. elegans, the distribution frequency of mirror repeats of size of 7–12 bps (45, 84.90%) was highest followed by 13–18 bps (3, 5.66%), 19–24 bps (3, 5.66%), and ≥25 bps (2, 3.77%).
Table 2: Representation of length of mirror repeats in different parts of ced-9 gene
These identified mirror repeats were classified into perfect mirror repeats and imperfect mirror repeats with single spacer, double spacer, and multispacer based on the center of symmetry and spacer elements.[34] Among 53 mirror repeats, 39 perfect MRs with single spacer, 8 perfect MRs without spacer, 1 imperfect MRs with multispacer, 4 imperfect MRs with single spacer, and 1 imperfect MRs without any spacer were observed. We identified 5 homopyrimidine-rich sequences (Hpy) and 3 homopurine-rich sequences (Hpu) within the whole ced-9 gene which is shown in Table 3. Figure 3 depicts the location of identified MRs within the ced-9 gene of C. elegans. The exons are shown with different colors and the remaining part depicts intron. Here, identified 30 MRs within exons and 23 MRs within an intron of the ced-9 gene of C. elegans are shown. Further, we used non-B-DNA tool to identify mirror repeats within ced-9 gene of C. elegans. We analyzed only one mirror repeat which is shown in given Table 4. This identified mirror repeat is a perfect mirror with single spacer, a size of 24 bps, and a spacer element of 12 bps.
Table 3: Representation of length, position, and types of mirror repeats in different parts of ced-9 gene
Figure 3: Pictorial view of position of identified mirror repeats within ced-9 gene of Caenorhabditis elegans
Table 4: Representation of mirror repeats within ced-9 genes of C. elegans by using non-B-DNA Motif Search Tool
The identified mirror repeats were examined further within C. elegans genome and other organisms’ genomes, as shown in Table 5. Here, the + sign indicates the presence of mirror repeats within the genome and the – sign indicates the absence of mirror repeats within the genome. Smaller mirror repeats were rarely distributed within the C. elegans genome and other organism genomes. Similarly, very large-size mirror repeats were not analyzed within the genome of the organism. However, mirror repeats with a size of 11–14 bps were frequently distributed within the genome of C. elegans and other organisms’ genomes. Here, we wish to highlight few mirror sequences which were identified within ced-9 gene of C. elegans were not shown its presence within the genome of C. elegans. It clearly indicates that there is a need to develop Megablast tool which can handle short input sequences more perfectly.
Table 5: Distribution of identified mirror repeats of ced-9 gene among different organism’s genomes
DISCUSSION
In this current study, a swift approach was used to determine mirror repeats within the ced-9 gene of C. elegans. Another non-B-DNA Motif Search Tool (nBMST) was deployed for the same purpose. We determined 53 mirror repeats using our strategy, whereas by using the nBMST tool, we were able to detect only one mirror repeat. Therefore, the BLAST tool is comparatively more efficient than the nBMST. Previous studies have been done on different operons of Escherichia coli strain K-12 substrain MG1655 and HIV-1 and HIV-2 genome for mirror repeats identification.[6,34] Yadav et al. studied flowering genes and photosynthetic genes of Arabidopsisthaliana for determination on mirror repeats.[35,36] Lang identified different types of mirror repeats within the gag gene of the HIV-1 (HXB2) genome.[37] Mirror repeats have been reported within the EngrailedHomeobox-1 gene of Xenopustropicalis, vegf-aa gene, and Pdgf-Aa gene of Daniorerio.[31,32,38] We also observed 31 mirror repeats within egl-1 apoptotic gene of C. elegans.[39] Kavita et al. studied Maleless gene and Intersex gene of D. melanogaster for the identification of mirror repeats.[30,40] Here, we observed some homopurine and homopyrimidine sequences within ced-9 gene of C. elegans which can adopt non-B-DNA structures such as H-DNA, cruciform, and hairpin. These noncanonical B-DNA structures have the potential to stall replication and transcription and also have a role in causing genetic instability and mutation.[41,42] Nevertheless, what exactly function these identified mirror repeats will have in the apoptosis process of C. elegans is still unrevealed.
CONCLUSIONS
The distribution of mirror repeats within the genome of C. elegans and other organisms’ genome revealed that these mirror repeats are an essential part of the genome and these might play important role in the evolutionary process.
Limitation of the study
Using Megablast tool, we were not able to identify identified mirror repeats of smaller size within the same genome.
Ethical statement
Not applicable to our research.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
REFERENCES
1. Lower SE, Dion-Côté AM, Clark AG, Barbash DA. Repetitive DNA sequences. Genes 2019;10:11–896.
2. De Bustos A, Cuadrado A, Jouve N. Sequencing of long stretches of repetitive DNA. Sci Rep 2016;6:36665.
3. Pathak D, Ali S. Repetitive DNA: A tool to explore animal genomes/transcriptomes. Functional Genomics. London, Intech Open Limited, London, SW7 2QJ, UK: 2012. 155–80.
4. Mehrotra S, Goyal V. Repetitive sequences in plant nuclear DNA:Types, distribution, evolution and function. Genomics Proteomics Bioinformatics 2014;12:164–71.
5. Richard GF, Kerrest A, Dujon B. Comparative genomics and molecular dynamics of DNA repeats in eukaryotes. Microbiol Mol Biol Rev 2008;72:686–727.
6. Yadav S, Yadav U, Sharma DC.
In silico approach for the identification of mirror repeats in selected operon genes of
Escherichia coli strain K-12 substrain MG1655. Biomed Biotechnol Res J 2022;6:93–7.
7. Zattera ML, Gazolla CB, Soares AA, Gazoni T, Pollet N, Recco-Pimentel SM, et al. Evolutionary dynamics of the repetitive DNA in the karyotypes of
Pipa carvalhoi and
Xenopus tropicalis (
Anura,
Pipidae). Front Genet 2020;11:637.
8. Gurusaran M, Ravella D, Sekar K. RepEx:Repeat extractor for biological sequences. Genomics 2013;102:403–8.
9. Poggi L, Richard GF. Alternative DNA structures
in vivo:Molecular evidence and remaining questions. Microbiol Mol Biol Rev 2021;85:e00110–20.
10. Zhao J, Bacolla A, Wang G, Vasquez KM. Non-B DNA structure-induced genetic instability and evolution. Cell Mol Life Sci 2010;67:43–62.
11. Gross DS, Garrard WT. The ubiquitous potential Z-forming sequence of eucaryotes, (dT-dG) n. (dC-dA) n, is not detectable in the genomes of eubacteria,
Archaebacteria, or mitochondria. Mol Cell Biol 1986;6:3010–3.
12. Guerrini L, Alvarez-Puebla RA. Structural recognition of triple-stranded DNA by surface-enhanced Raman spectroscopy. Nanomaterials (Basel) 2021;11:326.
13. Elmore MH, Gibbons JG, Rokas A. Assessing the genome-wide effect of promoter region tandem repeat natural variation on gene expression. G3 (Bethesda) 2012;2:1643–9.
14. Harvey VC, Acio CR, Bredehoft AK, Zhu L, Hallinger DR, Quinlivan-Repasi V, et al. Repetitive sequence variations in the promoter region of the adhesin-encoding gene sabA of
Helicobacter pylori affect transcription. J Bacteriol 2014;196:3421–9.
15. Shin SI, Ham S, Park J, Seo SH, Lim CH, Jeon H, et al. Z-DNA-forming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome. DNA Res 2016;23:477–86.
16. Schroth GP, Ho PS. Occurrence of potential cruciform and H-DNA forming sequences in genomic DNA. Nucleic Acids Res 1995;23:1977–83.
17. Bacolla A, Cooper DN, Vasquez KM, Tainer JA. Non-B DNA Structure and Mutations Causing Human Genetic Disease. ELS 2018:1–15.
18. Bacolla A, Wells RD. Non-B DNA conformations as determinants of mutagenesis and human disease. Mol Carcinog 2009;48:273–85.
19. Nelson LD, Bender C, Mannsperger H, Buergy D, Kambakamba P, Mudduluru G, et al. Triplex DNA-binding proteins are associated with clinical outcomes revealed by proteomic measurements in patients with colorectal cancer. Mol Cancer 2012;11:38.
20. Paulson H. Repeat expansion diseases. Handb Clin Neurol 2018;147:105–23.
21. Sharma S. Non-B DNA secondary structures and their resolution by RecQ helicases. J Nucleic Acids 2011;2011:724215.
22. Grechko VV. Repeated DNA sequences as an engine of biological diversification. Mol Biol (Mosk) 2011;45:765–92.
23. Wang G, Vasquez KM. Naturally occurring H-DNA-forming sequences are mutagenic in mammalian cells. Proc Natl Acad Sci U S A 2004;101:13448–53.
24. Leung MC, Williams PL, Benedetto A, Au C, Helmcke KJ, Aschner M, et al.
Caenorhabditis elegans:An emerging model in biomedical and environmental toxicology. Toxicol Sci 2008;106:5–28.
25. Meneely PM, Dahlberg CL, Rose JK. Working with worms:
Caenorhabditis elegans as a model organism. Curr Protoc Essent Lab Tech 2019;19:e35.
26. Corsi AK, Wightman B, Chalfie M. A Transparent window into biology:A primer on
Caenorhabditis elegans. WormBook 2015;200:1–31.
27. Leyva-Díaz E, Stefanakis N, Carrera I, Glenwinkel L, Wang G, Driscoll M, et al. Silencing of repetitive DNA is controlled by a member of an
Unusual Caenorhabditis elegans gene family. Genetics 2017;207:529–45.
28. Malin JZ, Shaham S. Cell death in
C. elegans development. Curr Top Dev Biol 2015;114:1–42.
29. Conradt B, Wu YC, Xue D. Programmed cell death during
Caenorhabditis elegans development. Genetics 2016;203:1533–62.
30. Kavita S, Namrata D, Deepti Y, Vikash B. Identification of mirror repeats within the Maleless (MLE) gene of Drosophila melanogaster Meigen. Indian Journal of Entomology 2023. doi:10.55446/IJE.2022.98.
31. Yadav D, Dhankhar M, Saini K, Bhardwaj V. A novel approach for identification of mirror repeats within the Engrailed Homeobox-1 gene of
Xenopus tropicalis. Biomed Biotechnol Res J (BBRJ) 2022;6:532.
32. Dangi N, Saini K. Identification of mirror repeats within the pdgf-aa gene of danio rerio. Journal of Pharmaceutical Negative Results 2023;14:2231–40.
33. Cer RZ, Donohue DE, Mudunuri US, Temiz NA, Loss MA, Starner NJ, et al. Non-B DB v2.0:A database of predicted non-B DNA-forming motifs and its associated tools. Nucleic Acids Res 2013;41:D94–100.
34. Yadav S, Yadav U, Sharma DC.
In-Silico Evaluation of 'Mirror Repeats'in HIV Genome. Int. J. Life Sci 2021;1:81–7.
35. Yadav U, Yadav S, Sharma CS. Characterization of flowering genes of
Arabidopsis thaliana for mirror repeats. Biointerface Res Appl Chem 2021;12:2852–61.
36. Yadav U, Yadav S, Sharma D.
In Silico analysis of structural photosynthetic genes of arabidopsis thaliana for unique mirror repeats. Indian J Sci Technol 2022;15:127–35.
37. Lang DM. Imperfect DNA mirror repeats in the gag gene of HIV-1 (HXB2) identify key functional domains and coincide with protein structural elements in each of the mature proteins. Virol J 2007;4:113.
38. Dangi N, Saini K, Bhardwaj V. Study on Mirror Repeats in The Vegf-Aa Gene of
Danio rerio. NeuroQuantology 2022;20:13–8.
39. Dhankhar M, Yadav D, Bhardwaj V. Identification of mirror repeats within the egl-1 gene of
Caenorhabditis elegans. NeuroQuantology 2022;20:1196–203.
40. Dangi N, Bhardwaj V. Mirror repeats in the intersex gene of Drosophila melanogaster Meigen. Indian J Entomol 2023;85:40–5.
41. Brazda V, Fojta M, Bowater RP. Structures and stability of simple DNA repeats from bacteria. Biochem J 2020;477:325–39.
42. Guiblet WM, Cremona MA, Harris RS, Chen D, Eckert KA, Chiaromonte F, et al. Non-B DNA:A major contributor to small- and large-scale variation in nucleotide substitution frequencies across the genome. Nucleic Acids Res 2021;49:1497–516.