“In silico analysis of human TLR3 missense single nucleotide polymorphisms and their potential association with cancer”

Wait 5 sec.

IntroductionCervical cancer ranks as the fourth most prevalent malignancy affecting women worldwide, arising from the epithelial tissue of the cervix1. Cervical cancer is one of the most successfully treated cancers when detected early. Human papillomaviruses (HPV), an extremely prevalent virus spread through sexual contact, are responsible for approximately 99% of cervical cancer occurrences2. The majority of occurrences of cervical cancer can be avoided with the HPV vaccine and secondary prevention techniques (screening for and treating precancerous lesions). Most HPV infections are eliminated by the immune system; however, persistent high-risk types can cause cervical cell changes that may lead to cancer3. It is currently unclear how the host immune system defends against HPV infection, and what the key predictors of HPV infection outcome are.Toll-like receptors (TLRs) are pattern-recognition receptors present in the cytoplasm and cell membrane that may detect pathogen-associated molecular patterns (PAMPs). TLRs have been discovered to have a crucial role in HR-HPV-caused cervical cancer4. TLRs, which are required for both innate and acquired immunity, play an important part in the genesis and progression of many malignant tumors, as well as the immune system’s defense against infectious illnesses5. TLRs in Homo sapiens can be categorized into ten subclasses based on their main sequence, TLR1 through TLR106. Among these TLRs, TLR1, TLR2, TLR4, TLR5, and TLR6 are present in the plasma membrane and are required for bacterial and mycobacterial population discrimination. TLR3 and TLR7, both found in endolysosomes, are essential for detecting dsRNA and ssRNA viruses, respectively7. TLR-8 is a gene that encodes a protein that has been associated with anogenital warts and superficial basal cell carcinoma. TLR9, which is present in the endolysosome, recognizes bacteria and viruses’ cytosine phosphate-guanine oligodeoxynucleotide (CpG-ODN). TLR expression profiles differ between TLRs. The TLR3 gene produces interferon-induced transmembrane protein 1, which is required for pathogen identification and innate defense activation. When a virus is detected, TLR3 is activated, enhancing the production of type I interferons and signaling other cells to strengthen their antiviral defenses8,9. Genetic polymorphisms in TLR3 have been associated with altered susceptibility to HPV infection and cervical cancer10.Single-nucleotide polymorphisms (SNPs) are the most widespread form of genetic variation in humans11. Each SNP represents a change in a single nucleotide, the basic unit of DNA. SNPs are frequently classified into two groups based on where they are found and whether they are coding, non-coding, or intergenic. SNPs in non-coding regions may change mRNA structure, disease susceptibility, and cancer risk. Non-coding SNPs, unlike synonymous and non-synonymous substitutions in coding regions, can affect gene expression levels. By definition, synonymous replacements do not modify a protein’s amino acids but may impact its function in other ways, whereas non-synonymous substitutions are further classified as missense and nonsense variants12,13. A missense variant is a type of single nucleotide polymorphism (SNP) in which a single nucleotide change in the DNA sequence leads to the substitution of one amino acid for another in the resulting protein. This alteration can affect the structure and function of the protein, potentially leading to changes in its stability, activity, or interaction with other molecules. Depending on the location and nature of the amino acid change, a missense variant may have benign, deleterious, or even disease-causing effects.SNPs in TLR genes are an important predictor of early cancer susceptibility, notably in cervical cancer. Functional non-synonymous SNPs (nsSNPs) identification requires expensive and difficult experimental approaches among the many SNPs linked to cervical cancer. As the human TLR3 gene has a huge amount of SNP data, it is critical to discover and investigate the detrimental SNPs associated with it. To prioritize the dbSNP database nsSNPs that are detrimental, the current study used a range of computational methods (dry lab work) for disease-related mutations, including PROVEAN, Mutation Assessor, PANTHER, SNAP, PhD-SNP, SNPs&GO, I-Mutant, CUPSAT, and many more on TLR3 SNPs. For the first time, the structural effects produced by the mentioned mutations, its interaction analysis & oncogenic nature of the TLR3 gene against Cervical Cancer investigation were performed.Materials and methodsThe schematic representation of the in silico analysis performed in this study is shown in Fig. 1.Fig. 1The methodology used for the analysis of the TLR3 missense SNPs.Full size imageData collectionTLR3’s protein sequence was obtained from UniProtKB with accession ID O1545514. The human TLR3 gene’s nsSNPs were acquired from NCBI. The protein structure was found in the protein data bank (PDB), specifically under PDB ID: 5GS015. The accession numbers of the SNPs and the direct access links to the software/tools used in the manuscript are provided in the Data availability Section (Table 1).Table 1 The accession number of the SNPs and direct accessible links of the software/tools mentioned in the manuscript.Full size tableIdentification of deleterious nsSNPs through sequence-based toolsDifferent in silico or dry lab tools that could provide deleterious nsSNPs were used. The functional effect of these could be observed through these tools, and 7 of these were used. The effect of the substitution was predicted by Mutation Assessor16PANTHER17Meta-SNP18PredictSNP19SNAP20PhD-SNP21 and SNPs&GO22.Detection of damaging nsSNPs through structure-based toolsI-MUTANT23CUPSAT (both heat and denature method)24MUpro25and Align-GVGD26 tools were utilized to anticipate the impact of protein stability.Conservation analysis and effect of nsSNPs on protein stability and solvent accessibility of TLR3Prediction of the conserved region for TLR3 was done by ConSurf server27. The ConSurf server is a bioinformatics tool for estimating the evolutionary conservation of amino/nucleic acid positions in a protein/DNA/RNA molecule based on phylogenetic relationships between homologous sequences. The evolutionarily conserved amino acid profile of the most deteriorating nsSNPs was used to gain insight into the protein’s structural and functional effect on the nsSNPs. Also, the usage of certain multiple sequences alignment tools like CLUSTAL-OMEGA and T-Coffee for TLR3 sequence of different species like Homo sapiens, Mus musculus, Bos taurus and Boselaphus tragocamelus with uniport ID O15455, Q99MB1, Q5TJ59 and Q0PV50 was used (Figure S1 & S2, Supplementary Information).NetSurfP 2.0 program was used to determine how the most detrimental SNPs affected protein stability and solvent accessibility. Absolute surface accessibility (ASA) and relative surface accessibility (RSA) were used by the NetSurfP 2.0 web server to characterize the cumulative effects of each damaging nsSNP on the amino acids that were exposed or buried. Through the use of the NetSurfP 2.0 server, some physiological characteristics such as protein stability and solvent accessibility were taken into consideration for the potential impact of the most detrimental nsSNPs on the functional behavior of TLR3 protein28.Assessing the oncogenic nature of screened mutations and association of the damaging SNPs with cancerFor the analysis of the oncogenic nature of the screened mutations, different tools named CScape29 and CScape-somatic30 were used. Along with it, FATHMM-MKL31 (Predict the Functional Consequences of Non-Coding and Coding Single Nucleotide Variants) & FATHMM-XF32 (Enhanced Accuracy in Predicting the Functional Consequences of Non-Coding and Coding Single Nucleotide Variants) were also performed to assess the oncogenic nature of screened mutations and association of the damaging SNPs with cancer.Analysis of the structural assessment of the most deleterious nsSNPs of TLR3The deleterious mutants which were filtered after structural analysis databases were used further for visualization and analysis by the HOPE (Have Your Protein Explained) server33. With the help of this web-based tool, how an amino acid substitution affects the protein TLR3’s physical and chemical characteristics, such as hydrophobicity, charge size differences between the wild and mutant forms, spatial structure, and protein activities can be determined.Protein-Protein interaction analysis & network analysis of TLR3To get an insight on the protein-protein interaction, the STRING database which gives the TLR3 affected mutation through analysis with the interaction network and associated functions which participate in it was utilized. Any change to a protein’s structure or function can deviate the way it interacts with other molecules34,35.The network was subsequently transmitted to Cytoscape (version: 3.10) for analysis. We used six methodologies from the cytoHubba plugin, including three local ranking techniques: degree, maximum neighborhood component (MNC), and maximal clique centrality (MCC), as well as three worldwide ranking algorithms: betweenness, radiality, and closeness centrality. The networks were subsequently studied using the cytoHubba plugin of Cytoscape. Nonetheless, the techniques evaluated hub proteins according to their proximity. Hub proteins were ranked according to their connectivity within the entire network, as per the global method36.Analysis of gene expression profile of TLR3 gene in association with cancerThe GEPIA2 database has been incorporated into this by performing box plot analysis of the TLR3 gene to identify the differential gene expression associated with cancer types. The box plot analysis revealed the expression of the TLR-3 gene in CESC (cervical adenocarcinoma) and UCEC (uterine corpus endometrial carcinoma)37.Determining the impact of the most detrimental nsSNPs and TLR3 gene expression on the proximate genesThe muTarget tool for examining the impact of the nsSNPs based on neighboring genes was used, and as two separate analyses “Genotype” and “Target” were present on this server, both UCEC and CESC were performed38.Three-dimensional structural assessment analysis of TLR3 proteinAlphaFold 3 server39 is used to perform the structural assessment of the TLR3 protein. Then Ramachandran plot was plotted using the Ramplot server.Molecular dynamics simulation and trajectory analysisThe three-dimensional structure of TLR3 (5GS0) was subjected to molecular dynamics simulations using the GROMACS software package40,41. The simulation protocol employed in this study followed previously established methods as detailed in earlier publications42. Protein topology parameters were derived using the CHARMM36 force field43. A cubic simulation box was constructed via the Gmxeditconf utility. Initially, the system underwent vacuum energy minimization using the steepest descent algorithm for 1500 steps. Solvation was achieved using the simple point-charge water model through the gmx solvate command, followed by system neutralization using gmxgenion. To eliminate steric clashes and optimize geometry, an energy minimization step was performed. This was followed by a two-phase equilibration process. The first phase involved 100 ps of NVT equilibration to stabilize the temperature at 300 K. Subsequently, 100 ps of NPT equilibration were carried out to regulate the system’s pressure and density. After successful equilibration, a 100 ns production molecular dynamics run was executed42.Analysis of the simulation trajectories was performed using standard GROMACS utilities. The RMSD and RMSF for both wild-type and mutant proteins were computed using the gmxrms and gmxrmsf tools, respectively. Additionally, the radius of gyration (Rg) and solvent accessible surface area44 were evaluated using gmx gyrate and gmxsasa. Secondary structure elements were assessed using the do dssp module44,45.Statistical analysisTo investigate potential relationships among computationally predicted effects of high-risk nsSNPs in the TLR3 gene, a correlation analysis was conducted using a panel of bioinformatics scores, including SNAP (functional impact), I-Mutant ΔΔG (protein stability), ConSurf (evolutionary conservation), and CScape/FATHMM (oncogenic potential). Spearman’s rank correlation coefficient was calculated to assess non-parametric associations between these variables46. A correlation matrix was visualized using a heatmap to illustrate the strength and direction of correlations47. Statistical analyses were performed using Python libraries (Pandas, SciPy, Seaborn), with a significance threshold set at p < 0.0547,48,49.ResultsThe results of the study and tools that were used to find deleterious nsSNPs for the TLR 3 gene were summarized in Fig. 2.Fig. 2Systematic representation of the results of in silico study conducted.Full size imageA total of 7843 SNPs of TLR3 gene sequence were collected from dbSNP of NCBI. Out of these, 150 SNPs were screened for selection for their deleterious role in different viral diseases. From these 150 SNPs, 71 of them were filtered to be missense variants while others were in the 3′ and 5′ UTR locations, Synonymous, non-sense, and frameshift mutations (Fig. 3).Fig. 3Distribution of TLR3 SNPs in dbSNP.Full size imageScreening of nsSNPsIn the current study, the TLR3 SNPs that play a damaging impact were selected. A total of 71 nsSNPs from dbSNP out of 150 were retrieved. To determine whether any of these SNPs were detrimental, they were all further assessed using various sequence-based (Table 2) and structure-based methods (Table 3).Table 2 Missense nsSNPs shown to be detrimental by sequence-based methods.Full size tableTable 3 Set of nsSNPs indicated by structure-based methods to be detrimental.Full size tableTo determine the most detrimental nsSNPs from the selected SNPs, a variety of in silico tools and algorithms, including PANTHER, Mutation Assessor, SNAP, PhD-SNP, Meta-SNP, PredictSNP, and SNPs&GO were employed. The outcomes of these tools demonstrated whether the phenotypic impacts of the amino acid alterations on protein activities were beneficial or detrimental depending on the score. A binary classifier called Meta-SNP uses the random forest-based technique to distinguish between SNPs that are linked to disease and those that are polymorphic and non-synonymous. The output of the four predictors discussed above was provided to Meta-SNP as an eight-element feature vector made up of two sets of four items each. The first group consisted of all the raw output scores from PANTHER, PhD-SNP, and SNAP variant predictions. If one of the input approaches did not generate a prediction, the method-defined default threshold for differentiating neutrals and non-neutrals as input to Meta-SNP was used. We were able to recover 23 SNPs out of 71 by this method, showing disease-causing mutations.The mutation assessor employs a multiple sequence alignment (MSA) partitioned to represent functional specificity to identify the functional importance of a missense variant. A functional impact score was created by adding a conservation score and a specificity score. ‘Neutral’ or ‘low’ variants are expected to have no impact on protein function, whereas ‘medium’ or ‘high’ variants are expected to lead to functional modifications. By utilizing UniProt protein sequences, the mutation assessor generated its own MSA. In order to create aligned sets of families and subfamilies, it was then divided based on the boundaries of the UniProt and Pfam domains. Out of 71 mutations, we had 3 and 26 that exhibit high and medium variations, respectively.Through the PANTHER (Protein ANalysis Through Evolutionary Relationships) Classification System, we obtained 43 mutations that were probably destructive i.e., they have a deleterious nature. The results were divided into two categories: “probably damaging” and “probably benign.” To provide a more reliable and accurate alternative to the predictions provided by individual integrated tools, the PredictSNP is a consensus classifier that combines the six best-performing prediction algorithms. PredictSNP identified 25 mutations out of 71 that were disease-causing. A total of 31 and 22 mutations, respectively were identified as being disease-causing by SNAP and PhD-SNP, two further programmes that were employed to retrieve the detrimental nature of the SNPs. SNPs&GO is an exact approach for determining whether a mutation, starting with a protein sequence, is related to a disease.Further, we also filtered out the nsSNPs that were mostly deleterious through structural method using the PDB ID: 5GS0 through different bioinformatics tools like I-MUTANT, CUPSAT, MUpro and Align-GVGD. We employed these tools for getting a more accurate understanding of its structure and how the nsSNPs impact protein stability based on free energy change (DDG). I-MUTANT, CUPSAT, MUpro and Align-GVGD predicted 5, 3, 3 and 7 mutations according to their structure to be deleterious in nature. Among these we filtered out 4 mutations i.e., N284I, C37R, Q538P, L360P.Conservation analysis of TLR3Using ConSurf, the mutations C37R and L360P had a score of 9 which showed that it had a highly conserved region and was buried inside the core region. N284I also had a score of 9 which showed a highly conserved region but was structurally exposed. Q538P had a score of 8 which was less conserved than the other three mutations and was exposed in nature.The score is analyzed on the basis of the amino acid in the variable region or conserved region. The conserved region which is scored to be 9 is color coded to be dark pink which is the most conserved. And those with a score of 1 is color coded to be blue which is the least conserved (Fig. 4). A deleterious mutation that falls under a highly conserved region is likely to be detrimental in nature.Fig. 4TLR3 chain A conservation analysis results predicted by the ConSurf server.Full size imageTable 4 Analysis of the high-risk nsSNPs in TLR3’s evolutionary conservation profile using the NetSurfP 2.0 database.Full size tableNetSurfP-2.0 server has been used for predicting secondary structure, solvent, and surface accessibility. According to NetSurfP, four SNPs N284I, C37R, Q538P and L360P were buried inside the backbone of the core protein. The most detrimental nsSNPs have RSA scores ranged from 0.063 to 0.49 and ASA scores ranged from 11.55 to 105.98 (Table 4) (Figure S3, Supplementary Information).Oncogenic analysis of TLR3CScape-somatic mutations isolate cancer driver mutations that arise early in tumor growth from passenger mutations that accumulate once metastasis has begun. Probability estimates, or p-scores in the range (0, 1), are used to represent predictions. Values above 0.5 are projected to be cancer drivers, while values below 0.5 are predicted to be passenger variations. Through CScape- somatic, it was seen that the mutations which had coding score as mentioned in Table 5 showed that the mutations N284I, Q538P, and L360P had values that indicated low-confidence passenger whereas C37R had a value that denoted a low-confidence driver. Cancer drivers occur fairly early in the development of the tumor. Meanwhile, passenger variants accumulate at later stages after a tumor starts to grow and usually correspond to low or no oncogenicity. However, the CScape data presents information about mutations in the chromosome 4, detailing their positions, reference bases, mutant bases, and associated coding scores. The mutation N284I, C37R and L360P were classified as oncogenic while, Q538P, showed no prediction for its coding score (Table 5).In addition to this, FATHMM-MKL and FATHMM-XF studies were performed and all four mutations- N284I, C37R, Q538P, and L360P show significant pathogenic potential and are most likely oncogenic. Both coding and non-coding regions have incredibly high p-values (all > 0.98) according to FATHMM-MKL, indicating a high likelihood of functional disruption. In particular, there is a considerable chance that N284I (coding: 0.99256, non-coding: 0.99253) and L360P (coding: 0.99103, non-coding: 0.99325) will undergo oncogenic transformation via structural and regulatory changes. Strongly detrimental effects are shown by C37R (coding: 0.98728, non-coding: 0.99346), with its elevated non-coding score raising concerns as it may play a function in dysregulated gene expression linked to cancer (Table S1 & S2, Supplementary Information).Table 5 Coding score of the filtered mutations for their oncogenic property through CScape and CScape-somatic.Full size tableStructural effects of deleterious SNPsHOPE is designed to achieve the goal of developing a server for automatic mutant analysis that can give light on the structural implications of a mutation. The HOPE server allowed structural visualization of the three extremely detrimental nsSNPs. As well as their structural modification, amino acid characteristics, and domain, the structural information for the mutations N284I, C37R, and L360P is provided in Table 6 for all three mutations.Fig. 5The visualization of wild-type (green) and altered (red) amino acid residues for all mutations using the HOPE Server (a) N284I, (b) C37R, (c) L360P.Full size imageWhen it comes to N284I, the altered residue is situated at a region critical to the protein’s activity and in close proximity to another region thought to be involved in binding. The mutation may interfere with how these domains interact, which could impact how signals are transmitted between them (Fig. 5a). The C37R mutation is present in a region of the protein that is crucial for its activity, and it is in close proximity to another region that is thought to be essential in binding. The mutation may interfere with how these domains interact, which could impact how signals are transmitted between them (Fig. 5b). For L360P, the altered residue is situated at a region critical to the protein’s function and in close proximity to another region thought to be essential in binding. The mutation may interfere with how these domains interact, which could impact how signals are transmitted between them (Fig. 5c).Table 6 Depicts the structural impact of TLR3 mutations retrieved from the HOPE server.Full size tableProtein-Protein interaction analysis & network analysis of TLR3STRING is a database of observed and predicted protein-protein interactions. The interactions result from computational prediction, cross-species knowledge transfer, and interactions acquired from other (primary) databases; they include both direct (physical) and indirect (functional) correlations. The STRING database was used to obtain the TLR3 interaction network. ‘TLR3’ was the input name, while ‘Homo sapiens’ was the organism choice and those with a confidence score of 0.9 and higher were chosen. In Fig. 6, the results are displayed as nodes and edges that depict how proteins interact with one another. In Fig. 7, the network analysis reveals that TRAF6, TRAF3, and MYD88 are strongly correlated with TLR3 based on MCC, MNC, closeness, radiality, and degree, whereas apart from these three, TLR3 is also correlated with RIPK1 based on betweenness.Fig. 6Protein-protein interaction of TLR3 gene.Full size imageFig. 7Network Analysis of genes obtained from STRING using the cytohubba plugin of Cytoscape, (a) Betweenness, (b) Closeness, (c) Degree, (d) MCC, (e) MNC, and (f) Radiality.Full size imageGene expression profile analysis of TLR3Through performing the box plot analysis using the GEPIA2 database, we can understand the differential gene expression associated with different cancer types of the TLR3 gene. Based on box plots, GEPIA2 offers differential signature score analysis. Through the results, it was observed that the two datasets of cancer types- CESC (cervical adenocarcinoma) and UCEC (Uterine corpus endometrial carcinoma) arise due to the overexpression of the TLR3 gene (Fig. 8). Generally, this means that most cancers occur due to the upregulation of the TLR3 gene.We completed the survival analysis of the patients with CESC and UCEC using the GEPIA online database. Patients were divided into two groups, high-expression level groups and low-expression level groups, based on the median level of TLR3 gene expression. Here, the findings indicated that patients with CESC had a longer survival period in both overexpression (roughly 200 months) and underexpression of the TLR3 gene, whereas in UCEC, it was observed that patient survival periods were higher when overexpression (roughly 140) of the TL3 gene was present and lower when underexpression of the TLR3 gene was present.Fig. 8(A) TLR3 gene expression analysis; total % survival rate of patients for both (B) cervical squamous cell carcinoma (CESC) and (C) uterine corpus endometrial carcinoma (UCEC).Full size imageEffect of most destructive nsSNPs and expression of TLR3 gene with their neighboring genesThe muTarget tool plots an expression plot and assists us in determining how the mutation affects gene expression. We used the muTarget tool for analyzing the correlation between gene expression and mutation based on the TCGA that correlates somatic mutations and gene expression in cancer. Correlations can be analyzed in two ways: The ‘Genotype’ run is for finding changes in gene expression that are related to a specific mutation and the ‘Target’ run is for finding mutations that alter the expression of target genes. The ‘Genotype’ hypothesis findings for uterine cancer showed that the expression of the genes CMC2, CXCL13, CXCL19, LAG3, SRD5A2, and others had changed (Fig. 9).Fig. 9Genotype analysis of TLR3 gene.Full size imageThe number of patients was too low (less than 10) to make any conclusions, and TLR3 was unable to change the expression of the nearby gene, hence there were no results for cervical cancer presented in the database. The TLR3 gene was used as input for both uterine and cervical cancer in the following Target run. According to our findings, the CHD6 gene exhibited higher expression than the RIMS1, MORC4, FMO5, TIAM1, and other genes in cervical cancer (Fig. 10A). Alternatively, for uterine cancer, the expression can be influenced by TP53, GABPB1, ZDHHC7, PTEN, MTPAP and other genes. When compared to its counterparts, PTEN in this category exhibited the greatest change in uterine cancer gene expression (Fig. 10B). Mutations in any interacting gene partners in the gene-gene network affect their neighboring genes, which may result in cancer formation, according to the muTarget study findings.Fig. 10Target analysis of TLR3 gene (A) Cervical cancer and (B) Uterine cancer.Full size imageThree-dimensional structure assessment of TLR3The Three-dimensional structure assessment of TLR3 without mutations (Fig. 11) and with mutations (N284I, C37R and L360P) (Figs. 12 and 13, &14) is performed with the help of the Alphafold 3 server. The projected template modeling (pTM) score and the interface predicted template modeling (ipTM) score are both generated from a measure known as the template modeling (TM) score. This assesses the precision of the complete structure. A pTM score exceeding 0.5 indicates that the predicted fold of the complex may closely resemble the actual structure. The ipTM evaluates the precision of the anticipated relative placements of the subunits inside the complex. Values beyond 0.8 indicate robust, high-quality predictions, whilst values below 0.6 imply a probable failure in prediction. ipTM values ranging from 0.6 to 0.8 represent an ambiguous zone in which forecasts may be accurate or inaccurate. The 2D & 3D Ramachandran plots were plotted for both structures without (Fig. 11b and c) and with mutations (Figs. 12b and c, 13b and c and 14b and c). In Fig. 11a, a structural assessment of TLR3 without mutations gave a pTM score of 0.92, while Figs. 12a, 13a and 14a respectively showed structural assessment of TLR3 with mutation N284I gave a pTM score of 0.91, TLR-3 with mutation C37R gave a pTM score of 0.92, and TLR3 with mutation L360P gave a pTM score of 0.92, all scores indicating robust and high-quality structural predictions. (Figure S4, S5, S6 & S7 Supplementary Information).Fig. 11(a) Structural assessment of TLR3 without mutations, (b) 2-D Ramachandran plot analysis (green, blue, and red (dots/triangles) represent torsion angles of favored, allowed, and disallowed regions respectively; dot represents residues other than glycine and triangles represents glycine), and (c) 3-D Ramachandran Plot (bar represents the frequency of torsion angles).Full size imageFig. 12(a) Structural assessment of TLR3 with N284I mutation, (b) 2-D Ramachandran plot analysis (green, blue, and red (dots/triangles) represent torsion angles of favored, allowed and disallowed regions respectively; dot represents residues other than glycine and triangles represents glycine), and (c) 3-D Ramachandran Plot (bar represents the frequency of torsion angles).Full size imageFig. 13(a) Structural assessment of TLR3 with C37R mutation, (b) 2-D Ramachandran plot analysis (green, blue, and red (dots/triangles) represent torsion angles of favored, allowed and disallowed regions respectively; dot represents residues other than glycine and triangles represents glycine), and (c) 3-D Ramachandran Plot (bar represents the frequency of torsion angles).Full size imageFig. 14(a) Structural assessment of TLR3 with L360P mutation, (b) 2-D Ramachandran plot analysis (green, blue, and red (dots/triangles) represent torsion angles of favored, allowed and disallowed regions respectively; dot represents residues other than glycine and triangles represents glycine), and (c) 3-D Ramachandran Plot (bar represents the frequency of torsion angles).Full size imageMolecular dynamics simulation and trajectory analysisThe Root Mean Square Deviation (RMSD) plot provides an overview of the structural stability of the protein throughout the simulation period. The wild-type 5GS0 (blue) shows bigger and more stable variable RMSD values than the mutants in the RMSD vs. Time plot (Fig. 15a), suggesting greater structural deviation and more extensive stability during the simulation. In contrast, the N284I mutant (orange) exhibits the lowest and most reliable RMSD, indicating a noticeably more stable conformation. The C37R (green) and L360P (red) mutants are in the between, with moderate RMSD variations. According to this pattern, the wild-type structure is more vulnerable to dynamic shifts under the simulated conditions, whereas the N284I mutation may give increased conformational stiffness or structural integrity.The Root Mean Square Fluctuation (RMSF) plot reveals localized flexibility within the protein structure by displaying the time-averaged fluctuation of each residue. The RMSF versus Residue plot (Fig. 15b) demonstrates per-residue flexibility. The general profile of all variants is similar, with higher fluctuation in the N- and C-terminal regions-which is to be expected given that they are generally disordered. N284I and C37R show somewhat greater fluctuations in the core region (about residues 300–400), despite the fact that the RMSF trends are generally conserved across all proteins. These results suggest localised instability or conformational flexibility in areas that may be important in functional interactions.The Radius of Gyration (Fig. 15c), which measures structural compactness, shows that the wild-type 5GS0 exhibits greater fluctuations and tighter packing, indicating sporadic transitions between compact and slightly relaxed states. With continually higher Rg values, N284I stands out once more, suggesting a more expansive structure overall. Compared to 5GS0, C37R and L360P similarly show higher Rg values, albeit they fluctuate more than N284I. This implies that the mutants, especially N284I, retain a stable but less compact shape, which may affect how they function biologically.The Solvent Accessible Surface Area (SASA) plot provides insights into how much of the protein’s surface is accessible to the solvent, reflecting changes in protein folding and hydrophobic core exposure. The Solvent Accessible Surface Area (SASA) plot (Fig. 15d) showed 5GS0 and C37R (blue and green) comparatively greater surface exposure than L360P and N284I. It’s interesting to note that N284I exhibits lower SASA despite being compact in Rg; this could be a sign of stronger surface contacts or hidden residues that restrict solvent access. L360P exhibits a comparable pattern.Fig. 15Protein & its mutation analysis of all four plots—(a) RMSD, (b) RMSF, (c) Radius of gyration, and (d) SASA for 100 ns or 100,000 ps [5GS0 (blue), N284I (orange), C37R (green), and L360P (red)].Full size imageStatistical analysisTo evaluate the relationships between various computational prediction scores associated with high-risk nsSNPs in the TLR3 gene, we conducted a Spearman correlation analysis. The results are visualized in the correlation heatmap (Figure S10, Supplementary Information). A strong positive correlation was observed between the SNAP score and the CScape score (r = 0.96), suggesting that mutations predicted to be functionally damaging by SNAP are also likely to be predicted as oncogenic by CScape. Similarly, the I-Mutant ΔΔG values showed a strong inverse correlation with both SNAP (r = −0.87) and CScape scores (r = −0.97), indicating that SNPs predicted to decrease protein stability are also associated with higher functional and oncogenic impact. ConSurf scores, which reflect evolutionary conservation, showed weak correlations with other metrics (SNAP: r = 0.17, I-Mutant: r = −0.10, CScape: r = −0.10), suggesting that conservation alone may not be sufficient to predict functional or oncogenic significance in this context.DiscussionIn this current study, we have shown the role of TLR3 associated with cervical cancer by computational approach. After running a set of databases, we found that among 71 missense mutations, 3 of them were highly detrimental in nature. These retrieved mutations namely N284I, C37R and L360P were further involved with different databases like HOPE server, STRING, GEPIA2 and muTARGET tool for gaining more understanding.Previous studies have demonstrated that TLR3 is aberrantly expressed in several cancers, including cervical cancer. For instance, Hasimu et al., (2011)50 reported elevated TLR3 expression in cervical intraepithelial neoplasia and cervical squamous cell carcinoma tissues compared to normal cervical tissues, suggesting a role in cervical carcinogenesis. Similarly, studies have shown that TLR3 expression correlates with apoptosis, proliferation, and angiogenesis in hepatocellular carcinoma, and serves as a prognostic biomarker in renal clear cell carcinoma. These findings underscore the significance of TLR3 in tumor biology across various cancer types51.Firstly, after retrieving the missense SNPs from NCBI, the mutations were incorporated into certain sequence-based tools like Mutation Assessor, PANTHER, Meta-SNP, PredictSNP, SNAP, PhD-SNP and SNPs&GO. These databases helped us to filter out the most deleterious mutations. To get more precise results of damaging mutations, we considered structural-based methods as well like the I-MUTANT, CUPSAT, MUpro and Align-GVGD. Through this, we sorted out 4 mutations that were highly detrimental in nature and these are N284I, C37R, Q538P and L360P. To understand the oncologic nature of these mutations, we used the databases like CScape, CScape-somatic and FATHMM, which showed that N284I, C37R and L360P were the more damaging mutations that could be interrelated to cancer. N284I has proven to be highly detrimental to different diseases other than cancer. Because of its association with impaired TLR3 signaling in vitro, it’s been said that N284I is highly related to viral infections in humans52. Previous studies have already implicated N284I in impaired TLR3 signaling and increased susceptibility to viral infections53suggesting its pathogenic role beyond cancer.In addition to this, C37R of TLR3 mutations has been predicted for the first time through computational analysis (dry lab) to be highly deleterious and potentially associated with cervical cancer. The mutation C37R, although not previously reported in a cancer context to our knowledge, was shown in our study to significantly impact protein structure and stability, and is proposed here for the first time as a novel cancer-related TLR3 mutation. However, further experimental validation is needed to confirm its impact on cancer and other diseases.Another mutation namely L360P, has been proven to be oncogenic. To support this, previous studies has shown that L360P is an HSE-causing (herpes simplex encephalitis) TLR3 mutation among the others that are P554S, E746X, G743D + R811I, and R867Q54. It has also been reported that heterozygosity for the HSE-causing L360P or G743D + R811I allele causes AD TLR3 deficiency in fibroblasts due to negative dominance and haploinsufficiency, respectively54. Therefore, our study expands on these findings by suggesting that these N284I, C37R and L360P may also contribute to oncogenic processes in cervical cancer, thereby connecting TLR3 immune dysfunction to tumor biology.To explore the functional implications of the identified high-impact nsSNPs (N284I, C37R, Q538P, L360P), we extended our analysis to include protein interaction networks, gene expression profiles, and mutation-expression correlations. STRING-based network analysis revealed strong interactions between TLR3 and key immune signaling proteins such as TRAF3, TRAF6, and MYD88, suggesting that these mutations could affect downstream immune pathways consistent with earlier findings that describe their involvement in TLR3-mediated NF-κB signaling55.Gene expression profiling using GEPIA2 revealed that TLR3 expression is upregulated in both cervical squamous cell carcinoma (CESC) and uterine corpus endometrial carcinoma (UCEC). Interestingly, survival analysis showed improved outcomes in UCEC patients with high TLR3 expression, whereas results for CESC were inconclusive, reflecting the context-dependent dual role of TLRs in tumor promotion and suppression. Furthermore, the muTarget analysis provided insight into mutation-associated expression shifts in neighboring genes. CHD6 showed the highest expression in cervical cancer, while TP53 and PTEN exhibited the greatest expression changes in uterine cancer. These findings point to potential network-level interactions, in which TLR3 mutations may influence the regulation of cancer-critical genes indirectly, in line with systems biology studies.To support the structural analysis, AlphaFold was used to generate separate structural models for each point mutation-C37R, N248I, and L360P-rather than combining all mutations into a single structure. This approach was adopted to better understand the individual structural consequences of each mutation without the confounding effects of multiple simultaneous substitutions. By modeling each mutant independently, we were able to observe localized conformational changes specific to each variant, allowing for a clearer interpretation of their potential impact on protein stability and function. This refinement improves the accuracy of structural comparison with the wild-type (PDB ID: 5GS0) and enhances the clarity of our structural analysis. Additionally, Sankari et al., (2024)56 and Kamal et al., (2025)57 also utilized Alphafold to generate accurate 3D models, enabling detailed visualization and functional analysis of mutant proteins. Our findings are in line with the study by Jumper et al., (2021)58which validates AlphaFold’s accuracy for modeling high-resolution protein structures and guiding mutational impact studies.In our study, the molecular dynamics simulations revealed notable structural differences between the TLR3 wild-type (5GS0) and its mutants (N284I, C37R, and L360P), as assessed through solvent accessible surface area (SASA), radius of gyration (Rg), RMSD, and RMSF. Among these, N284I exhibited the lowest RMSD and the most stable conformation, indicating enhanced structural integrity. This stability was further supported by RMSF analysis, which showed preserved flexibility across the mutants, though some increased fluctuations were noted around residues 300–400 in N284I and C37R. Furthermore, Rg analysis demonstrated that N284I maintained a more stable but expanded structure, potentially facilitating better ligand accommodation. SASA data revealed that N284I and L360P had reduced solvent exposure, suggesting tighter internal packing, a characteristic often linked to structural stabilisation in similar systems. In recent studies, molecular dynamics (MD) simulations have been employed to explore the structural effects of point mutations in TLR3 and TLR4. Mahita et al., (2018)59,60 investigated the TLR3 wild-type (WT) homodimer and TLR3 A795P homodimer to understand the impact of key mutations on the protein’s structural behavior. Their study highlighted subtle conformational changes induced by phosphorylation and point mutations in the TIR domains of TLR3, providing deeper insights into the molecular alterations that occur in these regions. Similarly, Prakasam et al., (2023)61 used MD simulations to examine structural changes in TLR4 induced by point mutations, further advancing our understanding of how such mutations influence TLR proteins and their functions.In addition to this, correlation analysis further reinforced the findings. A strong positive correlation between SNAP and CScape scores (r = 0.96) indicated that mutations predicted to be functionally damaging are also likely to be oncogenic. Additionally, I-Mutant ΔΔG scores were inversely correlated with both SNAP (r = −0.87) and CScape (r = −0.97) values, highlighting the role of structural destabilization in disease progression. In contrast, ConSurf scores had weak correlations with other metrics, suggesting that while conservation indicates evolutionary importance, it does not alone predict pathogenicity.However, this study concentrates on the most detrimental nsSNPs linked to cervical cancer; yet the impact of these SNPs may also be examined in relation to other serious disorders. Despite the estimations of this study being derived from several well-known tools and algorithms, comprehensive experimental research and large-scale population studies, in conjunction with clinical trials, are requisite prior to the use of the principal findings for clinical reasons. Moreover, these computational predictions are currently being validated through ongoing wet-lab experiments in our laboratory, which will provide experimental support for the in silico findings related to TLR3 SNPs.ConclusionThis study has predicted that TLR3 is related to cervical cancer with the help of the missense SNPs retrieved and the different in silico techniques involved. The TLR3 gene has undergone analysis of its nsSNPs since it has been linked to a number of complex illnesses. Four extremely damaging TLR3 nsSNPs namely N284I(rs5743316), C37R(rs752889035), Q538P(rs760275329), and L360P(rs768091235) have been found out of the 150 nsSNPs that have so far been reported in the dbSNP database. Based on a variety of evaluations, N284I, C37R and L360P were identified as the three most detrimental mutations. Although in silico methods cannot completely replace physical and frequently conclusive testing processes and methodologies, the current work is regarded to be useful for future research efforts that target TLR3 to treat cervical cancer.Data availabilityData is provided within the manuscript.ReferencesBray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74, 229–263 (2024).PubMed  Google Scholar Stelzle, D. et al. Estimates of the global burden of cervical cancer associated with HIV. Lancet Glob Health. 9, e161–e169 (2021).PubMed  Google Scholar Lei, J. et al. HPV vaccination and the risk of invasive cervical Cancer. N. Engl. J. Med. 383, 1340–1348 (2020).PubMed  Google Scholar Andersen, J. M., Al-Khairy, D. & Ingalls, R. R. Innate immunity at the mucosal surface: role of toll-like receptor 3 and toll-like receptor 9 in cervical epithelial cell responses to microbial pathogens. Biol. Reprod. 74, 824–831 (2006).PubMed  Google Scholar Mukherjee, S. et al. Toll-like receptor-guided therapeutic intervention of human cancers: molecular and immunological perspectives. Frontiers in Immunology vol. 14 Preprint at (2023). https://doi.org/10.3389/fimmu.2023.1244345Behzadi, P. et al. The dual role of toll-like receptors in COVID-19: Balancing protective immunity and immunopathogenesis. International Journal of Biological Macromolecules vol. 284 Preprint at (2025). https://doi.org/10.1016/j.ijbiomac.2024.137836Jin, M. S. & Lee, J. O. Structures of the Toll-like Receptor Family and Its Ligand Complexes. Immunity vol. 29 182–191 Preprint at (2008). https://doi.org/10.1016/j.immuni.2008.07.007Ernst, P. B., Takaishi, H. & Crowe, S. E. Helicobacter Pylori Infection as a Model for Gastrointestinal Immunity and Chronic Inflammatory Diseases. Dig Dis vol. 19 (2001). www.karger.comwww.karger.com/journals/ddiYu, L. & Chen, S. Toll-like receptors expressed in tumor cells: Targets for therapy. Cancer Immunology, Immunotherapy vol. 57 1271–1278 Preprint at (2008). https://doi.org/10.1007/s00262-008-0459-8Agarwal, M., Kumar, M., Pathak, R., Bala, K. & Kumar, A. Chapter Six - Exploring TLR signaling pathways as promising targets in cervical cancer: the road less traveled. in Targeting Signaling Pathways in Solid Tumors - Part A (eds Mukherjee, S. & Chatterjee, K.) vol. 385 227–261 (Academic, (2024).Behzadi, P. et al. The Interleukin-1 (IL-1) Superfamily Cytokines and Their Single Nucleotide Polymorphisms (SNPs). Journal of Immunology Research vol. 2022 Preprint at (2022). https://doi.org/10.1155/2022/2054431Aitken, N., Smith, S., Schwarz, C. & Morin, P. A. Single nucleotide polymorphism (SNP) discovery in mammals: A targeted-gene approach. Mol. Ecol. 13, 1423–1431 (2004).PubMed  Google Scholar Lander, S. et al. Initial Sequencing and Analysis of the Human Genome International Human Genome Sequencing Consortium* The Sanger Centre: Beijing Genomics Institute/Human Genome Center. NATURE vol. 409 www.nature.com (2001).Bateman, A. et al. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).Google Scholar Yoshida, T. et al. A covalent small molecule inhibitor of glutamate-oxaloacetate transaminase 1 impairs pancreatic cancer growth. Biochem. Biophys. Res. Commun. 522, 633–638 (2020).PubMed  Google Scholar Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res., 39(17), e118 (2011).Thomas, P. D. et al. A library of protein families and subfamilies indexed by function. Genome Res. 13, 2129–2141 (2003).PubMed  PubMed Central  Google Scholar Capriotti, E., Altman, R. B. & Bromberg, Y. Collective judgment predicts disease-associated single nucleotide variants. BMC Genomics 14 Suppl 3(Suppl 3):S2 (2013).Bendl, J. et al. PredictSNP: robust and accurate consensus classifier for prediction of Disease-Related mutations. PLoS Comput. Biol 10, e1003440 (2014).Bromberg, Y. & Rost, B. S. N. A. P. Predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 35, 3823–3835 (2007).PubMed  PubMed Central  Google Scholar Capriotti, E., Calabrese, R. & Casadio, R. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics 22, 2729–2734 (2006).PubMed  Google Scholar Capriotti, E. et al. WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics 14 Suppl 3, (2013).Capriotti, E., Fariselli, P. & Casadio, R. I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res 33, (2005).Parthiban, V., Gromiha, M. M. & Schomburg, D. C. U. P. S. A. T. Prediction of protein stability upon point mutations. Nucleic Acids Res 34, (2006).Cheng, J., Randall, A. & Baldi, P. Prediction of protein stability changes for single-site mutations using support vector machines. Proteins: Struct. Function Genet. 62, 1125–1132 (2006).Google Scholar Tavtigian, S. V. et al. Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral. J. Med. Genet. 43, 295–305 (2006).PubMed  Google Scholar Ashkenazy, H. et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 44, W344–W350 (2016).PubMed  PubMed Central  Google Scholar Klausen, M. S. et al. NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning. Proteins: Struct. Function Bioinf. 87, 520–527 (2019).Google Scholar Rogers, M. F., Shihab, H. A., Gaunt, T. R. & Campbell, C. CScape: A tool for predicting oncogenic single-point mutations in the cancer genome. Sci Rep 7, (2017).Rogers, M. F., Gaunt, T. R. & Campbell, C. CScape-somatic: distinguishing driver and passenger point mutations in the cancer genome. Bioinformatics 36, 3637–3644 (2020).PubMed  PubMed Central  Google Scholar Shihab, H. A. et al. An integrative approach to predicting the functional effects of non-coding and coding sequence variation. Bioinformatics 31, 1536–1543 (2015).PubMed  PubMed Central  Google Scholar Rogers, M. F. et al. FATHMM-XF: accurate prediction of pathogenic point mutations via extended features. Bioinformatics 34, 511–513 (2018).PubMed  Google Scholar Venselaar, H., te Beek, T. A. H., Kuipers, R. K. P., Hekkelman, M. L. & Vriend, G. Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces. BMC Bioinformatics 11, 548 (2010).Takeda, K. & Akira, S. Toll-Like receptors. Curr Protoc Immunol 14.12.1–14.12.10 (2015).Dong, X. et al. A simplified herbal formula improves cardiac function and reduces inflammation in mice through the TLR-Mediated NF-κB signaling pathway. Front Pharmacol 13, 865614 (2022).Shannon, P. et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).PubMed  PubMed Central  Google Scholar Tang, Z., Kang, B., Li, C., Chen, T. & Zhang, Z. GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res. 47, W556–W560 (2019).PubMed  PubMed Central  Google Scholar Menyhárt, O. et al. Guidelines for the selection of functional assays to evaluate the hallmarks of cancer. Biochimica et Biophysica Acta - Reviews on Cancer vol. 1866 300–319 Preprint at (2016). https://doi.org/10.1016/j.bbcan.2016.10.002Abramson, J. et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature 630, 493–500 (2024).PubMed  PubMed Central  Google Scholar Abraham, M. J. et al. High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1–2, 19–25 (2015). Gromacs.Google Scholar Scientiflow | Scientific Workflow Automation Tool. https://scientiflow.com/Saxena, S. et al. Structural and functional analysis of disease-associated mutations in GOT1 gene: an in silico study. Comput Biol. Med 136, 104695 (2021).Huang, J. & Mackerell, A. D. CHARMM36 all-atom additive protein force field: validation based on comparison to NMR data. J. Comput. Chem. 34, 2135–2145 (2013).PubMed  PubMed Central  Google Scholar Eisenhaber, F., Lijnzaad, P., Argos, P., Sander, C. & Scharf, M. The double cubic lattice method: efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies. Journal Comput. Chemistry 16 (1995).Kabsch, W. & Sander, C. Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features.Spearman, C. The proof and measurement of association between two things. Source: Am. J. Psychology 100(3-4), 441–471 (1904).Waskom, M. Seaborn: statistical data visualization. J. Open. Source Softw. 6, 3021 (2021).Google Scholar Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods. 17, 261–272 (2020).PubMed  PubMed Central  Google Scholar Mckinney, W. Data Structures for Statistical Computing in Python. (2010).Hasimu, A., Ge, L., Li, Q. Z., Zhang, R. P. & Guo, X. Chinese Anti鄄 Cancer A Ssociation CACA. www.cjcsysu.comLiao, G., Lv, J., Ji, A., Meng, S. & Chen, C. TLR3 Serves as a Prognostic Biomarker and Associates with Immune Infiltration in the Renal Clear Cell Carcinoma Microenvironment. J Oncol (2021). (2021).Brown, R. A. & Razonable, R. R. A real-time PCR assay for the simultaneous detection of functional N284I and L412F polymorphisms in the human toll-like receptor 3 gene. J. Mol. Diagn. 12, 493–497 (2010).PubMed  PubMed Central  Google Scholar Zhang, S. Y. et al. TLR3 deficiency in patients with herpes simplex encephalitis. Sci. (1979). 317, 1522–1527 (2007).Google Scholar Lim, H. K. et al. Severe influenza pneumonitis in children with inherited TLR3 deficiency. J. Exp. Med. 216, 2038–2056 (2019).PubMed  PubMed Central  Google Scholar Takeuchi, O. & Akira, S. Pattern recognition receptors and inflammation. Cell 140, 805–820 (2010). https://doi.org/10.1016/j.cell.2010.01.022 Preprint at.PubMed  Google Scholar Siva Sankari, G. et al. Computational analysis of sodium-dependent phosphate transporter SLC20A1/PiT1 gene identifies missense variations C573F, and T58A as high-risk deleterious snps. J. Biomol. Struct. Dyn. 42, 4072–4086 (2024).PubMed  Google Scholar Kamal, E., Kaddam, L. A., Ahmed, M. & Alabdulkarim, A. Integrating artificial intelligence and bioinformatics methods to identify disruptive STAT1 variants impacting protein stability and function. Genes (Basel) 16(3):303 (2025).Jumper, J. et al. Highly accurate protein structure prediction with alphafold. Nature 596, 583–589 (2021).PubMed  PubMed Central  Google Scholar Mahita, J. & Sowdhamini, R. Probing subtle conformational changes induced by phosphorylation and point mutations in the TIR domains of TLR2 and TLR3. Proteins 86, 524–535 (2018).Mahita, J. & Sowdhamini, R. Investigating the effect of key mutations on the conformational dynamics of toll-like receptor dimers through molecular dynamics simulations and protein structure networks. Proteins 86, 475–490 (2018).PubMed  Google Scholar Prakasam, P. & Abdul Salam, A. A. Basheer ahamed, S. I. The pathogenic effect of snps on structure and function of human TLR4 using a computational approach. J. Biomol. Struct. Dyn. 41, 12387–12400 (2023).PubMed  Google Scholar Download referencesAcknowledgementsThe authors would like to thank Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India, for providing infrastructure and support.FundingThis research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.Author informationAuthors and AffiliationsTherapeutics and Molecular Diagnostic Lab, J-3 Block, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, IndiaMohini Agarwal, Manish Kumar & Kumud BalaAmity Institute of Pharmacy, Amity University, Noida, IndiaSarthak DahiyaNational Institute of Biologicals, A-32, Sector-62, Noida, Uttar Pradesh, IndiaAnoop KumarRajiv Gandhi Cancer Institute and Research Centre, Delhi, IndiaRupal TripathiAuthorsMohini AgarwalView author publicationsSearch author on:PubMed Google ScholarManish KumarView author publicationsSearch author on:PubMed Google ScholarSarthak DahiyaView author publicationsSearch author on:PubMed Google ScholarAnoop KumarView author publicationsSearch author on:PubMed Google ScholarRupal TripathiView author publicationsSearch author on:PubMed Google ScholarKumud BalaView author publicationsSearch author on:PubMed Google ScholarContributionsM.A.- Conceptualization, Methodology, Formal analysis, Validation, Data curation, Investigation, Writing - Original Draft, Software. M.K.- Conceptualization, Methodology, Formal analysis, Validation, Data curation, Investigation, Writing - Original Draft, Software. S.D.- Software, Methodology, Data curation, Writing - Original Draft. A.K.- Writing - Review & Editing, Investigation, Validation, Visualization. R.T.- Writing - Review & Editing, Supervision, Validation, Investigation, Visualization. K.B- Conceptualization, Supervision, Data curation, Formal analysis, Investigation, Resources, Validation, Visualization, Writing - review & editing, Project administration. All authors read and approved the final manuscript.Corresponding authorCorrespondence to Kumud Bala.Ethics declarationsCompeting interestsThe authors declare no competing interests.Additional informationPublisher’s noteSpringer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Electronic supplementary materialBelow is the link to the electronic supplementary material.Supplementary Material 1Rights and permissionsOpen Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.Reprints and permissionsAbout this article