Computational studies on structural aspects of pyrazolo[3,4-b]pyridine derivatives as TRKA inhibitors in cancer therapy

Wait 5 sec.

IntroductionTRKs represent a significant family of transmembrane protein kinases that comprise three main subtypes: TRKA, TRKB, and TRKC1. These receptors belong to the receptor tyrosine kinase (RTK) family and are characterized by a complex structure that includes an extracellular domain for binding endogenous ligands, an intracellular domain that transmits signals, and a transmembrane domain that anchors the receptor in the cell membrane2. TRKA primarily interacts with nerve growth factor (NGF), while TRKB is predominantly activated by brain-derived neurotrophic factor (BDNF), and TRKC is activated by neurotrophin-3 (NT-3)3. Upon activation, the intramembrane kinase domain undergoes phosphorylation, which triggers downstream signaling pathways such as Ras/Erk, PLC-γ, and PI3K/Akt. These pathways are crucial for regulating cell proliferation, differentiation, and survival2. The genes encoding these receptors neurotrophic receptor tyrosine kinase 1 (NTRK1), neurotrophic receptor tyrosine kinase 2 (NTRK2), and neurotrophic receptor tyrosine kinase 3 (NTRK3) are responsible for the synthesis of TRKA, TRKB, and TRKC, respectively. Structural alterations in TRK proteins, particularly those associated with neurotrophic tyrosine receptor kinase (NTRK) fusions, can lead to the development of various cancers, including breast cancer, colorectal cancer4, and non-small cell lung cancer (NSCLC)5. These alterations typically result in the continuous activation of the intracellular kinase domain6, contributing to tumorigenesis and disease progression. In recent years, TRKs have emerged as effective therapeutic targets for cancer treatment, and the inhibition of these kinases using small molecules has been recognized as a promising strategy to suppress cancer cell growth. Several potent TRK inhibitors have been identified, including larotrectinib7,8 and entrectinib9, both of which received FDA approval as the first TRK inhibitors for the treatment of solid tumors harboring NTRK gene fusions. Despite the efficacy of these therapies, many patients face significant challenges due to resistance to these agents. Resistance mechanisms can arise from additional genetic mutations, alternative splicing events, or the activation of bypass pathways that circumvent the effects of the inhibitors. Consequently, the identification and development of novel inhibitors capable of targeting TRK mutants resistant to larotrectinib and entrectinib have become pressing challenges in the field10. Researchers are actively exploring combination therapies and next-generation TRK inhibitors to overcome resistance and improve patient outcomes. This ongoing investigation highlights the dynamic nature of cancer treatment, where adaptability and innovation are crucial for addressing challenges in targeted therapy. This effort necessitates the implementation of efficient and systematic methods for screening chemical databases against molecules with known biological activities, paving the way for innovative therapeutic interventions in oncology. CADD encompasses various methodologies11, including structure-based drug design, pharmacophore development, molecular docking, and MD simulations. These approaches, combined with assessments of chemical properties related to ADMET, enable the identification of promising drug candidates. Pharmacophore mapping identifies essential properties required for a molecule’s biological activity, highlighting critical features necessary for interaction with specific targets such as receptors or enzymes. By characterizing these features, pharmacophore models guide the design and optimization of new compounds with improved efficacy and selectivity. This modeling provides a valuable framework for virtual screening studies, which search compound libraries for molecules that align with the pharmacophoric model. Such screening is vital for reducing costs, minimizing energy expenditure, and saving time while enabling the construction of predictive models, pinpointing crucial active sites within proteins, and assessing ligand stability. The overarching objective of this study is to identify novel compounds that specifically target cancer cells. To align with this goal, we investigated the antitumor efficacy of 37 pyrazolo[3,4-b]pyridine derivatives that function as TRK inhibitors12. We performed pharmacophore mapping to identify the essential features required for the biological activity of these derivatives. A virtual screening study was conducted using the ZINC database to identify potential TRK inhibitors, adhering to the RO5 to select compounds with favorable drug-like properties. Moreover, molecular docking studies were carried out to assess the binding interactions of the top-scoring ligands against the TRK target. In addition, ADMET predictions were performed on these lead compounds. The results indicate that the top-scored ligands and ZINC hits have promising profiles, suggesting that they may be more effective than current TRK inhibitors and warrant further investigation as potential therapeutic agents for cancer treatment.Materials and methodsData set selection and ligand preparationThis study aimed to develop a pharmacophore model based on the fundamental chemical characteristics of 37 selective TRKA inhibitors with anticancer activity, which are identified in the literature12.The IC50 values for these ligands, which ranged from 26 nM to less than 100,000 nM as indicated in Table 1, were converted to pIC50 values. The 2D structures of the inhibitors were generated using ChemDraw Ultra 12.013. Following this, the ligands were imported into Maestro, where they underwent preparation using the LigPrep Wizard14. The structures were then optimized employing the OPLS 2005 force field, ensuring that the ionization state of each ligand remained in a neutral form. Additionally, tautomers and a minimum of 10 conformations for each ligand were generated14.Table 1 Chemical structures of ligands in the dataset and their TRKA anticancer activity (pIC50).Full size tableDevelopment of pharmacophoreThe objective of pharmacophore mapping is to identify potential 3D pharmacophores for a set of active ligands. For this analysis, the PHASE15 module of Schrodinger Maestro was employed. This module generated two unique pharmacophoric features based on structural criteria: hydrogen bond donor (HBD) as D and ring aromaticity (R), both of which are essential for biological activity16. In the pyrazolo[3,4-b]pyridine derivatives, the donor atom is nitrogen, and the pyrazole ring includes a ring feature, along with a hydrogen bond acceptor (HBA) represented by A and another R feature represented by the nitrogen and methylbenzene group. This module generated two unique pharmacophoric features based on structural criteria: D and R, both of which are essential for biological activity16. In the pyrazolo[3,4-b]pyridine derivatives, the donor atom is nitrogen, and the pyrazole ring includes an ring feature, along with a acceptor and another R feature represented by the nitrogen and methylbenzene group. Each feature on this pharmacophore map corresponds to a distinct region that is specifically associated with the active ligands, as illustrated in Fig. 1.Fig. 1The ADRR-1 pharmacophore hypothesis was developed using 37 ligands that selectively inhibit TRKA, illustrating HBD in blue, HBA in pink, and aromatic rings in orange.Full size imagePharmacophore hypotheses creationNineteen hypotheses were generated using the PHASE module, which aids in elucidating the binding interactions of active ligands with the receptor, featuring a box size of 1 and a minimum intersite distance of 2 Å. A pharmacophore site is a property that has been identified and mapped to a specific location within a conformation. This approach establishes a correlation between structural similarities and chemical features for the 37 ligands under investigation. Four distinct features with various variants were defined for the development of the pharmacophore hypothesis.Pharmacophore model validationValidation of the pharmacophore model is a critical assessment tool for evaluating the key structural characteristics responsible for inducing site-specific bioactivity. To verify the model’s capability to identify known ligands from a decoy set, it was tested against a decoy set17 containing 557 ligands, which included 21 recognized active ligands. Several statistical metrics, including the enrichment factor (EF), boltzmann enhanced discrimination of receiver operating characteristic (BED-ROC), ROC, and area under (AUC) the ROC curve, were employed to assess the model’s performance in differentiating between active and inactive ligands.Zinc database based virtual screeningA virtual screening assessment was conducted using the ZINC database, with the pharmacophore model ADRR-1 employed to identify potential drug candidates. In a preliminary virtual screening study, 453 tropomyosin receptor kinase inhibitors were evaluated via the ZINC database18. These ZINC ligands were filtered using the Ro5 to eliminate unsuitable candidates before progressing to high-throughput virtual screening (HTVS) and extra precision (XP) modes against TRKA inhibitors utilizing Schrodinger’s GLIDE module.Protein preparationThe 3D structure of TRKA (PDB code: 6KZD, resolution of 1.71 Å)19 is a single-chain protein composed of 312 amino acids. Key amino acids located at the active site of TRKA include Phe698, Met620, Glu546, Glu618, Gly547, Val552, Ala570, Leu544, Leu686, Phe617, Ala548, Met700, Gly699, Arg683, Gly545, Gly623, Tyr619, Val601, Lys572, Asp697, Arg702, Ser701, Glu588, Gly696, Asp624, Lys627, Asn684, Tyr709, Asp703, Leu592, Phe549, Asn626, and Lys621. This structure was obtained from the Protein Data Bank (PDB) and subsequently processed using the protein preparation wizard in Maestro (Schrödinger, 2017-2, LLC, New York). During this procedure, all water molecules were removed, hydrogen atoms were added, and energy minimization was performed using the OPLS_2005 force field 20.Validation of docking protocolValidation of the molecular docking protocol is essential to ensure the reliability of the docking results. The accuracy of the molecular docking procedure is evaluated using the root mean square deviation (RMSD). To calculate the RMSD, the co-ligand of 6KZD (DZ6) was extracted from the receptor and subjected to XP Glide docking using the corresponding receptor grid. The RMSD is determined by superimposing the best-docked pose of the co-ligand onto its original crystallographic bound conformation. For docking solutions to be deemed reliable, the RMSD value must be ≤ 2.0 Å21,22. This criterion helps researchers confirm that the docking performed is both accurate and dependable, thereby enabling reliable predictions of molecular interactions.Molecular dockingMolecular docking was performed using the XP mode of Glide23 to screen the top-scored ligands and ZINC hits as potential ligands for this study. The Glide algorithm systematically investigates various orientations, positions, and conformations of the ligands within the binding pocket of the enzyme. The final evaluation of binding affinity was established through the Glide score (G Score), which quantifies binding energy and offers insights into the potential interactions between the ligands and the target enzyme. This approach provides a comprehensive understanding of ligand-receptor interactions, thereby facilitating the identification of promising candidates for further pharmacological development.MD simulationMD simulations were performed on the top-ranked docked ligand interacting with the TRKA enzyme to examine the stability and conformational dynamics of the protein–ligand complex. The protein–ligand complexes were solvated using the TIP3P water model within a cubic box, employing periodic boundary conditions to effectively simulate the biological environment. To ensure physiological relevance, the system was neutralized with an adequate concentration of ions at 0.15 mol/L, maintaining a balanced net charge. Energy minimization of the system was carried out using the OPLS 2005 force field to optimize the initial conformation prior to the simulations. The dynamics of the system were simulated using the NPT ensemble (constant number of particles, pressure, and temperature), with the temperature set at 300.0 K and the pressure maintained at 1.013 bar throughout the 100 ns simulation period. Detailed analyses were conducted using metrics such as RMSD, root mean square fluctuation (RMSF), radius of gyration (rGyr), intramolecular hydrogen bonds (intraHB), molecular surface area (MolSA), and polar surface area (PSA) utilizing the Desmond trajectory analysis tools24. This comprehensive analysis offers insights into the stability and flexibility of the protein–ligand system, elucidating potential mechanisms underlying ligand binding and efficacy, and ultimately guiding future drug design initiatives.Drug likeness and in silico pharmacokinetics ADMET predictionIn the field of drug discovery, the concept of drug-likeness serves as a critical criterion for assessing the potential of ligands to be developed into effective therapeutic agents. Drug-likeness encompasses a range of physicochemical properties and biological activities that significantly impact a ligand’s pharmacokinetics, safety profile, and therapeutic efficacy. Key parameters often considered in the evaluation of drug-likeness include molecular weight (MW), lipophilicity, commonly measured by logP, the number of HBD and acceptors, as well as the identification of structural alerts that indicate toxicity. Lipinski’s Ro5 is a widely recognized guideline that outlines these parameters, suggesting that ligands that adhere to these criteria are more likely to exhibit favorable oral bioavailability. In addition to Ro5, several other guidelines and criteria are valuable for assessing drug-likeness. The Veber rule evaluates the number of HBD and rotatable bonds, positing that ligands with no more than 10 HBD and 10 rotatable bonds are more likely to demonstrate good bioavailability. Egan’s rule emphasizes the MolSA to volume ratio, indicating that ligands with an appropriate ratio generally have better accessibility and absorption. Ghose’s rule examines the physicochemical ranges of ligands, assisting researchers in identifying those that meet desirable limits for biological activity. Finally, Pfizer’s rule highlights the significance of physicochemical properties such as solubility and stability under biological conditions. In this study, we employed the SwissADME25 and pkCSM26 servers to assess the drug-likeness of the ten ligands exhibiting the best biological activities.Results and discussionPharmacophore hypothesis dataset and validation of the pharmacophore modelA total of 19 hypotheses were generated in this model, each demonstrating interactions with the identified receptor molecules. Each hypothesis is assigned a specific value, and the ranking of the different hypotheses is based on a variety of scores27. ADRR-1 was selected as the top hypothesis, determined by the survival score and the other scores presented in Table 2.Table 2 Creation of pharmacophore hypotheses with differing scores and rankings.Full size tableThe robustness of the ADRR-1 pharmacophore model was demonstrated through its alignment with ligand L1, the active ligand in the training set, which exhibited an anti-cancer pIC50 of 7.585 nM (Fig. 2A). The features represented by this hypothesis include one acceptor, one donor, and two aromatic rings. This indicates that the ADRR-1 hypothesis serves as a valuable and reliable pharmacophore model for identifying TRKA inhibitors. The angles and distances between various sites of ADRR-1 are provided in Fig. 2B and C, as well as illustrated in Tables 3 and 4, respectively. For each ligand, a single aligned conformer was superimposed on the hypothesis based on the RMSE of feature atom coordinates in comparison to those of the corresponding reference feature. Subsequently, fitness scores for all ligands were evaluated using the top-scoring pharmacophore model. A higher fitness score correlates with a more accurate prediction of the ligand’s activity. The fit function not only assesses whether the feature is mapped but also includes a distance term that measures the separation between the feature on the molecule and the centroid of the hypothesis feature. Figures 2A demonstrate that ADRR-1 aligns with the most active ligand L1.Fig. 2 (A) The ADRR-1 pharmacophore hypothesis demonstrating strong alignment with the active ligand L1. (B) distances and (C) angles between the pharmacophori points of the ADRR-1 model.Full size imageTable 3 Angles among the pharmacophoric sites of ADRR-1.Full size tableTable 4 Intersite distances among the pharmacophoric sites of ADRR-1.Full size tableUpon establishing several criteria, including the % screen plot and the ROC plot, the validation process for the hypothesis can proceed28. The % screen plot is instrumental in evaluating the number of active ligands retrieved from the entire set of ligands. By employing the ROC plot at various cut-off thresholds, it becomes possible to analyze the relationship between sensitivity and specificity. Optimal discrimination occurs when there is no overlap between the two distributions, resulting in a test that has both 100% precision and 100% sensitivity, as indicated by the ROC curve that traverses the upper left corner. The accuracy of the method is determined by how closely the curve approaches this corner29. Both the ROC plot and the % screen plot exhibit high levels of accuracy, as demonstrated in Fig. 3A and B.Fig. 3(A) The % screen plot evaluates the number of active ligands retrieved from the entire set of ligands. (B) The ROC plot with various cut-offs shows the ability to analyze the relationship between sensitivity and specificity.Full size imageMolecular docking of ligands with TRKA receptorMolecular docking is a powerful computational technique used to predict the interactions between potential drug compounds and biological targets. In this study, molecular docking procedures were implemented to analyze compounds that specifically target the TRKA protein. A critical aspect of docking analysis is the validation of the docking methodology to ensure the accuracy and reliability of the results obtained. Initially, the docking procedure was validated by re-docking the co-crystallized ligand DZ6. This step is essential as it allows for the assessment of the docking algorithm’s capability to reproduce known ligand–protein interactions. The re-docking results demonstrated a robust correlation between the re-docked ligand’s coordinates and the original co-crystallized coordinates, with a RMSD value of 0.86 Å. This low RMSD indicates a high level of precision in the docking methodology, affirming its reliability for subsequent analyses (Fig. 4).Fig. 4Superimposition of the native ligand (green) and the docked ligand (red) with an RMSD value of 0.86 Å.Full size imageFollowing the successful validation, interaction studies were conducted between the TRKA protein and potent ligands, as well as ZINC hits. Ten ligands L1, L4, L5, L8, L10, L11, L12, L15, L25, and L27 had the highest binding scores with key amino acids, including Phe698, Met620, Glu546, Glu618, Gly547, Val552, Ala570, Leu544, Leu686, Phe617, Ala548, Met700, Gly699, Arg683, Gly545, Gly623, Tyr619, Val601, Lys572, Asp697, Arg702, Ser701, Glu588, Gly696, Asp624, Lys627, Asn684, Tyr709, Asp703, Leu592, Phe549, Asn626, and Lys621 in TRKA, respectively. As shown in Table 5, all these ligands, along with the ZINC hits, exhibit strong and moderate hydrogen bonds, highlighting the significance of these interactions. This underscores the critical role of hydrogen bonding in the stability and efficacy of ligand binding to the TRKA protein.Table 5 Docking scores and interactions of top-scored ligands, top ZINC hits, and Entrectinib.Full size tableAmong the ligands analyzed, L5 exhibits the most favorable docking score of -14.169 kcal/mol, suggesting a strong affinity for key amino acid residues of the TRKA protein, including Ala570, Asp624, Asp697, Asp703, Glu546, Glu618, Gly545, Gly623, Gly696, Leu544, Leu686, Lys572, Lys627, Met620, Phe617, Phe698, Ser701, Val552, Val601, Val704, and Tyr619. Figures 5 present, respectively, the 2D and 3D interactions between L5 and the TRKA protein. Five conventional hydrogen bonds, categorized as strong and moderate, are formed between the carbonyl oxygen and nitrogen groups of the pyridine ring, as well as the nitrogen-methyl and fluorine atoms in the (3-((5-(2,5-difluorobenzyl)-1H-pyrazolo[3,4-b]pyridin-3-yl)amino)phenyl)(4-methylpiperazin-1-yl)methanone moiety and the residues Glu546, Met620, Lys627, and Lys572. Additionally, three carbon-hydrogen bonds, also categorized as strong and moderate, are formed between the C=N of the pyrazolo[3,4-b]pyridine moiety and residue Gly623, the CH of the pyridine and residue Glu618, and the CH2 of the pyrazolo[3,4-b]pyridine moiety with residue Asp703.Ligand L5 also establishes hydrophobic interactions with the receptor via pi-alkyl interactions with Val552, Leu544, Ala570, and Leu686; alkyl-alkyl interactions with Val704; pi-pi T-shaped interactions with Phe698; and pi-pi stacked interactions with Phe617 of the target protein. This diverse interaction profile indicates that L5 likely forms a more stable complex with the target protein compared to Entrectinib, which has a docking score of −10.081 kcal/mol.Fig. 52D and 3D binding interactions of L5 with the targeted TRKA. The colors indicate different types of interactions: Van der Waals interactions are represented in light green, Conventional Hydrogen Bonds in green, Carbon Hydrogen Bonds in light gray, Pi-Pi Stacked interactions in pink, Pi-Pi T-shaped interactions in magenta, Alkyl interactions in light gray, and Pi-Alkyl interactions in light pink.Full size imageFurthermore, ligands L11 and L4 demonstrate strong binding affinities, with docking scores of −13.870 kcal/mol and −13.581 kcal/mol, respectively. These ligands establish significant strong hydrogen bonds with important residues such as Phe698, Glu618, and Met620, in addition to robust hydrophobic contacts with residues like Val552 and Leu686. The range of interactions observed in these ligands suggests they may offer comparable or enhanced binding characteristics relative to Entrectinib. L1 achieves a docking score of −13.441 kcal/mol, exhibiting an interaction profile similar to that of L4, with strong and moderate hydrogen bonding involving Phe698 and Glu546. Although its binding potential is slightly lower than that of L5, L4, and L11, it still demonstrates favorable interactions that may contribute to effective binding. In contrast, ligands L15 and L25 show lower binding affinities, with scores of −12.784 kcal/mol and −12.672 kcal/mol, respectively. These ligands are characterized by fewer hydrogen bonds and limited hydrophobic interactions, which may hinder their overall binding efficacy to the target protein compared to both the higher-performing ligands and Entrectinib. L27, with a docking score of −13.597 kcal/mol, presents an interaction profile that is competitive with other high-scoring ligands but does not exceed the efficacy of L5, L4, L11, and L1. The presence of multiple hydrogen bonds and diverse interaction types among these ligands further emphasizes their potential as superior candidates compared to Entrectinib. The key amino acids involved in the interaction with Entrectinib, as shown in Fig. 6, include Asp624, Gly699, Leu686, Ala570, Leu544, Phe617, Phe698, Met700, Arg683, Tyr709, Ser706, Asp708, Ser701, Val704, Lys627, Gly545, Val552, Lys572, Asp697, Val601, Glu618, Gly696, Tyr619, Met620, Lys621, Gly623. Entrectinib exhibits significant hydrogen bonding interactions with key amino acids, including Asp624 and Gly699. These interactions are crucial for stabilizing the compound’s binding to the target protein. Additionally, Entrectinib demonstrates several hydrophobic interactions with amino acids such as Leu686, Ala570, Leu544, Phe617, and Phe698. These hydrophobic interactions contribute to the overall binding affinity by promoting close contact and minimizing exposure to the aqueous environment. Moreover, a range of van der Waals interactions is observed between Entrectinib and various amino acids, including Met700, Arg683, Tyr709, Ser706, Asp708, Ser701, Val704, Lys627, Gly545, Val552, Lys572, Asp697, Val601, Glu618, Gly696, Tyr619, Met620, Lys621, and Gly623. These van der Waals forces play a significant role in the specificity and stability of the ligand-receptor complex, further enhancing the efficacy of Entrectinib.Fig. 62D and 3D binding interactions of Entrectinib with the targeted TRKA. Entrectinib exhibits hydrogen bonding, hydrophobic interactions, and van der Waals interactions.Full size imageAmong the ZINC ligands, four ligands, including ZINC000013331109, ZINC000047843689, ZINC000100115906, and ZINC000095803748, exhibited the best docking scores. These ligands interact with a range of amino acids, including Glu618, Met620, Leu686, Ala570, Phe617, Val552, Phe698, Lys572, Leu544, Gly623, Tyr619, and Asp624, with the highest docking score attributed to ZINC000013331109. As illustrated in Fig. 7, the pyrazole ring of ZINC000013331109 forms three hydrogen bonds with amino acids such as Glu618 and Met620. Additionally, the pyrazolo-pyridine demonstrates multiple Pi-alkyl interactions with amino acids such as Leu686, Ala570, Phe617, Val552, and Phe698. As shown in Table 5, other hydrophobic interactions, similar to those observed in the top-scored ligands and Entrectinib, are also present in ZINC000013331109. Notably, the interactions and docking score of this ZINC compound surpass those of Entrectinib, indicating the potential efficacy of this ZINC compound.Fig. 72D and 3D binding interactions of ZINC000013331109 with the targeted TRKA. This compound exhibits hydrogen bonds and multiple pi-alkyl interactions, along with hydrophobic interactions.Full size imageMD simulationThe RMSD plots presented here analyze the stability and structural changes of protein–ligand complexes over MD simulations lasting 100 ns. The horizontal axis represents simulation time in ns, while the vertical axis measures RMSD in Å, indicating the structural deviations of both the protein and ligands from their initial conformations. As shown in Fig. 8a for ligand L5, the RMSD curve for the protein (Cα) initially exhibits significant fluctuations, reflecting the structural changes in the protein upon ligand binding. These fluctuations may arise from alterations in the interactions between the ligand and the protein. After approximately 20 ns, the RMSD curve stabilizes around 4.5 Å, indicating that the protein reaches a relatively stable conformation following its initial adjustments. This stability suggests strong and persistent interactions between ligand L5 and the protein. The RMSD curve for ligand L5 shows lower initial values, demonstrating that it quickly achieves a stable state. After 60 ns, the ligand’s RMSD reaches approximately 2.5 Å, suggesting that, while it may experience minor changes, it remains comparatively more stable than the protein throughout the simulation. This stability indicates that ligand L5 is well-positioned within its binding site, thus maintaining effective interactions with the protein. In contrast, Fig. 8b shows that the RMSD for ZINC000013331109 remains relatively stable for the protein between 0 and 20 ns. However, it gradually increases from around 2 Å to approximately 4 Å. This noticeable increase indicates that the protein undergoes structural changes. The RMSD of ligand ZINC000013331109 starts at about 1.5 Å and gradually rises to around 2.5 Å. This smaller variation suggests that ligand ZINC000013331109 maintains greater stability compared to the protein, possibly indicating that it is well-positioned within the protein binding site, thereby establishing effective interactions. Furthermore, the disparity in RMSD changes between the protein and the ligand underscores the stability of ligand ZINC000013331109 during the protein’s deformation. The absence of significant fluctuations in the RMSD of the ligand emphasizes the robustness of the interactions between ligand ZINC000013331109 and the protein.Fig. 8RMSD values for (a) L5-6KZD and (b) ZINC000013331109-6KZD complexes.Full size imageThe RMSF plot provides critical insights into the flexibility and stability of specific residues within a protein during MD simulations. The x-axis represents the residue number, while the y-axis indicates the RMSF values in Å, reflecting the average fluctuations of each residue over the simulation period. Figure 9a depicts a comprehensive examination of ligand L5's interaction with the residues of the 6KZD binding site, indicating varying degrees of flexibility among different residues. Higher RMSF values, particularly those exceeding 3 Å, suggest greater mobility, while lower values, around 0.8–1.6 Å, indicate more stable regions. The green bars in the plot highlight residues with low RMSF values, such as Arg542, Leu544, Gly545, Glu546, Gly550, Val552, Ala570, Lys572, Val601, Phe617, Tyr619, Met620, Lys621, His622, Gly623, Asp624, Lys627, Leu686, Asp697, Phe698, Ser701, Arg702, Asp703, Val704, and Tyr705. These residues signify regions within the protein that exhibit varying degrees of stability and flexibility, which may play crucial roles in the protein’s structural integrity and functional dynamics. In contrast, as shown in Fig. 9b, ZINC000013331109 reveals that Leu544 exhibits significant fluctuations, indicating increased mobility in this residue. This heightened mobility may reflect structural instability or a dynamic region that is critical for the protein’s function. Additionally, similar to ligand L5, Phe617 and Leu686 are associated with strong intermolecular interactions or stabilizing structural motifs, which suggest reduced fluctuations and a more rigid conformation.Fig. 9RMSF plot of the (a) L5-6KZD and (b) ZINC000013331109-6KZD complexes.Full size imageThe bar chart presented illustrates the protein–ligand interactions within the L5-6KZD complex during MD simulations. As shown in Fig. 10a, this chart categorizes these interactions into four distinct types: hydrogen bonds, hydrophobic interactions, ionic interactions, and water bridges, each represented by a different color for enhanced visual clarity30,31. Notably, the residue Met620 exhibits a high interaction fraction, indicating its significant role in forming hydrogen bonds with ligand L5. This strong interaction underscores Met620 as a critical residue for maintaining the structural integrity and binding affinity of the complex. The presence of hydrophobic interactions is indicated by the purple bars, with residues such as Ala570, Phe617, and Phe698 making substantial contributions. These non-polar interactions are essential for stabilizing the protein–ligand complex by reducing the exposure of hydrophobic regions to the aqueous environment. Additionally, water bridge interactions are depicted in blue, although they occur less frequently than hydrogen bonds and hydrophobic contacts. Residues such as Glu546 and Tyr619 are involved in these water bridge interactions, which can enhance stability by facilitating additional contacts through solvent-mediated interactions. It is noteworthy, as shown in Fig. 10b, that Met620 in ZINC000013331109, similar to L5, makes a significant contribution to hydrogen bonding. This suggests that this amino acid play a critical role in stabilizing the protein structure. Additionally, multiple hydrophobic interactions between ZINC000013331109 and the amino acids Val552, Ala570, Leu686, and Phe698 are observed, occurring with greater prevalence than those associated with the ligand L5.Fig. 10The interactions of (a) L5-6KZD and (b) ZINC000013331109-6KZD during the MD simulation.Full size imageThe examination of ligand entrapment within specific amino acid residues offers important insights into the binding mechanism. As shown in Fig. 11a, Met620 and Phe698 were consistently engaged with ligand L5 throughout the entire simulation duration. Ligand L5 also exhibited strong interactions with residues such as Glu546, Ala570, Asp624, and Leu686, maintaining close proximity to these residues for a substantial portion of the trajectory. In contrast, other amino acid residues showed lower levels of interaction, suggesting that their contacts with the ligand occurred less frequently. Similarly, for ZINC000013331109, as illustrated in Fig. 11b, the amino acids Glu618 and Met620 were consistently engaged with the ligand throughout the entire duration of the simulation, highlighting the critical role of these residues in stabilizing the binding interaction.Fig. 11The observed interaction fractions throughout the simulated trajectory for (a) L5-6KZD and (b) ZINC000013331109-6KZD.Full size imageThe analysis of ligand–protein interactions for more than 30% of the simulation duration highlights the significant interactions between the ligand and various amino acid residues within the protein. Each residue plays a critical role in stabilizing the ligand. As shown in Fig. 12a, Met620 maintained a high interaction level of 90% throughout the simulation, showcasing its pivotal role in binding. Similarly, Phe698, Leu686, and Ala570, which exhibited notable hydrophobic interactions, significantly contribute to the overall stability of the ligand–protein complex. In the case of ZINC000013331109, as shown in Fig. 12b, charge interactions were observed for 99% of the simulation time. Additionally, the amino acid Met620 exhibited two hydrophobic interactions with the imidazole ring, further highlighting the significant role of this amino acid.Fig. 12Schematic representation of detailed ligand atom interactions with the protein residues for (a) L5-6KZD and (b) ZINC000013331109-6KZD.Full size imageThe torsional degree of freedom pertains to the rotation around a single bond within a molecule, enabling it to achieve an optimal conformation for interaction with the protein. Torsional degree values for the L5-6KZD complex were calculated, revealing variations in specific bonds throughout the trajectory simulation. As shown in Fig. 13a, three C–N bonds and three C–C bonds displayed changes in their torsional conformations within ligand L5. Specifically, the bonds between atoms 5 and 8, 8 and 10, and 15 and 16 exhibited a single conformation. Conversely, the bonds between atoms 14 and 15, 23 and 24, and 22 and 23 revealed two distinct conformations. This finding suggests that kaempferol adopts a specific conformation within the binding pocket of 6KZD. Figure 13b presents the potential energy differences linked to the torsional rotations. The differences observed for bonds 22 and 23, along with 23 and 24, were measured at 0.17 units. For the bonds 14 and 15, 15 and 16, 5 and 8, and 8 and 10, the differences were 8.00, 10.00, 21.33, and 7.76 units, respectively. In the case of ZINC000013331109, a C–C bond and a C–N bond are observed (Fig. 13c). The bonds between atoms 4 and 5, as well as those between atoms 12 and 13, exhibit a three-dimensional conformation, with potential energy differences associated with the torsional rotations of 0.85 and 11 units. As shown in Fig. 13d, several distinct conformers are clearly represented through the torsional angle plots. The first conformer is observed within the torsional angle range of 0° to 90°, exhibiting a high probability density in the corresponding bar plot. This indicates that this conformation remains relatively stable throughout the simulation. The second conformer emerges around the 180° angle, also displaying significant density in the bar plot, suggesting that it represents a balanced state within the ligand’s structure. The third conformer is situated close to the 270–360° range, and similarly, it is evidenced in the bar plot as a distinct stable conformation. Overall, these three conformers illustrate the diversity of the ligand’s conformational space and underscore the impact of torsional flexibility on its properties. Understanding this diversity is crucial for elucidating ligand behavior in biological interactions and aids in the drug design process.Fig. 13(a) and (c) show the 2D schematics of ligand L5 and ZINC000013331109, with different color-coded rotatable bonds. (b) and (d) present the torsional analysis of the conformations of ligand L5 and ZINC000013331109.Full size imageThe analysis of the structural and dynamic properties of molecules is crucial for a thorough understanding of their behavior under varying conditions. Essential tools in this area include the RMSD, which assists in evaluating stability and structural alignment. The rGyr provides insights into the mass distribution within the molecule, reflecting its shape and compactness. The SASA measures the portion of the molecule’s surface that is available for interaction with solvents, which is important for understanding intermolecular interactions. Additionally, intraHB interactions are fundamental for stabilizing the 3D structures of molecules. The MolSA and PSA characterize the surface properties of the molecule, which have significant implications for its biological and chemical interactions. Together, these parameters constitute vital metrics for analyzing and predicting molecular behavior in MD simulations. As shown in Fig. 14a, b, the ligands L5 and ZINC000013331109 exhibit significant differences in their structural properties. Regarding RMSD, ligand L5 ranges from 0.2 to 2.4 Å, while ZINC000013331109 shows more stable values between 0.1 and 0.6 Å, indicating less structural variation. For the rGyr, L5 has values between 4.0 and 5.6 Å, whereas ZINC000013331109 is more compact, ranging from 2.65 to 2.85 Å. The MolSA for L5 is considerably larger, ranging from 350 to 430 Å2, while ZINC000013331109 is much smaller, with a range between 175 and 182 Å2. This suggests that L5 may engage in more extensive interactions with its environment. In terms of SASA, L5 spans from 50 to 230 Å2, compared to ZINC000013331109, which ranges from 5 to 45 Å2, further indicating that L5 has higher accessibility to solvent. Finally, the PSA for L5 ranges from 70 to 160 Å2, whereas ZINC000013331109 fluctuates between 127 and 140 Å2, suggesting greater stability in the PSA of ZINC000013331109 compared to the minimum values of L5.Fig. 14Presents the ligand properties of (a) the L5-6KZD and (b) ZINC000013331109-6KZD during the 100 ns MD simulation.Full size imageDrug-likeness predictionThe top scored ligands and ZINC hits evaluated for drug-likeness are detailed in Table 6. The top scored ligands have a higher MW, ranging from 433.45 to 467.44 g/mol. This higher molecular weight may limit their mobility within biological tissues. For instance, L12 has a molecular weight of 462.49 g/mol. In contrast, ZINC hits exhibit lower molecular weights, ranging from 166.16 to 198.17 g/mol, which may enhance their absorption and penetration into biological tissues. A representative example is ZINC000047843689, which has a molecular weight of 193.16 g/mol. Furthermore, Entrectinib is the heaviest compound in this comparison, with a molecular weight of 560.64 g/mol, implying greater structural complexity and an increased likelihood of side effects. Top scored ligands generally possess higher HBA and donor values compared to ZINC hits. This characteristic may facilitate more effective biological interactions at the molecular level. For example, L15 has HBA 6 and HBD 4, indicating its potential for multifaceted interactions. Conversely, ZINC hits typically have lower HBA and HBD values, which may aid in their easier entry into biological systems. For instance, ZINC000013331109 has HBA 4 and HBD 2, while ZINC000047843689 has HBA 5 and HBD 3. Entrectinib, with HBA 6 and HBD 3, demonstrates suitable interaction capabilities with biological targets. When evaluating the topological polar surface area (TPSA), top scored ligands usually exhibit higher TPSA values, which can significantly affect their permeability and absorption capabilities. TPSA values for top scored ligands range from 73.91 to 99.93; for example, L12 has a TPSA of 99.93. In contrast, ZINC hits typically have lower TPSA values, which may facilitate enhanced permeability. For instance, ZINC000047843689 has a TPSA of 99.10. Regarding lipophilicity, as measured by logP (MLogP), top scored ligands generally show higher values. For example, L6 has a MLogP of 4.03, whereas ZINC000013331109 has an MLogP of 1.66, indicating a reduced lipophilic character that may impair its efficacy. Top scored ligands tend to have a higher number of rotatable bonds (nRot), suggesting greater spatial complexity, which may lead to increased biological activities. For instance, L15 has an nRot of 7. In contrast, ZINC hits usually possess fewer rotatable bonds, indicating simpler structures that may lead to less complexity in biological interactions. For example, ZINC000095803748 has an nRot of 0. Entrectinib, on the other hand, has an nRot of 8, reflecting its significant structural complexity and variability. All top scored ligands and ZINC hits, except for ZINC000095803748, fulfill the Lipinski and Ghose criteria, indicating their strong potential for pharmaceutical applications. Additionally, Entrectinib scores negatively on the Ghose evaluation, suggesting the possibility of side effects. Neither the top scored ligands nor the ZINC hits exhibit pan-assay interference compounds (PAINS) or Brenk alerts, indicating a high structural purity and a lower risk of toxicity. Furthermore, although Entrectinib is also devoid of PAINS and Brenk alerts, its higher molecular weight may lead to a greater likelihood of adverse effects. This finding highlights that ZINC hits may present more attractive options for drug development due to their lower molecular weight and simpler structures in comparison to the top scored ligands.Table 6 Physicochemical, pharmacokinetic, and medicinal chemistry properties of top-scored ligands, top ZINC hits, and Entrectinib.Full size tableIn silico ADMET predictionAs shown in Table 7, the comparative analysis of the ADMET properties for various ligands, including top scored ligands, ZINC hits, and Entrectinib, reveals significant insights regarding their pharmacological potential and limitations. The majority of the top scored ligands exhibit high absorption rates, with examples like L4 94.995% and L8 95.451% indicating strong potential for systemic effects. An absorption percentage above 85% is generally considered favorable for therapeutic applications. ZINC hits are also promising; for instance, ZINC000013331109 demonstrates high absorption at 94.425%. However, ZINC000047843689 shows significantly low absorption at 36.805%, which may limit its effectiveness. Entrectinib has an absorption rate of 87.259%, which is adequate, suggesting its potential utility in therapeutic contexts. Regarding distribution, most top scored ligands feature negative values distribution (VD), indicating limited tissue accumulation. Notably, L10 and L12 exhibit positive VD values of 0.067 and 0.246, suggesting better distribution capabilities. In contrast, the ZINC hits mostly possess negative VD values, indicating restricted distribution in systemic circulation that reduces their therapeutic viability. When examining blood–brain barrier (BBB) permeability, all ligands display negative log BB values, indicating limited capacity to cross the BBB. This characteristic could hinder their effectiveness in targeting central nervous system conditions. Entrectinib exhibits a log BB of −1.295, reflecting comparable limitations in central nervous system (CNS) targeting. In terms of CNS permeability, similar to BBB permeability, top scored ligands demonstrate negative log PS values, reflecting low permeability into the CNS, which could signify limited efficacy for potential neuropharmacological applications. The toxicity profile of the top scored ligands is notable, as they are all AMES-positive, indicating a higher likelihood of toxic effects. The presence of toxicity is a significant consideration in lead optimization and drug development. Conversely, none of the ZINC hits indicated toxicity in the AMES test, suggesting a better safety profile for these candidates. However, Entrectinib also presents a positive toxicity profile, necessitating careful evaluation during clinical application. Concerning hepatotoxicity, all top scored ligands are flagged for potential hepatotoxic effects, suggesting possible liver-related side effects during drug administration. In contrast, the ZINC hits demonstrate no hepatotoxicity risk, indicating they might be safer alternatives for further investigation. Finally, in terms of total clearance, Entrectinib exhibits a total clearance rate of 0.829 ml.min−1 kg−1, which is adequate but lower than L15 1.057, suggesting a more efficient clearance mechanism for L15. A higher clearance rate is typically desirable as it indicates efficient elimination from the body.Table 7 Predicted ADMET properties of top-scored ligands, top ZINC hits, and Entrectinib.Full size tableConclusionTRKs are integral to nerve development, and their dysregulation resulting from genetic alterations, such as NTRK fusions, has been associated with various forms of cancer. Recently, TRKs have emerged as promising therapeutic targets, with small molecule inhibitors like larotrectinib and entrectinib receiving FDA approval for clinical use. However, the issue of resistance to these therapies poses significant challenges, underscoring the urgent need for the identification and development of novel inhibitors that can effectively target TRK mutants resistant to existing treatments. In this study, CADD methodologies were employed to identify potential drug candidates. The generated pharmacophore hypothesis, based on a series of pyrazolo[3,4-b]pyridine derivatives, identified key characteristics necessary for biological activity. Molecular docking studies were conducted on 37 ligands, yielding docking scores ranging from −12.672 to −14.169 kcal/mol. The compound with the best fit was L5, which established five conventional hydrogen bonds with the carbonyl oxygen and nitrogen groups of the pyridine ring, along with nitrogen-methyl and fluorine atoms of the moiety interacting with residues Glu546, Met620, Lys627, and Lys572. MD simulations, which included various analyses such as RMSD, RMSF, rGy, intraHB, MolSA, and PSA, further confirmed the stability of the TRKA-L5 complex, emphasizing the strong interactions between L5 and the target protein. Additionally, virtual screening from the ZINC database identified the hit compound ZINC000013331109, which exhibited a docking score of −10.775 kcal/mol and demonstrated binding affinity through three hydrogen bonds with amino acids such as Glu618 and Met620. The pyrazolo-pyridine also displayed multiple pi-alkyl interactions with amino acids including Leu686, Ala570, Phe617, Val552, and Phe698. All screened ZINC hits complied with the Lipinski, Ghose, Veber, Egan, and Brenk rules. Notably, ZINC000013331109 did not display hepatotoxicity, in contrast to the top-scoring ligands and entrectinib, which have been linked to liver toxicity. This study underscores the potential of ZINC000013331109 as a promising candidate for TRKA inhibition, potentially offering advantages over existing therapies by reducing hepatotoxicity. Further research is warranted to assess the efficacy and safety of this compound in clinical settings.