Epitope mapping strategies for immunogenicity mitigation in streptokinase therapeutics: an in-silico study

Wait 5 sec.

IntroductionFibrinolytic drugs are at the forefront of the treatment of blood clotting disorders by catalyzing the breakdown of fibrin clots into soluble peptides, enabling their phagocytic elimination. Among these agents, Streptokinase (SK) is a cost-effective and FDA-approved therapeutic option for the treatment of ST-segment elevation myocardial infarction, arterial thrombosis or embolism, deep vein thrombosis, pulmonary embolism, arteriovenous cannula occlusion1.Comprising 414 amino acids, SK is a protein sourced from various strains of β-hemolytic streptococci, with the medicinal variant obtained from group C streptococci. In particular, SK does not possess catalytic activity by itself; instead, it binds to plasminogen, leading to intricate pivotal interactions for the formation of an active enzyme complex2,3. As such, it can be best described as a plasminogen activator, rather than a direct enzyme.While streptokinase (SK) could play a crucial role in the elimination of blood clots, its administration comes with inherent challenges. Being a bacterial protein, SK triggers a significant immune response, that results in the production of neutralizing antibodies. These immune reactions can demonstrate mild allergic responses or progress to the development of antibodies. Consequently, repeated administration of SK results in diminished therapeutic efficacy, restricting its effectiveness in dissolving clots and increasing the risk of serious allergic responses4,5,6. The immunogenicity of SK is a major limitation for its widespread and long-term clinical use.Several engineering approaches have been employed to optimize streptokinase. PEGylation has been explored as a method to prolong the half-life and mitigate the immunogenic response of streptokinase. However, these alterations have diminished its ability to activate plasminogen7. This led to the development of PEGylated mutants of streptokinase and PEGylated liposomal SK to compensate for the problem8,9. Furthermore, investigations have indicated that neutralizing activity against native streptokinase significantly exceeds that elicited against a mutant streptokinase variant lacking the C-terminal 42 amino acids10. In one of the recent articles, it was demonstrated that in silico tools have a substantial ability to optimize streptokinase. Specifically, computational methods predicted that streptokinase could be targeted effectively by fusing it with a fibrin-binding peptide11. These findings demonstrate that the design of SK should go beyond simple modifications to focus on reducing its interaction with the immune system.The ability of antibodies to neutralize proteins by binding to specific regions known as epitopes underscores their critical role. Identifying these B-cell epitopes holds immense value in advancing and refining medical therapeutics12. The evolution of in-silico studies has facilitated numerous investigations focusing on the prediction of both linear and conformational B-cell epitopes, leveraging diverse algorithms13,14,15. This approach involves locating these residues and subsequently conducting mutations to eliminate the identified epitopes, which shows promise in lowering the immunogenicity associated with therapeutics based on foreign proteins16. However, most previous studies focus on proteins other than streptokinase. The fact that the detection of B-cell epitopes in Streptokinase (SK) remains unreported in the literature highlights a significant research gap, underscoring the need to use computational methods to gain insight into SK antigenicity.The detection of B-cell epitopes in Streptokinase (SK) remains unreported in the literature. In the current investigation, our primary objective was to define the B-cell epitopes within SK and subsequently employ targeted point mutations to eliminate them. This strategic approach aims to mitigate the immunogenicity of the enzyme. In our study, we accomplished a successful modeling of the Streptokinase (SK) structure. Subsequently, we employed multiple servers with diverse algorithms to identify the B-cell epitopes present within SK to conduct the mutagenesis and rebuild the protein structure.MethodsVisual workflowTo provide a comprehensive overview of our in-silico methodology, the study is initiated by presenting the key steps through a visual workflow (Fig. 1). This workflow diagram highlights the structured approach used to identify and modify the antigenic regions of Streptokinase, with an emphasis on the connections between the diverse experimental steps and computational tools employed in the course of this study. By presenting a clear pathway of our analyses, we aim to convey the rationale, the methodology, and the underlying concepts that Have guided our research, and the key relationships between the diverse parts of our analysis.Fig. 1Visual Workflow of the Immunogenicity Mitigation Strategy for Streptokinase. This diagram illustrates the in-silico methodology employed in this study, beginning with the retrieval of the Streptokinase sequence from UniProtKB, and progressing through the different steps involving computational tools and analyses, including: antigenicity prediction; 3D structure prediction with refinement; docking of streptokinase to plasminogen; linear and conformational B-cell epitope prediction; hot-spot residue determination; candidate hotspot alterations and evaluation; selection of effective point mutations; mutein 3D structure modeling, and validation. The diagram was generated with the assistance of Napkin website (at https://napkin.ai), an AI-powered (Artifficial Intelligence-Powered) note-taking tool, and shows the interconnectedness of the different analyses and the flow of our methodology, highlighting key relationships between different parts of the study.Full size imageStructural analysisSequence retrievalThe Streptokinase sequence with the accession number P00779 was obtained from the UniProtKB database17 (http://www.uniprot.org). The sequence was acquired in FASTA format to facilitate subsequent investigations and analyses. The UniprotKB database was chosen for its extensive, curated, and reliable data, making it suitable as a starting point for our study.Antigenicity predictionThe antigenic probability of Streptokinase was predicted using VaxiJen18, an online tool accessible at http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html. An alignment-free approach was applied by the server with the threshold set as 0.4 for bacterial antigens. This threshold was chosen because it is the default setting for bacterial antigens in VaxiJen, which balances sensitivity and specificity in identifying potential antigenic regions.3D structure predictionThe PDB file containing the 3D structure of streptokinase (PDB ID: 1BML) was obtained from the RCSB PDB database19 (https://www.rcsb.org) and the concerned sequence (chain C) was elicited by employing UCSF Chimera software. A comparison of sequences retrieved from UniProt and RCSB PDB was conducted using Clustal Omega20, available at https://www.ebi.ac.uk/Tools/msa/clustalo, to identify gaps and differences within the two sequences.To address these gaps, the SWISS-MODEL homology modeling server (available at https://swissmodel.expasy.org/)21 was utilized. This tool was employed to construct the missing residues and reconcile mismatches by integrating the sequence from UniProtKB. Importantly, while SWISS-MODEL successfully modeled the medial missing residues, it did not cover the terminal residues, necessitating the generation of a more complete 3D structure model.For this purpose, the I-TASSER22 (accessible at https://zhanggroup.org/I-TASSER/) and Alphafold3 server (available at https://alphafoldserver.com/) were employed23.Model evaluation and validationThe secondary structures from each of the ten models were visualized using UCSF Chimera software24 and positions embracing beta-sheet, helix, or loop were compared with data sourced from UniProtKB. To assess the quality of the 3D structures, several metrics were employed.The ERRAT software25, available at the SAVES online server (https://saves.mbi.ucla.edu), was utilized to evaluate the structural integrity of the models. Furthermore, Qualitative Model Energy Analysis (QMEAN) scores were computed using the QMEANDisCo tool26 accessible from the SWISS-MODEL server at https://swissmodel.expasy.org/qmean.In addition to these evaluations, MolProbity27 (https://www.molprobity.biochem.duke.edu) was employed to identify Ramachandran-favored residues within the predicted structures. These combined metrics were used because they help to assess the quality of different aspects of a protein model, such as the structural folding, its energy stability and the presence of sterically acceptable conformations. This analysis helped ascertain the conformational quality and overall reliability of the predicted structures.Docking analysisFor the docking analysis involving the candidate model and plasminogen, the catalytic domain of plasminogen was extracted from chain A of the 1BML structure using Chimera software. This extracted domain was then utilized as the basis for plasminogen in the subsequent docking procedure.The docking process between the Streptokinase (SK) model and the catalytic domain of plasminogen was carried out utilizing ZDOCK 3.0.228, accessible at https://www.zdock.umassmed.edu. This tool was selected because it provides a robust approach to evaluate protein-protein interactions based on shape complementarity and energetic factors. This computational tool facilitated the exploration and prediction of potential binding interactions between the SK model and the catalytic domain of plasminogen. These criteria ensured that the best binding pose is selected.Molecular dynamics simulationThe selected 3D model of streptokinase served as the input for conducting molecular dynamics (MD) simulations. Gromacs 2024, executed on a high-performance computing system equipped with Pascal Arch GPU, was employed for the simulation process. During the MD simulation, the Amber ff99SB*-ILDN force field was utilized for an accurate representation of molecular interactions and dynamics. Energy minimization of the protein/complex was initiated using the steepest descent algorithm, encompassing a maximum of 50,000 cycles with an energy cutoff set at 1000 kJ mol-1 nm-2. The protein/complex was positioned within a cubic box, ensuring a 1 nm solvent space on all sides. Solvation was accomplished using the SPC216 water model to mimic the surrounding solvent environment. Additionally, to simulate physiological conditions, a concentration of 0.15 M NaCl was integrated into the system. The equilibration of the system was executed under NVT (constant number of particles, volume, and temperature) and NPT (constant number of particles, pressure, and temperature) ensembles. The temperature was maintained at 300 K, and the pressure was at 1 bar using the V-rescale temperature and Parrinello-Rahman pressure coupling algorithms, respectively. The MD simulation, conducted at a constant temperature of 300 K and pressure of 1 bar, employed the leapfrog algorithm for a duration of 200 nanoseconds. This computational technique allowed for the exploration of molecular motions and interactions within the system over an extended simulation period29. The parameters were selected as described here to ensure that the system was at equilibrium, with adequate temperature and pressure for the simulation.Immunoinformatic analysisLinear B-cell epitope predictionMultiple computational tools employing diverse algorithms have been utilized to predict linear or continuous B-cell epitopes.BepiPred-3.0 software (available at https://services.healthtech.dtu.dk/services/BepiPred-3.0/)30 utilizes sequence-based epitope prediction methods enhanced through language model embeddings to predict epitope locations. The default threshold of 0.1512 was employed. Using both Feed Forward and Recurrent Neural Network approaches, ABCPred software (available at http://crdd.osdd.net/raghava/abcpred/)31 predicted linear B-cell epitopes with an accuracy of 65.93%. The default threshold of 0.51 was utilized. Leveraging physicochemical properties, BEPITOPE software32 was employed for the prediction of linear B-cell epitopes. SVMTriP server (available at https://sysbio.unl.edu/SVMTriP/prediction.php)33 predicted epitopes based on Tri-peptide similarity and propensity scores using Support Vector Machine (SVM) techniques. BepiPred-2.0 (available at https://services.healthtech.dtu.dk/services/BepiPred-2.0/)34 was developed utilizing a Random Forest algorithm trained on epitopes and non-epitope amino acids derived from crystal structures. Residues scoring above the default threshold (0.5) were predicted to be part of an epitope. The default thresholds were selected to ensure a balance between sensitivity and specificity in predicting linear epitopes, thus highlighting both important and robust predicted regions for all servers.Conformational B-cell epitope predictionThe modeled Streptokinase (SK) 3D structure underwent analysis using various structural analysis tools to predict epitopic residues.DiscoTope2.0 (available at https://services.healthtech.dtu.dk/services/DiscoTope-2.0)35 tool predicts epitopes by combining an epitope propensity amino acid score with relative surface accessibility (estimated regarding contact numbers). A score threshold of −3.7 was employed for epitope prediction. Utilizing Thornton’s method along with a residue clustering algorithm, Ellipro (available at https://tools.iedb.org/ellipro/)36 was employed for epitope prediction. The parameters set included a maximum score of 0.5 and a distance of 6 on the scale for prediction. Employing Support Vector Regression, the EPSVR predictor (available at http://sysbio.unl.edu/EPSVR/)37 utilizes various parameters including residue epitope propensity, conservation score, side chain energy score, contact number, surface planarity score, and secondary structure composition for the prediction of epitopes within the protein structure.Hot spot residue determinationTo decrease the antigenic properties of Streptokinase (SK), an evaluation of several properties of predicted conformational B-cell residues was conducted to select hot spot residues as potential candidates for the mutation process. Residues that are conserved, involved in interactions, and exhibit weakly encompassed propensity scales were excluded from consideration as hot spot residues.Conserved residue determinationThe identification of conserved residues was performed using the ConSurf tool (available at https://consurf.tau.ac.il/consurf_index.php)38, which employs consensus and relative entropy approaches. Evolutionary rates were estimated based on phylogenetic relationships among homologous sequences. Additionally, this tool categorized residues as surface-exposed or buried. Conserved exposed residues are more likely to be functional, while conserved buried residues tend to be structural.A Blast search against the UniRef90 database yielded 45 homologs, of which 7 unique sequences were extracted. The resulting sequences were aligned using ClustalOmega to determine mutual residues among all sequences.Furthermore, the SK sequence and the 7 homologous sequences were subjected to the MEME tool within the MEME Suite39 web server (https://meme-suite.org/meme/tools/meme) to identify novel, ungapped motifs present in the sequences. These motifs represent approximate conserved sequence patterns associated with distinct protein functions.Protein-protein interaction analysisThe 3D structure (1BML) containing identical SK-Plasminogen complexes was processed using UCSF Chimera to isolate a pair from the structure. The output was then submitted to the PDBsum40 web server (http://www.ebi.ac.uk/thornton-srv/databases/pdbsum) to obtain an intra-protein interaction diagram. Additionally, the previously obtained ZDOCK output was also submitted to PDBsum to enrich the dataset of Streptokinase-Plasminogen interactions.Propensity scale assessmentTools available in the IEDB database (http://tools.iedb.org/bcell) were utilized to obtain surface accessibility, flexibility, and hydrophilicity scores for residues. These predictions were based on propensity scales for each of the 20 amino acids, wherein each scale assigns 20 values to the amino acid residues based on their relative propensity to exhibit the specified property.The residues selected as hot spots were those that scored highly in epitope prediction algorithms, but at the same time scored poorly for conservation, flexibility, buried status and involvement in protein-protein interactions. These specific properties were chosen in order to identify residues that are more likely to be involved in the immune response, and therefore, constitute more appropriate targets for mutagenesis.Candidate hotspots alterationThe candidate hotspots were cross-referenced among the linear epitopes to find associated epitopes. The antigenicity scores of these candidate epitopic regions were evaluated using the Vaxijen webserver to identify high-scored epitopes. Hotspots within the selected regions were substituted with a defined set of amino acids less likely to induce antigenicity. This set included Ala, Val, Leu, Ile, Met, Phe, Pro, Trp, Cys, Tyr, Asp, Glu, Lys, His, and Arg. For each replacement, the corresponding ΔΔG (change in free energy of stability) and IEDB antigenicity score were computed through the EASE-MM tool41 (accessible at https://sparks-lab.org/server/ease-mm/) and the IEDB server, respectively. The entire SK sequence, containing a point mutation for each replacement, was analyzed by Vaxijen to assess the new antigenicity score.EASE-MM, a machine learning method, predicted the actual value of stability change (∆∆Gu = ∆Gu(mutated) - ∆Gu(wild-type)). A negative ∆∆Gu signifies a destabilizing mutation.Candidate alterations displaying a positive ∆∆Gu, alongside a decrease in both Vaxijen and IEDB scores, were considered potential stabilizing mutations. This strategy was used to explore a combination of modifications that have a beneficial effect for the overall protein stability and antigenicity.Selection of effective point mutationsSubsequently, the Vaxijen scores for each epitopic region undergoing point mutation(s) were individually calculated. Additionally, the Vaxijen score for the entire sequence was computed multiple times, with each calculation involving the application of one of the permutations of point mutation candidates in each order. This process facilitated the identification of the most effective point mutations.Furthermore, a propensity scale assessment for the alternative amino acids was conducted and compared to the wild-type residues. This comparison aimed to evaluate and contrast the properties of the alternative amino acids with those of the original (wild-type) residues. This approach allowed for the selection of the best substitution by combining local with global changes in antigenicity, and evaluating how the substituted residue interacts with the overall protein properties, particularly its hydrophobicity, flexibility and surface exposure.Modeling of the Mutein 3D structureThe Mutein’s structure prediction involved diverse approaches. Several models were constructed:Model SKC: Utilized UCSF Chimera to replace intended residues within the original structure.Model SKA: Created by assembling unchanged segments of the original structure with alternate residues, with the help of AIDA42 web server (available at: https://aida.godziklab.org).Models SKI1-5: Generated via I-TASSER software.Models Alpha1-5: Generated by AlphaFold3 Server.All models underwent refinement using ModRefiner (https://zhanggroup.org/ModRefiner/) for improved physical realism and accuracy. To select the most accurate model, comparisons were made based on Z-score, Verify-3D score, and Ramachandran plot which were assessed respectively through ProSA-web42 (at https://prosa.services.came.sbg.ac.at/prosa.php), SAVES platform data, and MolProbity webserver. This approach, combining diverse modeling and refinement techniques, was necessary to ensure the reliability of the final model. These evaluations aimed to identify the model with the highest accuracy and reliability among the generated structures.Mutein 3D structure validationThe chosen structure of the mutein served as input for the Molecular Dynamics (MD) simulation. The resulting output was utilized to construct a mutein-plasminogen complex through ZDOCK. The interactions within this newly formed complex were compared against a previously prepared interaction dataset to assess and validate the accuracy and compatibility of the mutein’s structure with the plasminogen protein. This step provided validation for the structure and assessed the interactions of the engineered protein with its physiological target.ResultsStructural analysisVaxiJen antigenicity scoreThe VaxiJen antigenicity score obtained for the UniProt sequence P00779 was 0.4911. This VaxiJen prediction indicated Streptokinase (SK) as a probable antigen.3D structure predictionA comparative analysis between UniProt and RCSB PDB sequences highlighted discrepancies in the Streptokinase (SK) structure. These inconsistencies comprised five missing residue segments spanning positions 1 to 11, 46 to 70, 175 to 181, 252 to 262, and 373 to 414, alongside five single residue mismatches at positions 11, 210, 244, 253, and 303 within the 1BML structure. SWISS-MODEL’s reconstruction efforts failed to completely address these gaps.Subsequently, five models were generated through I-TASSER and five through Alphafold3. These models underwent comprehensive assessment using ERRAT, QMEANDISCO, and MolProbity tools (Table Supplementary 1 (S1)). A scoring system was devised to evaluate the accuracy of the secondary structure in these models, comparing helixes and beta sheets with the information available in UniProt. The accuracy of the secondary structures of the models was stated as a percentage in Table S1. Additionally, the Root Mean Square Deviation (RMSD) plot derived from the Molecular Dynamics (MD) simulation confirmed the structural stability of the native SK model throughout the simulation period.Immuno-informatics analysisB-cell epitope predictionLinear B-cell epitopes were predicted with a lower size limit set at nine amino acids43. Predictions were conducted using ABCpred and SVMTriP servers, capable of identifying epitopes in varying lengths (ranging from 10 to 22 amino acids). Overlapping regions of high-scored epitopes from these servers across different lengths were chosen.Linear B-cell epitopes were predicted using multiple computational tools, including ABCpred and SVMTriP, which employ different prediction algorithms based on sequence data. ABCpred, using a neural network approach, identified epitopes of varying lengths (10 to 22 amino acids) and SVMTriP, using SVM-based tri-peptide similarity, and propensity scores, also predicted multiple epitopes that aligned with the ABCpred predictions. Overlapping regions of high-scored epitopes, determined by both algorithms, were selected for further analysis. By combining the results from different tools, we identified more reliable epitopes based on different methodologies and algorithms. The specific epitopes chosen for further consideration are listed in Table 1.Conformational B-cell epitopes were predicted utilizing DiscoTope 2.0, Ellipro, and EPSVR servers. Ellipro identified 53 residues out of 414 total residues as potential B-cell epitopes. Similarly, Discotope-2 and EPSVR identified 51 and 53 residues, respectively. The epitopes retrieved from these servers are depicted in Fig. 2. Consensus conformational epitopes were determined by selecting epitopes predicted by at least two conformational epitope predictors. This consensus comprised 49 conformational epitopes, including residues E53, P168, N170-D174, P212, G213, I254-N261, L277-D285, A383-D387, and Y397-K414.Fig. 2Map of epitopes obtained from different servers. X-axis represents the streptokinase amino acid number, and Y-axis indicates servers used to analyze the linear and conformational epitopes. Black color represents linear epitopes and blue color represents conformational epitopes. Also, the consensus of conformational epitopes is drawn in green color.Full size imageA library of conformational B-cell epitopes was created based on the epitope prediction outcomes. The selection criteria for residues to be included in the library encompassed their alignment with essential criteria for B-cell epitopes, such as non-conservation in evolution, absence in interaction interfaces, and possessing high levels of surface accessibility, hydrophilicity, and flexibility.Conformational B-cell epitopes were predicted utilizing DiscoTope 2.0, Ellipro, and EPSVR servers. Each of these algorithms relies on different aspects of the protein structure for their predictions: Ellipro identified 53 residues out of 414 as potential B-cell epitopes, using a clustering algorithm based on a maximum score and a distance on the scale for prediction. Discotope-2, using a combination of propensity scores with relative surface accessibility, and EPSVR, utilizing support vector regression with a variety of structural parameters, identified 51 and 53 residues, respectively. Consensus conformational epitopes, which were predicted by at least two different tools, were selected for further analyses (Fig. 3), to reduce the likelihood of false positives. This consensus comprised 49 conformational epitopes, including residues E53, P168, N170-D174, P212, G213, I254-N261, L277-D285, A383-D387, and Y397-K414.Fig. 3Predicted Conformational B-Cell Epitopes on the 3D Structure of Streptokinase. This figure depicts the 3D structure of the Streptokinase (SK) protein, with the protein backbone shown as a grey surface with a ribbon representation. The areas highlighted in hot pink represent the predicted conformational B-cell epitopes, which have the potential for antibody binding. These epitopes were identified using multiple in silico prediction tools including DiscoTope2.0, Ellipro, and EPSVR servers. The spatial distribution of these epitopes across the protein structure can be visualized, which reveals their complex three-dimensional arrangement. This image was generated using PyMOL version 3.0.0.Full size imageHot spot determinationResidues with evolutionary conservation and those involved in interface interactions were eliminated from further analysis. The conserved conformational epitopes encompassed residues K257, G259, K278, K279, Y284, Y384, D385, K386, Y397, Y399, L400, R401, and T405 (Fig. 4). Additionally, interface epitopes consisted of N170, P171, P283, Y284, and D285, according to the docking analysis (step 2.1.5). As can be seen in Table S2, the analysis focused on residues exhibiting high surface accessibility, hydrophilicity, and flexibility, which comprised E53, N170-D174, P212, N255, S258, K279-D285, D385, K386, D387, and I407-N410. This table outlines the propensity scales of the candidate conformational epitopes. Notably, specific thresholds were set for surface accessibility, flexibility, and hydrophilicity, with values established at 1, 1.01, and 2.061, respectively. It’s noteworthy that IEDB encountered limitations in calculating scores for terminal residues, leading us to exclude these residues from the analysis. Residues P212, G280, and P408 were also excluded from the conformational epitope library.Fig. 4Evolutionary conserved residues. X-axis represents the streptokinase amino acid number, and Y-axis indicates servers used to analyze the conservancy. The results from ConSurf, and mutual results of Clustal Omega and MEME are considered as consensus, including residues 74, 76, 79–84, 87, 100, 103, 106, 107, 110–112, 117, 119, 124, 125, 133, 137, 139–141, 155, 158, 164, 166, 194, 196, 199, 203, 204, 206, 207, 216, 219, 221, 224–227, 233, 240, 242, 243, 246, 248, 249, 257, 259, 266–268, 270, 272, 274, 278, 279, 284, 290, 298–301, 308, 313, 322, 323, 325, 327, 328, 332, 334, 336, 338–341, 345, 348, 349, 351–354, 363, 366–369, 371, 372, 379, 381, 384–386, 391–394, 397, 399–401, 405.Full size imageLinear epitopes encompassing the remaining candidate hot spots were identified at positions 43–78, 172–198, 246–270, 279–295, 373–394, and 404–414. These regions were subjected to VaxiJen analysis to evaluate antigenicity scores. Notably, epitopes at positions 279–295 and 404–414, along with their corresponding hot spots (E281, K282, I407, D409, N410), were excluded from the library (Table 1). Consequently, residues E53, D172, D173, D174, N255, S258, and D387 were identified as candidate hot spots within the epitope library. Figure 5 illustrates the position of these epitops on the surface of the enzyme.Table 1 Antigenicity score of the linear epitopes by VaxiJen.Full size tableFig. 5Hotspot Residues for Immunogenicity Reduction on the 3D Structure of Streptokinase. This figure displays the Streptokinase (SK) protein structure (gray) with the conformational B-cell epitopes highlighted in hot pink. The key residues that have been identified as “hot spots” for mutagenesis based on their structural characteristics, low conservation, and high antigenicity scores are highlighted in cyan. These residues represent potential targets for reducing the protein’s immunogenic potential through targeted point mutations, without compromising the structural stability of the protein, nor its binding to its natural substrate (plasminogen). The selection of these specific residues was a direct consequence of the application of multiple computational tools and analyses described in our methodology. This image was created using PyMOL version 3.0.0.Full size imageResidual modificationEach identified candidate residue was substituted with an appropriate alternative amino acid. The selection of alternative amino acids focused on those inducing a positive change in ∆∆Gu while concurrently reducing VaxiJen and IEDB antigenicity scores. Table 2 delineates the scores of alternate residues concerning candidate hot spots.Table 2 The scores of alternate residues concerning candidate hot spots. Notably, alternate amino acids demonstrating augmented scores are highlighted in green. Specifically, the alternates exhibiting a positive ∆∆gu value, a VaxiJen score lower than 0.4911, and a diminished IEDB score are considered potential stabilizing mutations.Full size tableAfter modifying the epitopic regions and candidate mutein sequences, the VaxiJen antigenicity score was assessed to ascertain reductions in immunogenicity. These evaluations are presented in Tables 3 and 4. Remarkably, certain modifications, namely E53M, D174M, and S258W demonstrated substantial reductions in immunogenicity when employed together. Consequently, these modifications were chosen for the reconstruction of the 3D structure. Table 5 delineates a comparative analysis of the propensity scales for residues 53, 174, and 258 before and after alteration. This comparison aims to illustrate the changes in propensity scales resulting from the modifications made to these specific residues.Table 3 VaxiJen antigenicity score of the wild-type (WT) and modified (M) epitopic regions.Full size tableTable 4 VaxiJen antigenicity score of the wild-type SK sequence and SK with altered residues.Full size tableTable 5 IEDB propensity scales of the residues 53, 174, and 258, before and after the alteration. As seen in the table, the scores are reduced in the Mutein residues (M) in comparison to the wildtype (WT).Full size tableMutein 3D structure analysisAn assessment was conducted to compare the quality of the predicted mutein structures. Following this analysis, the Alpha1 model was identified and selected as the most suitable mutein model. Refer to Table S3 for an overview of this selection process.The constructed mutein-Plg complex was found to exhibit acceptable alignment with the SK-Plg complex. This comparison affirms the integrity and compatibility of the mutein-plasminogen complex.Moreover, an evaluation of the conformational stability of the mutein was performed by analyzing the time-dependent behavior of the C-alpha root mean square deviation (RMSD) derived from the output data of Molecular Dynamics (MD) simulations. Figure 6 displays the RMSD plot representing the stability of the mutein structure over time.Fig. 6C-alpha RMSD Analysis of Wild-Type Streptokinase and Mutein during 200 ns Molecular Dynamics (MD) Simulations. The RMSD values for both wild-type (black) and mutein (red) forms of streptokinase were monitored over a 200 ns simulation. Both proteins achieved stable conformations, with RMSD values remaining predominantly below 3 Å, indicating a strong structural stability. Minor fluctuations are observed, particularly in the mutein, but overall, the structural integrity of the proteins is preserved throughout the simulation period. The graph was represented using QtGrace software version version 0.27-1.Full size imageDiscussionReducing immunogenicity of therapeutic proteins remains a central challenge in protein engineering, particularly for bacterial proteins such as streptokinase (SK), which often trigger strong immune responses in humans1,44. In this study, we employed a rational, structure-based immunogenicity mitigation pipeline using epitope mapping, molecular modeling, and molecular dynamics (MD) simulation to identify and validate mutation strategies that reduce B-cell antigenicity of SK without disrupting protein structure or function.We initiated this process by predicting B-cell epitopes using a combination of sequence-based and structure-based tools. The predicted linear and conformational epitopes were cross-validated using multiple algorithms to enhance reliability. These predictions were then mapped onto the structural model of SK, which we generated using AlphaFold 3. This full-length model resolved previously unstructured or missing segments present in crystallographic data, enabling a more complete evaluation of surface-accessible antigenic regions. Mapping of the predicted epitopes onto the AlphaFold 3 model revealed that the majority of immunogenic hotspots were located on surface-exposed regions, consistent with known properties of B-cell epitopes.Structural analysis has indicated that antigenic regions notably protrude from the protein surface, considered a critical epitope characteristic45,46. Continuous epitopes may occur at terminal protein segments, exposing hydrophilic and mobile regions to antibodies47. Conformational epitopes comprise multiple residues from different segments, typically surface loops, folded together in the protein. They often protrude from the protein surface and involve surface-exposed residues, loops, and turns. Epitopes are characterized by lower conservation, high surface accessibility, flexibility, hydrophilicity, and fewer hydrophobic residues45. They consist of amino acids interacting with antibodies, mainly large hydrophilic or charged amino acids (E, D, R), potentially aromatic (F, H, Y), or polar uncharged residues (N, S, T), contributing to antigenic properties. Proline (P) and glycine (G) act as helix breakers, often excluded from epitope hotspots48,49,50. The replacement of hotspots with hydrophobic residues aims to reduce immune-related adverse reactions. Alanine is commonly used for substitution, preserving protein conformation, and lacking side chain interactions with antibodies51. Substitution with other hydrophobic residues like valine, leucine, or methionine may also diminish antigenic scores45,48.To ensure that the mutations do not disrupt protein functionality, residues involved in evolutionary conservation and protein–protein interfaces were excluded from further consideration. Specifically, conserved conformational epitopes included residues such as K257, G259, K278, among others (Fig. 4), while docking analysis identified interface residues like N170 and Y284. Candidate residues for substitution were prioritized based on surface exposure, hydrophilicity, and flexibility (Table S2), using strict cutoffs (surface accessibility > 1, flexibility > 1.01, hydrophilicity > 2.061) to focus only on truly accessible and immunogenic residues. Furthermore, residues located in regions with inherently low antigenicity (as predicted by VaxiJen) were excluded from the mutational analysis. Specifically, low-antigenicity regions were identified at amino acid positions 279–295 and 404–414. The following residues were omitted as a result: E281, K282, R284, T285, D286, A287, L289, D290, Y291, S292, D409, G410, N411, A412, and S413. Excluding these residues ensured that only strongly antigenic regions were targeted for immunogenicity mitigation, further refining the accuracy and specificity of the mutational design. Importantly, none of the proposed mutations overlapped with known functional or active residues, preserving the enzymatic integrity of SK.We selected specific hotspot residues for substitution based on their contribution to antigenicity, surface accessibility, and side-chain chemistry. The choice of substituted residues was guided by the principle of reducing polar side chains that promote antibody binding and introducing small hydrophobic residues to minimize immune recognition. Alanine was prioritized due to its minimal steric and electronic impact. Where alanine substitution was less suitable due to local structural constraints, other amino acids were selected to preserve the hydrophobic core or maintain overall folding stability. Our findings showed substituting polar charged residues (E58 and D174) with a hydrophobic residue (M), and a polar uncharged residue (S258) with another hydrophobic residue (W) effectively reduced immune response.While this study focused primarily on B-cell epitopes due to their predominant role in SK immunogenicity, T-cell epitopes may also contribute to cellular immune responses. The mutations designed to reduce B-cell epitope antigenicity (E58M, D174M, S258W) likely impact overlapping T-cell epitopes, given the shared surface-exposed nature of these regions. Future experimental studies could explore T-cell epitope profiles to further validate the immunogenicity reduction of the mutein, including predictive assessments of toxicity, antigenicity, and allergenicity, as well as molecular docking between T-cell epitopes and MHC molecules.To validate the structural stability of wild-type and mutein SK, molecular dynamics (MD) simulations were conducted using models generated by AlphaFold 3, a cutting-edge deep learning-based protein structure prediction tool. For each construct, five models were generated with predicted TM-scores above 0.62, and the best models were selected based on QMEAN, ERRAT, and MolProbity assessments. Compared to earlier models predicted by I-TASSER and SWISS-MODEL, the AlphaFold models demonstrated superior stereochemical quality and structural consistency. MD simulations over 200 ns confirmed the stability of both wild-type and mutein proteins, with RMSD values remaining predominantly below 3 Å. These findings support the structural feasibility of the proposed mutations and reinforce confidence in the mutein’s therapeutic potential.Similar simulation-based evaluations have been used effectively to predict pharmacological behavior and molecular stability. For example, Khan et al. applied MD simulations to explore binding stability of novel isoflavones against cancer targets, revealing the importance of structural dynamics in therapeutic optimization52. Rahbar et al. demonstrated the power of in silico design and MD simulations in engineering structurally robust uricase variants through strategic disulfide bond introduction, offering a rational and predictive approach to enhancing enzyme stability without compromising catalytic efficiency53. Ullah et al. demonstrated that MD analysis enhances the reliability of docking predictions in anti-inflammatory drug discovery54. Although our study focuses on immunogenicity mitigation rather than target binding, the principle remains consistent: MD enables assessment of structural feasibility for engineered therapeutic proteins. Moreover, Aftab et al. illustrated how integrating spatial and temporal molecular descriptors through deep learning and ODEs can improve biological property prediction, a strategy conceptually aligned with our focus on dynamic structural validation in silico55.Our computational framework integrates epitope prediction, structural modeling, mutation design, and dynamic validation, providing a holistic view of the antigenic and structural landscape of therapeutic proteins. Importantly, by ensuring that mutated residues do not interfere with functional domains, we maintain the biological activity of SK. Although in silico methods provide valuable insights, the predicted reduction in immunogenicity must ultimately be validated through in vitro immunoassays and in vivo studies. Nonetheless, this study demonstrates the feasibility of rational immunogenicity mitigation using modern computational tools, and the principles described here are applicable to other biologics facing similar immunological barriers.ConclusionOur study utilized computational epitope mapping and strategic modification to mitigate the immunogenicity of Streptokinase (SK). Through the identification and substitution of key B-cell epitope hotspot residues with hydrophobic amino acids, our findings suggest a potential reduction in immune reactivity while preserving catalytic functionality. However, these results are speculative and require experimental validation to confirm their validity. The necessity for creating a modified SK with lower immunogenicity is emphasized due to its crucial role in therapy, highlighting the need for further research. Nonetheless, our study proposes a promising approach to optimize biological drugs by adjusting their immunogenic profile. It also underscores the significance of integrating computational methodologies into drug development pipelines, offering insights into potential strategies for enhancing the safety and effectiveness of biopharmaceuticals.Data availabilityAll data generated or analysed during this study are included in this paper, its supplementary material, and in publicly available repositories including Uniprot Knowledge Base at https://[www.uniprot.org](http:/www.uniprot.org) and Protein Data Bank at https://[www.rcsb.org](http:/www.rcsb.org).ReferencesEdwards, Z., Nagalli, S. & Streptokinase StatPearls. Treasure Island (FL) Ineligible Companies. Disclosure: Shivaraj Nagalli Declares No Relevant Financial Relationships with Ineligible Companies.: StatPearls Publishing Copyright © 2023 (StatPearls Publishing LLC., 2023).Chaudhari, A. M. et al. CRISPR-Cas9 mediated knockout of SagD gene for overexpression of streptokinase in Streptococcus equisimilis. Microorganisms 10 (3), 635 (2022).Raee, M. J. et al. Cloning, purification and enzymatic assay of streptokinase gene from Streptococcus pyogenes in Escherichia coli. Minerva Biotechnol. Biomol. Res. 29, 8–13 (2017).Google Scholar Igor, A. et al. Efficiency of targeted delivery of streptokinase based on fibrin-specific liposomes in the in vivo experiment. Drug Delivery Translational Res. 13 (3), 811–821 (2023).Article CAS Google Scholar Sawhney, P., Katare, K. & Sahni, G. PEGylation of truncated streptokinase leads to formulation of a useful drug with ameliorated attributes. PloS One. 11 (5), e0155831 (2016).Article PubMed PubMed Central Google Scholar Mican, J. et al. Structural biology and protein engineering of thrombolytics. Comput. Struct. Biotechnol. J. 17, 917–938 (2019).Article CAS PubMed PubMed Central Google Scholar Adivitiya, Khasa, Y. P. The evolution of Recombinant thrombolytics: current status and future directions. Bioengineered 8 (4), 331–358 (2017).Article PubMed Google Scholar Sawhney, P. et al. Site-Specific Thiol-mediated pegylation of streptokinase leads to improved properties with clinical potential. Curr. Pharm. Design. 22 (38), 5868–5878 (2016).Article CAS Google Scholar Jin, S-E., Kim, I-S. & Kim, C-K. Comparative effects of PEG-containing liposomal formulations on in vivo pharmacokinetics of streptokinase. Arch. Pharm. Res. 38 (10), 1822–1829 (2015).Article CAS PubMed Google Scholar Torrèns, I. et al. A mutant streptokinase lacking the C-terminal 42 amino acids is less Immunogenic. Immunol. Lett. 70 (3), 213–218 (1999).Article PubMed Google Scholar Hajizade, M. S. et al. Targeted drug delivery to the thrombus by fusing streptokinase with a fibrin-binding peptide (CREKA): an in Silico study. Ther. Deliv. 15 (6), 399–411 (2024).Zeng, Y. et al. Identifying B-cell epitopes using AlphaFold2 predicted structures and pretrained Language model. Bioinf. (Oxford England). 39 (4), btad187 (2023).Collatz, M. et al. EpiDope: a deep neural network for linear B-cell epitope prediction. Bioinf. (Oxford England). 37 (4), 448–455 (2021).CAS Google Scholar Tung, C. H. et al. NIgPred: Class-Specific antibody prediction for linear B-Cell epitopes based on heterogeneous features and Machine-Learning approaches. Viruses 13 (8), 1531 (2021).Sun, P. et al. Advances in In-silico B-cell epitope prediction. Curr. Top. Med. Chem. 19 (2), 105–115 (2019).Article ADS CAS PubMed Google Scholar Keshtvarz, M. et al. Engineering of cytolethal distending toxin B by its reducing immunogenicity and maintaining stability as a new drug candidate for tumor therapy; an in Silico study. Toxins 13 (11), 785 (2021).Boutet, E. et al. UniProtKB/Swiss-Prot. Methods in molecular biology. (Clifton NJ). 406, 89–112 (2007).CAS Google Scholar Zaharieva, N. et al. VaxiJen dataset of bacterial immunogens: an update. Curr. Comput.-Aided Drug Design. 15 (5), 398–400 (2019).CAS Google Scholar BurleySK et al. RCSB protein data bank: tools for visualizing and Understanding biological macromolecules in 3D. Protein Science: Publication Protein Soc. 31 (12), e4482 (2022).Sievers, F. & Higgins, D. G. The clustal Omega multiple alignment package. Methods in molecular biology. (Clifton NJ). 2231, 3–16 (2021).CAS Google Scholar Waterhouse, A. et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 46 (W1), W296–w303 (2018).Article CAS PubMed PubMed Central Google Scholar Zhou, X. et al. I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat. Protoc. 17 (10), 2326–2353 (2022).Article CAS PubMed Google Scholar Abramson, J. et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature 630 (8016), 493–500 (2024).Article ADS CAS PubMed PubMed Central Google Scholar Pettersen, E. F. et al. UCSF chimerax: structure visualization for researchers, educators, and developers. Protein Science: Publication Protein Soc. 30 (1), 70–82 (2021).Article CAS Google Scholar Colovos, C. & Yeates, T. O. Verification of protein structures: patterns of nonbonded atomic interactions. Protein Science: Publication Protein Soc. 2 (9), 1511–1519 (1993).Article CAS Google Scholar Studer, G. et al. QMEANDisCo-distance constraints applied on model quality Estimation. Bioinf. (Oxford England). 36 (6), 1765–1771 (2020).CAS Google Scholar Williams, C. J. et al. MolProbity: More and better reference data for improved all-atom structure validation. Protein science: publication Protein Soc. ;27(1):293–315. (2018).Pierce, B. G. et al. ZDOCK server: interactive Docking prediction of protein-protein complexes and symmetric multimers. Bioinf. (Oxford England). 30 (12), 1771–1773 (2014).CAS Google Scholar Kabiri, M. et al. A repurposing pipeline to Candidate-Suitable inhibitors of tyrosinase: computational and bioassay studies. Chem. Biodivers. 21 (12), e202401035 (2024).Article CAS PubMed Google Scholar Clifford, J. N. et al. BepiPred-3.0: improved B-cell epitope prediction using protein Language models. Protein Science: Publication Protein Soc. 31 (12), e4497 (2022).Article Google Scholar Saha, S. & Raghava, G. P. Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins 65 (1), 40–48 (2006).Article CAS PubMed Google Scholar Odorico, M. & Pellequer, J. L. BEPITOPE: predicting the location of continuous epitopes and patterns in proteins. J. Mol. Recognition: JMR. 16 (1), 20–22 (2003).Article CAS Google Scholar Yao, B. et al. SVMTriP: A method to predict B-Cell linear antigenic epitopes. (Clifton NJ). 2131, 299–307 (2020). Methods in molecular biology.Google Scholar Jespersen, M. C. et al. BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res. 45 (W1), W24–w29 (2017).Article CAS PubMed PubMed Central Google Scholar Kringelum, J. V. et al. Reliable B cell epitope predictions: impacts of method development and improved benchmarking. PLoS Comput. Biol. 8 (12), e1002829 (2012).Article CAS PubMed PubMed Central Google Scholar Duquesnoy, R. J. & Marrari, M. Usefulness of the ellipro epitope predictor program in defining the repertoire of HLA-ABC Eplets. Hum. Immunol. 78 (7–8), 481–488 (2017).Article CAS PubMed Google Scholar Liang, S. et al. EPSVR and epmeta: prediction of antigenic epitopes using support vector regression and multiple server results. BMC Bioinform. 11, 381 (2010).Article Google Scholar Ashkenazy, H. et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 44 (W1), W344–W350 (2016).Article CAS PubMed PubMed Central Google Scholar Bailey, T. L. et al. MEME Suite Nucleic Acids Res. ;43(W1):W39–49. (2015).Article ADS CAS PubMed Google Scholar Laskowski, R. A. et al. PDBsum: structural summaries of PDB entries. Protein Science: Publication Protein Soc. 27 (1), 129–134 (2018).Article CAS Google Scholar Folkman, L. et al. EASE-MM: Sequence-Based prediction of Mutation-Induced stability changes with Feature-Based multiple models. J. Mol. Biol. 428 (6), 1394–1405 (2016).Article CAS PubMed Google Scholar Xu, D. et al. AIDA: Ab initio domain assembly for automated multi-domain protein structure prediction and domain-domain interaction prediction. Bioinf. (Oxford England). 31 (13), 2098–2105 (2015).CAS Google Scholar Berglund, L. et al. The epitope space of the human proteome. Protein Science: Publication Protein Soc. 17 (4), 606–613 (2008).Article ADS CAS Google Scholar Baharifar, H. et al. Preparation of PEG-grafted chitosan/streptokinase nanoparticles to improve biological half-life and reduce immunogenicity of the enzyme. Int. J. Biol. Macromol. 143, 181–189 (2020).Article CAS PubMed Google Scholar Yari, M. et al. Decreasing the immunogenicity of erwinia chrysanthemi asparaginase via protein engineering: computational approach. Mol. Biol. Rep. 46 (5), 4751–4761 (2019).Article CAS PubMed Google Scholar Nelapati, A. K. et al. In-silico epitope identification and design of uricase Mutein with reduced immunogenicity. Process Biochem. 92, 288–302 (2020).Article CAS Google Scholar Van Regenmortel, M. H. What is a B-cell epitope? Methods in molecular biology. (Clifton NJ). 524, 3–20 (2009).Google Scholar Ramya, L. N. & Pulicherla, K. K. Studies on deimmunization of antileukaemic L-Asparaginase to have reduced clinical Immunogenicity–An in Silico approach. Pathol. Oncol. Research: POR. 21 (4), 909–920 (2015).Article CAS Google Scholar Nagata, S. & Pastan, I. Removal of B cell epitopes as a practical approach for reducing the immunogenicity of foreign protein-based therapeutics. Adv. Drug Deliv. Rev. 61 (11), 977–985 (2009).Article CAS PubMed PubMed Central Google Scholar Onda, M. et al. Characterization of the B cell epitopes associated with a truncated form of Pseudomonas exotoxin (PE38) used to make immunotoxins for the treatment of cancer patients. Journal of immunology (Baltimore, Md: 2006;177(12):8822-34. (1950).Onda, M. et al. An immunotoxin with greatly reduced immunogenicity by identification and removal of B cell epitopes. Proc. Natl. Acad. Sci. U.S.A. 105 (32), 11311–11316 (2008).Article ADS CAS PubMed PubMed Central Google Scholar Hameed, A. R., Fakhri Ali, S. & Almanaa, N. Exploring the hub genes and potential drugs involved in Fanconi anemia using microarray datasets and bioinformatics analysis. J. Biomol. Struct. Dynamics. 43 (7), 3297–3310 (2025).Article CAS Google Scholar Rahbar, M. R. et al. Stability and functional consequences of disulfide bond engineering in Aspergillus flavus uricase. Sci. Rep. 15 (1), 18419 (2025).Article ADS CAS PubMed PubMed Central Google Scholar Ahmad, S. et al. Identification of potential drug molecules against fibroblast growth factor receptor 3 (FGFR3) by multi-stage computational-biophysics correlate. J. Biomol. Struct. Dynamics. 43 (3), 1240–1248 (2025).Article CAS Google Scholar Aftab, N. et al. An optimized deep learning approach for blood-brain barrier permeability prediction with ODE integration. Inf. Med. Unlocked. 48, 101526 (2024).Article Google Scholar Download referencesAuthor informationAuthors and AffiliationsDepartment of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, IranMohammad Soroosh Hajizade & Mohammad Javad RaeePharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, IranMohammad Reza RahbarCollege of Graduate Studies, Upstate Medical University, State University of New York, New York, U.S.Maryam KabiriResearch Center for Traditional Medicine and History of Medicine, Department of Persian Medicine, School of Medicine, Shiraz University of Medical Sciences, Shiraz, IranMahdie HajimonfarednejadAuthorsMohammad Soroosh HajizadeView author publicationsSearch author on:PubMed Google ScholarMohammad Reza RahbarView author publicationsSearch author on:PubMed Google ScholarMaryam KabiriView author publicationsSearch author on:PubMed Google ScholarMahdie HajimonfarednejadView author publicationsSearch author on:PubMed Google ScholarMohammad Javad RaeeView author publicationsSearch author on:PubMed Google ScholarContributionsMohammad Soroosh Hajizade contributed to the conceptualization, data curation, formal analysis, and writing of the original draft. Mohammad Reza Rahbar was responsible for methodology. Maryam Kabiri handled the software, visualization, and contributed to writing, reviewing, and editing. Mahdie Hajimonfared Nejad performed the validation. Mohammad Javad Raee oversaw project administration.Corresponding authorCorrespondence to Mohammad Javad Raee.Ethics declarationsCompeting interestsThe authors declare no competing interests.Additional informationPublisher’s noteSpringer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Supplementary InformationBelow is the link to the electronic supplementary material.Supplementary Material 1Rights and permissionsOpen Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.Reprints and permissionsAbout this article