Heterogeneity in an adeno-associated virus transfection-based production process limits the production efficiency

Wait 5 sec.

IntroductionSince the emergence of gene therapy products, recombinant adeno-associated virus (rAAV) has become one of the leading viral vectors. This can be attributed to their unique properties of non-pathogenicity, low immunogenicity, episomal transgene expression, long-term persistency and broad tissue tropism1,2,3,4,5. Presently, there are six FDA-approved rAAV-based gene therapies, and over 200 more are under development6,7. By 2025, FDA predicts to approve 10 to 20 cell and gene therapies per year3. Despite the prevalence of rAAV-based gene therapy, its administration has been limited, partly due to high treatment costs2,8. The high prices can be attributed to high dosage requirements2, typically in the scale of 1012 to 1015 vector genomes per systemic dose6. This requirement, combined with low manufacturing productivities and high production costs8, results in costs per dose in excess of 1 million dollars9. The increased adoption of rAAVs places a burden on current manufacturing platforms, and achieving higher productivity is necessary to satisfy the growing clinical and commercial needs.The most common platform for rAAV manufacture is transient expression in human embryonic kidney 293 (HEK293) cells after triple-plasmid transient transfection10. Compared to other methods, transient expression after plasmid transfection is more versatile and forgoes helper viruses, eliminating the risks associated with their use. However, this method faces difficulties during scale-up and is costly due to high plasmid consumption and lack of consistency7,11,12. Even with a surplus of plasmids, only 30–80% of cells are successfully transfected3. Additionally, plasmid imbalances arising from this method may contribute to inefficient packaging and inconsistent empty-to-full capsid ratios2. Hence, titers are generally lower compared to other methods11, reaching up to 1014 vector genomes per liter13,14, requiring several litres of culture for one single dose. Extensive efforts have been made on process development for efficient transfection and transient expression3,15. However, the gaps in our biological understanding of rAAV production need to be addressed to overcome the limitations that are hindering the process at a cellular level.Assembly of rAAVs rely heavily on the host cell machinery, a rational approach to closing the gaps in our knowledge concerning alterations of the host-cell metabolism during AAV production is through the characterization of the host cell transcriptome. Indeed, previous studies have performed bulk RNA sequencing on rAAV production using either HEK293 cells or Sf9 insect cells and identified characteristic features of the producer cells. Virgolini et al. studied the transcriptional changes induced by rAAV production using the Sf9 baculovirus expression vector system. They found an enrichment of transcripts involved in the cell cycle, cell growth, protein folding and amino acid metabolism16. Wang et al. compared the transcriptomes of viral-producing and non-producing cells from two different HEK293 cell lines. They found that rAAV production led to an upregulation of innate immune response signalling pathways and stress response pathways of the host cells, as well as late-phase downregulation of fatty acid metabolism and neutral amino acid transport15. Chung et al. performed a kinetic study of the transcriptome after transfection at high cell density and at manufacturing scale and found an increased antiviral and inflammatory response, suggesting that cells respond to rAAV production as if it was an infection17. Lu et al. studied the cellular response of HEK293 cells towards transient transfection using both transcriptomics and proteomics and found an upregulation in immune response, unfolded protein response and p53 signaling18.Fundamentally, bulk RNA sequencing (RNA seq) studies the average gene expression in a cell population, which conceals the intrinsic cell-to-cell variability that persists even within a homogeneous population19,20,21. Single-cell RNA sequencing (scRNA seq) allows this heterogeneity to be revealed by examining the gene expression at a higher resolution in individual cells22,23. As the impact of cellular heterogeneity becomes more recognized and as advancements improve the accuracy and cost-effectiveness of this technology24, more studies have adopted scRNA seq for exploring the gene expression of cells in various contexts. Most of these studies are within immunology and oncology, revealing tumor heterogeneity and distinguishing immune cell types in clonally distinct cells20. However, one study has applied scRNA seq to examine rAAV production using the insect-cell baculovirus expression vector system in Sf9 insect cells25.  Virgolini et al. found a progressive increase in heterogeneity associated with the expression of viral and viral vector genes, as well as alterations in protein folding and translation, metabolic processes and stress response due to infection.This study presents the first application of scRNA seq to adherent and suspension cultures of HEK293 cells during transient transfection-based rAAV production, which uncovered cell-to-cell variability and potentially its underlying causes. Distinct groups were identified among the cells producing rAAVs and the trajectory of their development was examined. Using this approach, a significant limitation in production was identified and strategies for supporting rAAV production are proposed.Materials and methodsCell cultureHEK293T/17 ATCC-CRL-11,268 cells (see Ladd et al.26 were grown in GIBCO FreeStyle HEK293 Expression medium (Thermo Fisher Scientific, Waltham, USA) as either adherent cultures in Corning CellBIND static tissue culture T-25 flasks (Corning Inc., New York, USA) or suspension cultures in 125 mL shake flasks. All cultures were maintained in a Minitron incubator shaker set to 37 °C and 5% CO2.rAAV productionPrior to transfection, the adherent and suspension cultures reached cell densities of 2 × 106 cells mL− 1. Equal concentrations of the transfection agent PEIpro® (Polyplus, France) diluted in the cell culture medium and a DNA mixture containing a 2:1:1 mass ratio of pHelper, pRepCap (rAAV9), and green fluorescent protein (GFP) plasmids (Cobra Biologics, Charles River, UK) were combined. The PEIpro-DNA mixture was incubated for 7 min and then added to the cultures. The adherent and suspension cultures were maintained for up to 48 h post transfection (hpt). Samples were collected at 0, 12, 24 and 48 hpt and cryopreserved in 7.5% dimethylsulfoxide (DMSO).AnalyticsCell density and viabilityCell density and viability of the samples were determined with the NORMA XS automated cell counter and cell viability analyzer (IPRASENSE, Clapiers, France), which is based on holographic imaging.GFP expressionSamples taken from the cultures were centrifuged at 180 g for 5 min and resuspended in phosphate-buffered saline (PBS). The resuspended cells were analyzed using the Guava easyCyte benchtop flow cytometer (Cytek Biosciences, Fremont, USA). The forward and side scattering were used to gate single cells in the InCyte software Version 4.5 (Cytek Biosciences, Fremont, USA). GFP fluorescence was measured with a 488 nm excitation laser and a combination of a 512/18 nm and a 575/25 nm detector.rAAV genome titersrAAV genome titer quantification was performed using the AAVpro Titration Kit (for Real Time PCR) Ver.2 (cat#6233) (Takara Bio, Kusatsu, Japan). The samples were prepared following the associated protocol and using the provided kit components. The adherent samples were first washed with Ca2+- and Mg2+-free PBS, then incubated with TrypLE (Thermo Scientific, Waltham, USA) at 37 °C for 5 min. Supernatant samples were obtained by centrifuging at 1000 g for 5 min. Lysate samples were prepared through 3x freeze thaw cycles in a lysis buffer at -80 °C and centrifuged at 1000 g for 5 min. The lysis buffer containing 150 mM NaCl, 50 mM Tris-HCl and 2 mM MgCl2 was adjusted to a pH of 8.5 using NaOH. The lysate and supernatant samples were separately treated with DNAse for 1 h at 37 °C to digest free DNA and subsequently heated to 95 °C for 30 min to inactivate the DNAse. The lysate and supernatant samples were treated with lysis buffer for 20 min to break down the rAAV9 capsids. Finally, the samples were diluted 50-fold to use as a template for real-time qPCR. TB Green Premix Ex Taq II and the AAV2 ITR Forward and Reverse Titer Primers were combined with either the lysis template, supernatant template or the positive control provided in the kit prior to executing real-time qPCR analysis using the Bio-Rad CFX96 qPCR System (Bio-Rad, Hercules, USA). The program included: an initial denaturation at 95 °C for 10 min, a 2-step PCR comprising 35 cycles at 95 °C for 5 s and at 60 °C for 30 s for FAM fluorescence detection and a melt curve analysis. The standard curve was generated through a serial dilution of the positive control to facilitate quantification.Intracellular rAAV assembled capsid distributionThe distribution of rAAV particles containing correctly assembled capsids in the suspension sample at 48 hpt was investigated using the biotinylated affinity ligand CaptureSelect Biotin Anti-AAV9 conjugate (Thermo Fisher Scientific, Waltham, USA) which specifically binds to assembled capsids. At 48 hpt, cells were first washed with PBS, then fixated using 4% paraformaldehyde in PBS for 15 min. The fixated cells were washed with PBS and centrifuged at 400 g for 5 min three times to remove the fixative. The fixated cells were subsequently permeabilized with 0.1% Triton-X in PBST (PBS with 0.1% Tween 20) for 7 min and centrifuged at 400 g for 5 min. The permeabilized cells were incubated at room temperature with 0.1 ng/µL of the biotin anti-AAV9 conjugate for 1 h. This was followed by incubation at room temperature with 0.5 ng/µL Streptavidin, Alexa Fluor 647 Dye (Thermo Fisher Scientific, Waltham, USA) for 1 h. After incubation, the cells were washed with PBST, centrifuged at 400 g for 5 min and then resuspended in PBST. The samples were analyzed with the Guava easyCyte benchtop flow cytometer (Cytek Biosciences, Fremont, USA) using a 642 nm excitation laser and a 661/15 nm detector. The gating was performed in the InCyte software Version 4.5 (Cytek Biosciences, Fremont, USA) using the forward and side scattering. Untransfected cells were used as a negative control.Single-cell RNA sequencingSample preparation for sequencingThe samples for scRNA seq were thawed in a water bath at 37 °C prior to multiplexing with the Single cell Multiplexing Kit (Cat. No. 633781) (BD Biosciences, Franklin Lakes, USA) according to the manufacturer’s instructions (Doc ID: 210970). The isolation of single-cells, subsequent capture of the RNA, and then cDNA synthesis was performed with the BD Rhapsody Express (BD Biosciences, Franklin Lakes, USA) according to the manufacturer’s protocol (Doc ID: 210967) using one cartridge. A target of 5 000 cells per sample was used, which resulted in a total of 20 000 cells per cartridge. The Whole Transcriptome Analysis (WTA) Amplification Kit (Cat num: 633801) (BD Biosciences, Franklin Lakes, USA) was used to prepare the libraries from the cDNA containing BD Rhapsody beads according to the manufacturer’s protocol (Doc ID: 23-21712-00). The library quality and fragment length were checked with capillary gel electrophoresis on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, USA). The High Sensitivity DNA kit (Cat. No. 5067 − 4626) (Agilent Technologies, Santa Clara, USA) was used according to the manufacturer’s protocol. Paired-end mRNA sequencing and adapter demultiplexing using the Illumina NovaSeq 6000 platform (Illumina, Inc., USA) was performed by a third party (Novogene Co., LTD, China).Reference transcriptome for alignmentWhole plasmid sequencing of the plasmids was performed by a third party (Eurofins, Germany) through real-time long-read sequencing using Oxford Nanopore technology. The sequences for the human adenovirus 5 (Ad5) genome (AC_000008.1) and the simian virus 40 large T antigen (SV40 LT) antigen gene (AY531219.1) were acquired from the National Center to Biotechnology Information (NCBI)27. The plasmid sequences, Ad5 genome and SV40 LT gene were converted into transcripts using the program created by Pertea et al. called GFFRead28. This program used the obtained FASTA and general feature formats (GFF) files to convert the genes into transcripts. The transcripts for the plasmid sequences, Ad5 genome and SV40 LT gene combined with the human reference transcriptome from the GRCh38.p14 assembly (GCF_000001405.40) in NCBI were used as references for the alignment.Sequence alignmentSequence alignment to the references was performed in the cloud-based bioinformatics platform Seven Bridges (Seven Bridges Genomics Inc., USA). From this platform, the BD Rhapsody Sequence Analysis Pipeline version 2.0 utilizing the STAR alignment algorithm29. The output FASTQ files obtained from the sequencing and the reads are filtered by quality, demultiplexed and aligned to the references. Unique molecular identifiers (UMI) were quantified per gene per cell. Finally, UMI correction and cell label filtering were performed to obtain the file containing the molecule counts per cell, the binary alignment map (BAM) and the sequencing metrics.Data analysisProcessing and clusteringThe output file from the alignment was processed using the package Seurat version 5.1.030, in R version 4.3.1. Multiplets, undetermined cells, low quality cells (> 15% mitochondrial features and < 3000 gene count) were removed from the dataset. The dataset was split into two subsets of either adherent or suspension cells. The counts within these subsets were log normalized and features were linearly scaled using the functions NormalizeData() and ScaleData() with default parameters. PCA dimensionality reduction was performed with 50 principal components. An equivalent number of dimensions was used to construct the nearest-neighbor graph with the function findNeighbors(). The clustering was performed with the function findClusters(). The resolution for the clustering was determined using the framework by Patterson-Cross et al. called chooseR31, which chooses the most robust parameter through bootstrapped iterative clustering. A significance analysis of the number of clusters was performed using the package scSHC version 0.1.032, confirming that the number of principal components and the resolution were optimal for the subsets.Module scoresThe module scores for the plasmid gene expression were calculated using the AddModuleScore() function in Seurat version 5.1.030 with default parameters.Trajectory inferenceTrajectory inference was performed on the adherent and suspension subsets separately with Slingshot version 2.10.0 to follow the progression of cells between different cellular states over an underlying temporal variable or pseudotime33. Determination of the starting cluster was guided by the plasmid expression. A module score was calculated for the combined expression of all three plasmid genes, and the cluster with the lowest expression was defined as the starting cluster. Using default parameters in Slingshot, lineages were determined for the remaining clusters. Each lineage branching from the starting cluster represents cells transitioning into different states. From these lineages, principal curves were fitted through the data. Each principal curve is a smooth non-linear representation of the lineage. The cell-level weights and pseudotimes were individually extracted for the lineages for further analysis.Trajectory-based differential expression and overrepresentation analysisTrajectory-based differential expression analysis was individually performed on the lineages using the packages tradeSeq version 1.16.034 and SingleCellExperiment version 1.24.035. A negative binomial general additive model (NB-GAM) was used to model gene expression of each gene as smooth nonlinear functions of pseudotime. The optimal number of knots, at which the smooth nonlinear functions are joined together, was determined to be 6 using the function evaluateK(). The cell-level weights and pseudotimes were used as input values, while keeping all other parameters default. To determine the genes that are differentially expressed between the starting and end points of the lineage, the function startVsEndTest() was individually used on the gene NB-GAMs for each lineage. Genes with a log2 fold change > 2 and a p-value < 0.05 were clustered based on similarities in expression pattern over 100 points along the trajectory using the function clusterExpressionPatterns(). The function RSEC() from the package clusterExperiment version 2.22.036 was then used to consolidate the clusters into a final consensus cluster. The merging cut-off was set to 0.9 while keeping all other parameters default. The final consensus clusters were ordered based on gene expression over pseudotime. Overrepresentation analysis on each of the final consensus clusters was performed using the packages clusterProfiler version 4.10.137 and enrichR version 3.238 to determine the Gene Ontology (GO) biological processes associated with each final consensus cluster. An adjusted p-value threshold of < 0.0001 was applied to show only the most significant terms.Comparison of single-cell and bulk plasmid gene expressionSample preparation for bulk RNA sequencingA second experiment was conducted to produce samples for bulk RNA seq using the same methods described in Sect. 2.1, 2.2 and 2.3. Four replicates were prepared and sampled at 0, 12, 24 and 48 hpt. Samples were cryopreserved in 7.5% DMSO prior to sequencing. Library preparation and paired-end mRNA sequencing was performed on the cryopreserved samples by an external party (Novogene Co., LTD, China) using the Illumina NovaSeq 6000 platform. The external party performed quality control and data filtering by removing reads containing adapters, reads with > 10% indeterminable (N) bases, and reads with low quality (Q score ≤ 5) bases comprising > 50% of the total bases.Alignment and quality controlQuality control of the acquired bulk RNA sequencing data was performed prior to data processing using a tool created by Andrews et al. called FastQC39 combined with the tool made by Ewels et al. called MultiQC40. Alignment and quantification was performed using a program developed by Bray et al. called Kallisto41. The reference index used for pseudoalignment and quantification was created in Kallisto with the same reference transcripts and transcriptomes used in 2.5.2. The output file of Kallisto containing the estimated counts for each transcript was used for data analysis. Quality control of the pseudoalignment was performed using MultiQC40.Data processing and analysisThe package DESeq242 was employed to normalize the raw counts of the bulk RNA seq experiment using a negative binomial model. For comparison with the scRNA seq experiment, the raw counts of the plasmid genes from the bulk RNA seq experiment were normalized to their respective values at 48 hpt. These values were then compared to pseudo bulked raw counts from single-cell samples. Pseudo bulking was achieved using the function AggregateExpression from the package Seurat version 5.1.030. The raw counts from all the cells were summed for each gene and the summed counts were normalized to their respective values at 48 hpt.ResultsTransient rAAV production through transient triple-plasmid transfectionTwo HEK293 cell clones, growing either adherent or in suspension, were transfected with three plasmids for the production of rAAV9. Analysis of the cell lysates for rAAV titers revealed broadly similar values for the adherent and suspension cultures, around 2000 vg/cell, shown in Fig. 1. However, in the supernatants, the suspension cultures produced slightly more with 108 vg/mL compared to the 107 vg/mL of the adherent cells. There were slight variations among replicates, particularly for R4 and R5 in comparison with the other replicates.Fig. 1(a) Cell specific titer in the lysate and (b) volumetric titer in the supernatant measured through qPCR for adherent (ADH) and suspension (SUP) samples at 48 hpt. R1 to R5 corresponds to five replicates of the same experiment. One replicate (R1) was used for the single-cell RNA sequencing experiment and four replicates (R2-R5) were used for the bulk RNA sequencing experiment.Full size imageSingle-cell and bulk data quality control and processingNext, the transcriptomes of the cells were determined, both for single cells (R1) and bulk (R2-R5). The quality filtering, alignment, UMI correction and cell label filtering for the scRNA seq samples resulted in an average of 3.8 × 103 reads per cell for all four samples, adherent growth and suspension growth at 12 and 48 hpt, as shown in Table S1. The resulting number of cells after filtering is shown in Table S1. The bulk RNA seq data had an average of 2.6 × 107 normalized counts per sample after quality filtering, alignment and normalization, given in Table S2.Correlation of single-cell plasmid gene expression to bulkTo confirm the reproducibility of the plasmid gene expression observed in the single-cell RNA seq experiment, the single-cell raw counts were pseudo bulked then compared to the bulk RNA seq counts. The raw counts from each plasmid gene from 12 hpt of both experiments were normalized to the corresponding gene expression at 48 hpt, shown in Figure S1. The relationship of expression of all of the plasmid genes between the bulk and scRNA seq experiments is shown in Fig. 2. Pearson correlation coefficients (r) were calculated for the three plasmids separately. The genes of the pRepCap plasmid had a high correlation between the bulk and scRNA seq experiments. The genes of the pHelper and pGOI plasmids had, however, a less convincing correlation. This was mostly driven by the few genes in the pRepCap plasmids and by a few outliers in the pGOI plasmid, particularly for the adherent samples. An overall r value of > 0.7 was found for all plasmids for both adherent and suspension samples, demonstrating a good overall correlation of plasmid expression between the two experiments.Fig. 2Correlation between bulk and single-cell normalized counts of all plasmid genes obtained during RNA sequencing expressed in the adherent (a) and suspension (b) cells at 12 hpt. The genes are grouped by plasmid of origin: pGOI (blue), pHelper (red), pRepCap (green). The Pearson correlation coefficients (r) are displayed for each group as well as the combined plasmid expression (black).Full size imageHeterogeneity observed in the plasmid gene expressionIndividual clustering of the two subsets (adherent and suspension) resulted in four distinct clusters in both cases, seen in Figure S2. Cluster 3 for both subsets was found to have high mitochondrial gene expression, potentially indicating dead or dying cells, and was therefore excluded from the rest of the analysis. Using the plasmid genes as a signature, we calculated a module score and classified cells with scores at or above the 95th percentile as having high plasmid expression, the cutoffs of which are shown in Figure S3. The highest plasmid gene expression was observed in cluster 1 for both subsets, shown in Fig. 3. Clusters 0 and 3 had comparably low plasmid gene expression for both subsets and the plasmid gene expression of cluster 2 differed between the subsets. Some cells had no expression of any plasmid genes (2.3% of adherent and 1.4% of the suspension subsets) and were mainly found in cluster 0, as shown in Figure S4. Other cells expressed all plasmid genes (48% of adherent and 63.9% of the suspension subsets) and were ubiquitous in all clusters, also exhibited by Figure S4.Thus, it was concluded that cluster 1 contained cells with high plasmid gene expression and cluster 2 contained cells with low plasmid gene expression. Cluster 0 had low or no plasmid gene expression, likely comprising of “progenitor cells” to the other clusters.Fig. 3Violin plot of adherent (a) and suspension (b) subsets of the single-cell RNA sequencing data showing the module score for plasmid gene expression of each cluster. UMAP plot highlighting the plasmid gene expression score in the adherent (c) and suspension (d) subsets.Full size imageA large percentage of the cells (46%) were missing gene expression of at least one plasmid, as shown in Fig. 4. The percentage of cells missing plasmid gene expression decreased over time and was generally lower in samples taken at 48 hpt, especially for the suspension subset. This indicated that not all cells are expressing genes to a detectable level from all plasmids at 12 hpt. A marginal increase over time was also observed for cells with high plasmid gene expression for both subsets. The adherent and suspension samples had comparable levels of high plasmid gene expression, 5% and 8%, respectively.Fig. 4Plasmid gene expression of the cells from the single-cell RNA sequencing experiment. Plasmid expression is attributed to a cell if at least 1 transcript of any gene from that plasmid exists in that cell. Cells expressing all plasmid genes (olive green) and cells with high plasmid gene expression (navy blue) are also shown. To the left, the plasmid gene expression of all samples combined is displayed.Full size imageTo further investigate this observed heterogeneity in plasmid gene expression, intracellular rAAV capsids were stained using a nanobody specific to assembled rAAV9 capsids. This revealed that only a small percentage of cells (2.5%) from the 48 hpt suspension subset had intracellular assembled capsids to any large extent, shown in Fig. 5.Fig. 5Flow cytometry histogram with suspension cells at 48 hpt stained for intracellular assembled rAAV capsids and filtered from debris and multiplets. The negative control comprises stained untransfected cells. The positive gate was set based on the negative control.Full size imageDifferent cellular trajectories determine the plasmid gene expressionTrajectory inference was performed on the subsets to reveal the progression of cells toward different states. Using cluster 0 as the starting cluster, three lineages (α, β, γ) were identified, Fig. 6. The distance traveled from the beginning of the lineage is denoted as pseudotime. These lineages show that the progenitor cells, cluster 0, could either progress into cells with high plasmid gene expression, cluster 1, following the α lineage or into cells with low plasmid gene expression, cluster 2, following the β lineage. The γ lineage leading to cluster 3 was excluded from the analysis.Fig. 6UMAP plot with the three lineages (α, β, γ) for the adherent (a) and suspension (b) subsets of the single-cell RNA sequencing data generated from the trajectory inference. The cells are colored by cluster: 0 (red), 1 (green), 2 (blue), 3 (purple).Full size imageFig. 7Heatmap plot of the relative expression levels (log2 normalized z-score) of differentially expressed genes associated with the trajectory of suspension lineages towards low (a) or high (b) plasmid gene expression. The genes are grouped based on expression pattern over pseudotime. The group labelled − 1 shows genes that were not successfully grouped.Full size imageDifferential expression and overrepresentation analysis on the lineages were performed to identify factors that determined cellular fate. For each of the lineages, genes were grouped based on their expression over pseudotime, as shown in Figs. 7 and S5 for suspension and adherent cells. The number of groups identified varied between 9 and 16 for the lineages of the adherent and suspension subsets. Some of the genes could not be merged to any of the groups and thus formed their own grouping of unmatched genes (denoted − 1). Notably, some groups had cycles of gene expression over pseudotime. Significant GO terms were determined for the gene groups and ordered based on their expression over pseudotime, shown in Figs. 8 and S6. The full list of significant GO terms can be found in Tables S3 – S6.Fig. 8Results from the overrepresentation analysis with Gene Ontology biological processes of groups of differentially expressed genes associated with the suspension lineages towards low (a) or high (b) plasmid gene expression. The genes are grouped based on expression pattern over pseudotime.Full size imageDiscussionTransient transfection for the production of rAAVs suffers from many limitations13, some of which may be caused by cellular heterogeneity43. Heterogeneity is intrinsic to most cellular systems and can exist at different levels, from genomic to proteomic, and the extent of heterogeneity between these levels may vary20,21. A recent study investigated the cell-to-cell variability in a limited population of non-transfected adherent HEK293FT cells using scRNA seq and found only low levels of biological variance, which were primarily due to that the cells were at different stages in the cell cycle44. However, after viral infection and propagation, high heterogeneity of gene expression has been observed45,46. Single-cell transcriptomics could be a powerful tool for revealing the extent and cause of heterogeneity during transient transfection-based rAAV production43. Using single-cell transcriptomics, we were able to identify subpopulations of cells exhibiting high or low rAAV plasmid gene expression and determine the characteristic features for these subpopulations.Samples taken at 12 and 48 hpt were used for scRNA seq to reflect the early and later stages of transfection, as viral replication begins at 12 hpt and the majority of virions are produced after 24 hpt47,48. The normalized expressions of plasmid genes in pseudo bulked single-cell samples and bulk samples generally followed similar dynamics. However, genes of the pGOI plasmid tended to be higher at 12 hpt in single-cell samples than in bulk. Correlation analysis between single-cell and bulk plasmid gene expression resulted in a good correlation for the genes of the pHelper plasmid in both adherent and suspension cell samples. Only three genes were available for comparison for the gene of the pRepCap plasmid, indicating that the observed high negative correlation may lack substantial significance. On the other hand, a low r value for the genes of the pGOI plasmid was obtained, in particular for the samples of adherent cells. A more detailed analysis of the data revealed a significant enhancement of the correlation of genes from the pGOI plasmid for both adherent and suspension cell samples through the omittance of the GOI ORI encoding sequence. In general, a high correlation was found between bulk and single-cell plasmid gene expression. Importantly, the mRNA expression may not reflect the corresponding protein expression for every gene in a cell49,50. However, it can give indications of changes in cellular response to different stimuli as well as changes in expression level of individual proteins over time.Clustering of the scRNA seq data was found to be greatly influenced by the plasmid gene expression, indicating a separation of subpopulations of cells with high or low plasmid gene expression. Both the adherent and suspension subsets acquired a positively skewed distribution of the plasmid gene expression module score, indicating that high plasmid gene expression is further limited to a small subpopulation of cells.AAV production necessitates the successful transfection of all three plasmids, as it would otherwise lead to low titers or incomplete viral packaging12. However, the efficiency of transfection typically suffers from high variability and lack of scalability51. scRNA seq revealed differences in plasmid gene expression between individual cells originating from the same sample and showed that a large proportion of cells lacked expression of at least one plasmid. This proportion was predominantly influenced by phenotype and marginally influenced by time post-transfection, with suspension cells and later time points exhibiting a larger proportion containing all three plasmids.A low proportion of cells containing intracellular assembled capsids was observed in the bulk sample, potentially linked to the low percentage of cells expressing high plasmid gene expression. Dash et al. had similar findings when staining for intracellular assembled capsids, with only 5–10% of transiently-transfected HEK293SF cells expressing assembled rAAV capsids, even with high transfection efficiencies and vector yields52. These findings, combined with the fact that only a small fraction of the suspension cells (8%) and adherent cells (5%) at 48 hpt had high expression of plasmid genes (Fig. 4), indicates that not all the cells were successfully producing rAAVs. This shows that the heterogeneity in plasmid gene expression and the low fraction of cells producing rAAVs are bottlenecks that need to be addressed to improve titers.The clear separation of cells with different plasmid gene expressions during clustering indicated that distinct progression towards different states of plasmid gene expression could be discerned from trajectory inference. Three lineages were identified, all originating from cluster 0 which contained the largest number of cells and displayed low plasmid gene expression. One lineage, lineage β, led to the low-to-moderate plasmid gene expression in cluster 2. Another lineage, lineage α, transitioned into high plasmid gene expression. Genes that were differentially expressed along the lineages were identified and grouped into similar expression patterns, and significant GO terms associated with the groups of genes were determined.The adherent lineage β leading to low plasmid gene expression started with terms associated with the G1 phase of the cell cycle53,54,55,56. Other significant GO terms include canonical Wnt signaling pathway and the regulation of the β-catenin import into the nucleus. The canonical Wnt signaling pathway is responsible for G1- to S-phase progression57. This behavior suggests the progression of the cells from G1 to S to G2/M phase. GO terms associated with the regulation of the immune response and response to incorrect protein folding were upregulated at the end of the lineage.In the lineage α of adherent cells which leads to high plasmid gene expression, GO terms relating to the M and G1/S phases of the cell cycle were cyclically expressed in the lineage, and underwent two cycles of upregulation and downregulation. Positive regulation of transmembrane receptor protein serine/threonine kinase signaling pathway, which is highly associated with the M phase58, was upregulated in the beginning of the lineage, followed by downregulation. Lineage α displayed a high upregulation of ion transmembrane transport at the end, specifically for potassium, which has been linked to G1/S transition59. These findings potentially show the progression of cells from M- to G1- early on and then finally to the S-phase. Similar to the lineage β of adherent cells, GO terms related to the immune response were also upregulated at the end of the lineage.The lineage β of suspension cells leading to low plasmid gene expression showed some cyclical upregulation and downregulation, but most groups were either upregulated at the beginning or the end of the lineage. M- and G1-phase related GO terms such as mitosis, generation of precursor metabolites and cell size regulation were cyclically expressed, and both were downregulated at the end of the lineage. The negative regulation of the cell cycle observed at the end of the lineage could potentially indicate an arrest in the M-phase.Reminiscent of the lineage β of adherent cells, incorrect protein folding response was also observed for the suspension lineage β. The lineages leading to low plasmid gene expression were the only ones displaying incorrect protein folding, suggesting that this could be a potential reason for the low plasmid gene levels. Other omics studies also confirm the association of protein folding with rAAV production15,16,17,18,25. The outcome herein shows that protein folding could be optimized further to prevent limitations caused by incorrectly folded protein.In the lineage α of suspension cells leading to high plasmid gene expression, GO terms related to the M-phase such as mitosis and chromosome segregation were upregulated at early pseudotime values for lineage α. In line with this, an upregulation of GO terms related to the G1-phase such as cell growth and metabolism was observed. The gamma aminobutyric acid signaling pathway upregulated at the end of the lineage has been shown to lead to an accumulation of cells in the S-phase60. These results indicate the transition of cells from M- to G1- to S-phase for the lineage α of suspension cells, a behavior which is also seen in the lineage α of adherent cells.Generally, an earlier immune response was activated for the β lineages leading to low plasmid gene expression. The α lineages leading to high plasmid gene expression also triggered the immune response, although at a much later time point and nearing the end of the lineage. This outcome indicates that the regulation or the delay of the immune response may be a valid strategy for enhancing rAAV productivity.The link between wild type AAV expansion in vivo and the cell cycle has been previously established. AAVs have been found to modulate the host cell cycle to facilitate viral genome replication, which mainly occurs in the S/G2 phase. Indeed, our data demonstrate that the cluster of cells exhibiting high plasmid expression also showed significant upregulation of the S-phase gene signature whereas the G2M-phase signature showed a less pronounced correlation, as shown in Figures S7 and S861. Franzoso et al. showed that AAV2 gene expression occurs exclusively in cells in the S/G2 phases due to the cell cycle dependent expression of the Rep genes62. Rep 78 has been shown to activate caspase-3, which induces apoptosis during the G1 and early S phases of the cell cycle63, as well as to decrease the Cdc25A activity and accumulate pRb, which induces a complete arrest of the cells in the S phase64,65. Additionally, many other genes involved in rAAV production induce entry into the S phase or freeze the cells in the S phase by preventing mitosis, apoptosis such as the E1 and E4 genes66,67. The host cell DNA replication machinery required by DNA viruses for replication is only accessible during the S phase. To make the environment favorable for viral replication, AAVs must induce S-phase entry of infected cells, which they achieve through the virally-encoded proteins that induce S phase entry68.The influence of the cell cycle on rAAV productivity was partly shown by Barnes et al., who showed that modifying the SKA2 and ITPRIP gene expressions, which lead to modulatory effects on the cell cycle increased rAAV genome titers69. Another study by Tworig et al. who performed bulk RNA seq during rAAV production found the upregulation of proliferation inhibitors and downregulation of proliferation enhancers among high AAV producer cell lines, and proposed modulation of the cell cycle as a strategy to improve AAV titers70. Additionally, a study performed by Budge on plasmid DNA delivery during transfection of both CHO and HEK293 cells highlighted that plasmid DNA dispersal and nuclear import are facilitated by cell division, especially for PEI-mediated transfection, and that cell cycle synchronization to maximize cell division enhances transfection efficiency71. The findings in the present study show a strong link between high plasmid gene expression and the cell cycle. Cell cycle synchronization could be used as a strategy for improving plasmid gene expression and thus rAAV production.ConclusionThis study reports the first application of single-cell transcriptomics on transient transfection-based rAAV production with adherent and suspension HEK293T cells. In our effort to characterize the process, we uncovered the substantial heterogeneity in HEK293 cells during production and identified subgroups of cells with high and low rAAV plasmid gene expression.Major differences in plasmid gene expression were observed between the cells, with nearly half of the cells missing expression of genes from at least one plasmid, suggesting that successful plasmid gene expression is a significant bottleneck for the production. Additionally, we determined that a low proportion of cells produce intracellular assembled capsids. Key factors that distinguished cells with high and low plasmid gene expression were identified, with cell cycle being the primary contributor. Reducing the heterogeneity through strategies such as cell cycle synchronization could potentially improve the productivity of rAAV production based on transient transfection of HEK293 cells with producer plasmids. Further study in this direction is ongoing. This study contributes to our biological understanding of the production process as well as emphasizes the need for further omics studies.Data availabilityThe datasets analysed during the current study are available in the Sequencing Read Archive (SRA) with the accession number http://www.ncbi.nlm.nih.gov/bioproject/1236707.AbbreviationsrAAV:recombinant adeno associated virusRNA seq:RNA sequencingscRNA seq:single-cell RNA sequencingPEI:PolyethylenimineGFP:Green fluorescent proteinHPT:Hours post transfectionDMSO:DimethylsulfoxidePBS:Phosphate-buffered salinePBST:Phosphate-buffered saline with 0.1% Tween 20SV40LT:Simian virus 40 large T antigenUMI:Unique molecular identifierADH:AdherentSUP:SuspensionReferencesRobbins, P. D., Tahara, H. & Ghivizzani, S. C. Viral vectors for gene therapy. Trends Biotechnol. 16 (1), 35–40 (1998).Article  CAS  PubMed  Google Scholar Wang, D., Tai, P. W. & Gao, G. Adeno-associated virus vector as a platform for gene therapy delivery. Nat. Rev. Drug Discovery. 18 (5), 358–378 (2019).Article  CAS  PubMed  Google Scholar Sha, S. et al. Cellular pathways of Recombinant adeno-associated virus production for gene therapy. Biotechnol. Adv. 49, 107764 (2021).Article  CAS  PubMed  Google Scholar Li, C. & Samulski, R. J. Engineering adeno-associated virus vectors for gene therapy. Nat. Rev. Genet. 21 (4), 255–272 (2020).Article  CAS  PubMed  Google Scholar Daya, S. & Berns, K. I. Gene therapy using adeno-associated virus vectors. Clin. Microbiol. Rev. 21 (4), 583–593 (2008).Article  CAS  PubMed  PubMed Central  Google Scholar Destro, F. et al. The state of technological advancement to address challenges in the manufacture of rAAV gene therapies. Biotechnol. Adv. :108433. (2024).Wang, J-H., Gessler, D. J., Zhan, W., Gallagher, T. L. & Gao, G. Adeno-associated virus as a delivery vector for gene therapy of human diseases. Signal. Transduct. Target. Therapy. 9 (1), 78 (2024).Article  CAS  Google Scholar Smith, J., Grieger, J. & Samulski, R. J. Overcoming bottlenecks in AAV manufacturing for gene therapy. Cell. Gene Therapy Insights. 4 (8), 815–825 (2018).Article  Google Scholar Kliegman, M. et al. A roadmap for affordable genetic medicines. Nature 1, 8 (2024).Google Scholar Wang, Y. et al. Decoding cellular mechanism of recombinant adeno-associated virus (rAAV) and engineering host-cell factories toward intensified viral vector manufacturing. Biotechnol. Adv. 108322. (2024).Naso, M. F., Tomkowicz, B., Perry, W. L. III & Strohl, W. R. Adeno-associated virus (AAV) as a vector for gene therapy. BioDrugs 31 (4), 317–334 (2017).Article  CAS  PubMed  PubMed Central  Google Scholar Srivastava, A., Mallela, K. M., Deorkar, N. & Brophy, G. Manufacturing challenges and rational formulation development for AAV viral vectors. J. Pharm. Sci. 110 (7), 2609–2624 (2021).Article  CAS  PubMed  Google Scholar Fu, Q., Polanco, A., Lee, Y. S. & Yoon, S. Critical challenges and advances in recombinant adeno-associated virus (rAAV) biomanufacturing. Biotechnol. Bioeng. 120 (9), 2601–2621 (2023).Article  CAS  PubMed  Google Scholar Zhao, H. et al. Creation of a high-yield AAV vector production platform in suspension cells using a design-of-experiment approach. Mol. Therapy Methods Clin. Dev. 18, 312–320 (2020).Article  CAS  Google Scholar Wang, Y., Fu, Q., Lee, Y. S., Sha, S. & Yoon, S. Transcriptomic features reveal molecular signatures associated with recombinant adeno-associated virus production in HEK293 cells. Biotechnol. Prog. 39 (4), e3346 (2023).Article  CAS  PubMed  Google Scholar Virgolini, N. et al. Transcriptome analysis of Sf9 insect cells during production of recombinant Adeno-associated virus. Biotechnol. J. 18 (2), 2200466 (2023).Article  CAS  Google Scholar Chung, C-H. et al. Production of rAAV by plasmid transfection induces antiviral and inflammatory responses in suspension HEK293 cells. Mol. Therapy Methods Clin. Dev. 28, 272–283 (2023).Article  CAS  Google Scholar Lu, M., Lee, Z. & Hu, W. S. Multi-omics kinetic analysis of Recombinant adeno‐associated virus production by plasmid transfection of HEK293 cells. Biotechnol. Prog. 40 (2), e3428 (2024).Article  CAS  PubMed  Google Scholar Hwang, B., Lee, J. H. & Bang, D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 50 (8), 1–14 (2018).Article  CAS  PubMed  PubMed Central  Google Scholar Goldman, S. L. et al. The impact of heterogeneity on single-cell sequencing. Front. Genet. 10, 8 (2019).Article  CAS  PubMed  PubMed Central  Google Scholar Altschuler, S. J. & Wu, L. F. Cellular heterogeneity: do differences make a difference? Cell 141 (4), 559–563 (2010).Article  CAS  PubMed  PubMed Central  Google Scholar Kolodziejczyk, A. A., Kim, J. K., Svensson, V., Marioni, J. C. & Teichmann, S. A. The technology and biology of single-cell RNA sequencing. Mol. Cell. 58 (4), 610–620 (2015).Article  CAS  PubMed  Google Scholar Choi, Y. H. & Kim, J. K. Dissecting cellular heterogeneity using single-cell RNA sequencing. Mol. Cells. 42 (3), 189–199 (2019).CAS  PubMed  PubMed Central  Google Scholar Jovic, D. et al. Single-cell RNA sequencing technologies and applications: A brief overview. Clin. Translational Med. 12 (3), e694 (2022).Article  CAS  Google Scholar Virgolini, N. et al. Impact of dual-baculovirus infection on the Sf9 insect cell transcriptome during rAAV production using single‐cell RNA‐seq. Biotechnol. Bioeng. 120 (9), 2588–2600 (2023).Article  CAS  PubMed  Google Scholar Ladd, B., Bowes, K., Lundgren, M., Gräslund, T. & Chotteau, V. Proof-of-concept of continuous transfection for adeno-associated virus production in microcarrier-based culture. Processes 10 (3), 515 (2022).Article  Google Scholar Sayers, E. W. et al. Database resources of the National center for biotechnology information. Nucleic Acids Res. 50 (D1), D20–D6 (2022).Article  CAS  PubMed  Google Scholar Pertea, G. & Pertea, M. GFF utilities: GffRead and gffcompare. F1000Research ;9. (2020).Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29 (1), 15–21 (2013).Article  CAS  PubMed  Google Scholar Hao, Y. et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 42 (2), 293–304 (2024).Article  CAS  PubMed  Google Scholar Patterson-Cross, R. B., Levine, A. J. & Menon, V. Selecting single cell clustering parameter values using subsampling-based robustness metrics. BMC Bioinform. 22, 1–13 (2021).Article  Google Scholar Grabski, I. N., Street, K. & Irizarry, R. A. Significance analysis for clustering with single-cell RNA-sequencing data. Nat. Methods. 20 (8), 1196–1202 (2023).Article  CAS  PubMed  PubMed Central  Google Scholar Street, K. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genom. 19, 1–16 (2018).Article  Google Scholar Van den Berge, K. et al. Trajectory-based differential expression analysis for single-cell sequencing data. Nat. Commun. 11 (1), 1201 (2020).Article  ADS  PubMed  PubMed Central  Google Scholar Amezquita, R. A. et al. Orchestrating single-cell analysis with bioconductor. Nat. Methods. 17 (2), 137–145 (2020).Article  CAS  PubMed  Google Scholar Risso, D. et al. ClusterExperiment and RSEC: A bioconductor package and framework for clustering of single-cell and other large gene expression datasets. PLoS Comput. Biol. 14 (9), e1006378 (2018).Article  PubMed  PubMed Central  Google Scholar Wu, T. et al. ClusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innov. ;2(3). (2021).Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44 (W1), W90–W7 (2016).Article  CAS  PubMed  PubMed Central  Google Scholar Andrews, S. FastQC: a quality control tool for high throughput sequence data. [ (2017).Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32 (19), 3047–3048 (2016).Article  CAS  PubMed  PubMed Central  Google Scholar Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34 (5), 525–527 (2016).Article  CAS  PubMed  Google Scholar Love, M. I., Huber, W. & Anders, S. Moderated Estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 1–21 (2014).Article  Google Scholar Gurazada, S. G. R. et al. HEK-omics: the promise of omics to optimize HEK293 for Recombinant adeno-associated virus (rAAV) gene therapy manufacturing. Biotechnol. Adv. :108506. (2024).Borsi, G. et al. Single-cell RNA sequencing reveals homogeneous transcriptome patterns and low variance in a suspension CHO-K1 and an adherent HEK293FT cell line in culture conditions. J. Biotechnol. 364, 13–22 (2023).Article  CAS  PubMed  Google Scholar Jones, J. E., Le Sage, V. & Lakdawala, S. S. Viral and host heterogeneity and their effects on the viral life cycle. Nat. Rev. Microbiol. 19 (4), 272–282 (2021).Article  CAS  PubMed  Google Scholar Swaminath, S. & Russell, A. B. The use of single-cell RNA-seq to study heterogeneity at varying levels of virus–host interactions. PLoS Pathog. 20 (1), e1011898 (2024).Article  CAS  PubMed  PubMed Central  Google Scholar Nguyen, T. N. et al. Mechanistic model for production of Recombinant adeno-associated virus via triple transfection of HEK293 cells. Mol. Therapy Methods Clin. Dev. 21, 642–655 (2021).Article  CAS  Google Scholar Chahal, P. S., Schulze, E., Tran, R., Montes, J. & Kamen, A. A. Production of adeno-associated virus (AAV) serotypes by transient transfection of HEK293 cell suspension cultures for gene delivery. J. Virol. Methods. 196, 163–173 (2014).Article  CAS  PubMed  Google Scholar Maier, T., Güell, M. & Serrano, L. Correlation of mRNA and protein in complex biological samples. FEBS Lett. 583 (24), 3966–3973 (2009).Article  CAS  PubMed  Google Scholar Edfors, F. et al. Gene-specific correlation of RNA and protein levels in human cells and tissues. Mol. Syst. Biol. 12 (10), 883 (2016).Article  PubMed  PubMed Central  Google Scholar van der Loo, J. C. M. & Wright, J. F. Progress and challenges in viral vector manufacturing. Hum. Mol. Genet. 25 (R1), R42–R52 (2015).PubMed  PubMed Central  Google Scholar Dash, S., Sharon, D. M., Mullick, A. & Kamen, A. A. Only a small fraction of cells produce assembled capsids during transfection-based manufacturing of adeno‐associated virus vectors. Biotechnol. Bioeng. 119 (6), 1685–1690 (2022).Article  CAS  PubMed  PubMed Central  Google Scholar Calder, A. et al. Lengthened G1 phase indicates differentiation status in human embryonic stem cells. Stem. Cells. Dev. 22 (2), 279–295 (2013).Article  CAS  PubMed  Google Scholar Ter Huurne, M. & Stunnenberg, H. G. G1-phase progression in pluripotent stem cells. Cell. Mol. Life Sci. 78 (10), 4507–4519 (2021).Article  CAS  PubMed  PubMed Central  Google Scholar Soufi, A. & Dalton, S. Cycling through developmental decisions: how cell cycle dynamics control pluripotency, differentiation and reprogramming. Development 143 (23), 4301–4311 (2016).Article  CAS  PubMed  PubMed Central  Google Scholar Howell, A. S. & Lew, D. J. Morphogenesis and the cell cycle. Genetics 190 (1), 51–77 (2012).Article  CAS  PubMed  PubMed Central  Google Scholar Niehrs, C. & Acebron, S. P. Mitotic and mitogenic Wnt signalling. EMBO J. 31 (12), 2705–2713 (2012).Article  CAS  PubMed  PubMed Central  Google Scholar Yang, S-C., Huang, C-H., Chen, N-J., Chou, C-K. & Lin, C-H. Functional implication of human serine/threonine kinase, hAIK, in cell cycle progression. J. Biomed. Sci. 7 (6), 484–493 (2000).Article  CAS  PubMed  Google Scholar Rosendo-Pineda, M. J., Moreno, C. M. & Vaca, L. Role of ion channels during cell division. Cell. Calcium. 91, 102258 (2020).Article  CAS  PubMed  Google Scholar Wang, D. D., Kriegstein, A. R. & Ben-Ari, Y. GABA regulates stem cell proliferation before nervous system formation. Epilepsy Currents. 8 (5), 137–139 (2008).Article  PubMed  PubMed Central  Google Scholar Winocour, E., Callaham, M. & Huberman, E. Perturbation of the cell cycle by adeno-associated virus. Virology 167 (2), 393–399 (1988).CAS  PubMed  Google Scholar Franzoso, F. D. et al. Cell cycle-dependent expression of adeno-associated virus 2 (AAV2) rep in coinfections with herpes simplex virus 1 (HSV-1) gives rise to a mosaic of cells replicating either AAV2 or HSV-1. J. Virol. 91 (15). https://doi.org/10.1128/jvi (2017). 00357 – 17.Schmidt, M., Afione, S. & Kotin, R. M. Adeno-associated virus type 2 Rep78 induces apoptosis through caspase activation independently of p53. J. Virol. 74 (20), 9441–9450 (2000).Article  CAS  PubMed  PubMed Central  Google Scholar Berthet, C., Raj, K., Saudan, P. & Beard, P. How adeno-associated virus Rep78 protein arrests cells completely in S phase. Proc. Natl. Acad. Sci. 102 (38), 13634–13639 (2005).Article  ADS  CAS  PubMed  PubMed Central  Google Scholar Saudan, P., Vlach, J. & Beard, P. Inhibition of S-phase progression by adeno‐associated virus Rep78 protein is mediated by hypophosphorylated pRb. EMBO J. (2000).Nash, K., Chen, W., Salganik, M. & Muzyczka, N. Identification of cellular proteins that interact with the adeno-associated virus rep protein. J. Virol. 83 (1), 454–469 (2009).Article  CAS  PubMed  Google Scholar Ben-Israel, H. & Kleinberger, T. Adenovirus and cell cycle control. Front. Biosci. 7, d1369–d95 (2002).Article  CAS  PubMed  Google Scholar Bagga, S. & Bouchard, M. J. Cell cycle regulation during viral infection. Cell. Cycle Control: Mech. Protocols :165–227. (2014).Barnes, C. R. et al. Genome-wide activation screens to increase adeno-associated virus production. Mol. Therapy-Nucleic Acids. 26, 94–103 (2021).Article  CAS  Google Scholar Tworig, J. et al. Transcriptomics-informed Pharmacology identifies epigenetic and cell cycle regulators that enhance AAV production. Mol. Therapy Methods Clin. Dev. 32(4). (2024).Budge, J. D. Mapping cellular processes that determine delivery of plasmid DNA to the nucleus: application in Chinese hamster ovary and human embryonic kidney cells to enhance protein production. Front. Bioeng. Biotechnol. 13, 1466671 (2025).Article  PubMed  PubMed Central  Google Scholar Download referencesFundingOpen access funding provided by Royal Institute of Technology.Author informationAuthor notesBrian Ladd and Sofia Tunmats contributed equally to this work.Authors and AffiliationsAdBIOPRO Competence Centre for Advanced Bioproduction, Stockholm, SwedenBrian Ladd, Sofia Tunmats, Torbjörn Gräslund, Johan Rockberg & Véronique ChotteauDepartment of Industrial Biotechnology, KTH Royal Institute of Technology, Stockholm, SwedenBrian Ladd, Sofia Tunmats & Véronique ChotteauCell Culture and Fermentation Sciences, Biopharmaceuticals R&D, AstraZeneca, Cambridge, UKOlalekan DaramolaDepartment of Protein Science, KTH Royal Institute of Technology, Stockholm, SwedenTorbjörn Gräslund & Johan RockbergAuthorsBrian LaddView author publicationsSearch author on:PubMed Google ScholarSofia TunmatsView author publicationsSearch author on:PubMed Google ScholarTorbjörn GräslundView author publicationsSearch author on:PubMed Google ScholarOlalekan DaramolaView author publicationsSearch author on:PubMed Google ScholarJohan RockbergView author publicationsSearch author on:PubMed Google ScholarVéronique ChotteauView author publicationsSearch author on:PubMed Google ScholarContributionsConceptualization, B.L., V.C. Methodology, B.L., S.T. Software, B.L., S.T.Data curation, B.L., S.T.Investigation, B.L., S.T.Formal analysis, B.L., S.T.Supervision, V.C., T.G.Funding acquisition, V.C., T.G.Visualization, B.L., S.T.Project administration, V.C., T.G.Resources, V.C., T.G., J.R.Writing—original draft preparation, S.T., B.L.Writing—review and editing, S.T., B.L., V.C., T.G., O.D.Corresponding authorCorrespondence to Véronique Chotteau.Ethics declarationsCompeting interestsThe authors declare no competing interests.Additional informationPublisher’s noteSpringer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Supplementary InformationBelow is the link to the electronic supplementary material.Supplementary Material 1Rights and permissionsOpen Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.Reprints and permissionsAbout this article