scRepli-RamDA-seq: a multi-omics technology enabling the analysis of gene expression dynamics during S-phase

Wait 5 sec.

IntroductionA number of single-cell sequencing methods have been developed that simultaneously profile DNA and RNA from the same cell1,2,3,4,5,6,7,8,9,10,11. Despite the advancements in methodology, improvement in the resolution and sensitivity of both DNA sequencing (DNA-seq) and RNA sequencing (RNA-seq) components has been limited. Most methods typically use the DNA-seq component to detect DNA copy-number variation (CNV) at a relatively low resolution of 0.25–1.0 Mb, with few utilizing DNA-seq data for detecting DNA sequence variants. This limitation is influenced by several factors, including the low genome coverage, possibly due to sample loss during the experiment, biases introduced during whole genome amplification (WGA), and/or insufficient sequencing depth12. Similarly, the RNA-seq component also has room for improvement. For instance, most current methods primarily capture polyadenylated [poly(A)] RNAs but not non-poly(A) RNAs, although the latter play important roles in various contexts. Moreover, current methods could detect roughly  1. f The ability to detect known non-poly(A) genes was comparable across the three methods tested. Each row represents a non-poly(A) gene. Each column represents a sample derived from the indicated method. Gene expression levels [log10(TPM + 1)] are shown using the color scale in the figure. Source data are provided as a Source Data file.Full size imageWe subjected the isolated DNA on magnetic beads to WGA and next-generation sequencing (NGS) library preparation as in scRepli-seq13. In principle, scRepli-seq utilizes the degenerate oligonucleotide priming-PCR (DOP-PCR) technology, which provides relatively uniform WGA and good coverage13,17. The DNA-seq (scRepli-seq) component was used to assess DNA copy numbers genome-wide to karyotype G1-phase cells and analyze the replication state of S-phase cells. In parallel, the isolated RNA in the supernatant was subjected to scRamDA-seq14. This is achieved by using the not-so-random primers and the strand displacement amplification technique to amplify cDNA directly from the RNA template, which enables the detection of near full-length RNA, including non-poly(A) RNA, and results in higher detection rates of transcripts compared to other scRNA-seq methods14.We established two slightly different versions of the scRR-seq protocol, scRR1 and scRR3, using initial cell lysis buffer volumes of 1 µl and 3 µl, respectively. These two versions were developed in parallel, as each offers distinct advantages depending on the experimental context. scRR1 is more cost-effective and well-suited for large-scale experiments where sample preparation is straightforward and a certain degree of failure is acceptable, such as when single cells are sorted by flow cytometry. In contrast, scRR3 is more appropriate for manually isolated samples that are difficult to collect, have larger carryover volumes, or are otherwise precious and require a high success rate (Supplementary Note 1, Supplementary Table 1, and Supplementary Fig. 2a).After cell lysis, DNA on beads was processed similarly in both scRR1 and scRR3 based on the scRepli-seq protocol. In contrast, RNA in the lysis buffer was handled differently. For scRR3, we followed the protocol described in the kit (GenNext®RamDA-seq™ Single Cell Kit) with several modifications for the scRamDA-seq reaction. For scRR1, we used a smaller volume (1/3× of scRR3) for the scRamDA-seq reaction, which allowed us to skip the cDNA purification step and shorten the scRamDA-seq procedure (Supplementary Fig. 2a; see “Methods”).scRR-seq generates high-resolution DNA replication timing dataTo benchmark scRR-seq, we performed scRR1 and scRR3 using a near-diploid human cell line, hTERT-RPE1. We used fluorescence-activated cell sorting (FACS) to sort G1 and mid-S single cells into 0.2-ml tubes (Supplementary Fig. 2b) and processed them using our scRR-seq protocol. After filtering out cells that had failed the quality control (QC) test for library preparation (either DNA or RNA; see “Methods”), cells were subjected to NGS and downstream analysis. In addition, low-quality data were further excluded following the NGS analysis (see “Methods”; Supplementary Data 1).We first analyzed the DNA-seq data from scRR-seq (scRR-seq-DNA hereafter) and compared them with our original scRepli-seq data obtained individually13. To construct replication timing (RT) profiles from scRR-seq-DNA, similar numbers of NGS reads per sample [~4 million (M) reads] were generated and analyzed using our existing analysis pipeline13,18. The number of reads mapped to the human reference genome before and after filtering was similar for all three methods tested (scRR1, scRR3, and scRepli-seq; Supplementary Fig. 3a). By analyzing scRR-seq-DNA from G1 cells, we could observe the known and expected CNV on the q-arm of chromosome 10 in hTERT-RPE119 (Supplementary Fig. 3b).Because the WGA method for scRepli-seq provides relatively uniform amplification and sufficient coverage, we can assign replicated or unreplicated states to all genomic bins of S-phase cells after correcting mappability using control G1 cells13. We generated single-cell RT (scRT) profiles from scRR-seq-DNA for each S-phase cell at 40 or 80-kb resolution. In our hands, 40 kb is the highest resolution based on the optimal reads-per-bin analysis (Supplementary Fig. 3c). The scRT profiles derived from scRR-seq-DNA looked similar between 40 and 80-kb resolution data and were comparable to individually obtained scRepli-seq data (Fig. 1b and Supplementary Fig. 3d–f).scRR-seq generates high-quality RNA-seq data comparable to scRamDA-seqNext, RNA-seq data from scRR-seq (scRR-seq-RNA hereafter) were analyzed and compared with scRamDA-seq data generated individually. Again, we obtained similar numbers of NGS reads per sample (~4 M reads) and analyzed them using the ramdaq analysis pipeline20. The number of reads mapped to the reference human genome was relatively similar for all methods (Supplementary Fig. 4a). We also confirmed that DNA/RNA cross-contamination is minimal in scRR-seq-RNA as in scRamDA-seq (Supplementary Fig. 4b). Correlation between External RNA Controls Consortium (ERCC) spike-in counts and their expected concentration were high (R > 0.8) and consistent in all samples (Supplementary Fig. 4c). The whole transcriptome-level comparison revealed a high correlation between scRR-seq-RNA and scRamDA-seq data (R > 0.8, Supplementary Fig. 4d). While both principal component analysis (PCA) and t-Distributed Stochastic Neighbor Embedding analyses revealed some degrees of differences across the three different protocols tested (Supplementary Fig. 4e, f), we observed minimal batch effects in our experiments (Supplementary Fig. 4e, f; see scRR3 batches 1 and 2), consistent with our previous report on scRamDA-seq14. Nonetheless, scRR-seq protocols showed high coverage across the entire transcript length, including long transcripts, similar to scRamDA-seq (Fig. 1c, d).Our scRR-seq-RNA detected ~80% of genes and transcripts that were detected by the original scRamDA-seq (Fig. 1e and Supplementary Fig. 4g). scRR-seq-RNA also detected well-known non-poly(A) RNAs as efficiently as scRamDA-seq (Fig. 1f). The remaining ~20% not detected by scRR-seq-RNA were those that show low expression levels by scRamDA-seq and frequently corresponded to long non-coding (lnc)RNAs (Supplementary Fig. 5a–c). Using published RNA localization data21, we found that these undetected RNAs tend to be nuclear-localized (Supplementary Fig. 5d, e). This is consistent with the fact that, in scRR-seq-RNA, RNA was collected from the supernatant, which mainly corresponds to the cytoplasmic fraction. We further confirmed that a subset of RNAs, enriched for lncRNAs, remained bound to the beads (Supplementary Fig. 6). These results suggest that RNA abundance and subcellular localization influence detection sensitivity in scRR-seq-RNA. This could potentially explain the decrease in the number of genes detected compared to scRamDA-seq. Overall, however, scRR-seq-RNA data were comparable to scRamDA-seq data obtained individually.scRR-seq outperforms G&T-seqWe next asked how scRR-seq compared with G&T-seq4, a representative single-cell DNA/RNA-seq method that integrates PicoPlex for WGA and DNA analysis and Smart-seq2 for RNA profiling. To this end, we generated scRR-seq (scRR1 and scRR3) data from 8-cell stage mouse embryos and compared them with publicly available 8-cell embryo G&T-seq data (Supplementary Data 1). First, we focused on the RNA-seq component. We adjusted the NGS reads of our scRR-seq-RNA to ~3 M reads/sample to match the G&T-seq RNA-seq data (G&T-seq-RNA hereafter). Then, we evaluated these scRNA-seq data by comparison with bulk RNA-seq data from preimplantation embryos22. Using all genes for the analysis, 8-cell scRR-seq-RNA samples clustered more closely with bulk RNA-seq data of 8-cell embryos than 8-cell G&T-seq-RNA samples (Fig. 2a and Supplementary Fig. 7a).Fig. 2: Comparison of scRR-seq and G&T-seq using mouse embryos.a PCA of all genes. Bulk RNA-seq data of mouse preimplantation embryos22 were compared with scRR-seq and G&T-seq data4. b Detection rates of transcripts and genes expressed in 8-cell embryos across different expression level thresholds for each method. The values at the bottom show the mean number of transcripts or genes with SDs at TPM > 1. G&T-seq data were divided into two groups based on initial read counts. G&T-seq(H) includes seven samples with read counts comparable to or higher than those from scRR-seq-DNA, while G&T-seq(L) includes seven samples with lower initial read counts than scRR-seq-DNA. The line and color-shaded areas represent means and SDs, respectively. c Number of overlapping genes expressed in 8-cell embryos among the four methods shown (TPM > 1). d Comparison of scRT profiles of 8-cell embryos derived from different methods. The top panel shows whole-S scRT profiles of 8-cell embryos obtained via scRepli-seq16. A 100-Mb region of chromosome 2 (chr2) is shown. Green and red boxes indicate the subsets of cells used for comparison with scRR-seq-DNA and G&T-seq-DNA4, respectively. Yellow and blue represent unreplicated and replicated bins, respectively; white represents bins with no data. Source data are provided as a Source Data file.Full size imageWhile the overall mapping rates were higher for G&T-seq-RNA (Supplementary Fig. 7b), scRR-seq-RNA, particularly scRR3, detected ~10–20% more transcripts and genes with transcripts per million (TPM) > 1 (Fig. 2b), suggesting that scRR-seq-RNA can capture more alternative splicing or isoforms than G&T-seq-RNA. This is probably because scRR-seq-RNA theoretically captures all total RNA, including unspliced forms, while G&T-seq-RNA captures only poly(A) RNAs (Supplementary Fig. 7c). Expressed genes (TPM > 1) in scRR-seq-RNA overlapped more with bulk RNA-seq data than G&T-seq-RNA (Fig. 2c). In addition, the expression levels of well-known non-poly(A) transcripts and histone genes were also higher in scRR-seq-RNA than G&T-seq-RNA and bulk RNA-seq data (Supplementary Fig. 7d). When analyzing embryos, we found that scRR1 showed a bias towards the 3′ end of transcripts (Supplementary Fig. 7e) and detected fewer transcripts than scRR3 (Fig. 2b), which was not the case with cultured single cells collected by FACS (Supplementary Note 1). Nonetheless, scRR-seq overall showed superior sensitivity over G&T-seq for scRNA-seq analysis.Next, we compared scRR-seq-DNA with DNA-seq data from G&T-seq (G&T-seq-DNA hereafter) for their DNA detection capacity and sensitivity. Again, we adjusted the NGS reads of our scRR-seq-DNA to match those of the G&T-seq-DNA (~ 4 M reads/sample). We found that scRR-seq-DNA exhibited a ~ 50% mapping rate to the genome, while G&T-seq-DNA showed only ~10% (Supplementary Fig. 7f; after removing duplicates). As a result, the genome coverage of scRR-seq-DNA was ~3.2 times higher than that of G&T-seq-DNA (5.1% ± 0.6% and 1.6% ± 0.6%, respectively (mean ± SD); Supplementary Data 1). The lower WGA efficiency observed in G&T-seq may result from DNA loss during the DNA/RNA separation procedure, which involves more steps than scRR-seq.We also observed significantly higher median-absolute-deviation (MAD) scores for G&T-seq-DNA (Supplementary Fig. 7g). The MAD scores are routinely used to evaluate the uniformity of DNA-seq data13, where high MAD scores suggest a high level of noise in the data distribution. These observations make scRR-seq a better option for scDNA-seq and scRT analysis.We obtained G1-like and G2/M-like cells from 8-cell embryos (Supplementary Fig. 8a), which were used to correct the mappability of S-phase cells (Supplementary Fig. 8b). We successfully generated scRT profiles of 8-cell embryos from scRR-seq-DNA at 40-kb resolution, which was challenging for G&T-seq (Fig. 2d), possibly due to its low coverage (Supplementary Fig. 7f and Supplementary Data 1). Taken together, scRR-seq showed enhanced resolution and sensitivity compared to G&T-seq for both DNA-seq and RNA-seq, especially the former.scRR-seq captures gene expression dynamics and markers during S-phase progressionAs scRT data allows one to tell the S-phase timepoint of each cell (Supplementary Fig. 9), we reasoned that scRR-seq data should reveal gene expression dynamics during S-phase progression at an unprecedented temporal resolution and may help identify S-phase progression markers, which have never been reported before. Therefore, we generated scRR-seq data from hTERT-RPE1 cells and CBMS1 mESCs throughout the S-phase. First, we used scRR-seq-DNA to generate a whole-S RT profile, which showed that we successfully captured the entire S-phase of hTERT-RPE1 cells (Fig. 3a) and CBMS1 mESCs (Fig. 3b and Supplementary Fig. 10). The whole-S RT profile of CBMS1 mESCs closely resembled the profile obtained previously by scRepli-seq13 (Supplementary Fig. 11a). Then, we used our hTERT-RPE1 and CBMS1 mESC scRR-seq-RNA data to find S-phase progression marker candidates. We sorted cells based on their percentage replication scores and searched for genes that showed significant gene expression changes during S-phase progression. Using the generalized additive model (GAM) fitting, we identified 52 significant dynamic genes (FDR  10 in more than 50% of cells in scRR-seq-RNA data] from scRR-seq-RNA data (see “Methods”, Supplementary Fig. 13d and Supplementary Data 3). Monoallelically-expressed genes included Peg3, Peg10, Snrpn, and Rian, which are known mouse imprinted genes28,29. Other known imprinted genes, such as Igf2r and Slc38a4, showed biallelic expression (Supplementary Fig. 13e and Supplementary Data 3).Interestingly, monoallelic genes and genes with allelic expression imbalance showed a positive correlation with asynchronous RT in CBMS1 mESCs. That is, CBA-allele specific monoallelic genes or genes with biased expression toward the CBA allele exhibited earlier RT on the CBA loci, and vice versa was true for the MSM loci (Fig. 4f). For instance, Peg3 and Snrpn were primarily expressed from the MSM loci, and consistently, scRT profiles showed earlier replication of the MSM loci for these genes (Supplementary Fig. 13e, f). Likewise, Rian expression was biased toward the CBA allele, and the CBA Rian locus exhibited earlier RT than the MSM counterpart (Supplementary Fig. 13e, f). Overall, these results demonstrate the feasibility of haplotype-specific analysis by scRR-seq, and our data unequivocally demonstrated the positive correlation between transcription and RT within the same single cell.scRR-seq reveals a limited link between DNA copy number and gene expression levelEarly and late RT correlates with active and inactive transcription, respectively30. However, it remains unclear whether the increase in DNA copy number during S-phase plays any role in this relationship. In prokaryotes, transcription levels tend to increase as DNA is replicated (i.e., upon copy number increase), whereas in eukaryotic cells, transcription levels remain relatively static during S-phase31. However, these observations were made in organisms with small genome sizes or for a few representative genes by imaging or population-based data32,33,34, which were low in resolution and might not reflect reality. To revisit this problem, we leveraged scRR-seq to assess the relationship between DNA replication state and gene expression genome-wide at the single-cell level. The human haploid HAP1 cell line was used to avoid any issues arising from differences between homologous chromosomes.First, we evaluated the relationship between gene expression and RT. To do so, we used scRT profiles and calculated an average mid-S RT profile of HAP1 cells at 40-kb resolution, which resembled the population-based RT profiles from the BrdU-IP experiment (Fig. 5a). We categorized RT domains into three classes based on the average mid-S RT profile. The range from minimum to maximum was equally divided into nine fractions. RT domains with average mid-S RT values higher than the 7th fraction (>0.213), in between the 3rd and 7th fractions, and less than the 3rd fraction ( 0 in more than 50% of all cells were selected as informative genes, which were then separated into two groups based on their DNA copy number. About 58% of spliced and 48% of unspliced RNAs showed higher median expression in the 2-copy gene group, while 36% and 40%, respectively, were higher in the 1-copy group. However, most of these differences were not statistically significant (Fig. 5d and Supplementary Data 4). Only 8.9% and 6.9% of all spliced and unspliced RNAs, respectively, showed significant differences in gene expression levels between the two groups. Consistent with previous studies, our results suggest that changes in DNA copy number during S-phase have a limited impact on gene expression. It is more likely that gene expression is associated with RT and its related chromatin features than copy number changes during S-phase in mammalian cells.scRR-seq can detect CNVs induced by DNA replication stressTo assess the CNV detection sensitivity of scRR-seq, we induced chromosome breaks in a diploid female human fibroblast cell line IMR-90 by exposing them to low-dose aphidicolin, a well-known DNA polymerase alpha inhibitor commonly used to induce DNA replication stress. After 72 h of 0.4 µM aphidicolin treatment, G1 cells were collected and subjected to scRR-seq. As expected, we detected various CNV patterns in aphidicolin-treated but not untreated control cells. Large chromosomal abnormalities (>20 Mb)35 were reliably detected with a resolution of 50 to 500 kb using scRR-seq-DNA, while smaller abnormalities exhibited variability depending on the genomic bin size setting (Fig. 6a). Our data also demonstrates the potential of scRR-seq to detect CNVs smaller than 1000 kb (Supplementary Fig. 15a; at 50 and 100-kb bins).Fig. 6: CNVs in IMR-90 cells treated with low-dose aphidicolin.a Whole genome CNV plots of human IMR-90 cells treated with 0.4 µM aphidicolin for 72 h, compared to untreated control cells shown at the top of each panel. CNVs were analyzed at four bin-size settings: 500 kb, 200 kb, 100 kb, and 50 kb. b Regions 1, 2, 3, and 4 indicated in (a) are representative CNV-affected regions in IMR-90 cells treated with aphidicolin. Each row shows zoomed-in CNV profiles based on 50-kb CNV analysis (CNV panel) and corresponding gene expression levels (gene expression panel) for individual cells. Violin plots display the expression levels of genes [log10(TPM + 1)] located within each region, with the number of analyzed genes indicated at the top. Only genes with TPM > 1 in more than 50% of cells are shown. Black dots represent the median gene expression of genes in each region per cell, while red lines indicate the median gene expression across all cells for the same genes.Full size imageNext, we examined the relationship between CNVs and gene expression. Because aphidicolin inhibits DNA replication and cell-cycle progression, aphidicolin-treated cells naturally exhibited different expression profiles compared to untreated cells (Supplementary Fig. 15b). Therefore, we instead compared gene expression among aphidicolin-treated cells with and without chromosomal abnormalities. We found that while some chromosomal regions with CNVs showed gene expression alteration, others did not (Fig. 6b). For example, a cell with a large 3-copy CNV on chromosome 2 and a cell with a large 1-copy CNV on chromosome 19 showed median gene expression levels comparable to other cells (Fig. 6b, regions 1 and 4). In contrast, cells with a 1-copy CNV on chromosome 3 exhibited reduced median gene expression, whereas cells with a 3-copy CNV in the same region showed increased gene expression (Fig. 6b, see regions 2 and 3). This suggests a complex nature of gene regulation in cells with chromosomal abnormalities and implies that relying solely on RNA-seq is insufficient to determine chromosomal abnormalities. In summary, scRR-seq allows one to not only detect CNVs but also assess the impact of CNV on gene expression in the same cell.DiscussionIn this study, we established a single-cell multi-omics technology that enables high-resolution DNA replication profiling and full-length total RNA sequencing from the same single cell by combining our homemade single-cell technologies, scRepli-seq and scRamDA-seq. scRR-seq can generate DNA-seq and RNA-seq data that are highly comparable to those generated individually by the original methods (Fig. 1) and outperforms other single-cell DNA/RNA-seq methods such as G&T-seq (Fig. 2). scRR-seq also allows haplotype-specific analysis genome-wide at the single-cell level (Fig. 4). In addition, the integration of genome-wide replication and transcription data obtained from scRR-seq led to a much better understanding of the cell-cycle dynamics of gene expression and the identification of S-phase progression markers in human cells and mESCs (Fig. 3). Furthermore, scRR-seq allowed us to explore the precise relationship between DNA replication and transcription genome-wide in individual human cells (Fig. 5). In conclusion, scRR-seq is a powerful methodology that can provide an unprecedented, comprehensive view of DNA replication and gene expression within the same single cell.scRR-seq-DNA can generate a single-cell genome-wide DNA copy number profile. From our benchmarking, scRR-seq-DNA was nearly identical to scRepli-seq performed individually, generating high-resolution DNA-seq data and scRT profiles from single S-phase cells (up to 40 kb in resolution). Since RT correlates well with A/B compartments30,36,37, we can leverage this correlation to predict the A/B compartment profiles, which is challenging with the current single-cell Hi-C resolution. In addition, our scRR-seq-RNA achieves approximately 70–80% efficiency in detecting transcripts and gene expression compared to scRamDA-seq, which we consider sufficient to obtain valuable transcriptional information from a single cell. Thus, the combination of high-resolution DNA-seq and RNA-seq data from the same cell offers an unprecedented opportunity to explore cellular processes involving transcription, DNA replication, and 3D genome organization.Although the scDNA-seq component of scRR-seq showed a strong performance comparable to scRepli-seq, the scRNA-seq component is by no means perfect. For instance, despite its ability to detect the majority of RNA molecules, capturing certain lncRNAs and weakly expressed RNAs appears to be a challenge for scRR-seq-RNA, which is at least partly due to the physical loss of nuclear RNA during the DNA/RNA separation step. Nevertheless, scRR-seq-RNA still outperforms other available scDNA/RNA-seq methods in capturing lowly-expressed genes, including non-poly(A) genes, and the number of detected genes (Fig. 2b and Supplementary Fig. 7d). Thus, it is well-suited for generating global expression profiles and identifying cell types.Using scRR-seq, we could accurately determine the S-phase timepoint of each cell based on the percentage genome replication score and generate snapshot gene expression profiles of individual cells, meaning that we can obtain a moment-by-moment view of gene expression and monitor its dynamic changes during S-phase progression. We used our whole-S scRepli-seq profile derived from scRR-seq to evaluate existing cell-cycle markers commonly used in scRNA-seq analysis. Our results revealed the limitations of these markers (Fig. 3). In addition, we have identified S-phase progression markers in both hTERT-RPE1 and mESCs by scRR-seq. These markers should enable the determination of the S-phase status of any given cell. For instance, when a cell has the same or similar DNA copy number calls among all genomic bins throughout the genome (either 1-copy or 2-copy), the hidden Markov model (HMM) analysis used in our pipeline fails to binarize scRepli-seq data, resulting in difficulty distinguishing whether the cell is in the very early-S or very late-S phase. This issue should be solved by incorporating S-phase progression marker information.Since scRR-seq enables the simultaneous generation of genome-wide DNA replication and gene expression data at the single-cell level, we revisited the effect of DNA replication state on gene expression, which tends to increase as DNA is replicated in prokaryotes31. We applied scRR-seq to human haploid HAP1 cells and analyzed mid-S replicating genes in mid-S cells because these genes are in a mixture of replicated and unreplicated states in the population (Fig. 5). Consistent with previous studies in human cells that examined subsets of genes33,34, the majority of genes we analyzed showed similar gene expression levels despite variations in DNA replication status (Fig. 5d), indicating the limited impact of DNA copy number on gene expression during S-phase in mammalian cells. These results reveal a robust nature of transcriptional homeostasis during S-phase progression in mammalian cells.scRR-seq also enables simultaneous CNV detection and gene expression analysis in non-S-phase cells (Fig. 6). We found that CNVs and gene expression levels are not always positively correlated (Fig. 6b, regions 1 and 4). Since the regulation of gene expression associated with CNVs is complex and not fully understood38, scRR-seq offers a unique opportunity to explore this relationship at the single-cell level. In particular, scRR-seq could be highly useful in cancer research, where gene expression and CNVs vary between cells and influence tumor progression and therapeutic response39. Furthermore, we recently demonstrated that scRepli-seq can identify CNVs in mouse embryos at the single-cell level, regardless of whether cells are in G1 or S-phase16. This suggests that scRR-seq could be extended to explore gene expression and CNV associations in S-phase cells, which would be particularly valuable for analyzing rare or limited samples, for instance, single cells derived from tissues in vivo or patient samples.Regarding the overall cost and time efficiency, we believe scRR-seq is comparable to most existing protocols. It typically requires approximately 2–4 days each for DNA-seq and RNA-seq library preparation, which is similar to standard NGS workflows. The scRR-seq protocol is simple and easy, as it utilizes the commercially available RamDA-seq kit for RNA-seq. This simplifies the workflow and reduces variability associated with manual protocols. In terms of cost, our optimized scRR-seq protocol lowers reagent expenses to about $60 per sample for the combined DNA and RNA-seq library preparations (see Supplementary Data 5). In comparison, most other methods use Smart-seq-based technology as the RNA-seq component, which either requires technically challenging manual protocols or commercial kits that are substantially more expensive (which cost ~$13 and ~$80/sample for RNA-seq alone, respectively40). Although scRR-seq may not provide a major advantage in processing time or total cost, its simplified workflow makes it more user-friendly and easier to adopt in standard laboratory settings.While scRR-seq is not compatible with high-throughput processing of a large number of cells, it outperforms such methods regarding data resolution and, therefore, is best suited for analyzing cells of interest in depth. With its simplicity, versatility, and broad applicability, scRR-seq should contribute to addressing various biological questions, including those related to DNA replication, genome instability, and transcriptional regulation.MethodsCell cultureExperiments involving genetically modified organisms were conducted in accordance with the Regulations on the Use of Living Modified Organisms and the regulations of Mie University, and were approved by the Recombinant DNA Experiment Safety Committee of Mie University. Female CBMS1 mESCs41 were grown in 2i/LIF medium as described42. HAP1 cells (Horizon, C859) were cultured in Iscove’s Modified Dulbecco’s Medium (SIGMA, # I3390-500ML) containing 10% FBS and 1× penicillin/streptomycin. hTERT-RPE1 cells (Clontech, C4001-1) and IMR-90 cells (JCRB Cell Bank, JCRB9054) were cultured in MEMα with L-Glutamine and Phenol Red (Wako, #135-15175) containing 10% FBS and 1× penicillin/streptomycin.AnimalsAll animal experiments conformed to the Guide for the Care and Use of Laboratory Animals and were approved by the Institutional Committee of Laboratory Animal Experimentation of the RIKEN Center for Biosystems Dynamics Research (protocol number: A2015-06-9). Mouse-related experiments were performed as previously described16. C57BL/6 mice, aged 8–12 weeks, were used to produce oocytes and sperm. The temperature, humidity, and light cycle of mouse cages were maintained at 20–24 °C, 45–65% and 12 h–12 h dark–light, respectively. Mouse embryos were generated from in vitro fertilization, cultured, and single cells were collected as previously described16. The sex of embryos was not considered in the study design. Single cells from two mouse 8-cell embryos were collected and subjected to scRR-seq.Aphidicolin treatment in IMR-90 cellsIMR-90 cells (JCRB Cell Bank, JCRB9054) were incubated for 72 h in MEMα with L-Glutamine and Phenol Red (Wako, #135-15175) containing 10% FBS, 1× penicillin/streptomycin, and 0.4 µM of aphidicolin43 before collection.Sample preparation for scRR-seqAll procedures were carried out at 4 °C where possible. For hTERT-RPE1, CBMS1 mESCs, HAP1, and IMR-90, single cells were sorted into 0.2-ml tubes by a FACS using a Sony SH800 cell sorter (single-cell mode). Cells were stained with 30 µg/ml Hoechst 33342 in 10% FBS/PBS and incubated for 1 h at 37 °C in a temperature-controlled chamber before sorting (Supplementary Fig. 2b). For 8-cell embryos, blastomeres were manually dissociated and single cells were collected into 0.2-ml tubes using a mouth pipette. Subsequently, 1 or 3 µl of complete RamDA-seq lysis buffer (TOYOBO, #RMD-101) containing 2 mg/ml or 0.67 mg/ml Dynabeads™ MyOne™ Carboxylic Acid (Invitrogen, #65011), respectively, was added to each tube. Protocols using 1 µl and 3 µl of complete RamDA-seq lysis buffer were designated as scRR1 and scRR3, respectively. Samples were spun down at 10,000 × g for 1 min at 4 °C, vortexed at 2000 rpm for 1 min at room temperature using ThermoMixer C (Eppendorf, # 5382000023), and spun down again at 1000 × g for 5 min at 4 °C. Tubes were then placed on a magnetic stand (MagnaStand v3.2, FastGene, #FG-SSMAG3.2) for 5 min. The supernatant, containing total RNA, was transferred to a new tube and either processed immediately for scRamDA-seq14 or stored at −80 °C. To the bead-containing fraction, 6 µl of Single Cell Lysis & Fragmentation Buffer (SIGMA, #WGA4-50RXN) containing Proteinase K (SIGMA, # P4850-1ML) was added, followed by centrifugation at 10,000 × g for 1 min at 4 °C. This genomic DNA fraction was either processed immediately for scRepli-seq13,18 or stored at −30 °C.Sample preparation for scRR-seq-RNAcDNA was synthesized from RNA by using the GenNext®RamDA-seq™ Single Cell Kit (TOYOBO, #RMD-101) according to the manufacturer’s instructions, with the following modifications. If previously isolated RNA had been stored, it was first thawed at 4 °C. The RNA was then centrifuged at 10,000 × g for 1 min at 4 °C, heated at 70 °C for 90 s, and held at 4 °C. For genomic DNA digestion, we added 1 µl (for scRR1) or 3 µl (for scRR3) of Genomic DNA Digestion Solution containing ERCC RNA Spike-In [1× RT-RamDA buffer, 1× gDNA remover, 1/500,000 ERCC RNA Spike-In Mix (Thermofisher, 4456740) or 1/25,000 SIRV-Set 4 (Lexogen, 141)]. The mixture was spun down at 10,000 × g for 1 min at 4 °C, vortexed at 2000 rpm for 1 min at 4 °C using ThermoMixer C (Eppendorf), and centrifuged again at 10,000 × g for 1 min at 4 °C. The sample was then incubated at 30 °C for 5 min and held at 4 °C. Next, we added 1 µl (for scRR1) or 3 µl (for scRR3) of RT-RamDA (RDA) solution [1× RT-RamDA buffer, 1× RT-RamDA Enzyme Mix, 1st not-so-random (NSR) primer for mouse or human]. The sample was again spun down at 10,000 × g for 1 min at 4 °C, vortexed at 2000 rpm for 1 min at 4 °C using ThermoMixer C, and centrifuged at 10,000 × g for 1 min at 4 °C. Reverse transcription was then performed in a thermocycler under the following conditions: 25 °C for 10 min, 30 °C for 10 min, 37 °C for 30 min, 50 °C for 5 min, 98 °C for 5 min, and held at 4 °C. Subsequently, we added 2 µl (scRR1) or 6 µl (scRR3) of Second-strand synthesis solution [1× 2nd strand synthesis buffer, 1× 2nd strand synthesis enzyme, 2nd NSR primer for mouse or human]. The samples were spun down at 10,000 × g for 1 min at 4 °C and vortexed at 2000 rpm for 1 min at 4 °C using ThermoMixer C. For second-strand synthesis (for scRR3), samples were spun down at 10,000 × g for 1 min at 4 °C and incubated in a thermocycler at 16 °C for 60 min, 70 °C for 10 min, and held at 4 °C. For scRR1, samples were spun down at 10,000 × g for 1 min at 4 °C and incubated in a thermocycler at 16 °C for 60 min, 80 °C for 15 min, and held at 4 °C. All samples were finally centrifuged at 10,000 × g for 1 min at 4 °C and either processed immediately for library preparation or stored at −80 °C.Library preparation for scRR-seq-RNAscRamDA-seq NGS libraries were prepared using the Nextera® XT Library Prep Kit 24 Samples (Illumina, #15032350, #15032352) and Nextera® XT Index Kit v2 Set A 96 indexes-samples (Illumina, #15052163) according to the manufacturer’s instructions with the following modifications.For RNA samples from the scRR3 protocol (total volume: 15 µl), we added 27 µl of 1/4× Agencourt AMPure XP beads (Beckman Coulter, A63881), centrifuged at 1000 × g for 1 min, and vortexed at 2000 rpm for 2 min at room temperature using ThermoMixer C. Samples were incubated at 25 °C for 5 min in ThermoMixer C, spun down again at 1000 × g for 1 min, and placed on a magnetic stand for 2–5 min. The supernatant was discarded, and beads were washed twice with 150 µl of 80% ethanol while on the magnetic stand. After drying the beads, we added 3.75 µl of 0.67× Tagmentation DNA buffer, vortexed, and incubated at 25 °C for 5 min in ThermoMixer C. After spinning down at 1000 × g for 1 min, the samples were placed on the magnetic stand for 2–5 min, and the supernatant was transferred to a new tube. Next, we added 1.25 µl of Amplicon Tagment Mix, vortexed at 2000 rpm for 2 min at 4 °C in ThermoMixer C, incubated the samples at 55 °C for 5 min, and held at 10 °C (with 75 °C lid heating function) in a thermocycler. Then, 1.25 µl of Neutralize Tagment Buffer was added, vortexed at 2000 rpm for 2 min at 25 °C in ThermoMixer C, and incubated at 25 °C for 5 min. After centrifugation, we added 3.75 µl of Nextera PCR Master Mix, vortexed at 2000 rpm for 2 min at 4 °C in ThermoMixer C. Next, we added 1.25 µl each of i5 and i7 index adapter and vortexed at 2000 rpm for 2 min at 4 °C in ThermoMixer C. After spinning down, the samples were incubated in a thermocycler at 72 °C for 3 min, 95 °C for 30 s, 14 cycles of [95 °C for 10 s, 55 °C for 30 s, 72 °C for 30 s], 72 °C for 5 min, and held at 4 °C (with 105 °C lid heating function). We then cleaned up the library by adding 15 µl of AMPure XP beads to the solution, vortexed at 2000 rpm for 2 min at 25 °C in ThermoMixer C, and incubated the samples at 25 °C for 5 min. After spinning down at 1000 × g for 1 min, tubes were placed on a magnetic stand for 5 min, the supernatant was discarded, and the beads were washed twice with 150 µl of 80% ethanol while placed on the magnetic stand. After drying, we added 24 µl of TE (pH 8.0) to the beads and vortexed the samples. Samples were incubated at 25 °C for 2 min in ThermoMixer C, then spun down at 1000 × g for 1 min before being placed on the magnetic stand for 2–5 min. The supernatant was transferred to a new tube, and the library concentration was quantified by Qubit (Thermo Fisher Scientific). Sample DNA sizes were quantified by Agilent TapeStation with High Sensitivity D1000 ScreenTape. Optimal libraries showed sample concentrations of >1 ng/µl and fragment sizes ranging from 100–600 bp.For RNA samples from the scRR1 protocol (total volume: 5 µl), we took 1 µl of sample solution and added 2.2 µl of Tagmentation DNA buffer, spun down at 1000 × g for 1 min at 4 °C, and vortexed them at 2000 rpm for 2 min at 4 °C in ThermoMixer C. After another spin-down, 1 µl of Amplicon Tagment Mix was added and vortexed again at 2000 rpm for 2 min at 4 °C in ThermoMixer C. We incubated the samples at 55 °C for 10 min and held them at 10 °C (with 75 °C lid heating function) in a thermocycler. Next, we added 1 µl of Neutralize Tagment Buffer, vortexed at 2000 rpm for 2 min at 25 °C in ThermoMixer C, and incubated the samples at 25 °C for 5 min. After spinning down, we added 3 µl of Nextera PCR Master Mix, vortexed at 2000 rpm for 2 min at 4 °C in ThermoMixer C. Next, we added 1 µl each of i5 and i7 index adapters and vortexed the samples at 2000 rpm for 2 min at 4 °C in ThermoMixer C. After spinning down, samples were incubated in thermocycler at 72 °C for 3 min, 95 °C for 30 s, 16 cycles of [95 °C for 10 s, 55 °C for 30 s, 72 °C for 30 s], 72 °C for 5 min, and held at 4 °C (with 105 °C lid heating function). We then cleaned up the library by adding 12 µl of AMPure XP to the solution, vortexed the samples at 2000 rpm for 2 min at 25 °C in ThermoMixer C, and incubated them at 25 °C for 5 min. We spun down the samples at 1000 × g for 1 min before placing the tubes on a magnetic stand for 5 min. The supernatant was discarded, and we washed the beads with 100 µl of 80% ethanol twice while the samples were placed on a magnetic stand. After drying the beads, we added 24 µl of TE (pH 8.0) and vortexed the samples. The samples were incubated at 25 °C for 2 min in ThermoMixer C and then spun down at 1000 × g for 1 min before being placed on a magnetic stand for 2–5 min. The supernatant was collected and processed as described above.Sample and library preparation for scRamDA-seqAfter single cells were sorted into 0.2-ml tubes, we added 3 µl of complete RamDA-seq lysis buffer (TOYOBO, #RMD-101). All subsequent steps followed the same procedures as described for the scRR3 protocol in the “Sample preparation for scRR-seq-RNA” and “Library preparation for scRR-seq-RNA” sections.Sample and library preparation for scRR-seq-DNASample preparation was performed as described in the scRepli-seq protocol13,18. Briefly, 6 µl of genomic DNA (gDNA) solution was subjected to WGA using the SeqPlex kit (Sigma, SEQXE) in a 30 µl reaction volume, following the manufacturer’s instructions. The amplified gDNA was purified and size-selected using Agencourt AMPure XP SPRI beads (1.7× reaction volume), and the SEQXE adapter sequence was removed by the primer removal enzyme Eco57I (Sigma, SEQXE). The gDNA was further purified with Agencourt AMPure XP SPRI beads (1.8× reaction volume) and eluted in 15 µl of 1/10× Elution Buffer (Qiagen). The DNA fragment size peak should be within 200–600 bp, which was confirmed by Agilent TapeStation with High Sensitivity D1000 ScreenTape. We could easily distinguish single cells from zero or two cells by quantification of DNA with MultiNA, which provided reassurance that the gDNA was indeed derived from single cells. Then NGS libraries were constructed using the NGS LTP Library Preparation Kit (KAPA, KK8232) with a SeqCap adapter kit A/B (Roche, 07141530001/07141548001) or NEXTflex DNA barcodes (Bio Scientific, NOVA), or the NEBNext® Ultra™ II DNA Library Prep Kit for Illumina® (New England BioLabs, #E7645) with either NEBNext® Multiplex Oligos for Illumina® (Dual Index Primers Set 1, Set 2) (New England BioLabs, #E7600S, #E7780S) or NEBNext® Multiplex Oligos for Illumina® (Index Primers Set 1, Set 2, Set 3, Set 4) (New England BioLabs, #E7355L, E7500L, E7710L, E7730L). Finally, the samples were subjected to NGS on an Illumina HiSeq X Ten or Novaseq X Plus system (150-bp length, paired-end reads).Sample preparation for population-based RT profiling by BrdU-IP Repli-seq (HAP1)We followed the BrdU-IP protocol as described13. HAP1 cells were incubated in medium containing 10 mM BrdU (Sigma) for 2 h before cell collection. After trypsinization, the single-cell suspension was fixed in 75% ethanol. For FACS, we stained the fixed cells with propidium iodide (Nacalai, 29037) and used a Sony SH800 cell sorter (ultra-purity mode) to sort early- and late-S-phase cell populations (at least 10,000 cells per fraction). We used a Bioruptor BRII (Sonic Bio) for genomic DNA sonication (high-output mode), with ON/OFF pulse times of 30 s/30 s for 12 min in ice-cold water. BrdU-incorporated DNA was immunoprecipitated using anti-BrdU antibody (dilution to 12.5 µg/ml by PBS; BD Biosciences Pharmingen, 555627). After BrdU-IP, immunoprecipitated DNA samples were subject to WGA with a SeqPlex kit (Sigma, SEQXE). NGS libraries were constructed from early- and late-replicating DNA after WGA using the NEBNext® Ultra™ II DNA Library Prep Kit for Illumina® (New England BioLabs, #E7645), following the manufacturer’s instructions, and were subjected to NGS on an HiSeq X Ten system or Novaseq X Plus system (150-bp length, paired-end reads).Computation associated with the transcriptome profiling of single cells (scRNA-seq)All RNA-seq, scRNA-seq, and scRR-seq-RNA datasets were processed using a complete nextflow pipeline, ramdaq v.1.9.220. The following tools were used; Nextflow v.23.04.01, FastQC v.0.11.8, Fastq-mcf v.1.04.807, Hisat v.2.2.0, Samtools v.1.10, Bam2Wig v.3.0.1, Bamtools v.2.5.1, read_distribution v.3.0.1, inter_experiment v.3.0.1, inner_distance v.3.0.1, junction_annotation v.3.0.1, featureCounts v.2.0.1, RSEM v.1.3.1, edgeR v.3.26.5, and MultiQC v.1.9. Human GENCODE v38 and mouse GENCODE vM25 were used as reference genomes. Publicly available bulk RNA-seq data of mouse preimplantation embryos22 (short-read raw sequencing data; SRA accession: SRP225196), G&T-seq data of 8-cell mouse embryos4 (ENA accession: PRJEB9051), and scRamDA-seq of mESCs14 (GEO accession: GSE98664) were used for the analyses.Computation associated with the RT profiling of single cells (scRepli-seq)For non-haplotype/allele-specific analyses, we followed our established pipeline for scRepli-seq RT analysis as described13,18. We mapped scRR-seq-DNA data to the hg38 and mm10 reference genomes for human and mouse cells, respectively, using bwa (v.0.7.17-r1188). Genome coverage of unique mapped reads per sample was computed using genomeCoverageBed from bedtools.We then analyzed the whole genome scRR-seq-DNA data for each single cell in 40-kb non-overlapping bins (or 80-kb bins where indicated) using AneuFinder44 (v.1.2.1). For binarization, different options were applied to each cell depending on their FACS sorting gates (2-HMM option for G1 FACS gate: most.frequent.state = “1-somy”; 2-HMM option for S-phase FACS gates: most.frequent.state = “2-somy”).To calculate the percentage replication scores, we excluded abnormal chromosomes and the X chromosome from each cell line to avoid bias (chromosomes 10 and X were removed for hTERT-RPE1 analysis; chromosomes 8 and X were removed for CBMS mESC analysis; chromosomes 1, 15, and X were removed for HAP1 analysis). We also filtered out ambiguous regions based on ENCODE Blacklist version 245. Replicated bins (bins with 2-copy state) were then counted and divided by the total number of analyzed bins per cell. scRT profiles were sorted according to their percentage replication scores and visualized using the IGV browser.For haplotype/allele-specific analyses, we performed haplotype-resolved scRepli-seq as described18,25, using 400-kb non-overlapping bins. We used our in-house phased hTERT-RPE1 genome based on SNP information aligned to hg19. For mouse cells, we used our in-house allele-specific CBMS1 genome based on SNP information aligned to mm10.All average mid-S RT profiles were generated from cells with 40–70% replication scores and computed using our established pipeline13. Publicly available non-haplotype and haplotype-specific BrdU-IP profiles and scRT profiles of hTERT-RPE1 and CBMS1 mESCs13,25, as well as scRT profiles of 8-cell embryos16, were also used. Published datasets aligned to the mm9 reference genome were converted to mm10 using UCSC LiftOver.Quality controls of scRR-seq samplesFor scRR-seq-DNA, we first calculated the mapping ratio by dividing the number of reads with MAPQ > 10 by the total number of reads. Cells with a mapping ratio below Q1−1.5 × IQR were excluded. We also calculated the MAD score using the log2 ratio of read counts to the genome-wide median across non-overlapping 200-kb windows. We filtered out cells with MAD scores >0.3 for G1 cells, and 0.8 for mid-S phase cells. After sorting the cells based on their percentage replication scores, we calculated Manhattan distances between samples. Cells with a Manhattan distance greater than Q3 + 1.5 × IQR were excluded. Cells with percentage replication scores below 10% or above 90% were also excluded from downstream analysis.For scRR-seq-RNA, we first quantified the number of uniquely mapped reads to the genome (using hisat) in each cell and filtered out cells with values below Q1−1.5 × IQR. In addition, we examined uniquely mapped reads to ribosomal RNA (rRNA) and mitochondrial RNA (mtRNA) and filtered out those with values above Q3 + 1.5 × IQR for either. Cells that failed to pass the QC criteria for either scRR-seq-DNA or scRR-seq-RNA were excluded from all subsequent analysis.Subcellular localization of RNAsCytoplasmic/Nuclear Localization was assessed using the Relative Concentration Index (CN-RCI) obtained from the LncATLAS database (https://lncatlas.crg.eu/). Only genes with available CN-RCI data were included in the comparison.Comparison of bulk RNA-seq data of mouse preimplantation embryos with scRR-seq-RNA and G&T-seqAll genes were used for PCA across the bulk RNA-seq, scRR-seq-RNA, and G&T-seq-RNA datasets. Gene-level expression values were log-transformed as (log10(TPM + 1)), and PCA was performed using the prcomp function in R (scale. = TRUE).Cell-cycle phase assignment by SeuratWe analyzed CBMS1 mESC scRR-seq-RNA data using the Seurat (v5.0.1) package in R. Gene-level expression data (RSEM) were normalized using the “NormalizeData” function (normalization.method = “LogNormalized”, scale.factor = 10,000). The data were then scaled by the “ScaleData” function, with all genes used as features. Next, we utilized the “CellCycleScoring” function (using default S-phase and G2/M-phase genes for s.features and g2m.features, respectively) to predict the cell-cycle phase of CBMS1 mESCs. Subsequently, we ran “RunPCA” using the default S-phase and G2/M-phase marker genes in Seurat as features. PCA plots were color-coded based on cell-cycle phases assigned by CellCycleScoring as described above or by actual percentage replication scores, as indicated.Alternatively, we ran the “DiffusionMap” function from the destiny package (v.3.12.0) using the same default S-phase and G2/M-phase marker genes as above. Diffusion map plots were similarly color-coded based on the actual percentage replication scores as indicated. We conducted similar analysis for hTERT-RPE1.Identification of S-phase progression markersFor CBMS1 mESCs, we used only cells that passed QC for both scRR-seq-DNA and scRR-seq-RNA. We sorted cells according to their percentage replication scores and then searched for genes that showed significant gene expression changes during S-phase progression using an analysis method adapted from Hayashi et al.14. We used log-transformed gene-level expression data (derived from RSEM) as a function of the percentage replication score for this analysis. We fitted a GAM to gene-level expression as a smooth function of percentage replication score for each gene using the mgcv R package (version 1.8–16) with the parameter “family = Gaussian(link = identity).” The p-values of the smooth term were assessed using one-sided F-test provided by mgcv, and resulting p-values were adjusted for multiple testing across all genes using the Benjamini–Hochberg procedure to obtain the false discovery rate (FDR). The Akaike information criterion (AIC) was calculated for GAM and an intercept model. Genes with an FDR