AbstractStarships form a recently discovered superfamily of giant transposons in Pezizomycotina fungi, implicated in mediating horizontal transfer of diverse cargo genes between fungal genomes. Their elusive nature has long obscured their significance, and their impact on genome evolution remains poorly understood. Here, we reveal a surprising abundance and diversity of Starships in the phytopathogenic fungus Verticillium dahliae. Remarkably, Starships dominate the plastic genomic compartments involved in host colonization, carry multiple virulence-associated genes, and exhibit genetic and epigenetic characteristics associated with adaptive genome evolution. Phylogenetic analyses suggest extensive horizontal transfer of Starships between Verticillium species and, strikingly, from distantly related Fusarium fungi. Finally, homology searches and phylogenetic analyses suggest that a Starship contributed to de novo virulence gene formation. Our findings illuminate the profound influence of Starship dynamics on fungal genome evolution and the development of virulence.IntroductionTransposable elements (TEs, transposons) are ubiquitous mobile genetic elements in all life forms. Originally, these have been seen as selfish elements carrying only genetic information for their proliferation, but presently are appreciated to shape genome structure and function, and drive evolutionary innovations1. Whereas most TE superfamilies have simple structures and encode few proteins2, giant TEs are tens to hundreds of kilobases (kb) in size and carry tens to hundreds of cargo genes3,4,5.Starships are giant TEs (15–700 kb) that were recently discovered in Pezizomycotina fungi, the largest subdivision of Ascomycota, and typically contain a tyrosine recombinase (YR) “captain” gene as the first gene, required for transposition, while cargo genes are variable and functionally diverse5,6,7,8. How Starships impact global genome evolution, and the range and extent to which Starships have shaped genomes over time, remains enigmatic. Starship detection is technically challenging owing to the large diversity in cargo that can include abundant repeats7, and relies on the detection of the presence/absence of YR-containing inserts in orthologous sites among highly contiguous genome assemblies5. While 143 Starships were identified in this manner in a systematic search of 2899 fungal genomes, 10,628 “orphan” captain-like YR genes remained, suggesting many overlooked Starships6. Starships occupy up to 2.4% of the genome of the human pathogen Aspergillus fumigatus, show extensive presence/absence variation, and contain many differentially expressed cargo genes upon infection9. Moreover, in the plant pathogen Macrophomina phaseolina, 30% of chromosomal translocations, inversions, and putative chromosomal fusions occur near Starship insertions7. Interestingly, Starships can transfer horizontally between closely-related fungi of the same order, and can transfer important traits such as pathogenicity8,10,11,12,13,14,15,16.Pathogens and their hosts typically engage in molecular arms races, with the pathogen exploiting secreted virulence factors (effectors) to mediate host colonization, while hosts employ immune receptors for pathogen interception17,18,19. To avoid recognition, pathogen effector catalogs are highly dynamic and variable, mediated by a “two-speed genome” organization in which virulence genes co-localize in highly plastic genomic regions that are enriched in repetitive elements and particular epigenetic features20,21,22,23,24,25,26. Accordingly, the fungus Verticillium dahliae that causes vascular wilt disease in hundreds of hosts27,28,29 contains plastic ‘adaptive genomic regions’ (AGRs)30 that are enriched in virulence genes31,32,33,34,35,36, transcriptionally active TEs30,37,38, and structural variations32,37,38,39, associated with a unique chromatin profile and physical interactions in the nucleus30,40,41,42. Besides V. dahliae, the Pezizomycotina Verticillium genus contains nine additional plant-associated species28. Thus far, two Starships have been identified in a single strain of V. dahliae6. Here, we queried 56 highly contiguous Verticillium genome assemblies for Starships to analyze their association with the evolution of AGRs and virulence on plant hosts.ResultsA wealth of Starships occurs in the Verticillium genusTo identify Starships across the Verticillium genus, we collected 56 high-quality Verticillium genome assemblies, comprising 36 V. dahliae strains and 20 strains of the nine remaining species (Fig. 1a and Supplementary Data 1, 2) and queried these genomes for Starships using “Starfish”6. We uncovered 54 Starships that belong to 24 haplotypes of 14 naves of seven families (Fig. 1a, b and Supplementary Data 3). Between one and three Starships were detected in 33 of the 56 strains belonging to seven of the ten Verticillium species (Fig. 1a, b). Thus, most Verticillium genomes contain a Starship, and several genomes even contain multiple. Moreover, as these Starships range from 17 to 625 kilobases (kb) (Fig. 1c), and larger ones typically contain multiple YR genes of different naves (Fig. 1c, d), these likely represent nested Starship insertions. Thus, the final number of Verticillium Starships is likely under-estimated.Fig. 1: Diverse Starships populate the Verticillium genus.a The tree in black shows the phylogeny of the 56 strains used in this study across the Verticillium genus, divided into the Flavexudans (FE) and Flavnonexudans (FNE) clades28, based on whole-genome sequence alignments. Circle colors indicate the ten Verticillium spp. while the label comprises species abbreviation followed by strain name. The tree in gray shows only the V. dahliae strains at increased resolution. Scale bars indicate phylogenetic distances expressed as nucleotide substitutions per site. b Repertoires of Starship haplotypes (hap.) per strain. Starships were classified according to previous studies5,6. Columns indicate Starship haplotypes defined by k-mer similarity and named according to captain navis and family (Supplementary Fig. S1b), whereas heatmap colors show haplotype member counts. “Id” refers to identical Starships (coverage and nucleotide sequence identity >98%) within a haplotype. c Size of the different Verticillium Starships. Points indicate individual Starships listed in Supplementary Data 3. Gray crossbars and error bars indicate the median and 95% confidence interval range of Starship lengths for each navis. d Captain/captain-like tyrosine recombinase (YR) navis repertoires in Starships with multiple YR genes. e Captain/captain-like YR gene classification per strain where “Starship (captain)” indicates YR genes located as the first gene at the 5’-terminus of a Starship for which both borders could reliably be identified with the Starfish tool6, “Starship (non-captain)” refers to YR genes located at other sites in a Starship, suggesting nested Starships with unidentified boundaries. Furthermore, “other Starship region” indicates YR gene presence in regions that could not reliably be identified as Starship with Starfish, but that are syntenic to reference Starships. Finally, “unaffiliated” refers to YR genes that cannot reliably be affiliated with a Starship region.Full size imageVerticillium genomes exhibit extensive large-scale genomic rearrangements that may have affected the integrity of prior inserted Starships32,37,43. However, Starfish cannot identify fragmented Starships, rearranged Starship insertion sites, or Starship insertions into lineage-specific regions6. To identify such Starships, we queried for reference captain and captain-like YRs previously identified in Pezizomycotina genomes6, revealing 2–18 homologs in each strain, amounting to 556 YR genes of 38 naves and seven families (Supplementary Fig. S1 and Supplementary Data 4). Only 21% of them belong to Starships identified by Starfish (Fig. 1e), while the remaining 79% could point to unidentified Starships. Additionally, to detect potential Starships and Starship-like regions that may have been overlooked due to genomic rearrangements and insertion into lineage-specific regions, we identified regions syntenic to Starships as “Starship regions”. Hereafter, “Starships” refer to those identified using Starfish, while “Starship regions” collectively refer to Starships regions plus syntenic regions. Intriguingly, such Starship regions occur in all strains (Supplementary Data 5), and half of the captain-like YR genes that did not occur in Starships appeared in such Starship regions (Fig. 1e). Thus, we reveal abundant Starships and their remnants in the Verticillium genus.Starships are hotspots of large-scale genomic rearrangementsTo detail how genomic rearrangements affected Starships, we compared telomere-to-telomere genome assemblies of V. dahliae strains JR2 and VdLs17 that comprise dozens of large-scale genomic rearrangements32,37,44. In strain JR2, we detected three large Starships of 0.50–0.54 Mb each, belonging to haplotypes Ar1h1, Ar4h1, and Se1h1, plus additional Starship regions, collectively accounting for 5.3% (1.92 Mb) of the genome (36.15 Mb) (Fig. 2a). Although no Starships were detected in VdLs17 by Starfish, Starship regions account for 2.7% (0.96 Mb) of the genome (35.97 Mb) (Fig. 2a). Intriguingly, 60% of the inter-chromosomal rearrangement breakpoints between these strains occurred in Starship regions (Fig. 2a and Supplementary Fig. S2). Accordingly, Starship regions were mainly detected in AGRs that are enriched in such rearrangements32,37 (Fig. 2a, b). In strain JR2, 92% (1.77 Mb) of the Starship regions colocalized with AGRs (Supplementary Data 6). Moreover, 53% of the total AGR complement belongs to Starships. In VdLs17, 94% (0.90 Mb) of the Starship regions colocalized with 22% of the AGR complement (Supplementary Data 6).Fig. 2: Starships are hotspots of genomic rearrangements.a Circular plots showing the locations of Starship regions, adaptive genomic regions (AGRs), and genomic rearrangements between V. dahliae strains JR2 (upper eight chromosomes) and VdLs17 (lower eight chromosomes). Tracks are filled with colors representing either core, AGR, or centromeric regions, overlaid with bold lines and arrows representing Starship regions and Starships colored by haplotype. Overlapping arrows in a single region indicate that the Starship orientation is not determined due to the presence of captains at the 5’ end of both strands. Regular triangles point to the captain and captain-like tyrosine recombinase (YR) genes, colored by family and annotated with navis identification (ID). The overlaps among these elements and rearrangement breakpoints are shown in Supplementary Fig. S2 at a higher resolution. Colored bands at the inner edge of tracks represent genetic elements grouped into protein coding sequence (CDS) and transposable element (TE). Ribbons connect syntenic regions (>80% nucleotide sequence identity over 10 kb). b Total length of genomic compartments in V. dahliae strains JR2 and VdLs17. c Violin plots depicting the sequence alignment coverage determined by comparing JR2 genomic compartments against 35 V. dahliae genomes (Supplementary Fig. S3). Points indicate the coverage, and the color represents genome-wide average nucleotide identity (ANI) for each genome alignment (n = 35), while blue crossbars indicate the median values. Different letter labels indicate significant differences (two-sided Dunn’s test, adjusted p 70% identity and >80% coverage.Alien index (AI) values were calculated by the formula AI = log((best e-value for ingroup) + e-200) - log((best e-value for outgroup) + e-200)61 using hits with >50% coverage against Verticillium gene/TE queries by the blastn search. The e-value for no hits was set to 1 for queries with hits in either ingroup or outgroup, whereas AI values for queries with no hits in both ingroup and outgroup were not determined. The 332 non-Verticillium genomes of the order Glomerellales were used as ingroup, while the 9641 genomes of the remaining 74 orders of Pezizomycotina were used as outgroup (Supplementary Data 9).Phylogenetic analysesVerticillium phylogeny was inferred with REALPHY version 1.13113. Briefly, genome sequences were fragmented into overlapping 50 bp subsequences and mapped to the JR2 genome with Bowtie version 2.2.5114. Based on single-nucleotide polymorphisms in the aligned regions, the maximum likelihood phylogenetic tree was built with PhyML version 3.3.20220408115.Starship phylogeny was inferred by k-mer comparisons with mashtree version 1.4.6 with default k-mer size of 21, accuracy option mindepth 0, and bootstrapping for 1,000 iterations116.For the phylogenetic analyses of Av2, nucleotide sequences of Av2 orthologs were extracted from the Pezizomycotina genomes at the coordinates listed in Supplementary Data 16 using SeqKit version 2.3.0 subseq117. The nucleotide sequences of Av2 were aligned by MAFFT versions 7.511 or 7.526 with the L-INS-i method that iteratively refines local alignments118, followed by removal of ambiguously aligned sites with trimAl version 1.4.rev15 with option strict119. After multiple sequence alignments, maximum likelihood phylogeny was inferred by IQ-TREE version 2.0.3120 with the best substitution model TIM2e + G4 suggested by ModelFinder121 and the ultrafast bootstrap approximation for 1000 iterations122.For the phylogenetic analysis of NLPs, putative Verticillium proteins that contain an NPP1 domain64 were identified based on the homology to reference sequences in the Pfam database123 (Pfam accession PF05630.16) using HMMER version 3.3.296. Putative secretion signals were identified with SignalP 6.0124 or SignalP 3.0125. The deduced NLP amino acid sequences were aligned and trimmed as described for Av2. The phylogeny of NLPs was also inferred as described for Av2 with the different best substitution model VT + F + I + G4. NLP orthologs were numbered by the monophyletic relationship with NLP1 to NLP9 of V. dahliae strain VdLs1739,63 in the GenBank RefSeq database under the accession numbers listed in Supplementary Data 15.Targeted deletion of NLPs from the Verticillium dahliae genomeTo generate NLP6 and NLP3 deletion constructs, flanking sequences of its coding sequence were amplified from genomic DNA of V. dahliae strains VdLs17 and JR2, respectively, using primers listed in Supplementary Data 17. The amplified products were cloned into pRF-HU2 as described previously126, and subsequent Agrobacterium tumefaciens-mediated transformation of V. dahliae was performed as described previously127. Transformants were selected on PDA supplemented with cefotaxime (Duchefa, Haarlem, The Netherlands) at 200 μg/ml and hygromycin (Duchefa) at 50 μg/ml, and homologous recombination was PCR-verified. For genetic complementation, the coding sequence of NLP6 was cloned into the pFBT005 vector as previously described128, after which the NLP6 deletion mutants were transformed using the A. tumefaciens-mediated transformation method described above.Pathogenicity assays were performed on ten-day-old tomato seedlings (MoneyMaker) plants using root-dip inoculation as previously described129. Disease symptoms were scored up to 14 dpi, pictures were taken, and ImageJ was used to determine canopy areas while fungal colonization was determined with real-time PCR. To this end, stem sections were taken at the height of the first internode, flash-frozen in liquid nitrogen, ground to powder, and genomic DNA was isolated. Real-time PCR was performed with a quantitative PCR core kit for SYBR Green I (Eurogentec, Seraing, Belgium) on an ABI7300 PCR machine (Applied Biosystems, Foster City, CA, U.S.A.). The V. dahliae internal transcribed spacer (ITS) levels were used relative to tomato ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) levels to quantify fungal colonization of tomato plants36.StatisticsData were analyzed with R version 4.4.2130 and the R package “tidyverse” version 2.0.0131. The normality and homoscedasticity of data were tested by the Shapiro–Wilk test and Bartlett’s test, respectively, with the R base package “stats” version 4.3.1130. The multiple comparisons of data with non-normal distributions and unequal variances (p