Highly stable bacteriophages PIN1 and PIN2 have hallmarks of flagellotropic phages but infect immotile bacteria

Wait 5 sec.

Highly stable bacteriophages PIN1 and PIN2 have hallmarks of flagellotropic phages but infect immotile bacteriaDownload PDF Download PDF Research HighlightOpen accessPublished: 16 July 2025Afif Jati1,Yan Li1,Andre Mu2,Tze Y. Thung1,Huanchang Chen1,3,4,Joel Steele5,Han Lee5,Ralf Schittenhelm5,Tieli Zhou3,4,Francesca L. Short1,Rhys A. Dunstan1 &…Trevor Lithgow1 npj Viruses volume 3, Article number: 56 (2025) Cite this articleSubjectsComputational biology and bioinformaticsMicrobiologyBacteriophages (phages) are viruses that kill bacteria, with potential as antibacterial agents in industrial settings, agriculture, and human health. Here, we identified two phages, PIN1 and PIN2, that can kill clinical isolates of the human pathogen Klebsiella pneumoniae. The phages are highly stable; PIN2 in particular resisted multiple freeze-thaw cycles over 12 months without loss of activity. PIN1 and PIN2 are related to flagellotropic phages, an idiosyncratic group of viruses that bind to bacterial flagellae, but K. pneumoniae is an immotile pathogen that does not have flagellae. Genetic mosaicism is observed, wherein the long, flexible tail fiber of the flagellotropic phages has been substituted by a more compact tail fiber that binds the Klebsiella host through cell-surface capsular polysaccharide and lipopolysaccharide. PIN1 and PIN2 belong to the Yonseivirus group of phages, with initial analyses across the group suggesting further recent diversification in the tail-fiber cassette in the Yonseivirus genomes.IntroductionBacteriophages (phages) are viruses that infect bacteria. Commonly studied phages belong to the class Caudoviricetes and have the characteristics of a protein capsid housing a double-stranded DNA genome joined to a complex tail structure1,2. The capsid needs to withstand a turgor pressure of around 50 atm exerted by the DNA contents as well as any environmental stresses, and, structurally, there are distinct types of reinforcement mediated by capsid proteins3,4,5 to maintain phage stability. There are four major capsid protein types, all formed around a common, general HK97 fold: (i) the true HK97-type capsids have covalently cross-linked major capsid proteins6, (ii) P22-type capsids have a specific interaction domain inserted within the HK97 fold of their capsid protein7, (iii) BPP1-type capsids have a major capsid protein of selectively permuted structural elements as compared to HK978, and (iv) the Chi/YSD1-type capsids which have an intricate chainmail formed from auxiliary proteins linking the HK97 fold of the capsid proteins9,10. As phages begin to be deployed in clinical settings for phage therapy of antimicrobial-resistant infections, the prediction and measurement of phage stability is increasingly important as it equates to shelf-life for these new therapeutics11,12,13.The other major architectural feature of Caudoviricetes is the phage tail. The function of the tail is to selectively engage a receptor on the surface of the host bacterium to initiate the infection process, and then permit and guide passage of the phage genome into the host’s cytoplasm2,14. Mathematical modeling of bacterial encounters with a phage has suggested that accidental collisions drive the infection process1,15,16,17,18 and, outside of laboratory settings, phage concentrations would be so low as to make these encounters rate-limiting for host infection. One mechanism by which some phages can improve the encounter rate with prospective host bacteria is to bind to large molecular structures on the bacterial cell surface: in effect, these structures increase the capture radius for a phage binding event. One such structure is the bacterial flagellum; these extraordinary molecular machines are 10–15 μm in length19,20, spin at 6,000-60,000 r.p.m.21, and their gyrating motions mean that they create a volume around a bacterial cell to which some phages can bind. Thus, these flagellotropic phages have an improved chance of encountering their presumptive host. Typically, multiple flagella are assembled into the envelope of a bacterial cell and may be peritrichous, i.e., randomly distributed, such as in Salmonella enterica22 or carefully localized, such as the polar flagellae found in bacteria such as Vibrio cholerae23.As a fundamental surface appendage for motility, flagellae are found throughout the bacterial kingdom24,25, and flagellotropic phages are known to infect and kill diverse species of bacteria26,27. Given the considerable mechanical stress that they must endure after receptor binding - these phages are spun at velocities that require withstanding extraordinary shear forces28 - we previously sought to understand the structural features that contribute to the stability of a flagellotropic virion9. The paradigm for these phages is Chi, which infects Salmonella enterica29,30, and the closely related phage YSD1 that differs from Chi by three non-structural genes, so the virions of Chi and YSD1 are nearly identical31. YSD1 virions have a single long tail fiber enabling them to bind to the flagella on Salmonella cells9,31,32,33. Immunogold labeling in electron micrographs revealed that the long tail fiber is formed by the protein designated as YSD1_29. Pre-treatment of YSD1 virions with the antibodies that bind YSD1_29 prevents Salmonella infection31. The structures of bacteriophage Chi10 and YSD19 were solved by cryo-electron microscopy, and these represent the structure of a group of closely related phages previously designated as Chi-like34,35.Serendipitously, we discovered within the Chi-like flagellotropic phages a group that infects Klebsiella pneumoniae, a bacterium that does not have flagellae. In a survey of wastewater, two phages were discovered that formed plaques on an antimicrobial-resistant clinical isolate K. pneumoniae AJ218. Genome sequencing revealed that the two phages, PIN1 and PIN2, are highly similar to each other in sequence and possess structural proteins highly similar to those of flagellotropic phages, yet with distinct tail-associated proteins. A functional genomic screen suggested capsular polysaccharide is the primary receptor for PIN1 and PIN2, a feature that extends 340 nm from the cell surface36. With shelf-life being a major consideration in the usefulness of phages in biotechnology and therapy, we suggest that the sequence-based and structure-based findings on the highly stable phages PIN1 and PIN2 provide a starting point to develop machine-learning tools to predict phage stability from genome sequence data, irrespective of whether or not the phage is flagellotropic.ResultsIsolation of Klebsiella phages forming pin-prick plaquesKlebsiella pneumoniae AJ218 was isolated from a human urinary tract infection at the microbiological laboratory of the Alfred Hospital and found to be resistant to antiseptics such as chlorhexidine and to several antibiotics37. Using K. pneumoniae AJ218 as an isolation host, samples of wastewater from two independent sites yielded phages referred to hereafter as PIN1 and PIN2 (see “Methods” section). Both of the phages formed numerous, uniformly tiny plaques with a ‘pin-prick’ morphology (Fig. 1a). The infection parameters of each phage were defined by the latent period (the time-interval wherein the first round of phage particles is assembled, the host is lysed and the new virions are released) and the burst size (the total number of progeny virions that are released in this event). One-step growth curves on this strain revealed minor differences in PIN1, such as a slightly longer latent period compared to phage PIN2 (Fig. 1b).Fig. 1: Isolation and characterization of phage PIN1 and phage PIN2.a Representative plaque morphologies after infection of K. pneumoniae AJ218 are shown for phages PIN1 and PIN2. The size of the plaques was too small to measure accurately, but the zoomed view shows that there is little range in the plaque sizes, which, by best estimates, are approximately 500 μm in diameter. b One-step growth curve of PIN1 and PIN2 was performed by co-incubation with the AJ218 host strain for 10 min at 37 °C for phage adsorption, after which the mixture was subjected to centrifugation to remove free phage particles. The resuspended cell-phage pellets were incubated at 37 °C and sampled at 10-min intervals for 90 min. PIN1 had a latent period of 40 min and a burst size of 114 phage particles/cell; the latent period for PIN2 was observed at 30 min, and the burst size was 107 phage particles/cell. Data points are the means of three biologically independent samples, and the error bars are the standard error mean. c Representative negative-strain TEM micrographs of one high-titer cesium chloride purified sample of phages PIN1 or PIN2. Scale bar: 100 nm. d High-titer stocks of PIN1 and PIN2 phages were adjusted to 109 pfu/mL and stored at the indicated temperature for 12 months. At monthly intervals, samples were removed and incubated with K. pneumoniae AJ218 and plated to count plaques. The sample stored at −20 °C was subject to monthly freeze-thawing throughout the experiment in order to retrieve the aliquots to be tested. Data plotted is the average of three biological replicates. e Genome comparison and putative functional annotation of each PIN1 and PIN2 open-reading frame. The color-coding designates general functions, the gray-scale shading between PIN1 and PIN2 highlights the sequence identity (%) between the corresponding regions in each genome. Functional annotations for the virion proteins detected by mass spectrometry are presented for PIN1 (Supplementary Table 1) and PIN2 (Supplementary Table 2). The distinct tail-associated proteins Pin1_34 and Pin2_34 are labeled.Full size imagePhage virions were purified on cesium chloride gradients and visualized by transmission electron microscopy. The virions of PIN1 and PIN2 were barely distinguishable, with PIN1 having an icosahedral capsid of 58 ± 2.3 nm in diameter and PIN2 being 61 ± 2.1 nm in diameter. Both phages had long, flexible tail-tubes of 214 ± 4.3 nm (PIN1) and 220 ± 7.5 nm (PIN2) in length based on measurements from n = 15 individual phages (Fig. 1c). Neither phage had the long fibrous protein characteristic of flagellotropic phages extending from the terminal end of the tail-tube, instead having more globular features decorating the tail-tip. To address the stability of the phages, replicate stocks were prepared at 109 pfu/mL and stored at either room temperature (20–22 °C), refrigerated (4 °C), or frozen at −20 °C. Three biological replicates were tested for each of the phages at each of the time-points (Supplementary Fig. 1a). Each month, samples were removed from the stock and tested for virulence on the host K. pneumoniae AJ218. Equivalent measurements were made for two unrelated Klebsiella phages, BMac38 and RAD13. Both PIN1 and PIN2 showed high stability with no loss of infectivity across twelve months of storage at room temperature or refrigerated at 4 °C. Even the repeated freeze-thaw cycles for the −20 °C sample assessments did not impact the stability of PIN2 over the course of 12 months (Fig. 1d). Small differences in structural proteins have been linked to virion stability in other phages39,40, thus we sought to compare the sequences of PIN1 and PIN2.Genome sequencing of the PIN1 and PIN2 genomes showed them to be 59,383 and 58,999 base pairs in length, respectively, and analysis of open-reading frames in the genomes of PIN1 and PIN2 predicted that they each encode 89 proteins (Supplementary Table 1, Supplementary Table 2). Analysis of the phage virions by mass spectrometry confirmed the identity of the structural proteins that are present in each of the phage virions, and these are compared in Table 1. This data, and the comparison of the phage genome sequences (Fig. 1e) revealed that whereas PIN1 and PIN2 are highly related, the two phages are not identical. The overall average nucleotide sequence identity is 82%, but this varies considerably for several of the open-reading frames. An obvious example is the genome sequences encoding the putative tail-fiber protein in phage PIN1 (Pin1_34), which has no sequence similarity to the corresponding protein (Pin2_34) in phage PIN2. For some, near-identical proteins such as the major capsid proteins (Pin1_0021 and Pin2_0019), there is small sequence variation that may prove significant. Pin1_0021 and Pin2_0019 show 96% sequence identity, which is to say that they differ in 14 amino acid positions. In 8 of these differences, the residues in Pin2_0019 are identical to the residue found at the corresponding position of the major capsid protein of the flagellotropic phage YSD1 (Supplementary Fig. 1b).Table 1 Mass spectrometric analysis of PIN1 and PIN2 virionsFull size tableBoth PIN1 and PIN2 are predicted to be lytic phages via BacPhlip41, with probability scores of 92% and 93% respectively. The prediction is consistent with experimental observations of lytic activity from the presence of clear plaques on the lawns of Klebsiella AJ218 and one-step growth curves that show clear signs of lysis (Fig. 1a, b).PIN1 and PIN2 belong to the Yonseivirus groupWhole genome sequence analysis using VIPtree enabled a comparison to other related phages and showed that PIN1 and PIN2 belong to the genus Yonseivirus (Fig. 2a). Based on the viral proteomic phylogenetic tree, PIN1 was found to be closely related to previously isolated phage S9a from Spain42, whereas PIN2 was observed in a different cluster (Supplementary Fig. 2). A relationship was also indicated between phages PIN1 and PIN2 and the Chi-like virus group of phages represented by YSD1 and Chi that infect Salmonella Typhi as well as JS26 that infects Serratia sp. ATCC 3900643 (Fig. 2a). These observations were unexpected given that YSD1, Chi, and JS26 are flagellotropic phages, yet Klebsiella pneumoniae have no flagellae.Fig. 2: Phage PIN1 and phage PIN2 are relatives of flagellotropic phages.a Genome sequence analysis of Casjensviridae phage family that infects across 25 bacterial host genera was generated using ViPTree. The tree was reconstructed using 151 phage genomes that belonged to Casjensviridae family and were collected from the NCBI genome database (downloaded as of 1st June 2024) and also included the PIN1 and PIN2 genomes. The branch lengths represent genomic similarity based on normalized tBLASTx scores plotted on a logarithmic scale (scale bar denotes 0.1). The bacterial hosts for each phage are colored in the tree nodes. PIN1 and PIN2 are indicated, as are flagellotropic phages YSD1 and JS26. Full annotation of all phage names is presented in Supplementary Fig. 2. b Viral Clusters (VC) as determined by the Markov Clustering Algorithm of all viral sequences are projected onto a Cytoscape Organic Layout network sample space. Nodes represent viral genomes and are colored by their VC grouping, while an edge (i.e., connecting line) indicates an association with the respective VC groups. Phages Chi and YSD1 from VC1, and PIN1 and PIN2 from VC8 are specifically annotated. Those VCs containing phage with a YSD1_29 homolog as the putative tail fiber are highlighted in Supplementary Fig. 3.Full size imageTo address the sequence relationships with an independent approach, we analyzed the genome sequence data set to classify Viral Clusters (VC). By virtue of the calculated relationship, a VC corresponds to a viral population sharing 90% sequence identity over approximately 75% of the aligned genome sequence44. For the phage genomes presented in Fig. 2a, nucleotide sequence data was subject to a BLAST-based analysis of all-against-all according to the published method. All singleton sequences - those not meeting threshold criteria for a close relationship to any phage - were removed from the data. Fourteen discrete clusters were identified from the considerations built into this previously published graph-based clustering method to classify the genetic relationships across broad sets of phages (Fig. 2b). VC1 contained a total of 39 phages including YSD1 and Chi (Fig. 2b). A diagnostic feature of this group is the presence of a putative tail-fiber protein, homologous to YSD1_29 that ensnares the bacterial flagellum, thereby serving a receptor-binding function for docking to the host9,31. Sequence analysis demanding at least 90% sequence similarity across at least 90% of the open-reading frames identified such tail-fiber protein sequences in 37 of the 39 phages in VC1, and also in all 5 of the phages in VC7 (Supplementary Fig. 3, Supplementary Table 3). In VC1, phage 35 and phage 37 have deletions within their genome sequences truncating the putative YSD1_29 homolog, such that only 337 amino acid residues are encoded in phage 35, and only 480 amino acid residues are encoded in phage 35 (Supplementary Fig. 3). Of note, one Klebsiella phage (vB_KaeS_Phraden) was identified by this stringent assessment for a YSD1_29 tail fiber and belongs to VC1 (Fig. 2a, b). The isolation host for phage Phraden was Klebsiella aerogenes, a species of Klebsiella unique in that it expresses flagellae due to the flagellar assembly operon having been acquired from the members of the genus Serratia45. Conversely, phages PIN1 and PIN2 were found to cluster in VC8, together with other phages that infect Klebsiella pneumoniae and other immotile species of Klebsiella (Fig. 2b).A genome mosaic in receptor bindingThere is a high degree of sequence-based similarity in the structural proteins of PIN1 and PIN2 phages when aligned with phage YSD1 (Fig. 3a). The structure of the YSD1 virion has been solved by single-particle cryo-electron microscopy9, and this information adds confidence to the annotations for each of the proteins in the three phage sequences. The major capsid protein sequence is nearly identical between the three phages, and there is a relatively high degree of similarity in the other proteins of the capsid and the major tail-tube protein and tape-measure protein (Fig. 3a). Despite this clear relatedness, there is also a mosaicism in the sequence alignments.Fig. 3: A genetic mosaic provides a distinct tail-spike and tail fiber in the Yonseivirus.a Genome sequence comparison of PIN1, PIN2 and YSD1 open-reading frames. The color-coding designates the structural proteins according to predicted function and location in the capsid, portal-connector, and tail structures of the respective phages. For YSD1, the functional annotations are detailed as validated by the structural information9. The gray-scale shading between PIN1 and PIN2 highlights the sequence identity (%) between the corresponding regions in each genome. The virion proteins, as detected by mass spectrometry, are presented comparatively for PIN1 and PIN2 (Table 1). b AlphaFold was employed to predict the structures of the tail fiber, Pin1_32, and Pin2_30. The ribbon diagrams are colored according to the standard scale for confidence, with dark blue representing high-confidence predictions and yellow and red less confident predictions. An inset shows the structural homology assessed with FoldSeek, where the Pin1_32 structure (blue) was used as the query and the PDB100 database was used as the reference. The best match was J-protein from Lambda phage (“J-lambda”, colored yellow), with an RMSD of 5.69 Å. Further detail of the comparative structural analysis is shown in Supplementary Fig. 4. c Genome alignments across the Yonseivirus. The color-coding designates proteins according to predicted function, and the grey-scale shading highlights the sequence identity (%) between the corresponding regions in each genome. Thus, the white gaps represent no sequence identity.Full size imageElectron microscopy of YSD1 virions revealed a long, flexible fiber emanating from the tail-spike, and immunogold labeling of this long, flexible fiber revealed it to be YSD1_2931. Small-angle X-ray scattering (SAXS) of the purified YSD1_29 protein and cryo-electron microscopy of the identical homolog BH2P in situ on virions of Chi have revealed a compact N-terminal structure and a C-terminal region that is too flexible to be analyzed10,31. There is no sequence similarity to YSD1_29 in the corresponding open-reading frames in PIN1 and PIN2. Despite showing no sequence similarity, the positional arrangement would suggest that Pin1_32 and Pin2_30 are the corresponding receptor-binding proteins for phages PIN1 and PIN2, respectively (Fig. 3a). The absence of any sequence relatedness is reminiscent of the characteristic chimerism seen in the evolution of new phage properties46.When Alphafold was applied to predict structures for Pin1_32 and Pin2_30 (see “Methods” section, Supplementary Fig. 4) high-confidence predictions were evident throughout the compact, globular protein structure (Fig. 3b). These predicted structures for Pin1_32 and Pin2_30 are nearly identical, consistent with the high sequence similarity observed between the proteins. The structure of Pin1_32 and Pin2_30 has similarity to a group of phage proteins that function in attachment to the host, the best characterized being tip attachment protein J from λ phage (J-Lambda), based on structural homology search using FoldSeek47 and sequence-based protein domain search using InterProScan48 (Supplementary Fig. 4). A structural alignment of Pin1_32 and J-Lambda shows the considerable structural similarities, despite a rotational difference shifting the C-terminal domains apart (Fig. 3b). An equivalent overlay for Pin2_30 and J-Lambda is shown in Supplementary Fig. S4.Just as the YSD1_29 related receptor-binding proteins are conserved across the Chi-like viruses in VC1, the putative PIN1 and PIN2 receptor-binding proteins are likewise conserved across the genus Yonseivirus of phages targeting Klebsiella (Fig. 3c). We note also that there is a complex pattern of conservation and divergence for the tail-spike protein Pin1_34 which has less than 20% sequence identity to the corresponding protein in phage S9a, with neither of these proteins showing any sequence similarity in the other phages (Fig. 3c). Consistent with this, the tail-spike protein in phage PIN2 (Pin2_34) has no sequence similarity to the corresponding protein in any other Yonseivrus phage and the tail-spike protein in phage vB_Kpn_GBH054 has no sequence similarity to the corresponding protein in any other Yonseivrus phage (Fig. 3c). Nonetheless, the sequence-based and structural commonalities between the putative receptor-binding proteins PIN1_32 and PIN2_30 suggests that the two phages use a common receptor to bind to the Klebsiella cell surface. We sought to identify this receptor.The cell surface receptor for phages PIN1 and PIN2To determine the receptors for phage PIN1 and PIN2, a transposon-directed insertion site sequencing (TraDIS) library was constructed in K. pneumoniae AJ218 (Fig. 4a). The library was subject to co-incubation with PIN1 or PIN2, and after 5 h of growth in the presence of phage, surviving mutants from the library were isolated and sequenced. Both TraDIS experiments showed a collapse in the number of unique insertion sites following phage treatment, consistent with positive selection for rare mutants in which phage adsorption or replication was compromised. A total of 83 genes were identified, which, when disrupted by transposon insertion, conferred survival in the presence of phage PIN1. A total of 42 genes were essential for infection by phage PIN2, with 40/42 shared in common to the genes identified in the phage PIN1 screen (Fig. 4b, Supplementary Fig. 5). PIN1 infection-related genes, in addition to being more numerous than for PIN2, were also generally associated with higher fold-change increases, suggesting that PIN1 treatment may have placed a stronger selection on the initial mutant library than PIN2 for unknown reasons. Genes (identified as important for PIN1 or PIN2 infection) (Supplementary Table 4, Supplementary Table 5) were categorized into functional groups, and for both PIN1 and PIN2, these groups included genes encoding proteins involved in biosynthesis of the surface-exposed K-antigen and the O-antigen - which could serve as cell surface receptors for phage binding. In addition, several genes previously identified as required for full capsule expression, e.g., components of the rcs and mla 49, were identified as required for infection by PIN1 and PIN2.Fig. 4: Identification of O-antigen (LPS) and K-antigen (capsule) as receptors for phage PIN1 and phage PIN2.a Schematic depiction of the TraDIS screen of K. pneumoniae AJ218. A control culture (uninfected) was processed alongside cultures that were infected with either phage PIN1 or phage PIN2 at a multiplicity of infection (MOI) of 10 phage per bacterial cell. After selection in the presence of phage for 5 h at 37 °C, the bacterial cells were collected and processed for DNA sequencing (see “Methods” section). b Summary of genes identified in the two TraDIS screens, with 40 genes being identified as common to the infection process by phages PIN1 and PIN2. c Volcano plot representing the transposon insertions in each gene of the genome as a log2 fold increase, wherever more hits were detected +PIN1. Functional relationships for these genes are color-coded with genes for O-antigen biosynthesis (light green), CPS biosynthesis (red), ECA biosynthesis (blue), and other genes (grey). On this plot, the q-value represents the false discovery rate with 5% as the cutoff. Equivalent data for the phage PIN2 TraDIS screen is shown in Supplementary Fig. 5. d The components of the capsular polysaccharide synthesis and secretion machinery in K. pneumoniae are encoded by genes (represented by orange arrows) that cluster together on the bacterial chromosome as the cps locus. The three graphs depict the normalized data of transposon insertions at the cps locus as seen in the control culture (red), +PIN1 culture (green), and +PIN2 culture (blue). e Spot assays of diluted PIN1 and PIN2 phage preparations onto top agar layers containing K. pneumoniae AJ218 or an isogenic mutant in which the wzc gene has been deleted (AJ218 Δwzc). The PIN1 and PIN2 stocks used to prepare the serially diluted samples were ∼1010 PFU/ml. Three independent biological replicates were performed for both PIN1 and PIN2 spot tests.Full size imageExtending from the cell surface, the O-antigen is an oligosaccharide of defined composition that is linked to the core lipopolysaccharide. This same basic oligosaccharide can also be found in the periplasm as the Enterobacterial Common Antigen (ECA), and genes common to O-antigen biosynthesis were identified in the screen (Fig. 4c). Further support that it is the O-antigen comes from the observation that lipid A biosynthetic mutants were also recovered as being phage resistant. Klebsiella pneumoniae has an intricate multi-component machinery for the synthesis and secretion of polysaccharides that correspond to the K-antigen, forming a protective capsule that extends further outward than the O-antigen50,51,52. Most of the components of the capsule synthesis and secretion machinery are encoded by genes that cluster together on the bacterial chromosome at the cps locus53. Inspection of the reads from the TraDIS screen around the cps locus confirmed that almost all of the genes in the cps locus were identified as phage-resistant mutants (Fig. 4d). In order to independently address how important capsular polysaccharide is for phage binding, a bacterial mutant was engineered where the wzc gene had been deleted. This mutant form of Klebsiella AJ218 was resistant to phage infection (Fig. 4e).DiscussionThis study used as an isolation host Klebsiella pneumoniae AJ218, a clinical isolate that has been characterized in terms of fimbrial expression54, porin expression55, acquisition of metal ions56, and by atomic force microscopy to address its extensive extracellular capsular polysaccharide36. In a functional genomics screen, infection of K. pneumoniae AJ218 by phages PIN1 and PIN2 was shown to depend on the presence of the K-antigen (i.e., the capsular polysaccharide) and O-antigen (i.e., the lipopolysaccharide) displayed on the cell surface. These surface features, therefore, stand as receptors that the phage binds in order to interact productively with the outer membrane. Only after this interaction with its receptors do the phages initiate the conformational switches to release their DNA from the capsid to be translocated into the cytoplasm of the bacterial host cell, for the replication and transcription processes that will deliver new phage progeny. The lag times and burst sizes of each of the two phages are similar, though not identical. The pin-prick plaque morphology is identical for both phages, and the genome sequences revealed two highly similar complements of proteins encoded by PIN1 and PIN2, and in a group of related phages isolated from other countries of the world.Phages PIN1, PIN2, and the other species of Yonseivirus all infect Klebsiella hosts, yet showed conservation with flagellotropic phages across most of the structural proteins of the capsid and the tail-tube. As is the case in attempts to disentangle evolutionary trajectories for other phages57, there is insufficient sequence information to hypothesize a trajectory for the evolution of these phages. Were the ancestors flagellotropic, but a lineage then evolved by acquiring distinct proteins at the baseplate and tail-tip to bind Klebsiella K- and O-antigens? Or was it the YSD1_29 type tail fiber that evolved more recently to thereby provide for selection of a new set of flagellotropic phages?Genetic mosaicism describes scenarios such as the PIN1/PIN2 phage comparison to phage YSD1, wherein the genomes share tracts of high sequence conservation punctuated by regions of dissimilar sequence58,59. By and large, the extraordinary mosaicism in phage groups is anticipated to be mediated through recombination events that can either occur during co-infection of a host by two distinct phages46,60, or occur between extant prophages and new, incoming phages61,62. While specific examples have been detailed, there is an appreciation that such recombination would provide a means to change receptor-binding activities in the tail structure of phage virions, adapting the phages to new host-binding capabilities2,63,64,65. Receptor binding depends on specific interactions, with two-step receptor binding seemingly contributing to the quality control that ensures selectivity in host binding66,67. Based on spatial consideration of how extensive the capsular polysaccharide is relative to the lipopolysaccharide, phages PIN1 and PIN2 would first encounter the K-antigen receptor and thereafter the O-antigen receptor. Like their flagellotropic relatives, phages PIN1 and PIN2 would then benefit from the increased capture radius (promoting random encounters) provided through their receptor’s extension from the cell surface1,15,18,36.The emerging biotechnological and clinical needs for phages will require virions that have maximal stability for shelf-life68,69,70,71,72. Many phages are quite unstable and degrade at a rate of ten-fold loss in less than a month72,73. This presents a need for investigations into the features that provide for phage stability. Flagellotropic phage experience both the torque that actuates the movement of the phage relative to the flagellum and a substantial drag force through the surrounding fluid, which is the resistance to this movement28. Single-particle cryo-EM structures of the flagellotropic phages YSD1 and Chi have shown structural elements, referred to as chainmail armor, that contribute to the stability to withstand these forces9,10. Studies on other phages have revealed four types of “chainmail” structure in the capsid proteins that can stabilize phage virions3,4,5,6,7,8,9,10. The structural elements that provide these chainmail links are contributed by loops of additional sequence, and such sequences are attractive features for recognition by machine-learning and other AI tools74,75,76,77. Such a tool might be optimized to be a predictor of phage stability, providing an initial screening step of genome sequence data for phages preadapted to have useful shelf-life properties.In the case of the YSD1 capsid chainmail, the structure is stabilized by hook structures in the cement protein that link into neighboring major capsid protein subunits, as well as loops that together form the molecular chainmail9. With similar sequence features in the capsid found also in phages PIN1 and PIN2, their capsids may contribute toward the measured stability, which was seen to be high for both PIN1 and PIN2. While the phages have highly similar capsids, even a small number of sequence variants may contribute to the stability differences observed. The PIN1 phage showed only moderate stability through repeated freeze-thaw cycles, while PIN2 phages were unaffected by this storage regime. The strength of connections between the capsid and the tail has also been highlighted as a feature for stability of other phages39,40, but we observed that the head-tail connector protein is identical in PIN1 and PIN2.In addition to their intrinsic structural properties, phage shelf-life can be boosted by additives to promote virion stability, and, in the pharmaceutical industry, most proteins and protein-based biologicals are shipped and sold in powder form68,69,70,71,72,78. Phages can be prepared in dry powder forms using spray drying techniques in which sugars or amino acids can be stabilizing additives. Other examples include the use of freeze drying (lyophilization), emulsions such as for ointments, the formation of polymeric microparticles or nanoparticles, and liposomes in the process of suspending phage preparations. Irrespective of the formulations employed to fit with regulatory approvals, starting with intrinsically stable phages and having a means to determine the intrinsic stability of a phage would contribute to better products.Phages are present on Earth in astronomical numbers and are found in diverse environmental settings from deserts to deep-sea trenches79,80. One secret of their success is the pervasive mosaicism built into their genomes, reflecting ongoing and rapid evolution to interact with new host species and counter newly encountered anti-phage defenses81,82,83. In terms of diversification, the phage tail-tip is a multi-component molecular machine that engages with the host bacterium and subsequently triggers release of the genomic DNA; engagement with the host-cell surface will result in conformational changes that ultimately unlocks the portal to release the DNA, in a process that appears to have multiple quality control check-points2,59. That the molecular sequence of events can be so well coordinated despite the process of recombination-driven evolution delivering new components into the tail-tip is a remarkable feat of biology that we are only beginning to understand.MethodsBacterial strains, phage isolation, and infection of KlebsiellaThe Klebsiella clinical isolates from the Alfred Hospital collection have been characterized previously37,84. Wastewater samples collected from Koo Wee Rup and Lang Lang (Victoria, Australia) were centrifuged at 10,000 × g for 10 min and filtered through a 0.45 µm cutoff filter. The filtered water samples (45 mL) were subsequently mixed with 5 mL of 10× concentrated Luria-Bertani (LB) media and 1 mL of a Klebsiella AJ218 overnight culture and grown for a further 16 h at 37 °C. Cellular debris was removed by centrifugation at 10,000 × g for 10 min, and the resultant clarified supernatant was passed through a 0.45 µm filter. Phage infectivity via spot and plaque assays was subsequently measured according to a published procedure85. The water sample from Lang Lang yielded phage PIN1, and the water sample from Koo Wee Rup yielded phage PIN2.Phage amplification, purification, and phenotypingPhages were amplified and purified on discontinuous caesium chloride (CsCl) gradients (2 mL of 5.6 M, 1.5 mL of 4 M, and 1.5 mL of 3.6 M CsCl in SM buffer) in a Beckman SW41 centrifuge tube. Gradients were centrifuged at 22,000 rpm for 2 h, 4 °C, and the samples harvested as previously described85. For one-step growth curve experiments, mid-log-phase cultures were adjusted to an optical density at 600 nm (OD600) of 0.5, the cells harvested by centrifugation, and suspended in 0.1 volume of SM buffer. Phage lysate was subsequently added at a multiplicity of infection (MOI) of 0.01 and was allowed to adsorb for 10 min at 37 °C. Following centrifugation at 12,000 × g for 4 min, the pellet was washed twice with SM buffer, resuspended with 30 ml of fresh LB broth, and incubated at 37 °C. Samples were collected at 10-min intervals for 120 min and titrated to determine PFU per milliliter as previously described86. Plaque morphology was visualized after plaque assays via liquid infections and top agar overlays38. Infection plates were subsequently imaged using a Phenobooth (Singer Instruments) using the default camera settings.To assess phage virion morphology by electron microscopy, CsCl-purified phage samples were prepared as described previously85. Purified high-titer phage preparations (4 μL) were added to freshly glow-discharged CF200-Cu Carbon Support Film 200 Mesh Copper grids (ProSciTech) for 30 s. The sample was blotted from the grid using Whatman filter paper, and samples were subsequently stained with 4 μL of Nano W Methylamine Tungstate (Nanoprobes) for 30 s and blotted again. Grids were imaged using a Tecnai Spirit G2 transmission electron microscope (Tecnai). Size measurements of bacteriophage PIN1 and PIN2 were performed using the “DigitalMicrograph Software (DM3)” within the Gatan Microscopy Suite. Fifteen individual phages were measured for head diameter, tail length, and tail width. The measurements were then used to calculate mean and standard deviation (SD) values, providing a statistical assessment of phage dimensions.Phage genomic DNA extraction, sequencing, and annotationPhage genomic DNA was isolated from 1.8 mL of phage working stock lysates (~1010 pfu/mL) using a method of phenol-chloroform extraction as detailed previously31,38. The purity and concentration of the extracted DNA were measured using Nanodrop and Qubit double-stranded DNA BR assay kit (Thermo Scientific), and whole genome sequencing was performed by Victorian Clinical Genetics Services, using Illumina NovaSeq 6000 platform with a paired-end run of 2 × 150 bp as previously described87. Genomic sequencing data were processed and analyzed via the Galaxy web platform using the public server usegalaxy.org.au88. Initial quality control of raw reads was performed with FASTQC (v.0.12.1)89. Reads were subsequently trimmed using fastp (v.0.24.0)90 and subsampled using the seqtk toolkit91. De novo genome assembly was conducted using Shovill (v.1.1.0)92.All phage genomes were annotated using pharokka (v.1.7.5)93 in which PHANOTATE94 was used for the gene prediction. Virulence, antimicrobial resistance gene screening, and functional annotation of predicted genes were carried out using MMseqs295 against VFDB96 (virulence factors database), CARD97 (Comprehensive Antibiotic Resistance Database), and PHROGS98 (phage proteins database), respectively. All sequence alignments and visualizations were performed with Clinker99. To assign functional predictions for each PIN1 and PIN2 protein, sequence homology searches were conducted using HHpred100 and blastp against the Uniprot101 database with default settings. The genomic sequencing data were deposited on the NCBI GenBank database as Klebsiella phage PIN2 (PQ803401), Klebsiella phage PIN1 (PQ803402), and Klebsiella phage RAD13 (PQ803403). The viral proteomic tree was reconstructed using the ViPTree webserver102, incorporating a total of 153 phage genomes, including PIN1 and PIN2. The remaining 151 phage genomes were retrieved from the INPHARED database103 (accessed in June 2024) and filtered to include only members of the Casjensviridae family. The resulting phylogenetic tree was annotated and visualized using iTOL v7104. Multiple sequence alignment of the major capsid proteins YSD1_17, PIN1_20, and PIN2_18 was performed using ESPript 3.0.105, with YSD1_17 (PDB: 6XGP) secondary structure elements as a reference. The alignment was visualized with strict identity (red boxes) and conservation (similarity score >0.7, red text) highlighted.A BLAST database (makeblastdb) was created with the taxonomy limited to “all viral sequences” and used to execute an all-in-all pairwise comparison. BLAST (v2.14.1) hits with E-values ≤0.001 were retained for further analysis. A strict threshold of 90% sequence similarity and 75% coverage was used to further filter sequences. A graph-based clustering algorithm, Markov Clustering Algorithm (MCL; version 14-137) with an inflation value of 6.0, was used to determine Viral Clusters as defined previously44. Singleton viral sequences were removed, and the remaining clusters were visualized using Cytoscape (version 3.10.2) with an Organic Layout. BLASTP was used to identify the presence – as defined by >90% sequence similarity and >90% coverage – of the YSD1_29 distal tail protein across the set of viral sequences in this study. Firstly, the phage sequences were all annotated using Pharokka (version 1.3.0), and the subsequent protein files were used to create a blastp (makeblastdb) database in which to query the reference YSD1_29 sequence.Mass spectrometryAnalysis of virion protein composition by mass spectrometry was as previously described85. Briefly, CsCl-purified phage samples were solubilized in sodium dodecyl sulfate (SDS) lysis buffer (4% SDS, 100 mM HEPES pH 8.5) and sonicated to assist protein extraction, and, after further purification, the samples were proteolytically digested with trypsin (Promega) and peptides then purified prior to LC-MS/MS analysis. To obtain peptide sequence information, the raw files were searched with Byonic v3.0.0 (ProteinMetrics) against the K. pneumoniae AJ218 GenBank file NZ_LR130541.1 that was appended with the phage protein sequences. Only proteins falling within a false discovery rate (FDR) of 1% based on a decoy database were considered for further analysis.AlphaFold predictions and functional prediction of phage proteinsAlphaFold predictions were made using the ColabFold tool, version 1.5.2-patch106, as updated 12/06/23, which relies on the MMseqs2 and AlphaFold2 systems in conjunction with Google Colaboratory to predict protein folding. Each query sequence was run as a monomer using the amino acid sequence. Settings were unchanged, other than using the pdb100 template mode to detect potential templates already present. Predicted proteins were visualized using the ChimeraX software, version 1.5107. The functional prediction of the phage proteins Pin1_32 and Pin2_30 was performed using structural and sequence-based approaches. Structural predictions from AlphaFold were analyzed with FoldSeek47 by uploading the PDB files to find functional similarities. Additionally, protein domains were predicted using InterProScan5 by submitting the amino acid sequences to the web server (https://www.ebi.ac.uk/interpro/).Construction of K. pneumoniae AJ218 transposon mutant libraryThe library of K. pneumoniae AJ218 transposon mutants was constructed using a modified protocol from a previous study49. A Tn5-based transposon with a chloramphenicol resistance marker (cat gene) was delivered via the plasmid pDS1028. This plasmid was first introduced into an auxotrophic E. coli JKE201 strain, which required diaminopimelic acid (DAP) for growth, and was chosen as the donor strain. For conjugation, overnight cultures of the donor strain were grown on LB agar supplemented with 150 μM 2-6-DAP and 34 μg/ml chloramphenicol at 37 °C. The recipient strain, wild-type K. pneumoniae AJ218, was cultured on LB agar at 30 °C to minimize capsule overproduction. Colonies of both strains were harvested and resuspended with 1 ml of 0.9% sterile saline per plate for conjugation.The resuspended donor and recipient cells were mixed at a 2:1 OD600 ratio in 0.9% sterile saline. 50 μl aliquots of the mix were spotted onto LB agar plates containing 150 μM DAP. The plates were incubated at 37 °C for 1 h to facilitate mating. After incubation, conjugation spots were resuspended in 1 mL of 25% glycerol in 0.5× PBS, collected as the transconjugant mix, and flash-frozen for the selection step. Conjugation efficiency was estimated by plating the transconjugant mix on LB agar with 34 μg/mL chloramphenicol and on non-selective LB agar to determine total viable counts.Transconjugants were selected by plating aliquots of the thawed transconjugant mix onto selective media containing 34 μg/ml chloramphenicol, prepared in 245 mm square bioassay dishes. Approximately ~14,000 mutant colonies were seeded per plate and incubated overnight at 37 °C. Following the bioassay, ~504,000 mutant colonies were harvested, resuspended in 25% glycerol with 0.5× PBS, aliquoted into cryovials, and flash-frozen for subsequent experiments. The library quality control was done by randomly selecting 100 colonies for colony PCR to verify transposon integration and confirm the loss of the pDS1028 plasmid. The final K. pneumoniae AJ218 transposon mutant library stock was estimated to have a size of 2.25 × 10¹¹ CFU/ml, based on total viable counts.TraDIS to identify essential host genes for phage infectionA high-density transposon mutant library of Klebsiella AJ218 containing ~144,000 average unique insertion sites was used to define mutants surviving treatment with phages PIN1 or PIN2. Further sequencing analysis of the K. pneumoniae AJ218 transposon mutant library revealed a high density of insertion sites, with an average of one insertion occurring approximately every 38 bp across the genome. Three cultures were grown in LB medium, each inoculated with 109 bacterial cells from the transposon library stock and 1010 viral particles (MOI 10). Cultures were grown for a further 5 h, centrifuged (6000 × g, 10 min, at 4 °C), and the cell pellets washed in 10 mM Tris. Genomic DNA was isolated from each cell pellet by phenol-chloroform extraction using 15 mL phase lock tubes (Qiagen). Two micrograms of each gDNA preparation were used to prepare transposon-specific sequencing libraries using primer FS108 for specific amplification of transposon junctions as described previously49,108. DNA libraries were sequenced using the Illumina MiSeq platform with primer FS107 as described previously108,109.TraDIS analysis to identify receptors was performed as described previously38,110. Reads from transposon-gDNA junctions were mapped to the Klebsiella AJ218 genome (GenBank accession no. NZ_LR130541.1) using the BioTraDIS pipeline with the parameters “-v smalt_r -1 -t TAAGAGACAG -mm 1” and assigned to genomic features, with reads mapping to the 3′ 10% of the gene ignored. Comparisons between phage-treated and control samples were performed using the “TraDIS_comparison_positive_selection.R” script (https://github.com/francesca-short/tradis_scripts), which is based on the comparison script from the BioTraDIS toolkit, but in addition, reports the insertion index ratio between condition and control samples. Filtering based on gene-wise transposon mutant diversity (insertion index ratio) was necessary because, for many of the genes with increased read counts post-phage challenge, these reads mapped to just a single insertion site. These cases were presumed to result from rare secondary mutations unrelated to the transposon insertion, as suggested previously111. Genes required for phage infection were defined as those with a Log2 fold change (FC) of ≥2, a q-value of