Utilizing virus genomic surveillance to predict vaccine effectiveness

Wait 5 sec.

by Jiye Kwon, Ke Li, Joshua L. Warren, Sameer Pandya, Anne M. Hahn, Yale SARS-CoV-2 Genomic Surveillance Initiative , Virginia E. Pitzer, Daniel M. Weinberger, Nathan D. GrubaughBackground Since the development of the first vaccines targeting the original SARS-CoV-2 virus sequence in 2020, mRNA-based vaccines have been updated three times: targeting Omicron BA.4/BA.5 in 2022, the XBB lineage in 2023, and the KP.2 variant in 2024. While genomic surveillance has advanced our understanding of pathogen diversity, gaps remain in incorporating genomic information to evaluate vaccine effectiveness (VE) against emerging variants. This study aims to characterize the relationship between VE and sequence-based genetic distance, to establish a framework for predicting near real-time changes in the level of vaccine protection from virus surveillance data. Methods We analyzed 10,156 whole genome sequences of SARS-CoV-2 cases from Connecticut, USA, between April 2021 to July 2024. We first assessed how genetic distance, specifically the number of amino acid substitutions in the spike gene between COVID-19 case sequences and the mRNA vaccine formulation sequence(s), correlates with vaccine protection levels. Incorporating data from over 1 million test-negative controls, we developed a Bayesian time-varying model with autoregressive terms to assess VE at a weekly level. The analysis was adjusted for ZIP-code-level income, age, sex, and prior vaccine doses received. We then employed a random effects meta-regression to explore the relationship between VE and amino acid distance over time. Finally, we used the meta-regression model to estimate potential vaccine protection against emerging variants. Findings We found that spike gene amino acid distance showed a negative correlation with VE over time. Stepwise increases in amino acid distance aligned with sharp VE declines during variant emergence, while accumulation of within-variant changes was also associated with gradual VE decline. Each 10 amino acid increase in distance in the spike gene corresponds to a predicted 15.4% (95% credible intervals (CrI): –2.0%, 34.6%) reduction in VE. For the 2023/24 updated vaccine, spike distance rose from 12.25 to 30.23, predicting a 43.4% (95% CrI: –5.7%, 90.1%) drop in VE using sequence information alone. Conclusion Our framework quantifies how the emergence of new variants is expected to affect VE for SARS-CoV-2. By quantifying the relationship between amino acid substitutions and time-varying VE, we leverage intrinsic pathogen features, such as spike amino acid distance, to inform future vaccine updates using genomic sequences. As genomic surveillance data becomes more widely available across pathogens, this framework can serve as a near-real time surveillance tool to infer population-level protection and offers valuable insights for vaccine update decisions.