Federated analysis of incubation period distributions using individual-level observed data and heterogeneous summary statistics

Wait 5 sec.

The incubation period, the interval between pathogen exposure and symptom onset, is a critical epidemiological parameter for follow-up policy and outbreak response, yet individual-level exposure data remain scarce, especially early in outbreaks. For most priority pathogens, only summary statistics are available because sharing of individual-level data can be sensitive. Here we introduce a Bayesian hierarchical framework that jointly models individual-level observations and published summary statistics under a unified federated analysis framework. Simulation studies demonstrate that the method accurately recovers incubation period distributions across a range of data availability scenarios, generally outperforming approaches that use published summary statistics alone. Applying the framework to 18 pathogens, including 10 priority pathogens classified to have outbreak potential by the World Health Organization, we find substantial between-study heterogeneity in incubation period estimates, including by outbreak country for SARS-CoV-1, variants of concern for COVID-19, and exposure setting for typhoid fever. These estimates, together with the curated dataset and modelling framework in our associated R package ddsynth, provide a reproducible foundation for improved incubation period estimation and synthesis across pathogens of epidemic concern. Our framework enables robust and rapid estimation of incubation periods during new outbreaks.