Quantifying epidemiological dynamics is crucial for understanding and forecasting the spread of an epidemic. The coalescent and the birth-death model are used interchangeably to infer epidemiological parameters from the genealogical relationships of the pathogen population under study, which in turn are inferred from the pathogen genetic sequencing data. To compare the performance of these widely applied models, we performed a simulation study. We simulated phylogenetic trees under the constant rate birth-death model and the coalescent model with a deterministic exponentially growing infected population. For each tree, we re-estimated the epidemiological parameters using both a birth-death and a coalescent based method, implemented as an MCMC procedure in BEAST v2.0. In our analyses that estimate the growth rate of an epidemic based on simulated birth-death trees, the point estimates such as the maximum a posteriori/maximum likelihood estimates are not very different. However, the estimates of uncertainty are very different. The birth-death model had a higher coverage than the coalescent model, i.e. contained the true value in the highest posterior density (HPD) interval more often (2–13% vs. 31–75% error). The coverage of the coalescent decreases with decreasing basic reproductive ratio and increasing sampling probability of infecteds. We hypothesize that the biases in the coalescent are due to the assumption of deterministic rather than stochastic population size changes. Both methods performed reasonably well when analyzing trees simulated under the coalescent. The methods can also identify other key epidemiological parameters as long as one of the parameters is fixed to its true value. In summary, when using genetic data to estimate epidemic dynamics, our results suggest that the birth-death method will be less sensitive to population fluctuations of early outbreaks than the coalescent method that assumes a deterministic exponentially growing infected population.
The control or prediction of an epidemic outbreak requires the quantification of the parameters of transmission and recovery. These parameters can be inferred from phylogenetic relationships among the pathogen strains isolated from infected individuals. The coalescent and the birth-death process are two mathematical models commonly used in such inferences. No benchmark on the performance of these models currently exists. We aimed to objectively compare two specific models, namely the constant rate birth-death model and the coalescent with a deterministic exponentially growing infected population. We compare coverage, accuracy, and precision with which they can capture the true epidemic growth rate parameter using simulated datasets. We find that the constant rate birth-death process can account for early stochasticity and is thus capable of recovering the epidemic growth rates more successfully. Provided one of the parameters is known, e.g. the sampling proportion of infected individuals, then the basic reproductive ratio can also be estimated reliably. We conclude that a birth-death-based method is generally a more reliable method than a deterministic coalescent-based method for epidemiological parameter inference from phylogenies representing epidemic outbreaks. Care should be taken if sampling is not constant through time or across individuals, such scenarios require so-called birth-death skyline models or multi-type birth-death models.