Phylogenetics plays a crucial role in the interpretation of genomic data 1 . Phylogenetic analyses of SARS-CoV-2 genomes have allowed the detailed study of the virus’s origins 2 , of its international 3, 4 and local 4– 9 spread, and of the emergence 10 and reproductive success 11 of new variants, among many applications. These analyses have been enabled by the unparalleled volumes of genome sequence data generated and employed to study and help contain the pandemic 12 . However, preferred model-based phylogenetic approaches including maximum likelihood and Bayesian methods, mostly based on Felsenstein’s ‘pruning’ algorithm 13, 14 , cannot scale to the size of the datasets from the current pandemic 4, 15 , hampering our understanding of the virus’s evolution and transmission 16 . We present new approaches, based on reworking Felsenstein’s algorithm, for likelihood-based phylogenetic analysis of epidemiological genomic datasets at unprecedented scales. We exploit near-certainty regarding ancestral genomes, and the similarities between closely related and densely sampled genomes, to greatly reduce computational demands for memory and time. Combined with new methods for searching amongst candidate evolutionary trees, this results in our MAPLE (‘MAximum Parsimonious Likelihood Estimation’) software giving better results than popular approaches such as FastTree 2 17 , IQ-TREE 2 18 , RAxML-NG 19 and UShER 15 . Our approach therefore allows complex and accurate probabilistic phylogenetic analyses of millions of microbial genomes, extending the reach of genomic epidemiology. Future epidemiological datasets are likely to be even larger than those currently associated with COVID-19, and other disciplines such as metagenomics and biodiversity science are also generating huge numbers of genome sequences 20– 22 . Our methods will permit continued use of preferred likelihood-based phylogenetic analyses.