An expectation maximization algorithm is implemented to resolve the indexing ambiguity which arises when merging data from many crystals in protein crystallography, especially in cases where partial reflections are recorded in serial femtosecond crystallography (SFX) at XFELs.
Crystallographic auto-indexing algorithms provide crystal orientations and unit-cell parameters and assign Miller indices based on the geometric relations between the Bragg peaks observed in diffraction patterns. However, if the Bravais symmetry is higher than the space-group symmetry, there will be multiple indexing options that are geometrically equivalent, and hence many ways to merge diffraction intensities from protein nanocrystals. Structure factor magnitudes from full reflections are required to resolve this ambiguity but only partial reflections are available from each XFEL shot, which must be merged to obtain full reflections from these ‘stills’. To resolve this chicken-and-egg problem, an expectation maximization algorithm is described that iteratively constructs a model from the intensities recorded in the diffraction patterns as the indexing ambiguity is being resolved. The reconstructed model is then used to guide the resolution of the indexing ambiguity as feedback for the next iteration. Using both simulated and experimental data collected at an X-ray laser for photosystem I in the P6 3 space group (which supports a merohedral twinning indexing ambiguity), the method is validated.