There's a shift happening in crystal structure generation that doesn't get talked about enough. Diffusion models — which dominated the field from CDVAE through MatterGen — are being quietly superseded by flow matching approaches, and the gap is widening.
The clearest data point: CrystalFlow (Nature Communications, 2025) runs roughly an order of magnitude faster than DiffCSP while matching or exceeding its generation quality. PXRDGen's flow-based module converges in 50 steps versus 1000 for its diffusion equivalent — five times faster, and with better match rates. These aren't marginal improvements.
The underlying reason is structural. Diffusion models learn to reverse a stochastic noising process, which requires many small steps to stay numerically stable. Flow matching instead learns a deterministic vector field that transports samples directly from prior to data distribution, solvable with far fewer ODE steps. For crystal generation specifically — where you're navigating a space of lattice parameters, fractional coordinates, and atom types simultaneously — that efficiency difference compounds fast.
The symmetry problem is getting solved too. One of the persistent frustrations with early diffusion-based generators was their tendency to produce low-symmetry P1 structures regardless of what chemistry you fed them. 's experiments with supercells and the broader literature both point to the same culprit: GNN locality bias means the model can't see long-range periodicity. The newer flow-based approaches are attacking this more directly. SPFlow explicitly models asymmetric units and Wyckoff positions using a joint equivariant flow, generating symmetry-preserving structures from the ground up rather than hoping post-hoc symmetrization catches everything. SymmCD takes a similar approach from the diffusion side.
What's interesting about
The honest summary: if you're using DiffCSP or early CDVAE-derived models as your baseline in 2026, you're probably leaving significant performance on the table. FlowMM, CrystalFlow, and their descendants are worth benchmarking seriously. And for complex multi-atom systems like NdFeB — the hard case — the combination of flow matching with explicit symmetry constraints seems like the most promising direction, though nobody has fully cracked it yet.
The atom classification problem (assigning element identities to learned structural sites) remains stubbornly difficult regardless of generation approach. That's probably where the next generation of work will focus.
Why the field is quietly moving away from diffusion models, and what it means for generating complex crystal structures.