MatterGen employs a diffusion-based approach for crystal structure generation, utilizing classifier-free guidance to steer the generation process. The core of our modifications centers on the PropertyGuidedPredictorCorrector
class in magnetic_guidance.py
, which implements a new guidance mechanism:
The implementation uses specialized predictors and correctors from MatterGen’s existing codebase, each designed for different structural components:
pos
)Reuses MatterGen’s continuous variable sampler.
Implements ancestral sampling with the update rule:
where is the learned mean and is the noise schedule.
Applies Langevin dynamics in the correction step:
where is the step size and .
cell
)
Leverages MatterGen’s specialized lattice predictor.
Uses a modified ancestral sampling that preserves lattice constraints:
where is the lattice matrix and enforces crystal symmetry.
atomic_numbers
)
Utilizes MatterGen’s discrete diffusion implementation.
Implements D3PM (Discrete Denoising Diffusion) with transition matrix:
where is the corruption vector at time .
Direct prediction mode enables better categorical sampling:
The guidance mechanism combines these samplers with our reward function:
where:
is the original MatterGen score function.
is the guidance scale.
is our reward function.
is the diffusion timestep.
Our reward in this approach leaned on CHGNet to calculate the magnetic density of the structure for each tilmestep . Because this density value depends on many variables that are constantly changing during the diffusion process (individual magnetic moments, unit cell volume, and moment alignment), we did not see any meaningful move towards higher densities in the resulting structures. There simply isn't enough information in our reward in this case to indicate to the model that maybe maximizing moments and minimizing volume is the way to get high rewards.
So the logical next step here was to really simplify the reward process just to see if we could meaningfully influence the denoising process.
The categorical nature of atomic number selection meant that guidance on this component could be more direct and effective than continuous variables.
We simplified our approach to focus solely on atomic rewards, implementing a new MomentMaximizationReward
class with progressive scaling:
The reward computation follows:
where:
is the total magnetic moment.
is the number of atoms.
are threshold bonus functions:
Bi(m)={kiif m>ti1otherwise
Simplicity was the goal here. Given we don't have the cash to train a new MatterGen from scratch, the aim was to see what we could build on top of this base model and evaluate how the generation process might be influenced.
As luck would have it, this seems to have worked. We were able to influence MatterGen towards generating structures with cumulative moments 20-40 times larger than the unaltered base model.
The rewards shown are the final values that we settled on following iterations where each new iteration stepped up the reward from the last until we stopped seeing meaningful increases in average moments.
All-time best total moment: 42.08 μB (Ba₁Gd₆Te₁₃)
Second best: 41.00 μB (Eu₆Hg₂P₈)
Multiple structures exceeding 25 μB
Highest per-atom moment: 5.62 μB/atom (Gd₄N₁)
Composition | Magnetic Moment (μB) |
---|---|
Ba₁Gd₆Te₁₃ | 42.08 |
Eu₆Hg₂P₈ | 41.00 |
Fe₆O₃F₉ | 26.63 |
Eu₄Ga₂Ge₄ | 27.33 |
Gd₄N₁ | 28.10 |
Despite some promising results, the magnetic densities of these structures are less than what we see in Iron-based fridge magnets and are well within MatterGen's training data distribution. As much as it would be nice to simply crank the moment and minimize the volume of the unit cells for these structures, the NdFeB magnets rely on exchange coupling to align magnetic moments in parallel, and this behavior gives rise to their extreme magnetic strength and high coercivity.
Maybe we need to think through some more physics based reward landscapes, incentivize similar coupling behaviors, but keep the troublesome rare earths out of the equation. More to come.
I wanted to formalize in writing the idea that I keep coming back to for end-to-end material discovery. The hardest part of this project has been actually optimizing towards materials that have some p