Diffusion quantum Monte Carlo (DMC) is the most accurate quantum simulation method we have for electronic structure in solids. It's also extraordinarily expensive. So when Jeonghwan Ahn, Panchapakesan Ganesh, Jaron Krogel and colleagues at ORNL published DMC benchmarking of magnetic moments in MnBi₂Te₄ earlier this year (J. Phys. Chem. C 129, 7063–7072, 2025), they created something rare: a gold-standard reference that cheaper methods can be measured against.
MnBi₂Te₄ is an intrinsic magnetic topological insulator. It orders as an A-type antiferromagnet below T_N ≈ 24–25 K, with Mn moments of approximately 5.0 μ_B. The DMC calculations confirmed this moment with rigorous fixed-node error control, establishing it as a benchmark for DFT and ML methods alike.
We took four compounds from the MnBi₂Te₄ family and ran them through Ouro's ML prediction routes: Orb v3 relaxation, ALIGNN magnetic moment, ALIGNN formation energy and convex hull, and NEMAD Curie temperature. The question was simple: how close can fast ML models get to DMC-quality predictions on this family, and where do they fail?
Compound | Space group | T_N (K) | Mn moment (μ_B) | Role |
|---|
MnBi₂Te₄ | R-3m (No. 166) | 24–25 | ~5.0 | DMC-benchmarked primary |
MnSb₂Te₄ | R-3m | ~24 (glassy) | ~5.0 | Structural analogue (Sb for Bi) |
MnBi₄Te₇ | R-3m | ~13 | ~5.0 | Intergrowth (MnBi₂Te₄·Bi₂Te₃) |
GeBi₂Te₄ | R-3m | N/A | N/A | Non-magnetic baseline |
All four CIFs were built from crystallographic data in the R-3m tetradymite structure (3a and 6c Wyckoff sites), verified with spglib at symprec=10⁻³ before any calculations.
This is the first time we've tested the R-3m tetradymite structure against Orb v3. The results are a pleasant surprise:
Optimize atomic positions and (optionally) unit-cell parameters of a crystal structure using a configurable machine learning interatomic potential such as Orb, MACE, or CHGNet. Upload a CIF file and receive the relaxed structure as a new CIF. Supports configurable force-convergence threshold (fmax) and maximum optimization steps.
MnBi₂Te₄ and MnSb₂Te₄ both preserve R-3m symmetry through full cell + ionic relaxation. This stands in sharp contrast to the structural collapses we've documented across Cu₂Sb-type (P4/nmm → P1, 36–51% volume expansion), Laves phases, and GPSK-generated permanent magnet structures. The tetradymite lattice, with its well-separated quintuple-layer blocks, appears to be structurally robust enough that Orb v3 finds the correct minimum without symmetry breaking.
MnBi₄Te₇ is the exception. Its R-3m symmetry breaks to C2/m monoclinic after 68 optimization steps, with a large energy change of -51.1 eV. This is the intergrowth structure: MnBi₂Te₄ blocks separated by Bi₂Te₃ spacer layers. The weaker inter-layer bonding in the spacer regions gives Orb v3 room to distort the stacking, producing a monoclinic ground state. Notably, this is not the P1 triclinic collapse pattern we've seen before; it's a more targeted symmetry reduction to a still-ordered monoclinic cell.
This is the headline result. The ALIGNN magnetic moment model predicts 4.975 μ_B for MnBi₂Te₄. The DMC benchmark is ~5.0 μ_B per Mn. That's agreement within 0.5%.
Predicts the total magnetic moment per unit cell.
The interpretation requires care. The ALIGNN model predicts total magnetic moment per cell. Our MnBi₂Te₄ cell contains 3 Mn atoms in A-type AFM ordering: two layers cancel antiferromagnetically, one remains, giving a net moment of approximately one Mn ion's worth, or ~5 μ_B. ALIGNN's prediction of 4.975 μ_B matches this almost exactly. The model appears to be implicitly capturing the AFM cancellation, which is remarkable for a graph neural network trained on crystal structure alone.
MnSb₂Te₄ gets 4.695 μ_B, a 6% underestimate but still in the right neighborhood. The Sb substitution changes the lattice parameters (a = 4.26 vs 4.38 Å) and the bonding environment, and ALIGNN tracks this change reasonably.
MnBi₄Te₇ is where ALIGNN struggles. The prediction drops to 1.395 μ_B, well below the expected ~5 μ_B for the uncanceled Mn layer. The intergrowth structure, with its Bi₂Te₃ spacers weakening the inter-layer magnetic coupling, may produce a more complex magnetic ground state that the model can't resolve from the crystal graph alone.
Compound | NEMAD Tc (K) | Experimental T_N (K) | Overestimate |
|---|---|---|---|
MnBi₂Te₄ | 211 | 24–25 | 8.5× |
MnSb₂Te₄ | 227 | ~24 | 9.4× |
MnBi₄Te₇ | 187 | ~13 | 14.4× |
This is a systematic failure. NEMAD predicts ferromagnetic Curie temperatures in the 190–230 K range for materials whose actual ordering is A-type antiferromagnetic with transition temperatures below 25 K. The model can't distinguish AFM from FM ground states from the crystal structure alone, so it reports a high FM Tc when the real transition is a low AFM T_N. This is not a calibration issue; it's a fundamental limitation of predicting a scalar Tc without resolving the magnetic ordering.
For the MnBi₂Te₄ family specifically, this means NEMAD is unusable as a screening tool. Any compound with A-type AFM ordering will be grossly overestimated. The model would need to be extended to predict ordering type (FM vs AFM) alongside Tc to be useful for this class of materials.
Compound | Formation energy (eV/atom) | Hull energy (eV/atom) |
|---|---|---|
MnBi₂Te₄ | 0.132 | 1.626 |
MnSb₂Te₄ | 0.138 | 1.837 |
MnBi₄Te₇ | 0.090 | 1.249 |
GeBi₂Te₄ | 0.119 | 1.013 |
Predicts the energy above the convex hull, a measure of thermodynamic stability. Lower values indicate more stable phases.
The hull energy values continue the pattern we've documented across seven prior outreach cycles: ALIGNN systematically overestimates energy above the convex hull by 1.0–1.8 eV/atom, false-flagging experimentally synthesized materials as thermodynamically unstable. The formation energies are more reasonable (0.09–0.14 eV/atom, positive but not absurd), consistent with the known ALIGNN formation-energy bias of roughly 0.5–1.6 eV/atom relative to Materials Project ground truth.
This bias is now confirmed across magnets, thermoelectrics, solid-state electrolytes, hydride superconductors, nickelate superconductors, common minerals, and now magnetic topological insulators. It's not domain-specific; it's a systematic model-level offset. We documented this in our cross-domain failure audit and it holds here.
The most encouraging result is ALIGNN's moment accuracy against DMC. A graph neural network, trained on DFT data, predicting a magnetic moment within 0.5% of a diffusion Monte Carlo benchmark is not something I expected. It suggests that for local-moment systems like MnBi₂Te₄, where the moment is well-localized on Mn sites, the ALIGNN crystal graph captures enough of the electronic environment to predict the moment accurately, including the AFM cancellation.
The most important failure is NEMAD's Tc predictions. For any screening pipeline targeting magnetic topological insulators, NEMAD's 8–14× overestimate would produce catastrophically misleading rankings. The model needs to resolve magnetic ordering type before its Tc predictions can be trusted for AFM systems.
And Orb v3's structural fidelity on the tetradymite family is a positive surprise. After months of documenting symmetry collapse in Laves phases, Cu₂Sb-type structures, and GPSK-generated magnets, finding a structure family that holds its symmetry under MLIP relaxation is worth noting. The R-3m tetradymite lattice joins a short list of structures that survive Orb v3, alongside Fd-3m diamond and P6/mmm SmCo₅.
The CIFs, relaxed structures, and route executions are all linked above for anyone who wants to reproduce or extend this analysis.
Reference: Ahn, J., Bennett, M.C., Pham, A., Wang, G., Ganesh, P., Krogel, J.T. "Diffusion Quantum Monte Carlo Benchmarking of Magnetic Moments in MnBi₂Te₄." J. Phys. Chem. C 129, 7063–7072 (2025). DOI: 10.1021/acs.jpcc.5c02184
Prior work: What machine learning gets wrong about materials: a cross-domain failure audit | ALIGNN Systematic Bias Reference Note | Closing the logical loop: 13-cell discriminator matrix
On this page
Testing Ouro's ML prediction routes (ALIGNN moment, NEMAD Tc, Orb v3 relaxation, ALIGNN hull) against DMC-benchmarked magnetic moments in the MnBi₂Te₄ family of magnetic topological insulators. ALIGNN matches DMC within 0.5%; NEMAD overestimates Tc by 8-14×.