DFT approach to MAE calculation

So I've finally made some progress on MAE prediction/calculation. As you can tell from the title, we're having to rely on DFT. It's a little unfortunate because I was really hoping to be able to use machine learning to accelerate some or all of the process but based on what's out there and the data available to train a model like that, I think we're a long ways away.

Even in trying to calculate MAE with DFT, I found out just how sensitive the value is to convergence and proper representation of electronic structure. You'll find this sentiment everywhere, and I can only reiterate it.

I'm making this post to share a little bit about what I've learned. The work isn't over, but I do finally have something working so I'm going to take a moment to celebrate that before I get back to it.

There are two approaches to MAE calculation I've been looking at. Force-theorem, and total energy difference.

Let's look at each one.

Force-theorem

This is the approach I first set out to implement. Ideally it should be a nice happy medium between cost savings (time and compute) and accuracy. The process roughly follows like so:

Compute a single SCF run with nspin 2
Use this self-consistent electron density to initialize NSCF run for each magnetization direction you want to look at.
Find the band energy for each NSCF run and compare, find easy - hard axis. This is your energy difference and the usually MAE calculation follows

It should be that you spend most of your compute on the first SCF run. What I've found so far (and it should be noted that I haven't fully gotten this approach to work yet) is that the SCF run actually needs to be nspin 4 (no SOC though) to properly initialize the NSCF runs. This could be a limitation of the DFT software I'm using though (ABACUS). I will probably give VASP a go and see if I can't get it working there.

If you only use nspin 2, I found that MAE was always ~0.

When I get this approach working, I'll be sure to write up another post but at this point I've burned a ton of time trying to get it to work with no luck. I think I'm close, but we'll see. Comparing the energies is not as straight forward because when you change the magnetization direction for each NSCF run, the fermi level changes and has led me to compare the wrong energies and get massive MAE values.

Total energy difference

This approach is a lot simpler, but more computationally demanding. As I mentioned before, this requires a fully SCF run for each direction, nspin 4 with SOC. This is about the most expensive possible setup in DFT - it's no wonder there is not a lot of data out there for this kind of calculation.

But the approach is quite elegant.

SCF run with nspin 4 lspinorb 1, noncolin 1) for each magnetization direction
Compare total energies, find easy and hard axis and the difference is your MAE

Much simpler, but multiple SCF runs can really add up in terms of compute time and cost. This is the only approach I've gotten to work so far. I'm in the process of pulling together some metrics so I'll have more concrete numbers soon, but I'd estimate about ~4x the time cost (4x the SCF runs, 1 vs. 4, NSCF runs are basically negligible compared to SCF). 5 minutes -> 20 minutes is something.

Parameters

Let's take a look at some of the parameters we can work with that effect the outcome of our calculations.

First and foremost is kspacing, or the k-point mesh. Everywhere I've seen had mentioned that you need a really dense mesh to get proper MAE. Sad, because this param also has the greatest effect on runtime. I'm trying to find the "good enough" level so that we can run fast but still have a general idea of the MAE. You can always go back and run a finer mesh to ensure convergence, but for screening I want just order-of-magnitude level of accuracy. The good thing about GPU acceleration for the PW (plane wave) basis has a good k point parallelization implementation. This is likely why we get such good GPU utilization too.

Next up is smearing, smearing_sigma. I haven't gotten a good feel of this param yet. I know it needs to be fairly low so that you don't smooth out the anisotropy effects.

Next is basis_type. I haven't played around with this a ton given that it seems ABACUS' plane wave approach has a much better GPU implementation. That's important for my use case so I'm going with it. They say this about the LCAO basis:

Unlike PW basis, only the grid integration module (module_gint) and the diagonalization of the Hamiltonian matrix (source_hsolver) have been implemented with GPU acceleration under LCAO basis.

Related to basis_type is ks_solver. This I have experimented a good deal with. I've mainly looked at:

bpcg: The BPCG method, which is a block-parallel Conjugate Gradient (CG) method, typically exhibits higher acceleration in a GPU environment.
dav: The Davidson algorithm.

I've found bpcg to be faster, but when running on larger systems I was getting some strange CUDA errors. Could have been out-of-memory, but running in Modal I don't really have access to be able to see if that's really the case. Looked something like:

bash

[gpu-health] [WARN] GPU-3988280c-3ab3-c0bb-814f-3f7ce05e02a8: XID: NVRM: Xid (PCI:0000:80:00): 31, pid=19769, name=exe, Ch 0000000a, intr 00000000. MMU Fault: ENGINE GRAPHICS GPC3 GPCCLIENT_T1_6 faulted @ 0x2b38_5f200000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE

Changing to ks_solver dav would usually fix this crash. Note too that your kpoints interplay here, so if you have a large system with a lot of kpoints you may run into trouble using bpcg.

UPDATE

Force-theorem is working! I'm able to get good results much faster with an initial nspin 2 SCF run, then NSCF nc + SOC runs for each direction. The problem was really sneaky. When copying over the charge density from the SCF run to the NSCF, DO NOT copy the ABACUS-CHARGE-DENSITY.restart file, instead, just copy the chgs*.cube file. There was not any good info into this other than finding out by trial-and-error. The NSCF runs will even complain that they couldn't read the restart file, but I think this set up is what's required to get the initial state we need. If I copy the restart file, MAE was almost always ~0. Now we're getting good values!

Summary

I'm still working on MAE calculations with DFT, but I'm happy to at least have something consistently working with the total energy difference approach. It will only get better from here. I'll let y'all know when there is a public API to test your own materials with.

Loading comments...

posts

DFT approach to MAE calculation

Force-theorem

Total energy difference

Parameters

Summary