Orb latent space Tc classifier · Posts on Ouro

Orb latent space Tc classifier

Using what we learned when trying to use the MLFF's latent space for Tc prediction, there's a way we can simplify things for the prediction model and give it a better change of picking up on the signal that determines superconductivity.

Read the full post here:

post

After reading the MatterSim paper, the authors proposed the idea of using the MLFF's latent space as a direct property prediction feature set. Earlier, and I had been thinking about using a VAE (or s

11mo

I'll summarize what we learned: predicting Tc from ground state may theoretically be impossible. More on this later. We had fine performance doing this actually, but what most likely was happening was that we were learning a model to recognize material classes, not superconductivity. If you know of a superconductor of a specific chemical family (cuprate, hydride, etc.), you can make rough predictions that other materials of that family will have similar Tc. Because the dataset we were training on was pretty much all superconductors, we also weren't learning when a material might not ever have a Tc.

I think of Wolfram's idea of computational irreducibility. We have (from the Wikipedia page):

Many physical systems are complex enough that they cannot be effectively measured. Even simpler programs contain a great diversity of behavior. Therefore no model can predict, using only initial conditions, exactly what will occur in a given physical system before an experiment is conducted.

This is exactly the kind of setup we're working with. We have some ground state material (initial conditions) and we're trying to predict some phase shift, bypassing all of the complexity that happens to the material as it heats up to Tc. You can't just bypass these computations. Temperature increases will effect a material in different (complex) ways, including phase shifts, lattice and structural changes, etc. Not accounting for these, even though we know superconductivity is affected by them, is a recipe for failure.

Learning from this, we can redesign our model such that we don't skip over these computations.

Modeling task

Instead of a regression model like last time, this time we'll build a classifier. Instead of input material to output Tc, we take our time and ramp up the material in temperature through MD simulations. At every step or increase in temperature, we classify if the material in this state is superconducting, or not. Simple binary classification.

Dataset creation

The dataset is easy to generate now that we're setup this way. For each superconducting material we have in the 3DSC dataset, we do a MD temperature ramping simulation, tracking atomic trajectories, Orb latent space output, measured and expected temp, and finally weather or not it should be superconducting, given by:

$Class = \begin{cases} 1 & T <= T_c \\ 0 & T > T_c \end{cases}$

Note that at exactly T = Tc, there's technically a phase transition point where the function isn't well-defined, but for this purposes we'll assume superconductivity.

This approach allows us to generate plenty of data to train on. Depending on the MD parameters (time step, measurement frequency, etc.) we can make thousands of data points from just a single superconducting material.

The challenge so far has been getting a nicely balanced dataset and eval, since it's easy to just ramp temperatures well above Tc and have a lot of negative (O class) samples. Although, balance may not be most important since the imbalance is true to the real world, and actually even more so since we haven't added any non-superconducting materials to train on yet. This will be important to properly learn to discriminate between materials that have no transition.

Currently still generating more training data. Because we run MD on every material in the dataset, we're looking at (at least) a few days of GPU time to run through all ~6000 materials.

From this point on, I'll be sharing early results from models trained on just 1564 materials.

Evaluation results

Given that fundamentally this is a binary classification task, we have a well understood set of metrics we can study to learn more about the performance of our model.

Looking at just the prediction label metrics, the results are shockingly good...

plaintext

Val set size: 3465 configurations
Validation Accuracy: 0.905
Validation ROC AUC: 0.964

Though used for early stopping, the model hasn't trained on any of the materials evaluated here. That said, I should note that results again could be artificially inflated because of material class and chemical family learning, rather than superconductivity mechanism learning. Considering we want to use this model to discover new families of superconductor, I'll work to develop a better way to segment materials so that we can evaluate fairly.

Orb latent space Tc classifier confusion matrix

Because of the dataset creation process, this matrix doesn't really tell us that much. Again dependent on the simulation parameters, the weight of each class can cause misleading results. Though, the weighting isn't that bad here! 1000 superconducting configurations to 2000 non-superconducting is okay.

Predictions

One of the downsides to this approach is that we are now dependent on running MD simulations until we find Tc. This is more computationally more intensive than a single pass through a model, but we find that it's still good enough for high throughput discovery, especially when using Orb or other MLIP/MLFF models to handle the MD calculations.

And maybe we'll find that simply setting temperature directly and stabilizing for a few steps is good enough to approximate a proper temperature ramping from 0 K.

For now we're just testing superconductors, so to start the simulation we find predictions near 1 (classifier gives probabilities, and we can set threshold to 0.5 to determine 1 or 0). As the temperature increases, we notice a critical point where predictions suddenly fall much closer to 0. See the following example:

Superconducting state classification vs temperature

When I first saw a plot like this I was really excited. Seeing how clearly the model predicts a phase shift near the expected Tc was exactly what we're hoping for. In the example we have above, it's about as clear cut as it gets, though the transition is not instantaneous (shown by the two points between true and estimated Tc), it's a pretty good approximation and the probabilities are confident in their classification.

Let's continue to look at some more predictions and see what we can learn. In the next example, we'll see a materials that the model could not find a Tc for:

Uncertain superconducting state classification

What's interesting here is that it looks like early on it's P(superconducting) is shrinking rapidly, but that doesn't last long and it ends up in this unstable (and uncertain) state of predicted superconductivity. In reality, this material quickly loses superconductivity at just 1.2 K. For now I'll assume this is a training size issue, but it's noted here for reference.

This example is also why the confusion matrix is pretty useless. This material was completely wrong, and so each point beyond 1.2 K w/ P(superconductivity) > 0.5 is an error which we can make more of just by ramping up to greater temperatures (in this experiment we stop at 200 K because we know all materials in the dataset have Tc less than this).

In the final example, let's showcase some higher Tc predictions and the models ability to classify materials when the kinetic and potential energy of the system is greater.

Here, we look at a material with a Tc of 37.7 K.

Higher Tc superconducting state classification

Although rougher than our first example, we find that we still make predictions decently close to the true value. Note that predicted Tc is not determined by the first temperature which predicts a probability less than 0.5, but a smoothing of 5 nearest points that meet the threshold. This is why we see Tc higher than the first temperature we get a prediction below 0.5.

Evaluation results (Tc prediction)

Read the full evaluation here:

post

Careful evaluation of the classifier model is important so that we can truly understand the capabilities and performance of a Tc predicting model. Particularly important to us is the ability for the m

11mo

Orb latent space Tc prediction classifier parity plot

Reentrant superconductivity?

It's apparently possible that a material can transition between superconducting and normal states multiple times as temperature changes.

The underlying physics typically involves competing orders in the material - for example, the interplay between magnetic ordering and superconductivity. As temperature changes, different ordering mechanisms can dominate, leading to these multiple transitions.

This behavior is interesting because it challenges our traditional understanding of superconductivity as simply occurring below a critical temperature. It shows how quantum effects and competing interactions can create more complex phase diagrams than initially expected.

It's possible that our model has learned this behavior. Not because the phenomenon is in the training (in fact it's not, which could be problematic), but because it really is getting a signal on the drivers of the superconducting state.

Reentrant superconductivity as an emergent behavior of Tc classifier

The estimated Tc here is incorrect as the code currently just calculates the first Tc we find, but not the maximum Tc because we hadn't expected this behavior before. This now calls into question all the prior calculations I've been doing because I've been doing early stopping on the MD so that we don't waste compute when we've already found Tc (judged by the first zone of non-superconductivity, expecting that to hold for all T greater).

The probabilities here are not very crisp, but the bands of SC are pretty clear:

0 to 2 K, SC
2 to 11 K, no-SC
11 to 25 K, SC
25 and greater, no-SC

So instead of the predicted 1.6 K Tc, it's closer to 25. While still underestimating, that's a lot better.

Other materials with this behavior found so far:

Ba2Cu1Hg1O4.058-MP-mp-6562-synth_doped (high Tc)
Bi2Ca1Cu1.988Ni0.012Sr2O8-MP-mp-1218930-synth_doped (high Tc)
As1Ce1F0.2Fe1O0.8-MP-mp-605060-synth_doped

Out of target distribution predictive power

Something we looked at closely last time was how well the model could predict critical temperatures for materials that had a Tc greater than what had been seen in the training data. If we're to discover a room-temperature superconductor, this is likely to be a fundamental requirement.

This time around, we may have a better shot. We've set up the model so that it has a better chance of learning the driving factors that activate or disable the superconducting state. It's temperature independent this time. Before, the regression model predicted Tc directly, so it was directly tied to temperature. Now, there is no real concept of temperature expect for how it may be encoded in the latent space.

This test will give us a sense of if we're learning material class features like before or if we're learning the driving factors. That said, it's far more nuanced than that. There are multiple paths to superconductivity, thus different mechanisms available to learn.

In this experiment, we filter our data to Tc <= 80 K and attempt to predict the materials we just filtered out (Tc > 80 K).

Unfortunately, there were no predictions made greater than 80 K. Most of these materials were of the BaCaCu and BiCaCu families. Perhaps because of the temperature cutoff, we did not learn this type of superconducting (or the model does not learn superconducting drivers).

Changing up the approach

Seeing these results has me thinking that the model may be using temperature and material encoded together to make predictions instead of looking for superconductivity drivers.

So what if we could remove the part that encoded temperature?

Looking at the model feature importance, we find one feature that is dominating all the rest. Feature number 256. Removing it before training and predictions has some really interesting results.

Now, the model has no problem predicting temperatures outside of the training range! So does it still perform?

Final notes

I think we're on to something with this approach. I'm definitely feeling the burn of weak hardware right now. Even GPU accelerated, simulations have been running for a couple days non-stop, and some of the MD to predict Tc takes hours where some take minutes (# of atoms being the key factor to runtime). Not ideal for high-throughput discovery on consumer hardware, but easy to solve with a couple A100s. Either way, this is significantly more approachable than the traditional need for supercomputers.

Some next steps for improvement are as follows:

Finish dataset generation of all ~6000 materials (3 GPU days)
Add more non-superconducting materials to the dataset
Finish model evaluation (7 GPU days ??)
Proper model tuning, hyper-parameter optimization

Loading comments...

posts