Loading...

5mo

Curie temperature prediction model

In this post I'll share some of the work I've been doing on a Curie temperature prediction model. I finally found a decent dataset to work with. More on that here:

Notes from ML prediction of Curie temperature paper

post

Sharing some notes as I read this paper. I uploaded it here for reference. I came across it looking for a Curie temperature dataset and so far this has been the best I've found so far.

5mo

2 comments

5mo

Author

4mo

Author

Join to comment

The authors graciously made a repo available with their compiled dataset. Like many datasets compiled from literature, the dataset only has chemical formula and Curie temperature. The first part of this project was to match chemical formula to crystal structure.

So far, all I've done is standardize the formulas (make them stoichiometric) as Materials Project expects then search the database for matches. There is possibly a more robust approach we can apply by looking at what the authors of 3DSC did to match formulas from the SuperCon dataset. Additionally, I've only searched Materials Project so leveraging ICSD is an easy next step.

Dataset

From the original ~35,000 rows, there were ~13,000 unique chemical formulas. Many of the rows from the original set were duplicate formulas with differing Curie temperatures. There was no further information on how to properly deduplicate or which Curie temperature to go with so in the process of grouping by chemical formula I've taken the summary statistics like count, mean, min, max, and standard deviation.

In general, the standard deviations are low, within ~10 degrees. For those that are not, it's worth exploring further and understanding why that would be the case. It's very likely that these temperatures represent different crystal structures with the same stoichiometry. Having nuance like that would be very useful for this use case. Perhaps we could go back to the literature they we're first published in and see if there is any information regarding crystal structure.

Distribution of Curie temperatures

Image file

Looking at the 13,000 unique chemical families, we take the average Curie temperature and look at a histogram of those temperatures.

5mo

Number of materials: 12977
Mean Tc: 315.4 K
Median Tc: 256.0 K
Min Tc: 0.0 K
Max Tc: 1434.1 K
Materials with Tc > 298K: 5682 (43.8%)

This distribution matches up with what I've seen for # of expected Curie temperatures below room temperature. I'd seen figures that 50-60% of materials had Tc below room temperature, therefor disqualifying them as candidates to be useful permanent magnets. Good to see we have a similar distribution in our dataset.

You can find the compiled dataset here:

Curie temperature dataset v0

dataset

This is a first draft of a compiled Curie temperature dataset mapping crystal structure (from Materials Project) to Curie temperature. Builds on the work of https://github.com/Songyosk/CurieML. Dataset includes ~6,800 unique materials representing 3,284 unique chemical families.

5mo

Matching to Materials Project database

Searching the 13,000 chemical formulas in Materials Project database found 6,800 different crystal structures represented by 3284 unique chemical families. This is because a single stoichiometry could have multiple matches in the database, for example ZrVFe matches mp-1215241 and mp-1215261. Looking at this example closer, there does not appear to be a significant difference in the two structures. In this case, we can use min energy above hull to choose the more likely structure.

Deduplicating data

Even though we were able to find 6,800 matches in MP, there are only ~3,000 unique stoichiometries represented with no clear answer as to which is the correct structure. Also very possible the actual structure is not one of the matches. Some of this we will never know so we need to be okay with "good enough".

Following some of the techniques from 3DSC, we could:

Instead of only looking for exact matches, allow for matching chemical formulas that differ by a constant factor. For example, CuLa2O4 could be matched with Cu2La4O8 (with a relative factor of 1/2).
Rank matches by: a) Energy above hull (Ehull) - lower values preferred b) Total weighted relative difference (Δtotrel) - lower values preferred.

Modeling

As a data scientist should know, it's better to spend your time with the dataset. Nothing more needed beyond XGBoost/CatBoost and some cross validation. Funny enough, the more papers I read on property prediction, the more I find people 'independently' discovering that random forests/decision trees do the best. Same findings in the DS world.

I'll be going back to continue to improve the dataset but I wanted to have something end-to-end before refining.

Feature vector

This experiment takes the same approach to the work I did on critical temperature prediction for superconductors. We take the latent vector of a GNN (graph neural network) MLIP (machine learning interatomic potential) model and use that as our feature vector.

I've been finding that MLIP is important compared to MLFF because of the magnetic understanding of the material. We see those results replicated here where CHGNet outperforms Orb despite Orb generally performing better on MD and benchmarks.

Interestingly, the CHGNet feature vector is only dim 128 where Orb is 256. Additionally, CHGNet is 500K params where Orb is 25M...

I'll be evaluating some other models with magnetic understanding soon too, likely pushing performance even further. For example, SevenNet:

SevenNet MLIP model

post

New MLIP model on the leaderboards! Currently #2 with an F1 score of 0.884. Congrats to the team. They provide a few pre-trained models as well as a ASE calculator for MD. Great stuff. It's a graph ne

5mo

Performance & metrics

Decent results so far. I know there is still so much we can do to improve the dataset, and training on multiple structures with the same Tc doesn't feel right so I'm not holding much to these initial results. There is also plenty of feature engineering to do. Beyond the latent vector, we could include mean and sum magnetic moment. This was found to be the single most predictive feature by Jung et al., the original authors of this dataset.

Curie temperature prediction parity plot

Image file

R-squared value of 0.89, we look expected vs. predicted temperatures for a test set of ~1200 materials.

5mo

Not bad. A lot of variance below 400K but if we're primarily using this model as a material screening filter, that range matters less. Despite a relatively high R2, I think there is a lot of room to improve.

Cross-validation also told a somewhat different story. In 5-fold cross validation, the metrics were as follows:

plaintext

Best R² score: 0.713
Mean MAE: 111.523
Mean RMSE: 165.627

237 views

0 comments

4 references

Building a magnetocrystalline anisotropy energy model, attempt 1
post
Like our work on Curie temperature, the effort here is to build a machine learning model that can take a crystal structure and predict its magnetocrystalline anisotropy energy. Relevant for permanent
3mo
Initial screening of magnetic materials
post
We're starting to bring a few of the pieces together in our permanent magnet screening pipeline. In this post we'll look at how well we are able to filter out materials from a list of ~5000 ferro/ferr
4mo
Curie temperature prediction route
post
Inspired by 's Project 014, we've exposed our Curie temperature prediction model so that you can test your own materials!
4mo
Zn-Mg-H systems
post
After wrestling with Mattergen finetuning for longer than I would've liked to, I pivoted back to simple property conditioned generation on Zn-Mg-H systems per 's recommendation. Each generated system
5mo

posts

posts

Curie temperature prediction model

Curie temperature prediction model

Notes from ML prediction of Curie temperature paper

2 comments

Dataset

Distribution of Curie temperatures

Curie temperature dataset v0

Matching to Materials Project database

Deduplicating data

Modeling

Feature vector

SevenNet MLIP model

Performance & metrics

Curie temperature prediction parity plot

Building a magnetocrystalline anisotropy energy model, attempt 1

Initial screening of magnetic materials

Curie temperature prediction route

Zn-Mg-H systems