Discover ways to transform this asset
POST /speech/from-post
I wanted to formalize in writing the idea that I keep coming back to for end-to-end material discovery. The hardest part of this project has been actually optimizing towards materials that have some property in a somewhat differentiable way.
If we could optimize (maximize) , we could find a room temperature superconductor easily.
If we could optimize , we could discover better magnets quickly.
This is a simplification, but the idea holds. There are a set of properties that need to be balanced in this process, but those should be able to be included in the optimization too.
So far it's been very hard to direct material exploration with property values in mind, and especially hard to discover out-of-distribution property values.
MatterGen has not been very fruitful on that front yet.
Let's see the general structure.
A framework for discovering new materials with targeted properties using machine learning:
Foundation Model Training: Train a Machine Learning Interatomic Potential (MLIP) model on a diverse set of crystal structures to learn a meaningful latent space representation.
Property Prediction Layer: Build multiple specialized models that predict various material properties based on the latent representation vectors.
Interpretability Analysis: Apply SHAP analysis and other interpretability techniques to identify which latent space features drive specific material properties.
Decoder Development: Create a decoder model that can transform points in the latent space back into valid crystal structures.
Latent Space Optimization: Fine-tune latent vectors to maximize desired properties using the property prediction models as objective functions.
Material Generation: Decode the optimized latent vectors to generate novel crystal structures with enhanced properties.
Validation: Verify the predicted properties through simulation and eventually experimental synthesis.
This approach leverages the power of representation learning to navigate the vast materials design space efficiently while maintaining physical realizability.
The use of MLIP model is not necessary. We use it because the task of predicting energies, stresses, and forces, (and magnetic moments) creates a latent space we know to have predictive power for the behavior of the system. We also know it's not a complete representation of the material.
As we attempt to build more property prediction models, we'll learn more about the holes in our latent representation. From there, we can improve the latent representation by concatenating additional features or training our own encoder.
Latent Space Discontinuity: The latent space of MLIP (Machine Learning Interatomic Potential) models may not be continuous with respect to crystal structures. Tuning a vector in latent space might produce a point that doesn't correspond to a physically realizable material.
Property Correlation Trade-offs: Materials properties often have fundamental trade-offs (e.g., strength vs. ductility). Optimizing for one property may unavoidably degrade others, making multi-objective optimization challenging.
Extrapolation Reliability: The property prediction models will only be reliable within the domain of your training data. When you optimize in the latent space, you may drift into regions where predictions become unreliable.
Decoder Fidelity: Developing a high-quality decoder from latent space back to crystal structures is extremely difficult. Crystal structures have strict physical constraints (charge neutrality, coordination preferences, bond angles) that are hard to capture in generative models.
Symmetry and Periodicity: Crystal structures have symmetry and periodicity constraints that might not be preserved during latent vector manipulation and decoding.
Local vs. Global Minima: The optimization process in latent space might get stuck in local minima, missing potentially better solutions.
The other approach we're looking at is centered around MatterGen and building physics-informed reward functions to incentivize generating materials with specific properties.
Check out 's work so far:
Discover other posts like this one