Discover ways to transform this post
POST /speech/from-post
This is a continued deep-dive into the latent space generated by the Orb model prior to it's MLFF tasks. I have been attempting to train a model on Tc prediction using this latent space as a feature vector. More on that here:
After reading the MatterSim paper, the authors proposed the idea of using the MLFF's latent space as a direct property prediction feature set. Earlier, and I had been thinking about using a VAE (or s
In the post, I'll be focusing purely on what we can learn from the latent space and see what sort of patterns emerge related to Tc, material composition and class, structure, etc.
We can visualize the latent space with dimensionality reduction using UMAP and t-SNE. Try this interactive Plotly visualization!
Using the 256 dimensional latent space output from the Orb model, we visualize the 3DSC(MP) dataset using t-SNE and UMAP. The UMAP projection has been given the target for learning a manifold that keeps similar Tc materials close together.
Make sure to bring the visualization into fullscreen with the expand button at the top right of the visualization. There are two, side-by-side, visualizations here.
Another version of the visualization with better point labeling:
Using the 256 dimensional latent space output from the Orb model, we visualize the 3DSC(MP) dataset using UMAP with direction from Tc labels. Hover a point to see Tc, formula, and Material Project identifier.
We can see that clusters of similar Tc often relates to materials of the same class, e.g. like cuprates, BaCu families, and other chemical systems.
Discover other posts like this one
I wanted to formalize in writing the idea that I keep coming back to for end-to-end material discovery. The hardest part of this project has been actually optimizing towards materials that have some p
By looking at our superconducting state classifier model feature importance, we can understand what features we should be looking at. From there, we can start to study how these features change across
The paper is somewhat basic (and probably still in preprint), but this contribution is nonetheless great!