Try it with your own structures here:
Calculate energy above hull, make phase diagram
Depending on the size of your system, expect the first call to finish within 5 minutes. Successive requests (with all reference materials now cached), should take no more than 20s, also depending on whether we need to do a cold-start for the service or not.
First the request lands with nothing more than a link to a CIF file. We stream that file down. The raw bytes are parsed, converted to a pymatgen.Structure
, and immediately relaxed with Orb v3 — a light-weight machine-learned force field — using an LBFGS optimiser capped at 400 steps and a 0.03 eV Å⁻¹ force threshold. Orb is fast enough that we can afford to do the same relaxation for every reference material we fetch, ensuring apples-to-apples energies throughout the phase diagram.
Every relaxed structure and its per-atom energy are dropped into two on-disk caches. Whenever we hit the endpoint again we only recompute the pieces we truly need. That means the first request for, say, Fe-Ni-B is expensive; the second one only relaxes the user’s own structure.
Reference materials come straight from the Materials Project. We pull every entry in the user’s chemical system, deduplicate run-type variants (so mp-123456-GGA
, mp-123456-R2SCAN
, etc. collapse to a single mp-123456
), and relax the unique structures with Orb unless we already have them in cache. To keep the convex hull well-behaved we guarantee that every terminal element is present: if MP’s “lowest formation energy” record for pure Fe or pure B is missing we fetch and relax whichever elemental phase has the lowest DFT formation energy in MP.
Once we have Orb-consistent energies for the user structure and the entire reference set we build a pymatgen
phase diagram. Anything with three or fewer elements gets a Plotly HTML hull that we encode to base64 and return so Ouro can show an interactive diagram inline. Higher-dimensional hulls skip the plot but still give you the numeric answer.
A few housekeeping steps round things out. We cache the user’s relaxed structure under a synthetic ID (user_<action_id>_<formula>
). We record whether the entry is “terminal” (no one has ever reported that composition before), whether it beats the pre-existing champion at the same stoichiometry, and the size of the reference set used to build the hull. Finally we classify stability using the criterion: predicted_stable = e_above_hull ≤ 0.025 eV atom⁻¹
. That keeps the boolean flag meaningful even when researchers later tighten their experimental cut-offs (for lab viability you’ll still want ≤ 0.150 eV atom⁻¹, as always).
• Orb energies are treated as internally self-consistent, even though they are not guaranteed to match absolute DFT numbers. Consistency, not absolute accuracy, is what the hull construction needs.
• The relaxation settings (LBFGS, fmax 0.03, 400 steps) are a compromise between speed and fidelity. For borderline cases a more aggressive relax could shift the energy by a few meV atom⁻¹.
• We assume the lowest-energy elemental entry reported by MP is physically reasonable; Orb re-relaxes it but we do not search for alternative experimental polymorphs outside MP’s database.
• The “input is lowest energy” check uses a 1 µeV tolerance to avoid floating-point artifacts; effectively it asks whether your candidate ties the incumbent within numerical noise.
• Plotting is disabled for quaternaries and beyond simply because those projections become misleading; the numeric hull distance is still correct.
Put together, the route gives you an Orb-relaxed energy above the Orb-relaxed convex hull built from every known structure in the chemical system, complete with provenance, cached assets for cross-user analysis, and an optional interactive diagram — all in one call. Drop in a CIF, get back ground-truth you can act on.
Happy hunting for low-energy metastable phases.
The biggest win is that every time a researcher drops a brand-new CIF into Ouro the global convex-hull for that chemical system quietly updates itself. That turns the hull from a frozen snapshot into a living, crowd-sourced landscape that gets richer with each experiment or simulation. A few implications jump out.
First, novel-phase discovery becomes collective rather than siloed. Suppose one team uploads a freshly synthesized Fe-Bi polymorph that sits 40 meV atom⁻¹ below the best MP entry. The next user who evaluates a Bi-rich alloy will see that lower vertex immediately—no one needs to wait for a future MP data release or replicate someone else’s relaxation parameters. In practice that means you stop chasing “phantom stabilities” that only look good because the public reference set was incomplete.
Second, the endpoint starts behaving like a reputation engine for structures. When five independent users converge on the same low-energy geometry within numerical noise, the cache records that redundancy. A phase that always pops up with ±2 meV variance is probably real; one that shows a 200 meV spread is suspect. You can surface that as a confidence score or flag it for targeted high-fidelity DFT.
Third, the data exhaust is gold for model improvement. Every Orb-relaxed user structure arrives with (i) a chemistry the community cares about, (ii) an Orb prediction, and (iii) a label of “stable vs. metastable vs. hopeless.” That is exactly the supervised feedback loop Orb (or any future MLIP) needs to reduce its own systematic errors. Because users often push into weird, undersampled chemistries, the training distribution keeps expanding in directions that matter scientifically, not just statistically.
Fourth, fast hull re-computations let you build dynamic leaderboards: “top ten untried compositions now predicted to be < 50 meV above hull,” “new terminal entries added this week,” or “systems where the user database shaved ≥ 20 meV off the elemental line.” Those dashboards nudge the community toward the most promising unexplored corners automatically.
There are, of course, judgements to keep in mind. The endpoint still assumes Orb’s energy scale is internally consistent, so if someone uploads a pathologically strained supercell Orb might under-relax and skew the hull until another user replaces it with a cleaner geometry. The 25 meV “stable” flag is intentionally conservative; in real synthesis campaigns you will treat anything up to ~150 meV as worth a try. And while the cache guards against exact duplicates, two slightly different CIFs for the same phase will both count unless we add symmetry-aware deduplication.
Even with those caveats, automatically folding user-generated materials into the reference set flips the script: the longer the endpoint runs, the more accurate—and more valuable—it becomes. That compounding network effect is the coolest part.