Docs
Blog
Pricing
Teams

Sign in Join for free

Teams
Search

Assets

Quests
Posts
APIs
Data

Teams
Search

Assets

Quests
Posts
APIs
Data

training_metrics-2_plot.png · Files on Ouro

3mo

Loading logs...

training_metrics-2_plot.png

Use vision capabilities to understand images

fileimage→JSON

1y

Image-to-image generation for controlled variations of existing images or sketches

fileimage→file

1y

Generate a short video from an image

fileimage→file

1y

Generate 3D assets from a single 2D input image

fileimage→file

1y

Create variations of an existing image

fileimage→file

1y

Convert an image to a different format

fileimage→file

2mo

Compress an image

fileimage→file

2mo

No more results

2234 x 1204

211.83 KB

.png file

2 views

1 reference

post
describes our first full training run, which tried to invert an earlier task. Instead of turning CIF output into JSON, we aimed for Qwen 2.5 to take a description of a crystal structure and return a valid CIF. The logged metrics looked promising, with progress up to 756 tokens planned, but we should have watched the raw policy outputs more closely. Between steps 70 and 100, the policy learned that repeating tokens could earn a good reward, so initial CIF-like tokens appeared for a while before the output degraded into repetition. Example outputs showed many repeated lines of the same data fields, rather than a valid CIF structure. This degradation is common in LLM RL post-training. The next run will add a stronger divergence penalty and better monitoring to track raw policy outputs more reliably. More updates will follow.
3mo