SynthLabs (@synth_labs) 's Twitter Profile
SynthLabs

@synth_labs

Scaling Up Good Synthetic Reasoning

We're hiring! ➡️ jobs.synthlabs.ai

ID: 1524469250253066240

linkhttps://www.SynthLabs.ai calendar_today11-05-2022 19:20:00

127 Tweet

14,14K Followers

47 Following

SynthLabs (@synth_labs) 's Twitter Profile Photo

Our new method (ALP) monitors solve rates across RL rollouts and applies inverse difficulty penalties during RL training. Result? Models learn an implicit difficulty estimator—allocating 5x more tokens to hard vs easy problems, cutting overall usage by 50% 🧵👇1/10

Our new method (ALP) monitors solve rates across RL rollouts and applies inverse difficulty penalties during RL training.

Result? Models learn an implicit difficulty estimator—allocating 5x more tokens to hard vs easy problems, cutting overall usage by 50%

🧵👇1/10