SynthLabs (@synth_labs) 's Twitter Profile
SynthLabs

@synth_labs

Scaling Up Good Synthetic Reasoning

We're hiring! ➡️ jobs.synthlabs.ai

ID: 1524469250253066240

linkhttps://www.SynthLabs.ai calendar_today11-05-2022 19:20:00

127 Tweet

14,14K Takipçi

47 Takip Edilen

Daniel van Strien (@vanstriendaniel) 's Twitter Profile Photo

Big-Math: Big-Math: Massive Math Dataset for RL Training - 10x larger than GSM8k/MATH - 3 core properties: uniquely verifiable, open-ended, closed-form - Human-validated 90%+ precision filters - Difficulty metrics for curriculum learning

Big-Math: Big-Math: Massive Math Dataset for RL Training

- 10x larger than GSM8k/MATH
- 3 core properties: uniquely verifiable, open-ended, closed-form
- Human-validated 90%+ precision filters
- Difficulty metrics for curriculum learning
Nebius (@nebiusai) 's Twitter Profile Photo

The final stop in our meetup series will be in San Francisco! 🌁 nebius.com/events/nebius-… Join us at Convene 100 Stockton near Union Square on Thursday, March 13, for a deep dive into our AI cloud. Our developers, AI R&D engineers and architects will share insights with the tech

The final stop in our meetup series will be in San Francisco! 🌁 nebius.com/events/nebius-…

Join us at Convene 100 Stockton near Union Square on Thursday, March 13, for a deep dive into our AI cloud. Our developers, AI R&D engineers and architects will share insights with the tech
Alon Albalak (@albalakalon) 's Twitter Profile Photo

Happy to finally announce Big-MATH, the largest math reasoning dataset purposefully designed for large-scale RL! We worked tirelessly, cleaning and filtering math datasets so that you don't have to!

Happy to finally announce Big-MATH, the largest math reasoning dataset purposefully designed for large-scale RL!

We worked tirelessly, cleaning and filtering math datasets so that you don't have to!
Alon Albalak (@albalakalon) 's Twitter Profile Photo

🤯 Big-Math is the #3 most popular dataset on Hugging Face If you're using it, I'd love to see the results of your work🤩Please share with us

🤯 Big-Math is the #3 most popular dataset on <a href="/huggingface/">Hugging Face</a> 

If you're using it, I'd love to see the results of your work🤩Please share with us
nathan lile (@nathanthinks) 's Twitter Profile Photo

thrilled to see Big-MATH climbing to #3️⃣ on Hugging Face—clear signal the community wants more high-quality, verifiable RL datasets. grateful to everyone who’s been liking, downloading, and supporting ❤️

thrilled to see Big-MATH climbing to #3️⃣ on <a href="/huggingface/">Hugging Face</a>—clear signal the community wants more high-quality, verifiable RL datasets.

grateful to everyone who’s been liking, downloading, and supporting ❤️
nathan lile (@nathanthinks) 's Twitter Profile Photo

📜 Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models: arxiv.org/abs/2502.17387 🤗 Hugging Face dataset huggingface.co/datasets/Synth…

Rafael Rafailov @ NeurIPS (@rm_rafailov) 's Twitter Profile Photo

This is the dataset we curated for our own reasoning experiments. There is a lot of reasoning data coming out now, but we spend extra time on this to make sure all the problems are high-quality and suitable for RL training!

The AI Timeline (@theaitimeline) 's Twitter Profile Photo

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models Author's Explanation: x.com/synth_labs/sta… Overview: Big-Math, a dataset of over 250,000 high-quality math questions with verifiable answers, is purposefully designed for

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

Author's Explanation:
x.com/synth_labs/sta…

Overview:
Big-Math, a dataset of over 250,000 high-quality math questions with verifiable answers, is purposefully designed for
nathan lile (@nathanthinks) 's Twitter Profile Photo

Qwen+RL = dramatic, Aha! Llama+RL = quick plateau Same size. Same RL. Why? Qwen naturally exhibits cognitive behaviors that Llama doesn't Prime Llama with 4 synthetic reasoning patterns & it matched Qwen's self-improvement performance! We can engineer this into any model! 👇

nathan lile (@nathanthinks) 's Twitter Profile Photo

models primed with INCORRECT solutions but with RIGHT BEHAVIORS achieve identical performance to those trained on correct solutions? > optimize for behaviors & amplify with RL

models primed with INCORRECT solutions but with RIGHT BEHAVIORS achieve identical performance to those trained on correct solutions?

&gt; optimize for behaviors &amp; amplify with RL
nathan lile (@nathanthinks) 's Twitter Profile Photo

btw, random fun fact we pointed out months ago: the only MATH example OpenAI published with o1 announcement included an unsubstantiated assumption 😬

btw, random fun fact we pointed out months ago:

the only MATH example <a href="/OpenAI/">OpenAI</a> published with o1 announcement included an unsubstantiated assumption 😬
Nebius (@nebiusai) 's Twitter Profile Photo

Read how SynthLabs, a startup developing AI solutions tailored for logical reasoning, is advancing AI post-training with our @TractoAI: nebius.com/customer-stori… 🔹 Goal: Develop an ML system that empowers reasoning models to surpass pattern matching and implement sophisticated

Read how <a href="/synth_labs/">SynthLabs</a>, a startup developing AI solutions tailored for logical reasoning, is advancing AI post-training with our @TractoAI: nebius.com/customer-stori…

🔹 Goal:
Develop an ML system that empowers reasoning models to surpass pattern matching and implement sophisticated
nathan lile (@nathanthinks) 's Twitter Profile Photo

btw we have ongoing research on this front! we're open-science, pro-publication, and love collaboration. want to push this frontier forward? we're growing our SF team & always open to research partners—reach out, my DMs are open 📩

nathan lile (@nathanthinks) 's Twitter Profile Photo

Generative Reward Models impact compounds daily. way stronger interest now than when we published last fall 👇 many excellent recent extensions—cool seeing where researchers take GenRM

Generative Reward Models impact compounds daily.
way stronger interest now than when we published last fall 👇

many excellent recent extensions—cool seeing where 
researchers take GenRM