Chase Blagden (@chaseblagden) Twitter Tweets • TwiCopy

Anikait Singh

a year ago

Scaling LLMs with more data is hitting its limits. To address more complex tasks, we need innovative approaches. Shifting from teaching models what to answer to how to solve problems, leveraging test-time compute and meta-RL, could be the solution. Check out Rafael's 🧵 below!

thumb_up_off_alt32

chat_bubble_outline0

repeat5

shareShare

SynthLabs

@synth_labs

a year ago

Ever watched someone solve a hard math problem? Their first attempt is rarely perfect. They sketch ideas, cross things out, and try new angles. This process of exploration is key to human reasoning and our latest research formalizes this as Meta Chain-of-Thought (1/8) 🧵👇

thumb_up_off_alt225

chat_bubble_outline7

repeat42

shareShare

Chase Blagden

@chaseblagden

10 months ago

Deepseek has escaped containment

thumb_up_off_alt15

chat_bubble_outline0

repeat1

shareShare

Rafael Rafailov @ NeurIPS

@rm_rafailov

10 months ago

Meta-RL- learning to think

thumb_up_off_alt36

chat_bubble_outline2

repeat3

shareShare

Aran Komatsuzaki

@arankomatsuzaki

10 months ago

SynthLabs presents: Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

thumb_up_off_alt161

chat_bubble_outline3

repeat29

shareShare

Charlie Snell

@sea_snell

10 months ago

Pokémon Go to the Claude

thumb_up_off_alt20

chat_bubble_outline1

repeat1

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

10 months ago

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models "In this work, we present Big-Math, a dataset of over 250,000 high-quality math questions with verifiable answers, purposefully made for reinforcement learning (RL). To create

thumb_up_off_alt339

chat_bubble_outline4

repeat79

shareShare

SynthLabs

@synth_labs

10 months ago

Releasing Big-MATH—the first heavily curated & verifiable dataset designed specifically for large-scale RL training & LLM reasoning! 📝 250,000+ problems, 47k NEW Q's ✅ 10x larger than existing datasets like MATH 🧑‍⚖️ Verifiable—we eliminated 400k+ problems Details below! 🧵👇

thumb_up_off_alt142

chat_bubble_outline3

repeat16

shareShare

SynthLabs

@synth_labs

10 months ago

Start exploring Big-MATH today! 📄 Paper: arxiv.org/abs/2502.17387 💻 Code: github.com/SynthLabsAI/bi… 📂 Dataset: huggingface.co/datasets/Synth…

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Alon Albalak

@albalakalon

10 months ago

Happy to finally announce Big-MATH, the largest math reasoning dataset purposefully designed for large-scale RL! We worked tirelessly, cleaning and filtering math datasets so that you don't have to!

thumb_up_off_alt126

chat_bubble_outline5

repeat18

shareShare

Nebius

@nebiusai

8 months ago

Read how SynthLabs, a startup developing AI solutions tailored for logical reasoning, is advancing AI post-training with our @TractoAI: nebius.com/customer-stori… 🔹 Goal: Develop an ML system that empowers reasoning models to surpass pattern matching and implement sophisticated

Read how <a href="/synth_labs/">SynthLabs</a>, a startup developing AI solutions tailored for logical reasoning, is advancing AI post-training with our @TractoAI: nebius.com/customer-stori…

🔹 Goal:
Develop an ML system that empowers reasoning models to surpass pattern matching and implement sophisticated

thumb_up_off_alt59

chat_bubble_outline2

repeat14

shareShare

Asher Trockman

@ashertrockman

8 months ago

Are you a frontier lab investing untold sums in training? Are you trying to stay competitive? Are you finding that your competitors' models are ... thinking a bit too much like yours? Then antidistillation.com might be for you! Sam Altman Elon Musk

thumb_up_off_alt139

chat_bubble_outline5

repeat29

shareShare

Benjamin Spiegel

@superspeeg

8 months ago

Why did only humans invent graphical systems like writing? 🧠✍️ In our new paper at CogSci Society, we explore how agents learn to communicate using a model of pictographic signification similar to human proto-writing. 🧵👇

thumb_up_off_alt1,1K

chat_bubble_outline22

repeat180

shareShare

Bahareh Tolooshams

@btolooshams

7 months ago

We have released VARS-fUSI: Variable sampling for fast and efficient functional ultrasound imaging (fUSI) using neural operators. The first deep learning fUSI method to allow for different sampling durations and rates during training and inference. biorxiv.org/content/10.110… 1/

thumb_up_off_alt49

chat_bubble_outline1

repeat15

shareShare

Chase Blagden

@chaseblagden

7 months ago

>What do you do? >RL Agents for <X>

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Devin

@_chotzen

7 months ago

Our first long-horizon agentic software engineering model is here! We've shipped a model that matches Claude on Cascade in a lot of ways. However the most exciting thing about this release is the trajectory we're on. So much left to do... we're hiring!

thumb_up_off_alt33

chat_bubble_outline0

repeat4

shareShare

nathan lile

@nathanthinks

7 months ago

excellent work by Jason Weston & team—extending our "Generative Reward Models" work with RL (GRPO) to optimize LLM reasoning during judgment scalable (synthetic) evaluation continues to be AI's key bottleneck!

excellent work by <a href="/jaseweston/">Jason Weston</a> & team—extending our "Generative Reward Models" work with RL (GRPO) to optimize LLM reasoning during judgment

scalable (synthetic) evaluation continues to be AI's key bottleneck!

thumb_up_off_alt95

chat_bubble_outline1

repeat12

shareShare

nathan lile

@nathanthinks

7 months ago

btw we have ongoing research on this front! we're open-science, pro-publication, and love collaboration. want to push this frontier forward? we're growing our SF team & always open to research partners—reach out, my DMs are open 📩

thumb_up_off_alt55

chat_bubble_outline16

repeat7

shareShare