verl project (@verl_project) Twitter Tweets • TwiCopy

Bairu Hou

8 months ago

1/ Long chain-of-thought (CoT) reasoning boosts LLM performance—but with a computational overhead. Checkout our new paper, ThinkPrune, where we explore a simple question: To what extent can we cut the reasoning length while keep the quality? We show that by simply adding a hard

thumb_up_off_alt119

chat_bubble_outline4

repeat19

shareShare

Jinjie Ni @ ICLR'25 🇸🇬

@nijinjie

8 months ago

Remember the NoisyStudent topping ImageNet back in 2019🏆? Was it the last dance of noisy training? 🍻 Meet NoisyRollout, our new noisy training efforts in building stronger o1-like visual reasoners. ✨ With only 2.1k training data and zero additional training cost, it hits

thumb_up_off_alt76

chat_bubble_outline3

repeat17

shareShare

Lang Feng

@langfengq

6 months ago

Open-source "verl-agent" codebase is evolving fast⚡ A scalable, multi-turn reinforcement learning framework for training LLM/VLM-based agents — now with rich features! (see summary in image below🔽) 🚀 Try it out and train your own LLM agents 📎 GitHub: github.com/langfengQ/verl…

thumb_up_off_alt19

chat_bubble_outline1

repeat1

shareShare

Lambda

@lambdaapi

6 months ago

Distributed training on GPU clusters shouldn't be complex. Check out the latest blog on orchestrating reasoning agent training with RAGEN and verl project on our 1-Click Clusters, powered by dstack 🔗 lambda.ai/blog/agent-tra…

Distributed training on GPU clusters shouldn't be complex.

Check out the latest blog on orchestrating reasoning agent training with RAGEN and <a href="/verl_project/">verl project</a> on our 1-Click Clusters, powered by <a href="/dstackai/">dstack</a>

🔗 lambda.ai/blog/agent-tra…

thumb_up_off_alt17

chat_bubble_outline4

repeat8

shareShare

verl project

@verl_project

6 months ago

DeepSeek 671b and Qwen3 236b support with Megatron backend is now available as preview in verl v0.4.0 🔥🔥🔥 We will continue optimizing MoE model performance down the road. DeepSeek 671b: verl.readthedocs.io/en/latest/perf… verl v0.4: github.com/volcengine/ver…

thumb_up_off_alt105

chat_bubble_outline0

repeat12

shareShare

Casper Hansen

@casper_hansen_

6 months ago

💥Async RL rollouts that are 75% faster than other async implementations - removing all synchronous parts of the rollout - a single step in multi-turn is independent and async of all other completions - completions can finish independent of other completions

thumb_up_off_alt18

chat_bubble_outline2

repeat3

shareShare

Zhaochen Su

@suzhaochen0110

6 months ago

🚀 Thrilled to unveil ReVisual-R1! Our 7B open-source MLLM achieves long, accurate & thoughtful reasoning! 🔥 SOTA on 9 key benchmarks! Including AIME24 (53.3) & MathVision (48.8). Overall +16.8% avg! 📈 📄 Paper: arxiv.org/pdf/2506.04207 💻 Code: github.com/CSfufu/Revisua…

thumb_up_off_alt135

chat_bubble_outline4

repeat28

shareShare

Infini-AI-Lab

@infiniailab

6 months ago

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training. More rollouts lead to better model performance, but they’re also a major bottleneck in

thumb_up_off_alt163

chat_bubble_outline1

repeat31

shareShare

Chenxin An

@anchancy46881

6 months ago

# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels

thumb_up_off_alt441

chat_bubble_outline23

repeat80

shareShare

verl project

@verl_project

5 months ago

If you're in Singapore on 7/11, do not miss this meetup! Talks from the verl community: - LLMs to optimize code performance on real-world repos & verl project updates Qian Liu - Long-horizon LLM agent training with verl-agent Lang Feng Link: lu.ma/e498qhsi

thumb_up_off_alt15

chat_bubble_outline0

repeat4

shareShare

elvis

@omarsar0

5 months ago

MemAgent MemAgent-14B is trained on 32K-length documents with an 8K context window. Achieves >76% accuracy even at 3.5M tokens! That consistency is crazy! Here are my notes:

thumb_up_off_alt560

chat_bubble_outline10

repeat115

shareShare

verl project

@verl_project

5 months ago

The 1st verl meetup will be held at ICML Vancouver on July 16th! Please join us if you will be there! lu.ma/0ek2nyao (onsite only) Featuring speakers from verl & SGLang dev team, plus Beidi Chen from Infini-AI-Lab and Yi Wu from Ant RL Lab #verl #ICML #Vancouver

thumb_up_off_alt16

chat_bubble_outline0

repeat2

shareShare

Chujie Zheng

@chujiezheng

5 months ago

Proud to introduce Group Sequence Policy Optimization (GSPO), our stable, efficient, and performant RL algorithm that powers the large-scale RL training of the latest Qwen3 models (Instruct, Coder, Thinking) 🚀 📄 huggingface.co/papers/2507.18…

thumb_up_off_alt1,1K

chat_bubble_outline18

repeat143

shareShare

Girish

@googrish

5 months ago

To push the open source frontier for RL + LLMs, we need scalable, modular environments with real-world complexity, beyond math benchmarks. Today, we’re releasing *benchmax*. An open-source framework to build, run, & scale useful RL envs for LLM fine-tuning, with integrations to

thumb_up_off_alt78

chat_bubble_outline4

repeat25

shareShare

Chujie Zheng

@chujiezheng

4 months ago

GSPO has been integrated into verl project (github.com/volcengine/ver…) and TRL (github.com/huggingface/tr…). Thanks for the prompt support from the community 🚀

thumb_up_off_alt204

chat_bubble_outline3

repeat24

shareShare

SkyPilot

@skypilot_org

4 months ago

VeRL now officially supports launching via SkyPilot! Let SkyPilot deal with infra heavylifting for verl project: 🚀Spin up VeRL workers on your k8s or clouds 🔧Set up ray 🤖Ignite your agentic RL training Check out the VeRL doc: verl.readthedocs.io/en/latest/star…

VeRL now officially supports launching via SkyPilot!

Let SkyPilot deal with infra heavylifting for <a href="/verl_project/">verl project</a>:

🚀Spin up VeRL workers on your k8s or clouds
🔧Set up ray
🤖Ignite your agentic RL training

Check out the VeRL doc:
verl.readthedocs.io/en/latest/star…

thumb_up_off_alt15

chat_bubble_outline1

repeat5

shareShare

verl project

@verl_project

4 months ago

Nice OpenAI gym-like environment interface, compatible with verl

thumb_up_off_alt13

chat_bubble_outline0

repeat3

shareShare