verl project (@verl_project) 's Twitter Profile
verl project

@verl_project

Open RL library for LLMs.
github.com/volcengine/verl
Join us on verl-project.slack.com

ID: 1884257238099058692

calendar_today28-01-2025 15:08:41

50 Tweet

489 Followers

11 Following

Bairu Hou (@hou_bairu) 's Twitter Profile Photo

1/ Long chain-of-thought (CoT) reasoning boosts LLM performance—but with a computational overhead. Checkout our new paper, ThinkPrune, where we explore a simple question: To what extent can we cut the reasoning length while keep the quality? We show that by simply adding a hard

1/ Long chain-of-thought (CoT) reasoning boosts LLM performance—but with a computational overhead.

Checkout our new paper, ThinkPrune, where we explore a simple question: To what extent can we cut the reasoning length while keep the quality?

We show that by simply adding a hard
Jinjie Ni @ ICLR'25 🇸🇬 (@nijinjie) 's Twitter Profile Photo

Remember the NoisyStudent topping ImageNet back in 2019🏆? Was it the last dance of noisy training? 🍻 Meet NoisyRollout, our new noisy training efforts in building stronger o1-like visual reasoners. ✨ With only 2.1k training data and zero additional training cost, it hits

Remember the NoisyStudent topping ImageNet back in 2019🏆? Was it the last dance of noisy training? 

🍻 Meet NoisyRollout, our new noisy training efforts in building stronger o1-like visual reasoners. 

✨ With only 2.1k training data and zero additional training cost, it hits
Lang Feng (@langfengq) 's Twitter Profile Photo

Open-source "verl-agent" codebase is evolving fast⚡ A scalable, multi-turn reinforcement learning framework for training LLM/VLM-based agents — now with rich features! (see summary in image below🔽) 🚀 Try it out and train your own LLM agents 📎 GitHub: github.com/langfengQ/verl…

Open-source "verl-agent" codebase is evolving fast⚡

A scalable, multi-turn reinforcement learning framework for training LLM/VLM-based agents — now with rich features! (see summary in image below🔽)

🚀 Try it out and train your own LLM agents
📎 GitHub: github.com/langfengQ/verl…
Lambda (@lambdaapi) 's Twitter Profile Photo

Distributed training on GPU clusters shouldn't be complex. Check out the latest blog on orchestrating reasoning agent training with RAGEN and verl project on our 1-Click Clusters, powered by dstack 🔗 lambda.ai/blog/agent-tra…

Distributed training on GPU clusters shouldn't be complex.

Check out the latest blog on orchestrating reasoning agent training with RAGEN and <a href="/verl_project/">verl project</a> on our 1-Click Clusters, powered by <a href="/dstackai/">dstack</a> 

🔗 lambda.ai/blog/agent-tra…
verl project (@verl_project) 's Twitter Profile Photo

DeepSeek 671b and Qwen3 236b support with Megatron backend is now available as preview in verl v0.4.0 🔥🔥🔥 We will continue optimizing MoE model performance down the road. DeepSeek 671b: verl.readthedocs.io/en/latest/perf… verl v0.4: github.com/volcengine/ver…

DeepSeek 671b and Qwen3 236b support with Megatron backend is now available as preview in verl v0.4.0 🔥🔥🔥
We will continue optimizing MoE model performance down the road.

DeepSeek 671b:  verl.readthedocs.io/en/latest/perf… 
verl v0.4: github.com/volcengine/ver…
Casper Hansen (@casper_hansen_) 's Twitter Profile Photo

💥Async RL rollouts that are 75% faster than other async implementations - removing all synchronous parts of the rollout - a single step in multi-turn is independent and async of all other completions - completions can finish independent of other completions

💥Async RL rollouts that are 75% faster than other async implementations
- removing all synchronous parts of the rollout
- a single step in multi-turn is independent and async of all other completions
- completions can finish independent of other completions
Zhaochen Su (@suzhaochen0110) 's Twitter Profile Photo

🚀 Thrilled to unveil ReVisual-R1! Our 7B open-source MLLM achieves long, accurate & thoughtful reasoning! 🔥 SOTA on 9 key benchmarks! Including AIME24 (53.3) & MathVision (48.8). Overall +16.8% avg! 📈 📄 Paper: arxiv.org/pdf/2506.04207 💻 Code: github.com/CSfufu/Revisua…

🚀 Thrilled to unveil ReVisual-R1! Our 7B open-source MLLM achieves long, accurate &amp; thoughtful reasoning!
🔥 SOTA on 9 key benchmarks! Including AIME24 (53.3) &amp; MathVision (48.8). Overall +16.8% avg! 📈
📄 Paper: arxiv.org/pdf/2506.04207
💻 Code: github.com/CSfufu/Revisua…
Infini-AI-Lab (@infiniailab) 's Twitter Profile Photo

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training. More rollouts lead to better model performance, but they’re also a major bottleneck in

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training.

More rollouts lead to better model performance, but they’re also a major bottleneck in
Chenxin An (@anchancy46881) 's Twitter Profile Photo

# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels

# 🚨 4B open-recipe model beats Claude-4-Opus 
🔓 100% open data, recipe, model weights and code.

Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 

🥳 Check out how we boost open-recipe reasoning models to incredible performance levels
verl project (@verl_project) 's Twitter Profile Photo

If you're in Singapore on 7/11, do not miss this meetup! Talks from the verl community: - LLMs to optimize code performance on real-world repos & verl project updates Qian Liu - Long-horizon LLM agent training with verl-agent Lang Feng Link: lu.ma/e498qhsi

elvis (@omarsar0) 's Twitter Profile Photo

MemAgent MemAgent-14B is trained on 32K-length documents with an 8K context window. Achieves >76% accuracy even at 3.5M tokens! That consistency is crazy! Here are my notes:

MemAgent

MemAgent-14B is trained on 32K-length documents with an 8K context window.

Achieves &gt;76% accuracy even at 3.5M tokens!

That consistency is crazy!

Here are my notes:
verl project (@verl_project) 's Twitter Profile Photo

The 1st verl meetup will be held at ICML Vancouver on July 16th! Please join us if you will be there! lu.ma/0ek2nyao (onsite only) Featuring speakers from verl & SGLang dev team, plus Beidi Chen from Infini-AI-Lab and Yi Wu from Ant RL Lab #verl #ICML #Vancouver

Chujie Zheng (@chujiezheng) 's Twitter Profile Photo

Proud to introduce Group Sequence Policy Optimization (GSPO), our stable, efficient, and performant RL algorithm that powers the large-scale RL training of the latest Qwen3 models (Instruct, Coder, Thinking) 🚀 📄 huggingface.co/papers/2507.18…

Proud to introduce Group Sequence Policy Optimization (GSPO), our stable, efficient, and performant RL algorithm that powers the large-scale RL training of the latest Qwen3 models (Instruct, Coder, Thinking) 🚀

📄 huggingface.co/papers/2507.18…
Girish (@googrish) 's Twitter Profile Photo

To push the open source frontier for RL + LLMs, we need scalable, modular environments with real-world complexity, beyond math benchmarks. Today, we’re releasing *benchmax*. An open-source framework to build, run, & scale useful RL envs for LLM fine-tuning, with integrations to

To push the open source frontier for RL + LLMs, we need scalable, modular environments with real-world complexity, beyond math benchmarks.

Today, we’re releasing *benchmax*.

An open-source framework to build, run, &amp; scale useful RL envs for LLM fine-tuning, with integrations to
Chujie Zheng (@chujiezheng) 's Twitter Profile Photo

GSPO has been integrated into verl project (github.com/volcengine/ver…) and TRL (github.com/huggingface/tr…). Thanks for the prompt support from the community 🚀

SkyPilot (@skypilot_org) 's Twitter Profile Photo

VeRL now officially supports launching via SkyPilot! Let SkyPilot deal with infra heavylifting for verl project: 🚀Spin up VeRL workers on your k8s or clouds 🔧Set up ray 🤖Ignite your agentic RL training Check out the VeRL doc: verl.readthedocs.io/en/latest/star…

VeRL now officially supports launching via SkyPilot!

Let SkyPilot deal with infra heavylifting for <a href="/verl_project/">verl project</a>:

🚀Spin up VeRL workers on your k8s or clouds
🔧Set up ray
🤖Ignite your agentic RL training

Check out the VeRL doc:
verl.readthedocs.io/en/latest/star…