Sida (Star) Li (@starli27496427) 's Twitter Profile
Sida (Star) Li

@starli27496427

PhD @DSI_UChicago building @ProphetArena | LLM evaluations, prediction-powered inference & intersection between statistics x AI | Prev: @Berkeley_EECS.

ID: 1536650645117075456

linkhttp://listar2000.github.io calendar_today14-06-2022 10:04:09

12 Tweet

16 Takipรงi

49 Takip Edilen

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Reasoning models often use excessively long thought processes, causing inefficient inference. This paper introduces ShorterBetter, a reinforcement learning method guiding models to find their optimal reasoning length autonomously. It samples multiple outputs, identifies the

Reasoning models often use excessively long thought processes, causing inefficient inference.

This paper introduces ShorterBetter, a reinforcement learning method guiding models to find their optimal reasoning length autonomously.

It samples multiple outputs, identifies the
Prophet Arena (@prophetarena) 's Twitter Profile Photo

๐Ÿ”ฎ Introducing Prophet Arena โ€” the AI benchmark for general predictive intelligence. That is, can AI truly predict the future by connecting todayโ€™s dots? ๐Ÿ‘‰ What makes it special? - It canโ€™t be hacked. Most benchmarks saturate over time, but here models face live, unseen

๐Ÿ”ฎ Introducing Prophet Arena โ€” the AI benchmark for general predictive intelligence.

That is, can AI truly predict the future by connecting todayโ€™s dots?

๐Ÿ‘‰ What makes it special?

- It canโ€™t be hacked. Most benchmarks saturate over time, but here models face live, unseen
Chenghao Yang (@chrome1996) 's Twitter Profile Photo

Where is exploration most impactful in LLM reasoning? The initial tokens! They shape a sequence's entire semantic direction, making early exploration crucial. Our new work, Exploratory Annealed Decoding (EAD), is built on this insight. By starting with high temperature and

Where is exploration most impactful in LLM reasoning? The initial tokens! They shape a sequence's entire semantic direction, making early exploration crucial.

Our new work, Exploratory Annealed Decoding (EAD), is built on this insight. By starting with high temperature and
rLLM (@rllm_project) 's Twitter Profile Photo

๐Ÿš€ Introducing rLLM v0.2 - train arbitrary agentic programs with RL, with minimal code changes. Most RL training systems adopt the agent-environment abstraction. But what about complex workflows? Think solver-critique pairs collaborating, or planner agents orchestrating multiple

๐Ÿš€ Introducing rLLM v0.2 - train arbitrary agentic programs with RL, with minimal code changes.

Most RL training systems adopt the agent-environment abstraction. But what about complex workflows? Think solver-critique pairs collaborating, or planner agents orchestrating multiple
Sida (Star) Li (@starli27496427) 's Twitter Profile Photo

Iโ€™m not an AI infra person, but somehow just got my async LoRA fix merged into Verl ๐Ÿ˜… Spent a few days untangling async RL logic (from knowing nothing) -- no perfect understanding, but found (and fixed!) a sneaky bug. Proof that you can just do things...

Iโ€™m not an AI infra person, but somehow just got my async LoRA fix merged into Verl ๐Ÿ˜…
Spent a few days untangling async RL logic (from knowing nothing) -- no perfect understanding, but found (and fixed!) a sneaky bug.

Proof that you can just do things...
Sida (Star) Li (@starli27496427) 's Twitter Profile Photo

Unfortunately missing #NeurIPS2025 and the SD sunshine ๐Ÿ˜ญ But our first author Justin will be presenting "ShorterBetter" -- chat with him about efficient LLM reasoning! (And yes, this brilliant friend is applying for PhD this cycle!) arxiv.org/pdf/2504.21370

Unfortunately missing #NeurIPS2025 and the SD sunshine ๐Ÿ˜ญ
But our first author <a href="/Justin_6657/">Justin</a> will be presenting "ShorterBetter" -- chat with him about efficient LLM reasoning!
(And yes, this brilliant friend is applying for PhD this cycle!)
arxiv.org/pdf/2504.21370
Cooperative AI Foundation (@coop_ai) 's Twitter Profile Photo

Don't miss our last seminar of the year: 'The Interplay of Economic Thinking and Language Models: Vignettes and Lessons', live 18th of December (5pm GMT, 9am PT, 12pm ET) led by Haifeng Xu (The University of Chicago). Link below.

Don't miss our last seminar of the year: 'The Interplay of Economic Thinking and Language Models: Vignettes and Lessons', live 18th of December (5pm GMT, 9am PT, 12pm ET) led by <a href="/haifengxu0/">Haifeng Xu</a> (<a href="/UChicago/">The University of Chicago</a>). Link below.
Sida (Star) Li (@starli27496427) 's Twitter Profile Photo

Been working on rLLM for the past few months ๐Ÿ˜€! This new version (and more to come) is definitely one step closer to ๐™™๐™š๐™ข๐™ค๐™˜๐™ง๐™–๐™ฉ๐™ž๐™ฏ๐™ž๐™ฃ๐™œ ๐™–๐™œ๐™š๐™ฃ๐™ฉ๐™ž๐™˜ ๐™๐™‡ ๐™ฉ๐™ง๐™–๐™ž๐™ฃ๐™ž๐™ฃ๐™œ -- any agent you can write down, rLLM will help you train it.

Sida (Star) Li (@starli27496427) 's Twitter Profile Photo

Huge congrats on making Tinker fully public! ๐Ÿš€ With rLLM (rLLM), integrating your (multi-)agent workflows with Tinkerโ€™s infrastructure is now super easy~ Docs + example here ๐Ÿ‘‡ rllm-project.readthedocs.io/en/latest/examโ€ฆ

Sida (Star) Li (@starli27496427) 's Twitter Profile Photo

During the past 4 months since the debut of Prophet Arena, our amazing team has: 1. Added 1000+ forecasting events to the platform and supported more SOTA models. 2. Curated the "agent benchmark" where the competing agent performs end-to-end forecasts. More to come soon!

Sida (Star) Li (@starli27496427) 's Twitter Profile Photo

How to enjoy the best of two worlds: alignment from the aligned model and the diversity in the base model? Check out this simple but elegant "base-align"-collaboration work by Yichen (Zach) Wang and Chenghao Yang et al. ๐Ÿ‘‡

Prophet Arena (@prophetarena) 's Twitter Profile Photo

Happy New Year! Here are some AI Forecasts for 2026๐Ÿ”ฎ Most likely World Cup winner: ๐Ÿ‡ช๐Ÿ‡ธ Spain Spotify #1 artist: Taylor Swift (Qwen 3 235B says 100%) 75% - GTA 6 releases before end of 2026 (Grok-4) 65% - One Battle After Another wins Best Picture (Claude Sonnet 4) 55% - U.S.

Sida (Star) Li (@starli27496427) 's Twitter Profile Photo

๐Ÿš€ Huge congrats to Manan Roongta, Sijun Tan, and the Snorkel AI team on building this impressive Financial Analysis agent! Another strong example of how rLLM powers RL training across diverse reasoning tasks - from finance to beyond. Stay tuned for new rLLM features!

Yinjie Wang (@yinjiew2024) 's Twitter Profile Photo

Train your ๐ŸฆžOpenClaw๐Ÿฆž simply by talking to it. Meet OpenClaw-RL. Host your model on our RL server, and your LLM gets optimized automatically. Use it anywhere. Keep it private. Make it more personal every day. We have fully open sourced everything. Come in and have fun!