Ziniu Li @ ICLR2025 (@ziniuli) 's Twitter Profile
Ziniu Li @ ICLR2025

@ziniuli

Ph.D. student @ CUHK, Shenzhen.

Working toward the science of RL and LLMs.

ID: 900213437929750528

linkhttp://www.liziniu.org calendar_today23-08-2017 04:29:43

145 Tweet

432 Followers

477 Following

Zhengyang Tang (@zhengyang_42) 's Twitter Profile Photo

We’re excited to share our new paper “CoRT: Code-integrated Reasoning within Thinking”! 🤖 A post-training framework that teaches Large Reasoning Models (LRMs) to better leverage Code Interpreters for enhanced mathematical reasoning. 🔍 Key Highlights: Strategic hint

We’re excited to share our new paper “CoRT: Code-integrated Reasoning within Thinking”!

🤖 A post-training framework that teaches Large Reasoning Models (LRMs) to better leverage Code Interpreters for enhanced mathematical reasoning.

🔍 Key Highlights:

Strategic hint
Zhengyang Tang (@zhengyang_42) 's Twitter Profile Photo

🚀 Thrilled to announce that our paper "SCRIT: Self-Evolving LLM Critique without Human or Stronger Models" was accepted to #COLM2025! We enable LLMs to self-improve critique abilities — zero human annotations, zero stronger models needed! 🔄✨ Looking forward to meeting

Ge Zhang (@gezhang86038849) 's Twitter Profile Photo

Amazing work by Rui-Jie (Ridger) Zhu ,more resources to investigate the mechanism behind the hybrid linear attention. Resources: Paper: arxiv.org/pdf/2507.06457 Huggingface CKPT Link: huggingface.co/collections/m-…

Xueyao Zhang (@xueyao_98) 's Twitter Profile Photo

🚀 Our #ACL2025 work INTP unveils how Preference Alignment improves diverse TTS models (AR, Flow-matching, and Masked Generative Model)! Unlock the secrets: ➡️ Customized Post-Training. ➡️ Human-Guided Unlearning. ➡️ Heterogeneous preference pairs to avoid reward hacking.

🚀 Our #ACL2025  work INTP unveils how Preference Alignment improves diverse TTS models (AR, Flow-matching, and Masked Generative Model)! 

Unlock the secrets:
➡️ Customized Post-Training.
➡️ Human-Guided Unlearning.
➡️  Heterogeneous preference pairs to avoid reward hacking.
Jiashuo Liu (@liujiashuo77) 's Twitter Profile Photo

We built FutureX, the world’s first live benchmark for real future prediction — politics, economy, culture, sports, etc. Among 23 AI agents, #Grok4 ranked #1 🏆 Elon didn’t lie. Elon Musk your model sees further 🚀🍀 LeaderBoard: futurex-ai.github.io

We built FutureX, the world’s first live benchmark for real future prediction — politics, economy, culture, sports, etc. 
Among 23 AI agents, #Grok4 ranked #1 🏆
Elon didn’t lie.
<a href="/elonmusk/">Elon Musk</a> your model sees further 🚀🍀

 LeaderBoard: futurex-ai.github.io
Ge Zhang (@gezhang86038849) 's Twitter Profile Photo

Is text-only information enough for LLM/VLM Web Agents? 🤔 Clearly not. 🙅‍♂️ The modern web is a rich tapestry of text, images 🖼️, and videos 🎥. To truly assist us, agents need to understand it all. That's why we built MM-BrowseComp. 🌐 We're introducing MM-BrowseComp 🚀, a new

Is text-only information enough for LLM/VLM Web Agents? 🤔 Clearly not. 🙅‍♂️ The modern web is a rich tapestry of text, images 🖼️, and videos 🎥. To truly assist us, agents need to understand it all. That's why we built MM-BrowseComp. 🌐

We're introducing MM-BrowseComp 🚀, a new
Yizhi Li (@yizhilll) 's Twitter Profile Photo

[1/n] Introducing TreePO🌲, a new RL framework for LLMs! It slashes sampling costs while boosting reasoning capabilities. Daily Paper: huggingface.co/papers/2508.17…

Ziniu Li @ ICLR2025 (@ziniuli) 's Twitter Profile Photo

Thanks to AK for reposting our work! It's exciting to see how the knapsack formulation reshapes the view of system efficiency to unlock exploration and scale RL.

Ziniu Li @ ICLR2025 (@ziniuli) 's Twitter Profile Photo

Thank Yiyou Sun for adding our work on Knapsack RL to this excellent collection. I strongly believe that focusing on the hard-tier problems—where traditional RLVR pipelines fail—is crucial for advancing our understanding and methodologies. This living repository is a vital

Ziniu Li @ ICLR2025 (@ziniuli) 's Twitter Profile Photo

Excited to share our research on scaling looped language models! We explore how next-generation foundation models can scale latent reasoning and more efficiently leverage parameters for knowledge manipulation!

zeng zhiyuan (@zhiyuan_nlper) 's Twitter Profile Photo

🚀 Thrilled to share our new work, "RLoop: A Self-Improving Framework for Reinforcement Learning"! arxiv.org/pdf/2511.04285

🚀 Thrilled to share our new work, "RLoop: A Self-Improving Framework for Reinforcement Learning"!
arxiv.org/pdf/2511.04285