Manhin Poon (@entroshape_) 's Twitter Profile
Manhin Poon

@entroshape_

ML Theory & LLM (Agent & MLsys)

ID: 1578035067754278916

linkhttps://github.com/EntroShape calendar_today06-10-2022 14:51:12

100 Tweet

18 Followers

398 Following

Yiling Lou (@yiling__lou) 's Twitter Profile Photo

Thrilled to announce that I'll be joining UIUC CS Siebel School of Computing and Data Science as an Assistant Professor in Spring 2026! 📢 I’m looking for Fall '26 PhD students who are interested in the intersection of Software Engineering and AI, especially in LLM4Code and Code Agents. Please drop me an

ICLR 2025 (@iclr_conf) 's Twitter Profile Photo

We are aware of low-quality and LLM-generated reviews and are currently deliberating on appropriate courses of action. For now, authors who receive very poor quality or LLM-generated reviews should flag them to their ACs. We appreciate the community's efforts in reporting these!

Shiyi Cao (@shiyi_c98) 's Twitter Profile Photo

1/n 🚀 Introducing SkyRL-Agent, a framework for efficient RL agent training. ⚡ 1.55× faster async rollout dispatch 🛠 Lightweight tool + task integration 🔄 Backend-agnostic (SkyRL-train / VeRL / Tinker) 🏆 Used to train SA-SWE-32B, improving Qwen3-32B from 24.4% → 39.4%

1/n
🚀 Introducing SkyRL-Agent, a framework for efficient RL agent training.

⚡ 1.55× faster async rollout dispatch
🛠 Lightweight tool + task integration
🔄 Backend-agnostic (SkyRL-train / VeRL / Tinker)
🏆 Used to train SA-SWE-32B, improving Qwen3-32B from 24.4% → 39.4%
MikaStars★ (@mikastars39) 's Twitter Profile Photo

ICLR prematurely terminated the reviewer’s edit score and comments. I have to ask: is this fair to those who haven’t finished their rebuttal yet?

Martin Ziqiao Ma (@ziqiao_ma) 's Twitter Profile Photo

NEPA: Next-Embedding Predictive Autoregression A simple objective for visual SSL and generative pretraining. Instead of reconstructing pixels or predicting discrete tokens, we train an autoregressive model to predict the next embedding given all previous embeddings. Key ideas:

NEPA: Next-Embedding Predictive Autoregression

A simple objective for visual SSL and generative pretraining. Instead of reconstructing pixels or predicting discrete tokens, we train an autoregressive model to predict the next embedding given all previous embeddings.

Key ideas:
Line (@linexjlin) 's Twitter Profile Photo

字节的论文提到一个问题:上下文不够用了 Seed-prover 1.5 的论文里提到他们的 Lean 证明器生成的 32K-64K 长度的证明中错误占了多数,获得了持续性负分(答对1 分,答错-1分),表现出在超长 CoT 情况下解题能力退化 (见图4d)。 相比起来 DeepSeek-Speciale 的就应对自如。复杂编程任务平均每题

字节的论文提到一个问题:上下文不够用了

Seed-prover 1.5 的论文里提到他们的 Lean 证明器生成的 32K-64K 长度的证明中错误占了多数,获得了持续性负分(答对1 分,答错-1分),表现出在超长 CoT 情况下解题能力退化 (见图4d)。

相比起来 DeepSeek-Speciale 的就应对自如。复杂编程任务平均每题
MiniMax (official) (@minimax__ai) 's Twitter Profile Photo

MiniMax M2.1 is OPEN SOURCE: SOTA for real-world dev & agents • SOTA on coding benchmarks (SWE / VIBE / Multi-SWE) • Beats Gemini 3 Pro & Claude Sonnet 4.5 • 10B active / 230B total (MoE) Not just SOTA, faster to infer, easier to deploy, and yes, you can even run it locally

MiniMax M2.1 is OPEN SOURCE: SOTA for real-world dev & agents

• SOTA on coding benchmarks (SWE / VIBE / Multi-SWE)
• Beats Gemini 3 Pro & Claude Sonnet 4.5
• 10B active / 230B total (MoE)

Not just SOTA,
faster to infer, easier to deploy,
and yes, you can even run it locally
MikaStars★ (@mikastars39) 's Twitter Profile Photo

Stop using LoRA for RLVR!!! New paper released👉Evaluating Parameter Efficient Methods for RLVR 📖Alphaxiv: alphaxiv.org/abs/2512.23165 💻Github: github.com/MikaStars39/Pe… Is standard LoRA truly the optimal choice for Reinforcement Learning?. We present the first large-scale

Stop using LoRA for RLVR!!!
New paper released👉Evaluating Parameter Efficient Methods for RLVR
📖Alphaxiv: alphaxiv.org/abs/2512.23165
💻Github: github.com/MikaStars39/Pe…

Is standard LoRA truly the optimal choice for Reinforcement Learning?. We present the first large-scale