Yichao Fu (@fuyichao123) 's Twitter Profile
Yichao Fu

@fuyichao123

CSE PhD student @UCSD

ID: 1815623304964673536

linkhttps://viol2000.github.io/ calendar_today23-07-2024 05:41:57

115 Tweet

73 Followers

375 Following

Hao AI Lab (@haoailab) 's Twitter Profile Photo

(1/n) 🚀 With FastVideo, you can now generate a 5-second video in 5 seconds on a single H200 GPU! Introducing FastWan series, a family of fast video generation models trained via a new recipe we term as “sparse distillation”, to speed up video denoising time by 70X! 🖥️ Live

Jiayi Weng (@trinkle23897) 's Twitter Profile Photo

Harmony format is finally open-sourced. I still remember 3 years ago (before ChatGPT release) Shengjia Zhao, Daniel and I were brainstorming about the right abstraction for RL training, and that is the start point of the entire harmony library. github.com/openai/harmony

Hao AI Lab (@haoailab) 's Twitter Profile Photo

[Lmgame Bench] 🔥 OpenAI has just released two open‑weight reasoning models: gpt‑oss‑120B (~117 B) and gpt‑oss‑20B (~21 B),They are the first OpenAI models with open weights since GPT‑2. We tested both in Lmgame Bench, across 4 interactive games: 🧱 Sokoban | 🟦 Tetris | 🔢

[Lmgame Bench] 🔥 OpenAI has just released two open‑weight reasoning models: gpt‑oss‑120B (~117 B) and gpt‑oss‑20B (~21 B),They are the first OpenAI models with open weights since GPT‑2.

We tested both in Lmgame Bench, across 4 interactive games:
🧱 Sokoban | 🟦 Tetris | 🔢
Yichao Fu (@fuyichao123) 's Twitter Profile Photo

Excited to share my 1st project as a Research Scientist Intern at Meta FAIR! Grateful to my mentor Jiawei Zhao for guidance, and to Yuandong Tian & Xuewei for their valuable advice and collaboration. Our work DeepConf explores local confidence for more accurate & efficient LLM reasoning!

Hao AI Lab (@haoailab) 's Twitter Profile Photo

[1/5] [Lmgame Bench] 🎮 Question: Can RL-based LLM post-training on games generalize to other tasks? We shared a preliminary study to explore this question: - Same-family (in-domain): Training on 6×6 Sokoban → 8×8 and Tetris (1 block type) → Tetris (2 block types) transfers,

[1/5] [Lmgame Bench] 🎮

Question: Can RL-based LLM post-training on games generalize to other tasks?

We shared a preliminary study to explore this question:
- Same-family (in-domain): Training on 6×6 Sokoban → 8×8 and Tetris (1 block type) → Tetris (2 block types) transfers,
Zhi Su (@zhisu22) 's Twitter Profile Photo

🏓🤖 Our humanoid robot can now rally over 100 consecutive shots against a human in real table tennis — fully autonomous, sub-second reaction, human-like strikes.

AK (@_akhaliq) 's Twitter Profile Photo

Microsoft presents rStar2-Agent Agentic Reasoning Technical Report rStar2-Agent boosts a pre-trained 14B model to state of the art in only 510 RL steps within one week, achieving average pass@1 scores of 80.6% on AIME24 and 69.8% on AIME25, surpassing DeepSeek-R1 (671B) with

Microsoft presents rStar2-Agent

Agentic Reasoning Technical Report

rStar2-Agent boosts a pre-trained 14B model to state of the art in only 510 RL steps within one week, achieving average pass@1 scores of 80.6% on AIME24 and 69.8% on AIME25, surpassing DeepSeek-R1 (671B) with
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Another great Google DeepMind paper. Shows how to speed up LLM agents while cutting cost and keeping answers unchanged. 30% lower total cost and 60% less wasted cost at comparable acceleration. Agents plan step by step, so each call waits for the previous one, which drags

Another great <a href="/GoogleDeepMind/">Google DeepMind</a> paper.

Shows how to speed up LLM agents while cutting cost and keeping answers unchanged.

30% lower total cost and 60% less wasted cost at comparable acceleration.

Agents plan step by step, so each call waits for the previous one, which drags
Yichao Fu (@fuyichao123) 's Twitter Profile Photo

🚀 Excited that two of my papers got accepted at #NeurIPS2025! Even more thrilled that it’s happening right here in San Diego. Can’t wait to see everyone there! 😎

elvis (@omarsar0) 's Twitter Profile Photo

Very cool work from Meta Superintelligence Lab. They are open-sourcing Meta Agents Research Environments (ARE), the platform they use to create and scale agent environments. Great resource to stress-test agents in environments closer to real apps. Read on for more:

Very cool work from Meta Superintelligence Lab.

They are open-sourcing Meta Agents Research Environments (ARE), the platform they use to create and scale agent environments.

Great resource to stress-test agents in environments closer to real apps.

Read on for more:
Hao AI Lab (@haoailab) 's Twitter Profile Photo

[1/N]🚀New decoding paradigm drop!🚀 Introducing Lookahead Reasoning(LR): step-level speculation that stacks with Speculative Decoding(SD). It has been accepted to #NeurIPS2025 🎉 📖 Blog: hao-ai-lab.github.io/blogs/lookahea… 💻 Code: github.com/hao-ai-lab/Loo… 📄 Paper: arxiv.org/abs/2506.19830

Hao AI Lab (@haoailab) 's Twitter Profile Photo

Tunix × GRL: One-Line Multi-Turn RL on JAX+TPU 📷 We’re collaborating closely with Google’s Tunix team—JAX-native LLM post-training on TPU. Using Tunix’s lightweight RL framework, we shipped a first-hand multi-turn RL training example in GRL. It runs in one line. GRL:

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

New paper from Meta Superintelligence Labs (FAIR) Explains why grokking happens and shows when learning moves from memorizing to generalizing. Gives a concrete recipe to trigger grokking, with weight decay, moderate width, and a data threshold near size times log size.

New paper from Meta Superintelligence Labs (FAIR)

Explains why grokking happens and shows when learning moves from memorizing to generalizing. 

Gives a concrete recipe to trigger grokking, with weight decay, moderate width, and a data threshold near size times log size.
swan (@siqizhu666) 's Twitter Profile Photo

🚀 Glad to share our new paper: GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare We model user–LLM interaction as a strategic game, aligning models for mutual benefit. 👉 arxiv.org/abs/2510.08872 #AI #LLM #GameTheory #RL

🚀 Glad to share our new paper:
GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare
We model user–LLM interaction as a strategic game, aligning models for mutual benefit.
👉 arxiv.org/abs/2510.08872

#AI #LLM #GameTheory #RL
Hao AI Lab (@haoailab) 's Twitter Profile Photo

[Lmgame Bench] ♠️♥️ Can LLMs bluff, fold, and bet like real poker players—with no strategic help? From Oct 28 – 30 (Tue–Thu, 10 AM – 4 PM PT), we’re hosting a 6 model live multi-agent Texas Hold’em tournament on Twitch 🎥 🕹️ twitch.tv/lmgamebench Each model starts with 300

[Lmgame Bench] ♠️♥️ Can LLMs bluff, fold, and bet like real poker players—with no strategic help?
From Oct 28 – 30 (Tue–Thu, 10 AM – 4 PM PT), we’re hosting a 6 model live multi-agent Texas Hold’em tournament on Twitch 🎥
 🕹️ twitch.tv/lmgamebench

Each model starts with 300
Hao AI Lab (@haoailab) 's Twitter Profile Photo

♠️♥️ The cards are on the table. Day 1 of our 3-day Texas Hold’em LLM tournament is live! 😍 🤖 6 models. 300 chips each. No strategy prompts, only pure reasoning. 🎥 Watch now → twitch.tv/lmgamebench #AI #TexasHoldem #LmgameBench

♠️♥️ The cards are on the table.
 Day 1 of our 3-day Texas Hold’em LLM tournament is live! 😍

🤖 6 models. 300 chips each. No strategy prompts, only pure reasoning.

🎥 Watch now → twitch.tv/lmgamebench
#AI #TexasHoldem #LmgameBench
Hao Zhang (@haozhangml) 's Twitter Profile Photo

Excited to partner with SGLang: FastVideo + SGLang = the future open ecosystem for diffusion. 🥳🫡 ----------- A few extra cents: Since I started faculty at UCSD, our lab has been investing diffusion for video and text , and in both algorithms and systems. - Text-side, we

vLLM (@vllm_project) 's Twitter Profile Photo

🚀 No More Train–Inference Mismatch! We demonstrate bitwise consistent on-policy RL with TorchTitan (training) + vLLM (inference) — the first open-source run where training and inference numerics match exactly. It only takes 3 steps: 1️⃣ Make vLLM batch-invariant (same seq →

Hao AI Lab (@haoailab) 's Twitter Profile Photo

Our lab has three posters this year as #NeurIPS25 comes to our home turf in San Diego 😎 if you’re around, we’d love for you to swing by, say hi, and chat about LLM systems, efficient reasoning, and video diffusion 💙💛

Our lab has three posters this year as #NeurIPS25 comes to our home turf in San Diego 😎 if you’re around, we’d love for you to swing by, say hi, and chat about LLM systems, efficient reasoning, and video diffusion 💙💛