Ligeng Zhu (@ligengzhu) 's Twitter Profile
Ligeng Zhu

@ligengzhu

Research Scientist at @Nvidia building VLMs , previously @MIT, @SFU and @ZJU_China.

ID: 3389355552

linkhttp://lzhu.me calendar_today30-08-2015 05:37:18

827 Tweet

1,1K Followers

1,1K Following

Enze Xie (@xieenze_jr) 's Twitter Profile Photo

πŸš€ Fast-dLLM: 27.6Γ— Faster Diffusion LLMs with KV Cache & Parallel Decoding πŸ’₯ Key Features🌟 - Block-Wise KV Cache Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6Γ— throughput gains with <2% accuracy loss πŸ”„ -

πŸš€ Fast-dLLM: 27.6Γ— Faster Diffusion LLMs with KV Cache &amp; Parallel Decoding πŸ’₯  

Key Features🌟  
- Block-Wise KV Cache  
  Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6Γ— throughput gains with &lt;2% accuracy loss πŸ”„  
-
Infini-AI-Lab (@infiniailab) 's Twitter Profile Photo

πŸ”₯ We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. πŸš€ Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧡 1/n

Muyang Li (@lmxyy1999) 's Twitter Profile Photo

πŸš€ #Nunchaku now supports FLUX.1-Kontext-dev! Edit images with just one sentence β€” style transfer, face swap, and more β€” now 2–3Γ— faster and using 1/4 VRAM. βœ… Works with ComfyUI & Diffusers πŸ”— Demo: svdquant.mit.edu/kontext/ πŸ“‚ Code: github.com/mit-han-lab/nu… πŸ€— 4-bit #SVDQuant

πŸš€ #Nunchaku now supports FLUX.1-Kontext-dev!
Edit images with just one sentence β€” style transfer, face swap, and more β€” now 2–3Γ— faster and using 1/4 VRAM.
βœ… Works with ComfyUI &amp; Diffusers
πŸ”— Demo: svdquant.mit.edu/kontext/
πŸ“‚ Code: github.com/mit-han-lab/nu…
πŸ€— 4-bit #SVDQuant
LMSYS Org (@lmsysorg) 's Twitter Profile Photo

πŸš€Summer Fest Day 4: Turbocharging Vision-Language Models with SGLang + NVILA 4.4Γ— throughput, 2.2Γ— faster response time! We've integrated NVILA into SGLang, enabling high-performance, scalable serving of vision-language models. This unlocks a 4.4Γ— TPS boost and significantly

πŸš€Summer Fest Day 4: Turbocharging Vision-Language Models with SGLang + NVILA 

4.4Γ— throughput, 2.2Γ— faster response time!
We've integrated NVILA into SGLang, enabling high-performance, scalable serving of vision-language models. This unlocks a 4.4Γ— TPS boost and significantly
Ligeng Zhu (@ligengzhu) 's Twitter Profile Photo

Empowered by SGLang, NVILA serving now has 4.4x throughput and 2.2x faster response πŸš€πŸš€πŸš€ Awesome work made by Zijian Zhang w/ a lot help from SGLang team!

Bolei Zhou (@zhoubolei) 's Twitter Profile Photo

NeurIPS Conference This is great! But will you also consider setting up an official satellite location in China, given the fact that so many great NeurIPS papers come from China and so many Chinese researchers couldn't attend the conference due to the US/Canada Visa issue?

LMSYS Org (@lmsysorg) 's Twitter Profile Photo

πŸš€ Summer Fest Day 5: Multiple Token Prediction in SGLang by @Eigen_AI_ and SGLang Team 1.6Γ— throughput, same quality β€” open-source & production-ready! We’ve integrated MTP into SGLang, unlocking up to 60% higher output throughput for models like DeepSeek V3, with zero quality

πŸš€ Summer Fest Day 5: Multiple Token Prediction in SGLang by @Eigen_AI_ and SGLang Team
1.6Γ— throughput, same quality β€” open-source &amp; production-ready!

We’ve integrated MTP into SGLang, unlocking up to 60% higher output throughput for models like DeepSeek V3, with zero quality
Yi Wu (@jxwuyi) 's Twitter Profile Photo

Tired intricate system code for RL training? 🀯 We release AReaL-lite – A lightweight AReaL version for AI researchers! πŸš€#opensource ✨ Algorithm-first design & APIsπŸŽ‰ ✨ 80% less code w. 90% AReaL's full efficiency πŸŽ‰ ✨ Customizable agentic RLπŸŽ‰ πŸ”— github.com/inclusionAI/AR…

Tired intricate system code for RL training? 🀯 
We release AReaL-lite – A lightweight AReaL version for AI researchers! πŸš€#opensource
✨ Algorithm-first design &amp; APIsπŸŽ‰
✨ 80% less code w. 90% AReaL's full efficiency πŸŽ‰
✨ Customizable agentic RLπŸŽ‰
πŸ”— github.com/inclusionAI/AR…
Eigen AI (@eigen_ai_labs) 's Twitter Profile Photo

πŸš€Founded by four dedicated MIT graduates, Eigen AI is the world's first company focusing on AEI – Artificial Efficient Intelligence, making AI accessible for all. Today OpenAI dropped GPT-OSS. We teamed up with our partners SGLang LMSYS Org and @NVIDIA to deliver open-source

πŸš€Founded by four dedicated MIT graduates, Eigen AI is the world's first company focusing on AEI – Artificial Efficient Intelligence, making AI accessible for all.

Today OpenAI dropped GPT-OSS. We teamed up with our partners SGLang <a href="/lmsysorg/">LMSYS Org</a> and @NVIDIA to deliver open-source
Ryan Hanrui Wang (@hanrui_w) 's Twitter Profile Photo

Announcing Eigen AI Eigen AI, the world’s first company dedicated to AEI β€” Artificial Efficient Intelligence. πŸš€ The future of AI is already here; it’s simply not evenly distributed. Our mission is to close that gap by driving radical efficiency so that every person and