Enze Xie (@xieenze_jr) 's Twitter Profile
Enze Xie

@xieenze_jr

Sr. Research Scientist at NVIDIA, doing GenAI, CS PhD from HKU MMLab, interned at NVIDIA.

ID: 1723702194380427264

linkhttps://xieenze.github.io/ calendar_today12-11-2023 14:00:10

49 Tweet

769 Followers

116 Following

Enze Xie (@xieenze_jr) 's Twitter Profile Photo

🚀 Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache & Parallel Decoding 💥 Key Features🌟 - Block-Wise KV Cache Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6× throughput gains with <2% accuracy loss 🔄 -

🚀 Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache &amp; Parallel Decoding 💥  

Key Features🌟  
- Block-Wise KV Cache  
  Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6× throughput gains with &lt;2% accuracy loss 🔄  
-