LMSYS Org (@lmsysorg) 's Twitter Profile
LMSYS Org

@lmsysorg

Large Model Systems Organization: We developed SGLang sglang.ai, Chatbot Arena, and Vicuna! Please join our Slack channel at slack.sglang.ai

ID: 1822588444046249984

linkhttps://lmsys.org/ calendar_today11-08-2024 10:58:54

309 Tweet

5,5K Takipçi

138 Takip Edilen

Kimi.ai (@kimi_moonshot) 's Twitter Profile Photo

🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models 🔹Strong in coding and agentic tasks 🐤 Multimodal & thought-mode not supported for now With Kimi K2, advanced agentic intelligence

🚀 Hello, Kimi K2!  Open-Source Agentic Model!
🔹 1T total / 32B active MoE model
🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models
🔹Strong in coding and agentic tasks
🐤 Multimodal & thought-mode not supported for now

With Kimi K2, advanced agentic intelligence
zhyncs (@zhyncs42) 's Twitter Profile Photo

Huge thanks to the MoonCake team for bringing day 0 support for Kimi K2 in SGLang and KTransformers, including the integration of the new reasoning parser!

LMSYS Org (@lmsysorg) 's Twitter Profile Photo

SGLang is currently the only open-source LLM serving engine validated with PD disaggregation + large-scale EP on the H200 cluster with over 100 GPUs. Huge thanks to the MoonCake Kimi.ai team for helping verify it even before release!

LMSYS Org (@lmsysorg) 's Twitter Profile Photo

Kimi Kimi.ai K2 on SGLang: As a trillion-parameter model, long context on 1–2 H200/B200 nodes struggles. Use SGLang’s PD disaggregation + large-scale EP—validated on 100+ H200s by MoonCake Team. New SOTA is here—start using it!

Kimi <a href="/Kimi_Moonshot/">Kimi.ai</a> K2 on SGLang:
As a trillion-parameter model, long context on 1–2 H200/B200 nodes struggles. Use SGLang’s PD disaggregation + large-scale EP—validated on 100+ H200s by MoonCake Team.
New SOTA is here—start using it!
zhyncs (@zhyncs42) 's Twitter Profile Photo

Kimi.ai K2 shares the same architecture as DeepSeek R1, with experts increased from 256 to 384—so all SGLang optimizations work out of the box. PD disaggregation + large-scale EP was validated on 100+ H200s by MoonCake Team before release. Always trust SGLang! 🚀

LMSYS Org (@lmsysorg) 's Twitter Profile Photo

🚀Summer Fest Day 3: Cost-Effective MoE Inference on CPU from Intel PyTorch team Deploying 671B DeepSeek R1 with zero GPUs? SGLang now supports high-performance CPU-only inference on Intel Xeon 6—enabling billion-scale MoE models like DeepSeek to run on commodity CPU servers.

🚀Summer Fest Day 3: Cost-Effective MoE Inference on CPU from Intel PyTorch team

Deploying 671B DeepSeek R1 with zero GPUs? SGLang now supports high-performance CPU-only inference on Intel Xeon 6—enabling billion-scale MoE models like DeepSeek to run on commodity CPU servers.