vLLM (@vllm_project) 's Twitter Profile
vLLM

@vllm_project

A high-throughput and memory-efficient inference and serving engine for LLMs. Join slack.vllm.ai to discuss together with the community!

ID: 1774187564276289536

linkhttps://github.com/vllm-project/vllm calendar_today30-03-2024 21:31:01

327 Tweet

12,12K Takipçi

15 Takip Edilen

vLLM (@vllm_project) 's Twitter Profile Photo

🚀 The RL community keeps pushing boundaries — from better on-policy data and partial rollouts to in-flight weight updates that mix KV caches across models during inference. Continuing inference while weights change and KV states stay stale sounds wild — but that’s exactly what