vLLM (@vllm_project) 's Twitter Profile
vLLM

@vllm_project

A high-throughput and memory-efficient inference and serving engine for LLMs. Join slack.vllm.ai to discuss together with the community!

ID: 1774187564276289536

linkhttps://github.com/vllm-project/vllm calendar_today30-03-2024 21:31:01

327 Tweet

12,12K Followers

15 Following

vLLM (@vllm_project) 's Twitter Profile Photo

Speculative decoding is one of the best tool in the vLLM's suite of inference optimization tool box, accelerating the inference without accuracy loss. Checkout our blog post for more details about the state of spec decode in vLLM today! 🧵 blog.vllm.ai/2024/10/17/spe…