vLLM
@vllm_project
A high-throughput and memory-efficient inference and serving engine for LLMs. Join slack.vllm.ai to discuss together with the community!
ID: 1774187564276289536
https://github.com/vllm-project/vllm 30-03-2024 21:31:01
327 Tweet
12,12K Followers
15 Following
vLLM🤝🤗! You can now deploy any Hugging Face language model with vLLM's speed. This integration makes it possible for one consistent implementation of the model in HF for both training and inference. 🧵 blog.vllm.ai/2025/04/11/tra…