vLLM
@vllm_project
A high-throughput and memory-efficient inference and serving engine for LLMs. Join slack.vllm.ai to discuss together with the community!
ID: 1774187564276289536
https://github.com/vllm-project/vllm 30-03-2024 21:31:01
327 Tweet
12,12K Takipçi
15 Takip Edilen
vLLM v0.8.3 now supports AI at Meta's latest Llama 4 Scout and Maverick. We see these open source models as a major step forward in efficiency with long context feature, native multi-modality, and MoE architecture. Best tips of running it 🧵 blog.vllm.ai/2025/04/05/lla…