Roy (@jasonlovelu) 's Twitter Profile
Roy

@jasonlovelu

Software engineer. @vllm_project Committer.

ID: 2622910696

linkhttps://github.com/esmeetu calendar_today19-06-2014 12:23:24

139 Tweet

70 Takipçi

292 Takip Edilen

vLLM (@vllm_project) 's Twitter Profile Photo

🙏 DeepSeek's highly performant inference engine is built on top of vLLM. Now they are open-sourcing the engine the right way: instead of a separate repo, they are bringing changes to the open source community so everyone can immediately benefit! github.com/deepseek-ai/op…

vLLM (@vllm_project) 's Twitter Profile Photo

vLLM has just reached 50K github stars! Huge thanks to the community!🚀 Together let's bring easy, fast, and cheap LLM serving for everyone✌🏻

vLLM has just reached 50K github stars! Huge thanks to the community!🚀
Together let's bring easy, fast, and cheap LLM serving for everyone✌🏻
vLLM (@vllm_project) 's Twitter Profile Photo

Thanks for the great write-up! 🙌 Prefix caching is critical for agentic workflows like @ManusAI_HQ , and vLLM makes it seamless. ✅ prefix caching is enabled by default with an efficient implementation ✅ Append-only context? Cache hit heaven Context engineering FTW 🚀

Thanks for the great write-up! 🙌 Prefix caching is critical for agentic workflows like @ManusAI_HQ , and vLLM makes it seamless.

✅ prefix caching is enabled by default with an efficient implementation
✅ Append-only context? Cache hit heaven

Context engineering FTW 🚀
Ilir Aliu - eu/acc (@iliraliu_) 's Twitter Profile Photo

Why add sensors and complex systems when physics can do the job? This production line sorts products using only weight and controlled bursts of air. ✅ No cameras or vision models ✅ No expensive integration ✅ Just reliable, repeatable separation at scale It’s a reminder that

Red Hat AI (@redhat_ai) 's Twitter Profile Photo

How do you build and contribute to vLLM? In our latest vLLM Office Hours, we walked through the full contributor workflow from building vLLM across different hardware targets to submitting your first PR. Video: youtube.com/live/RjTuvQwob… Slides: docs.google.com/presentation/d…

Yuchen Jin (@yuchenj_uw) 's Twitter Profile Photo

PewDiePie in 2025: – built a 10×4090 rig – runs Llama 70B, gpt-oss-120B & Qwen 245B locally via vLLM – built a custom web UI (chat, RAG, search, TTS) – ran protein-folding simulations for charity – created an AI “council”, a swarm of 64 models – now fine-tuning his own model

PewDiePie in 2025:

– built a 10×4090 rig
– runs Llama 70B, gpt-oss-120B & Qwen 245B locally via vLLM
– built a custom web UI (chat, RAG, search, TTS)
– ran protein-folding simulations for charity
– created an AI “council”, a swarm of 64 models
– now fine-tuning his own model
Ai2 (@allen_ai) 's Twitter Profile Photo

Introducing OlmoEarth 🌍, state-of-the-art AI foundation models paired with ready-to-use open infrastructure to turn Earth data into clear, up-to-date insights within hours—not years.

Roy (@jasonlovelu) 's Twitter Profile Photo

My personal take: Some teams push the limits with massive scaling. Others focus on optimization under tight constraints. Both are needed. The trick is to stay honest about which game you’re actually playing..

My personal take:
Some teams push the limits with massive scaling.
Others focus on optimization under tight constraints.

Both are needed.
The trick is to stay honest about which game you’re actually playing..
Fei-Fei Li (@drfeifei) 's Twitter Profile Photo

AI’s next frontier is Spatial Intelligence, a technology that will turn seeing into reasoning, perception into action, and imagination into creation. But what is it? Why does it matter? How do we build it? And how can we use it? Today, I want to share with you my thoughts on

Roy (@jasonlovelu) 's Twitter Profile Photo

Humans are like CPUs, AI is like a GPU. The next bottlenecks aren’t compute, but bandwidth and verification: how fast we can talk to AI, and how well we can check its answers for correctness and safety.

Humans are like CPUs, AI is like a GPU.
The next bottlenecks aren’t compute, but bandwidth and verification:

how fast we can talk to AI, and how well we can check its answers for correctness and safety.
vLLM (@vllm_project) 's Twitter Profile Photo

Thanks to GitHub for spotlighting vLLM in the Octoverse 2025 report — one of the fastest-growing open-source AI projects this year. 🏆 Top OSS by contributors 🚀 Fastest-growing by contributors 🌱 Attracting the most first-time contributors Trusted by leading open model

Thanks to <a href="/github/">GitHub</a> for spotlighting vLLM in the Octoverse 2025 report — one of the fastest-growing open-source AI projects this year.

🏆 Top OSS by contributors
🚀 Fastest-growing by contributors
🌱 Attracting the most first-time contributors

Trusted by leading open model
vLLM (@vllm_project) 's Twitter Profile Photo

🚀 No More Train–Inference Mismatch! We demonstrate bitwise consistent on-policy RL with TorchTitan (training) + vLLM (inference) — the first open-source run where training and inference numerics match exactly. It only takes 3 steps: 1️⃣ Make vLLM batch-invariant (same seq →

Roy (@jasonlovelu) 's Twitter Profile Photo

In complex systems, scale doesn’t just add more; it rewires everything. At some threshold, the structure that was stable at small scale collapses and a new one takes over. That jump is what we casually call “from quantity to quality.”

In complex systems, scale doesn’t just add more; it rewires everything.

At some threshold, the structure that was stable at small scale collapses and a new one takes over.

That jump is what we casually call “from quantity to quality.”
vLLM (@vllm_project) 's Twitter Profile Photo

🎉vLLM v0.11.2 is out! This release focuses on things the community cares about most — smoother scaling, more predictable performance, and wider model support. 1456 commits from 449 contributors (184 new!) made this possible. 🧡 Here are a few improvements you'll feel in real