Zhanghao Wu (@michaelvll1) 's Twitter Profile
Zhanghao Wu

@michaelvll1

Building SkyPilot @skypilot_org | Co-creator of Vicuna @lmsysorg, PhD @Berkeley_EECS @ucbrise. Prev: @MIT, @sjtu1896

ID: 1236209629576785920

calendar_today07-03-2020 08:39:10

179 Tweet

987 Followers

371 Following

Zhanghao Wu (@michaelvll1) 's Twitter Profile Photo

AI engs like decade-old Slurm for GPU scheduling, while infra teams push k8s for flexibility. Abridge solved this with SkyPilot: 🧑‍🔬 AI engs focus on AI for healthcare with 10x dev speed up 👷 AI infra teams scale and manage GPU capacity across hyperscalers & neoclouds

Sumanth (@sumanth_077) 's Twitter Profile Photo

Run, manage, and scale AI workloads on any infrastructure! SkyPilot is an open source framework that provides a unified, cloud-agnostic control plane for launching, monitoring, and optimizing AI workloads. 100% Open Source.

Run, manage, and scale AI workloads on any infrastructure!

SkyPilot is an open source framework that provides a unified, cloud-agnostic control plane for launching, monitoring, and optimizing AI workloads.

100% Open Source.
SkyPilot (@skypilot_org) 's Twitter Profile Photo

The GPU neocloud ecosystem is heating up! Congratulations to @Nebius on their impressive milestone, with Microsoft purchasing $17B worth of GPUs. With the arrival of excellent GPU providers, AI infra needs to become vendor agnostic. @Nebius is one of many great partners

SkyPilot (@skypilot_org) 's Twitter Profile Photo

Congrats to Together AI on releasing Instant Clusters! After creating a cluster instantly, how do you utilize the GPUs quickly for AI? SkyPilot is glad to partner with Together AI to offer day-0 integration, enabling AI teams to onboard new GPU clusters from the

SkyPilot (@skypilot_org) 's Twitter Profile Photo

Underdiscussed topic: GPUs aren’t the bottleneck. For fast distributed GenAI post-training, network & storage matter as much. By tuning the network and storage on the cloud, Henry Zhu found 10x speedup on Nebius cloud, even with the same GPUs and code. maknee.github.io/blog/2025/Netw…

Underdiscussed topic: GPUs aren’t the bottleneck. For fast distributed GenAI post-training, network & storage matter as much.

By tuning the network and storage on the cloud, <a href="/makneee/">Henry Zhu</a> found 10x speedup on <a href="/nebiusai/">Nebius</a> cloud, even with the same GPUs and code.

maknee.github.io/blog/2025/Netw…
Ljubomir Buturovic (@ljbuturovic) 's Twitter Profile Photo

For individual AI researchers, I highly recommend Nebius cloud computing service because: 1. It is well supported by SkyPilot, a great cloud management framework 2. You can rent a single H100 - crucial for development in relatively-low-resource settings

Zhanghao Wu (@michaelvll1) 's Twitter Profile Photo

Really enjoy working with the Nebius team! Both their eng and partnership teams move really fast and are very productive, making Nebius one of the top Neocloud choices in SkyPilot community.

Nebius (@nebiusai) 's Twitter Profile Photo

Shopify diversifies its GPU strategy with Nebius. With SkyPilot, their team runs large multi-node jobs on reliable instances and extends workflows across clouds without changing what already works. Testimonial 👇 #GPUcloud #AIInfrastructure

<a href="/Shopify/">Shopify</a> diversifies its GPU strategy with Nebius. With SkyPilot, their team runs large multi-node jobs on reliable instances and extends workflows across clouds without changing what already works. 
Testimonial 👇 #GPUcloud #AIInfrastructure
Zhanghao Wu (@michaelvll1) 's Twitter Profile Photo

Another great joint forces for running large-scale LLM training! Torchtitan powers the LLM finetuning, and SkyPilot launches and scales it on any of your AI infra, including k8s and clouds!

AI at Meta (@aiatmeta) 's Twitter Profile Photo

SkyPilot PyTorch More ways to scale TorchTitan with SkyPilot 🔥 great to see the community expanding the ecosystem with new tutorials like this one!

Zhanghao Wu (@michaelvll1) 's Twitter Profile Photo

Build agents with RL using VeRL, run and scale the training on any infra with SkyPilot! It is as simple as sky launch verl.yaml , and the training now starts on any Kubernetes clusters, or clouds you have. A series of blogs about agent training are coming.

SkyPilot (@skypilot_org) 's Twitter Profile Photo

The best ChatGPT that $100 can buy — now on your own k8s/cloud! Run Andrej Karpathy's nanochat (pretrain → finetune → eval → serve) with SkyPilot 🚀 💰 Complete training for ~$100 ⚡ Deploy web UI for ~$2–3/hr 🌐 Run on any cloud or k8s with SkyPilot github.com/skypilot-org/s…

The best ChatGPT that $100 can buy — now on your own k8s/cloud!

Run <a href="/karpathy/">Andrej Karpathy</a>'s nanochat (pretrain → finetune → eval → serve) with SkyPilot 🚀

💰 Complete training for ~$100
⚡ Deploy web UI for ~$2–3/hr
🌐 Run on any cloud or k8s with SkyPilot

github.com/skypilot-org/s…
Zhanghao Wu (@michaelvll1) 's Twitter Profile Photo

Today’s AWS outage is a reminder that even the biggest cloud providers aren’t immune to failure. That’s one of the reason why we’ve bet on a multi-cloud future. With SkyPilot, easily/automatically switch to another cloud/region with one command -- keep your workloads

Today’s AWS outage is a reminder that even the biggest cloud providers aren’t immune to failure. That’s one of the reason why we’ve bet on a multi-cloud future.

With SkyPilot, easily/automatically switch to another cloud/region with one command -- keep your workloads
SkyPilot (@skypilot_org) 's Twitter Profile Photo

AWS outage is a reminder: single-region is fragile. Modern AI teams demand region & cloud agnostic tooling.🧐 We compared AWS Batch vs SkyPilot for AI. See how SkyPilot offers global GPU scheduling/recovery -- with faster iteration and 11x cost savings: blog.skypilot.co/aws-batch-vs-s…

AWS outage is a reminder: single-region is fragile. Modern AI teams demand region &amp; cloud agnostic tooling.🧐

We compared AWS Batch vs SkyPilot for AI. See how SkyPilot offers global GPU scheduling/recovery -- with faster iteration and 11x cost savings:

blog.skypilot.co/aws-batch-vs-s…
NovaSky (@novaskyai) 's Twitter Profile Photo

☁️SkyRL now runs seamlessly with SkyPilot! Let SkyPilot handle GPU provisioning and cluster setup, so you can focus on RL training with SkyRL. 🎯 Launch distributed RL jobs effortlessly ⚙️ Auto-provision GPUs across clouds 🤖 Train your LLM agents at scale Get started

☁️SkyRL now runs seamlessly with SkyPilot! 

Let <a href="/skypilot_org/">SkyPilot</a> handle GPU provisioning and cluster setup, so you can focus on RL training with SkyRL.

🎯 Launch distributed RL jobs effortlessly
⚙️ Auto-provision GPUs across clouds
🤖 Train your LLM agents at scale

Get started
Zhanghao Wu (@michaelvll1) 's Twitter Profile Photo

Thanks for all people dropping by our SkyPilot booth! Really enjoyed the convos and good to learn that we are resolving urgent pains companies have: run and manage AI on any GPU cluster, no matter if it is reserved, on-prem, or from clouds. See you all at #RaySummit ; )