yv | AS11414 | N6YVB
@yvbbrjdr
exists as 451; RT ≠ Endorsement; Like = Thanks/Acknowledgement ≠ Agreement; Creator of @LANDropApp, @AthenaAGI; Senior System Software Engineer @NVIDIA
ID: 1603444903
https://yvb.moe/ 18-07-2013 13:15:19
3,3K Tweet
1,1K Takipçi
365 Takip Edilen
🚀 Excited to collaborate with NVIDIA and SemiAnalysis on pushing inference performance to the next level! On the Blackwell GB200 NVL72, SGLang achieved 26K input / 13K output tokens per GPU/sec. On the SemiAnalysis InferenceMAX benchmark, SGLang is the default engine for
Exciting updates on DGX Spark: Now you can run gpt-oss-20b at 70 tokens/s with SGLang! This is 1.4x faster than what we got in our blog last week. We worked with the NVIDIA AI Developer team to fix a bunch of Triton and quantization issues. Cannot wait to see how much performance we