Christopher Chou (@chrischou03) 's Twitter Profile
Christopher Chou

@chrischou03

building chatbot arena @stanford @ucberkeley

ID: 1305230512445820928

calendar_today13-09-2020 19:43:04

32 Tweet

123 Takipçi

40 Takip Edilen

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

We are thrilled to announce the milestone release of SGLang Runtime v0.2, featuring significant inference optimizations after months of hard work. It achieves up to 2.1x higher throughput compared to TRT-LLM and up to 3.8x higher throughput compared to vLLM. It consistently

We are thrilled to announce the milestone release of SGLang Runtime v0.2, featuring significant inference optimizations after months of hard work.

It achieves up to 2.1x higher throughput compared to TRT-LLM and up to 3.8x higher throughput compared to vLLM. It consistently
Lianmin Zheng (@lm_zheng) 's Twitter Profile Photo

Grok-2 is here, a new frontier-level model from @xAI! I still remember the good old days when I was a GPU-poor grad student, playing with the Vicuna model and building the Chatbot Arena leaderboard with just a few GPUs. But now, my job at xAI is developing systems for the 100K

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Chatbot Arena update❤️‍🔥 Exciting news—@xAI's Grok-2 and Grok-mini are now officially on the leaderboard! With over 6000 community votes, Grok-2 has claimed the #2 spot, surpassing GPT-4o (May) and tying with the latest Gemini! Grok-2-mini also impresses at #5. Grok-2 excels in

Chatbot Arena update❤️‍🔥

Exciting news—@xAI's Grok-2 and Grok-mini are now officially on the leaderboard!

With over 6000 community votes, Grok-2 has claimed the #2 spot, surpassing GPT-4o (May) and tying with the latest Gemini! Grok-2-mini also impresses at #5.

Grok-2 excels in
Christopher Chou (@chrischou03) 's Twitter Profile Photo

Fantastic work by Lisa. I often feel like I can predict a model's identity based on their output and she formalizes these vibes!

Christopher Chou (@chrischou03) 's Twitter Profile Photo

We have released our initial leaderboard for text-to-image models! Check it out and let us know what you think. One thing interesting to see is the ranking shifts based on whether the prompts are preset or not.

vLLM (@vllm_project) 's Twitter Profile Photo

🚀 With the v0.7.0 release today, we are excited to announce the alpha release of vLLM V1: A major architectural upgrade with 1.7x speedup! Clean code, optimized execution loop, zero-overhead prefix caching, enhanced multimodal support, and more.

🚀 With the v0.7.0 release today, we are excited to announce the alpha release of vLLM V1: A major architectural upgrade with 1.7x speedup! 
Clean code, optimized execution loop, zero-overhead prefix caching, enhanced multimodal support, and more.
Ion Stoica (@istoica05) 's Twitter Profile Photo

The progress in AI is down to three basic resources; (1) people (experts), (2) data, and (3) infrastructure. Arguably, at this point the US is only ahead in (3). Also, at this point the Chinese open source models are ahead. Not only DeepSeek but also Qwen. This is a fact.

Blockchain at Berkeley (@calblockchain) 's Twitter Profile Photo

💰🏆B@B members just swiped 30k in Prizes in one weekend, check out all the projects below: 🌟 Projects: Command Flare by Ravi Riley Rohan Vardhan Shorewala Oleg Viatkin - RAG Knowledge - 1st place AND Consensus Learning - 3rd place Voice To Flare by @mason_arditi, Romain ,

Christopher Chou (@chrischou03) 's Twitter Profile Photo

Excited to finally share what we've been working on for the past year! Marin is a platform for developing foundation models from data curation to large-scale model training to evaluation. It's been a privilege to work on this with so many amazing people!

Christopher Chou (@chrischou03) 's Twitter Profile Photo

LFG Chatbot Arena! AI Evaluation is an incredibly hard thing to get right especially balancing between user preference and model capability. I’m confident in the team to execute on this mission. Excited for the future!

Nick Jiang @ ICLR (@nickhjiang) 's Twitter Profile Photo

Vision transformers have high-norm outliers that hurt performance and distort attention. While prior work removed them by retraining with “register” tokens, we find the mechanism behind outliers and make registers at ✨test-time✨—giving clean features and better performance! 🧵

Vision transformers have high-norm outliers that hurt performance and distort attention. While prior work removed them by retraining with “register” tokens, we find the mechanism behind outliers and make registers at ✨test-time✨—giving clean features and better performance! 🧵
Lisa Dunlap (@lisabdunlap) 's Twitter Profile Photo

At #CVPR2025 ? Come see my talk on building evals which embrace the fuzziness of generative models at the EVAL-FoMo workshop today! This talk had everything - from Chatbot Arena to model vibes to designing UI's :P Details: June 11th, 4:30pm, room 210

David Hall (@dlwh) 's Twitter Profile Photo

So about a month ago, Percy posted a version of this plot of our Marin 32B pretraining run. We got a lot of feedback, both public and private, that the spikes were bad. (This is a thread about how we fixed the spikes. Bear with me. )

So about a month ago, Percy posted a version of this plot of our Marin 32B pretraining run. We got a lot of feedback, both public and private, that the spikes were bad. (This is a thread about how we fixed the spikes. Bear with me. )
Google AI Developers (@googleaidevs) 's Twitter Profile Photo

.Center for Research on Foundation Models's Marin project has released the first fully open model in JAX. It’s an 'open lab' sharing the entire research process - including code, data, and logs, to enable reproducibility and further innovation. developers.googleblog.com/en/stanfords-m…