Dustin Tran (@dustinvtran) 's Twitter Profile
Dustin Tran

@dustinvtran

Research Scientist at Google DeepMind. I lead evaluation at Gemini / Bard.

ID: 1540897980

linkhttp://dustintran.com calendar_today23-06-2013 12:57:08

2,2K Tweet

42,42K Takipçi

666 Takip Edilen

Michael Chang (@mmmbchang) 's Twitter Profile Photo

Gemini and I also got a chance to watch the OpenAI live announcement of gpt4o, using Project Astra! Congrats to the OpenAI team, super impressive work!

Dustin Tran (@dustinvtran) 's Twitter Profile Photo

Gemini 1.5 report is out. Lots of progress in pre- and post-training. Gemini 1.5 Pro dominates 1.0 Ultra which was launched only 6 months ago. Even our speediest Gemini 1.5 Flash outperforms 1.0 Ultra on most text and vision tasks.

Melvin Johnson (@melvinjohnsonp) 's Twitter Profile Photo

Our latest version of Gemini 1.5 Pro in AI Studio is #1 on the LMSys leaderboard. 🚀 This is the result of various advances in post-training and we have more lined up. Congrats to the Gemini team.

Dustin Tran (@dustinvtran) 's Twitter Profile Photo

Gemini is #1 overall on both text and vision arena, and Gemini is #1 on a staggering total of 20 out of 22 leaderboard categories. It's been a journey attaining such a powerful posttrained model. Proud to have co-lead the team!

Dustin Tran (@dustinvtran) 's Twitter Profile Photo

Welcome back Noam Shazeer to Google! It'll be a great time working together again since 2018. Let's take Gemini which is #1 and continue expanding the limits of its capabilities. techcrunch.com/2024/08/02/cha…

Dustin Tran (@dustinvtran) 's Twitter Profile Photo

Nice work on controlling style biases! In this view, many models are no longer inflated (e.g., response length, formatting). Gemini 1.5 Flash also outperforms gpt-4o-mini overall and across all categories except for coding.

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Massive News from Chatbot Arena🔥 Google DeepMind's latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision

Massive News from Chatbot Arena🔥

<a href="/GoogleDeepMind/">Google DeepMind</a>'s latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision
Dustin Tran (@dustinvtran) 's Twitter Profile Photo

Here is what Gemini can do on *Flash*. My favorite perk: Gemini 2.0 Flash Thinking has significant gains in core capabilities while also excellent in user preferences (co-#1 with gemini-exp-1206 on lmarena.ai). The best of both worlds.

Dustin Tran (@dustinvtran) 's Twitter Profile Photo

2.5 Pro Exp is a model we're so proud of: #1 on LMArena, #1 on benchmarks like AIME, Aider, MMMU, and MRCR, & significant gains across coding, reasoning, multimodal, and so much more. Try it now! aistudio.google.com gemini.google.com

Dustin Tran (@dustinvtran) 's Twitter Profile Photo

This is so good. Love meta-analyses. From a benchmark it's much harder to optimize the test set (implicitly or otherwise).

Dustin Tran (@dustinvtran) 's Twitter Profile Photo

Our latest and greatest coding model! We've made some big strides for web app and visual development. And it continues dominating in user preference: #1 with a 37 Elo gap from #2.

Quoc Le (@quocleix) 's Twitter Profile Photo

Following its IMO gold-level win, Google DeepMind is sharing Gemini Deep Think with mathematicians for feedback. Excited to see what they discover! 🧠 Plus, an updated Gemini 2.5 Deep Think is now rolling out for Google AI Ultra subscribers. Learn more: bit.ly/3IWcWq0