Dustin Tran (@dustinvtran) Twitter Tweets • TwiCopy

Dustin Tran

@dustinvtran

+ Follow

Research Scientist at Google DeepMind. I lead evaluation at Gemini / Bard.

ID: 1540897980

linkhttp://dustintran.com calendar_today23-06-2013 12:57:08

2,2K Tweet

42,42K Followers

666 Following

Michael Chang

@mmmbchang

2 years ago

Gemini and I also got a chance to watch the OpenAI live announcement of gpt4o, using Project Astra! Congrats to the OpenAI team, super impressive work!

thumb_up_off_alt1,1K

chat_bubble_outline56

repeat244

shareShare

Gemini 1.5 report is out. Lots of progress in pre- and post-training. Gemini 1.5 Pro dominates 1.0 Ultra which was launched only 6 months ago. Even our speediest Gemini 1.5 Flash outperforms 1.0 Ultra on most text and vision tasks.

thumb_up_off_alt30

chat_bubble_outline2

repeat2

shareShare

Melvin Johnson

@melvinjohnsonp

a year ago

Our latest version of Gemini 1.5 Pro in AI Studio is #1 on the LMSys leaderboard. 🚀 This is the result of various advances in post-training and we have more lined up. Congrats to the Gemini team.

thumb_up_off_alt152

chat_bubble_outline6

repeat17

shareShare

Dustin Tran

@dustinvtran

a year ago

Gemini is #1 overall on both text and vision arena, and Gemini is #1 on a staggering total of 20 out of 22 leaderboard categories. It's been a journey attaining such a powerful posttrained model. Proud to have co-lead the team!

thumb_up_off_alt110

chat_bubble_outline10

repeat8

shareShare

Dustin Tran

@dustinvtran

a year ago

Welcome back Noam Shazeer to Google! It'll be a great time working together again since 2018. Let's take Gemini which is #1 and continue expanding the limits of its capabilities. techcrunch.com/2024/08/02/cha…

thumb_up_off_alt89

chat_bubble_outline2

repeat3

shareShare

Dustin Tran

@dustinvtran

a year ago

Nice work on controlling style biases! In this view, many models are no longer inflated (e.g., response length, formatting). Gemini 1.5 Flash also outperforms gpt-4o-mini overall and across all categories except for coding.

thumb_up_off_alt24

chat_bubble_outline1

repeat2

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

a year ago

Massive News from Chatbot Arena🔥 Google DeepMind's latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision

Massive News from Chatbot Arena🔥

<a href="/GoogleDeepMind/">Google DeepMind</a>'s latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision

thumb_up_off_alt1,1K

chat_bubble_outline59

repeat307

shareShare

Dustin Tran

@dustinvtran

a year ago

The team says hi again

thumb_up_off_alt124

chat_bubble_outline8

repeat4

shareShare

Dustin Tran

@dustinvtran

a year ago

gemini-exp-1206, out now. #1 everywhere. A 1 year anniversary for Gemini! aistudio.google.com/app/prompts/ne…

thumb_up_off_alt90

chat_bubble_outline3

repeat6

shareShare

Dustin Tran

@dustinvtran

a year ago

Here is what Gemini can do on *Flash*. My favorite perk: Gemini 2.0 Flash Thinking has significant gains in core capabilities while also excellent in user preferences (co-#1 with gemini-exp-1206 on lmarena.ai). The best of both worlds.

thumb_up_off_alt44

chat_bubble_outline2

repeat0

shareShare

Dustin Tran

@dustinvtran

9 months ago

2.5 Pro Exp is a model we're so proud of: #1 on LMArena, #1 on benchmarks like AIME, Aider, MMMU, and MRCR, & significant gains across coding, reasoning, multimodal, and so much more. Try it now! aistudio.google.com gemini.google.com

thumb_up_off_alt27

chat_bubble_outline0

repeat2

shareShare

Dustin Tran

@dustinvtran

7 months ago

This is so good. Love meta-analyses. From a benchmark it's much harder to optimize the test set (implicitly or otherwise).

thumb_up_off_alt24

chat_bubble_outline1

repeat2

shareShare

Dustin Tran

@dustinvtran

7 months ago

Our latest and greatest coding model! We've made some big strides for web app and visual development. And it continues dominating in user preference: #1 with a 37 Elo gap from #2.

thumb_up_off_alt15

chat_bubble_outline0

repeat2

shareShare

Quoc Le

@quocleix

5 months ago

Following its IMO gold-level win, Google DeepMind is sharing Gemini Deep Think with mathematicians for feedback. Excited to see what they discover! 🧠 Plus, an updated Gemini 2.5 Deep Think is now rolling out for Google AI Ultra subscribers. Learn more: bit.ly/3IWcWq0

thumb_up_off_alt281

chat_bubble_outline13

repeat18

shareShare