lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile
lmarena.ai (formerly lmsys.org)

@lmarena_ai

LMArena: Open Platform for Crowdsourced AI Benchmarking. Test out the new Beta: beta.lmarena.ai! Officially graduated from @lmsysorg!

ID: 1641378826537295874

linkhttps://lmarena.ai/leaderboard calendar_today30-03-2023 09:56:38

1,1K Tweet

77,77K Takipçi

203 Takip Edilen

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Breaking: Claude Opus 4 jumps to #1 in WebDev Arena! A strong comeback from Anthropic - Opus 4 and Sonnet 4 now on top of the chart, surpassing previous Claude 3.7 and matching Gemini 2.5 Pro. Massive congrats to Anthropic🔥

Breaking: Claude Opus 4 jumps to #1 in WebDev Arena!

A strong comeback from <a href="/AnthropicAI/">Anthropic</a> - Opus 4 and Sonnet 4 now on top of the chart, surpassing previous Claude 3.7 and matching Gemini 2.5 Pro.

Massive congrats to <a href="/AnthropicAI/">Anthropic</a>🔥
lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Image Editing just got real on LMArena 🖼️✨ Introducing Image Edit Arena: where AI editing models go head-to-head on your images. Upload, edit, vote. It's that simple. Who edits it best? You decide🫵 Learn how it works in thread 🧵

Image Editing just got real on LMArena 🖼️✨

Introducing Image Edit Arena: where AI editing models go head-to-head on your images. Upload, edit, vote. It's that simple.

Who edits it best? You decide🫵

Learn how it works in thread 🧵
lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

You can also share your image generations from LMArena in our monthly AI Generation Contests in the Discord. This month's theme is "Cozy Desk" 🥰discord.gg/NtYvnEmf55

Sundar Pichai (@sundarpichai) 's Twitter Profile Photo

Our latest Gemini 2.5 Pro update is now in preview. It’s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads lmarena.ai with a 24pt Elo score jump since the previous version. We also

Our latest Gemini 2.5 Pro update is now in preview.

It’s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads <a href="/lmarena_ai/">lmarena.ai</a> with a 24pt Elo score jump since the previous version.

We also
Bloomberg Live (@bloomberglive) 's Twitter Profile Photo

“LMArena is an open community platform for testing and evaluating all the top AI models…you have the ability to not just interact with them, but to battle them against one another,” Anastasios Nikolas Angelopoulos, Co-Founder lmarena.ai, tells Bloomberg’s Rachel Metz at #BloombergTech.

Bloomberg Live (@bloomberglive) 's Twitter Profile Photo

“People come because they like seeing multiple opinions, multiple AIs, and comparing them… They’re really excited by the opportunity to look at some new models from the top providers that are pre-released,” Anastasios Nikolas Angelopoulos, Co-Founder lmarena.ai at #BloombergTech.

Bloomberg Live (@bloomberglive) 's Twitter Profile Photo

“Many use cases for AI are subjective, human preference is very important to evaluate that… our goal is to take this preference data and use it to extract out all the different components of why people prefer something,” Anastasios Nikolas Angelopoulos, Co-Founder lmarena.ai at #BloombergTech