Sergio Sánchez (@s3rgiosanchez) Twitter Tweets • TwiCopy

Anthropic

3 months ago

New Anthropic research: Estimating AI productivity gains from Claude conversations. The Anthropic Economic Index tells us where Claude is used, and for which tasks. But it doesn’t tell us how useful Claude is. How much time does it save?

thumb_up_off_alt1,1K

chat_bubble_outline69

repeat222

shareShare

Sergio Sánchez

@s3rgiosanchez

3 months ago

No me queda ninguna duda de que nos espera un 2026 apasionante en relación a la IA. Recomendadísima lectura.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Runway

@runwayml

2 months ago

Introducing our new frontier video model, Runway Gen-4.5. Previously known as Whisper Thunder (aka) David. Gen-4.5 is state-of-the-art and sets a new standard for video generation motion quality, prompt adherence and visual fidelity. Learn more below.

thumb_up_off_alt2,2K

chat_bubble_outline194

repeat475

shareShare

Poetiq

@poetiq_ai

2 months ago

Poetiq has officially shattered the ARC-AGI-2 SOTA 🚀 ARC Prize has officially verified our results: - 54% Accuracy – first to break the 50% barrier! - $30.57 / problem – less than half the cost of the previous best! We are now #1 on the leaderboard for ARC-AGI-2!

Poetiq has officially shattered the ARC-AGI-2 SOTA 🚀

<a href="/arcprize/">ARC Prize</a> has officially verified our results:
- 54% Accuracy – first to break the 50% barrier!
- $30.57 / problem – less than half the cost of the previous best!

We are now #1 on the leaderboard for ARC-AGI-2!

thumb_up_off_alt2,2K

chat_bubble_outline111

repeat263

shareShare

Mckay Wrigley

@mckaywrigley

2 months ago

Here are my Opus 4.5 thoughts after ~2 weeks of use. First some general thoughts, then some practical stuff. --- THE BIG PICTURE --- THE UNLOCK FOR AGENTS It's clear to anyone who's used Opus 4.5 that AI progress isn't slowing down. I'm surprised more people aren't treating

thumb_up_off_alt2,2K

chat_bubble_outline144

repeat229

shareShare

Simón Muñoz

@simonvlc

2 months ago

"La elección ya no es si adoptar IA. Ese tren ya partió. La elección es si estar entre el 5% que está construyendo el futuro o el 95% que está intentando entenderlo." estrategiadeproducto.com/p/el-futuro-de…

thumb_up_off_alt26

chat_bubble_outline0

repeat10

shareShare

OpenAI Newsroom

@openainewsroom

2 months ago

OpenAI is co-founding the Agentic AI Foundation (AAIF) under the Linux Foundation alongside Anthropic and Block to support open, interoperable standards for agentic AI. We’re also donating AGENTS .md to help establish open standards that enable safe, reliable agents across

thumb_up_off_alt3,3K

chat_bubble_outline196

repeat493

shareShare

OpenAI

@openai

2 months ago

GPT-5.2 is now rolling out to everyone. openai.com/index/introduc…

thumb_up_off_alt12,12K

chat_bubble_outline713

repeat2,2K

shareShare

Matt Shumer

@mattshumer_

2 months ago

I've had access to GPT-5.2 since November 25th. Since then, I've used it as my daily-driver, pushing it to its limits. It beats out Opus 4.5 in most things I tried, but there's a (big) catch. Here's my review of GPT-5.2: shumer.dev/gpt52review

thumb_up_off_alt1,1K

chat_bubble_outline77

repeat65

shareShare

ARC Prize

@arcprize

2 months ago

A year ago, we verified a preview of an unreleased version of OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year

A year ago, we verified a preview of an unreleased version of <a href="/OpenAI/">OpenAI</a> o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task

Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task

This represents a ~390X efficiency improvement in one year

thumb_up_off_alt4,4K

chat_bubble_outline150

repeat646

shareShare

ARC Prize

@arcprize

2 months ago

ARC-AGI-3 (2026) will drive AI capability and efficiency even further Designed to measure the ability of AI to efficiently learn and generalize in novel environments, it will be a first-of-its-kind Interactive Reasoning Benchmark Stay tuned

thumb_up_off_alt311

chat_bubble_outline2

repeat18

shareShare

Javi López ⛩️

@javilop

2 months ago

🔥 ¡OPEN AI ESTÁ DE VUELTA! Sam Altman acaba de soltar los resultados de GPT-5.2 Thinking y es un auténtico monstruo. Altman lo llama un modelo MUY inteligente, y los benchmarks lo avalan → salto brutal sobre GPT-5.1, y se fuma a Claude Opus 4.5 y Gemini 3 Pro en pruebas

thumb_up_off_alt108

chat_bubble_outline13

repeat18

shareShare

Sergio Sánchez

@s3rgiosanchez

2 months ago

"Reasoning systems now show genuine fluid intelligence on simple tasks"

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Sergio Sánchez

@s3rgiosanchez

2 months ago

Estos casos de uso los iremos viendo más y más conforme la adopción del vibe coding de los nuevos modelos vaya calando en los equipos de desarrollo. Una migración completa en 3 días, increíble.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Tibo

@thsottiaux

2 months ago

GPT-5.2-Codex is out, further advancing our SoTA for professional software engineering and long-running agentic coding work. It improves on instruction following, long-context understanding, and pushes the frontier including on cyber. $ codex -m gpt-5.2-codex

thumb_up_off_alt824

chat_bubble_outline73

repeat79

shareShare

OpenAI Developers

@openaidevs

2 months ago

🆕 Codex now officially supports skills Skills are reusable bundles of instructions, scripts, and resources that help Codex complete specific tasks. You can call a skill directly with $.skill-name, or let Codex choose the right one based on your prompt.

thumb_up_off_alt2,2K

chat_bubble_outline96

repeat307

shareShare

Sergio Sánchez

@s3rgiosanchez

2 months ago

"shipping velocity mathers more than perfection" Llámalo AI slope o como quieras...

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

OpenAI Developers

@openaidevs

20 days ago

📣 How we built the Codex agent loop Ever wonder what Codex does between your prompt and its response? Each turn assembles inputs, runs inference, executes tools, and feeds the results back into context until the loop ends openai.com/index/unrollin…

thumb_up_off_alt1,1K

chat_bubble_outline52

repeat198

shareShare

OpenAI

@openai

7 days ago

GPT-5.3-Codex is now available in Codex. You can just build things. openai.com/index/introduc…

thumb_up_off_alt10,10K

chat_bubble_outline638

repeat1,1K

shareShare

Vaibhav (VB) Srivastav

@reach_vb

7 days ago

BOOOOM! Introducing GPT-5.3-Codex: our most capable agentic coding model yet 🔥 > Frontier coding + terminal skills with fewer tokens > Built for long-running tasks (research → tool use → execution) > Interactive mid-turn steering + frequent progress updates > Stronger default

thumb_up_off_alt453

chat_bubble_outline31

repeat28

shareShare