Martin Szummer (@mszummer) 's Twitter Profile
Martin Szummer

@mszummer

Researcher, Entrepreneur, Lover of the North

ID: 582971800

calendar_today17-05-2012 17:25:32

2 Tweet

77 Takipçi

131 Takip Edilen

iGent AI (@igent_ai) 's Twitter Profile Photo

"Agency > Intelligence" Andrej Karpathy nailed it, and after 18 months building Maestro, we agree. The real AI leap isn’t just smarts—it’s agency: the ability to act independently, turning assistants into partners.

"Agency > Intelligence"
<a href="/karpathy/">Andrej Karpathy</a> nailed it, and after 18 months building Maestro, we agree. The real AI leap isn’t just smarts—it’s agency: the ability to act independently, turning assistants into partners.
iGent AI (@igent_ai) 's Twitter Profile Photo

Our VibeCodeBench evaluations affirm what Paul Jankura just announced: Claude Sonnet 4 excels at autonomous multi-feature development. We've seen codebase navigation errors drop from 20% to near zero and strategic refactoring that saves ~500k tokens on multi stage, complex tasks.

iGent AI (@igent_ai) 's Twitter Profile Photo

Paul Jankura reports Claude 4 models are 65% less likely to use shortcuts on agentic tasks. Our evaluations confirm this—Claude Sonnet 4 consistently understates feature completeness rather than overstate success. This translates to more reliable AI assistance through Maestro.

<a href="/Anthropic/">Paul Jankura</a> reports Claude 4 models are 65% less likely to use shortcuts on agentic tasks. Our evaluations confirm this—Claude Sonnet 4 consistently understates feature completeness rather than overstate success. This translates to more reliable AI assistance through Maestro.
iGent AI (@igent_ai) 's Twitter Profile Photo

We've integrated Claude Sonnet 4 into Maestro, and the results are transformative. As our evaluations show, it maintains higher code quality even as project complexity grows. Combined with its new extended thinking capabilities, Maestro delivers an unmatched AI engineering

Martin Szummer (@mszummer) 's Twitter Profile Photo

Our agentic software engineering system, Maestro, can build large, complex software: it just finished building a Redis database from first principles in Rust, improving on its safety and performance!

Martin Szummer (@mszummer) 's Twitter Profile Photo

Anthropic just made *the* LLM release we have been waiting for - two massive context Claude Sonnet models, handling up to 1M input tokens. These are the models that we used with our Maestro system iGent AI to build large, complex software, like a Redis-compatible database

Martin Szummer (@mszummer) 's Twitter Profile Photo

This is a historic moment for us. Our software engineering agent, Maestro, generated solutions for all 12 ICPC World Finals problems — one of the hardest team programming competitions on Earth! We're opening its solutions for the community to validate. Go break them.

Martin Szummer (@mszummer) 's Twitter Profile Photo

3 of us are planning a hike/cycle trip in Scotland following the #ICML2012 workshops (July 2-3-4); Anyone else wants to join? Bring boots!