
Martin Szummer
@mszummer
Researcher, Entrepreneur, Lover of the North
ID: 582971800
17-05-2012 17:25:32
2 Tweet
77 Takipçi
131 Takip Edilen

"Agency > Intelligence" Andrej Karpathy nailed it, and after 18 months building Maestro, we agree. The real AI leap isn’t just smarts—it’s agency: the ability to act independently, turning assistants into partners.


Our VibeCodeBench evaluations affirm what Paul Jankura just announced: Claude Sonnet 4 excels at autonomous multi-feature development. We've seen codebase navigation errors drop from 20% to near zero and strategic refactoring that saves ~500k tokens on multi stage, complex tasks.

Paul Jankura reports Claude 4 models are 65% less likely to use shortcuts on agentic tasks. Our evaluations confirm this—Claude Sonnet 4 consistently understates feature completeness rather than overstate success. This translates to more reliable AI assistance through Maestro.






