Guglielmo D'Anna (@guglielmodanna) 's Twitter Profile
Guglielmo D'Anna

@guglielmodanna

Design Engineer @codestoryAI building @aide_dev and other cool things

ID: 2969809955

calendar_today09-01-2015 14:51:03

120 Tweet

79 Followers

203 Following

Aide (@aide_dev) 's Twitter Profile Photo

Say hello to agents in your editor! 🤖 🤝 🤖 We are taking out first step towards putting agents right in your editor. Unlike other platforms, its not just a single agent but many of them working, collaborating together to get the task done. You can see the swarm of agents

skcd (@skcd42) 's Twitter Profile Photo

Excited to share that our team has cooked up a multi-agent coding framework and setting the new State-of-the-Art on SWE-Bench-Lite with 40.3% accepted solutions! Very soon this framework will work right in your editor with the developer working along with agent(S)

Excited to share that our team has cooked up a multi-agent coding framework and setting the new State-of-the-Art on SWE-Bench-Lite with 40.3% accepted solutions!

Very soon this framework will work right in your editor with the developer working along with agent(S)
skcd (@skcd42) 's Twitter Profile Photo

We are sharing a more detailed breakdown for swe-bench-lite evaluation results and some gotchas which we found. To start with we are scoring 40.3% (the real score is a bit higher than that, but some tests were failing but we got a green tick for it) Now for the gotchas

We are sharing a more detailed breakdown for swe-bench-lite evaluation results and some gotchas which we found.

To start with we are scoring 40.3% (the real score is a bit higher than that, but some tests were failing but we got a green tick for it)

Now for the gotchas
Aide (@aide_dev) 's Twitter Profile Photo

We are improving the UX for probing: - We now show the results of the probe directly in the editor - Decorations are improved so you can see where the agent decided to click and the symbol to follow

Naresh 🍓 (@ghostwriternr) 's Twitter Profile Photo

A sneak peek of the CodeStory coding agent that still sits at the top of SWE-bench, now integrated into the editor ⚡️ The agent here explores the codebase to add a new API, refactors an existing struct, and adding just the new methods as necessary.

Naresh 🍓 (@ghostwriternr) 's Twitter Profile Photo

Some QOL improvements to Aide: You can now minimise the CMD+K palette by pressing Esc when the agents are working, and shift your focus elsewhere💃 And if you cancel while the agents are only partly done, edits made until then are kept around for you to review 💁 Happy hacking!

skcd (@skcd42) 's Twitter Profile Photo

Long context is now available on Aide! We now use Google's Gemini-Pro-1.5-Flash for performing long context search and starting the agents. You do not need to be at the right file or location and can start the work from anywhere.

skcd (@skcd42) 's Twitter Profile Photo

agents are 4X faster key changes: - we made our context gathering step much faster by optimising for speed while keeping the same accuracy - code generation is now multi-tiered, using sonnet for planning the outline of the change and Llama-3.1-8B for applying the edits now:

skcd (@skcd42) 's Twitter Profile Photo

Code at the speed of thought The agents are writing code in real time, no speedups! Instead of coming up with the proper prompt, you can think through the changes you are about to make ITERATIVELY and the agents take care of writing out the code.

skcd (@skcd42) 's Twitter Profile Photo

Another big change is the way we navigate codebase, starting today you can navigate the code in Aide using the AST instead of line by line. This is how code is written and parsed (as nodes of the AST tree) allowing you to move and edit quickly

Guglielmo D'Anna (@guglielmodanna) 's Twitter Profile Photo

Going in iterations is such a natural way of coding. It makes sense that coding with AI should be possible in such fashion – with the advantage that agents can code much faster than people (even Sandeep). Here is what we hashed out these past two weeks.

Guglielmo D'Anna (@guglielmodanna) 's Twitter Profile Photo

This thing Naresh 🍓 whipped up today is just sooo cool, and makes me wonder how I lived without this in VS code 'till now. It's also the perfect pair with quick, anchor-based symbol AI editing that we are releasing tonight.

skcd (@skcd42) 's Twitter Profile Photo

CodeStory agent is now SOTA on swebench-verified with 62.2% resolution rate. We did this by scaling our agent on test time inference and re-learning the bitter lesson. Sonnet3.5(new) was the only LLM we used for this run

CodeStory agent is now SOTA on swebench-verified with 62.2% resolution rate.

We did this by scaling our agent on test time inference and re-learning the bitter lesson.

Sonnet3.5(new) was the only LLM we used for this run