Kabir (@plodq) 's Twitter Profile
Kabir

@plodq

ID: 1237378701798436864

linkhttp://kabirk.com calendar_today10-03-2020 14:05:07

37 Tweet

30 Takipçi

33 Takip Edilen

Kabir (@plodq) 's Twitter Profile Photo

I like how Anthropic's Claude gracefully degrades when there are a lot of users. First, it switches to concise mode for shorter responses. If there's no capacity at all, Claude will return an error - but it saves your prompt so you don't need to type it in again.

Kabir (@plodq) 's Twitter Profile Photo

It feels like the 2 main product improvements cursor brings are 1/ context automatically in chat and 2/ good autocomplete. Feels like this should be easy to implement in other domains. But products like Microsoft's Copilot havent panned out, so I must be missing something

Kabir (@plodq) 's Twitter Profile Photo

For a period some frontier llms were being released in 3 sizes. Eg Gemini flash, pro, ultra; Claude haiku, sonnet, opus. It looks they're trending towards 2 sizes now - no more ultra or opus; o1/mini.

John Yang (@jyangballin) 's Twitter Profile Photo

40% with just 1 try per task: SWE-agent-LM-32B is the new #1 open source model on SWE-bench Verified. We built it by synthesizing a ton of agentic training data from 100+ Python repos. Today we’re open-sourcing the toolkit that made it happen: SWE-smith.

40% with just 1 try per task: SWE-agent-LM-32B is the new #1 open source model on SWE-bench Verified.

We built it by synthesizing a ton of agentic training data from 100+ Python repos.

Today we’re open-sourcing the toolkit that made it happen: SWE-smith.