atharva (@atharvaraykar) 's Twitter Profile
atharva

@atharvaraykar

writing: atharvaraykar.com
work: @nilenso

elsewheres:
🦋 @atharvaraykar.com
🐘 @[email protected]

ID: 1085056759310544896

calendar_today15-01-2019 06:11:05

411 Tweet

221 Takipçi

586 Takip Edilen

atharva (@atharvaraykar) 's Twitter Profile Photo

Anthropic is the first lab that (very quietly) released scores for the more useful SWE Bench variants on release. Significant improvement on SWE Bench Pro! Unfortunately no one knows how well Gemini or Codex-Max is on it.

atharva (@atharvaraykar) 's Twitter Profile Photo

I looked at the code for this Is this just ...asking an LLM to improve the prompt in a loop by giving it annotated data? It beats/matches DSPy's fancy optimizers. Is this a big L for DSPy? Am I missing something? Why didn't they try this obvious thing first?