Scott Swingle (@bio_bootloader) 's Twitter Profile
Scott Swingle

@bio_bootloader

Father of 3, building Mentat (the github native coding agent!) @AbanteAI, prev @DeepMind

ID: 1528890098112049152

linkhttp://mentat.ai calendar_today24-05-2022 00:06:45

5,5K Tweet

10,10K Followers

2,2K Following

Scott Swingle (@bio_bootloader) 's Twitter Profile Photo

$1.25 / $10 input/output is great, and 400k context window also great cached input is only $0.125 - 10%, matching Anthropic's cached input rates - Amazing!

Scott Swingle (@bio_bootloader) 's Twitter Profile Photo

GPT-5 is OpenAI's best model yet on LoCoDiff LoCoDiff is unusual - more reasoning usually hurts performance, and it's the same for GPT-5, with "minimal" doing better than "medium" reasoning effort. GPT-5 almost at Gemini 2.5 Pro levels until 60k tokens Sonnet 4 still dominates

GPT-5 is OpenAI's best model yet on LoCoDiff

LoCoDiff is unusual - more reasoning usually hurts performance, and it's the same for GPT-5, with "minimal" doing better than "medium" reasoning effort.

GPT-5 almost at Gemini 2.5 Pro levels until 60k tokens

Sonnet 4 still dominates
Scott Swingle (@bio_bootloader) 's Twitter Profile Photo

Mentat has positive margins. It’s simply 19.5% over our model api costs. Use as much or as little as you want (you’ll want a lot)

Grant♟️ (@granawkins) 's Twitter Profile Photo

Wrap your head around this. - Have git spit out a file's edit history - Line up all the diffs one-after-another - Feed it to a model and tell it to recreate the current (last) state At 100,000 tokens, Sonnet 4 succeeds 2/3 of the time, while GPT-5 does <5%.

Scott Swingle (@bio_bootloader) 's Twitter Profile Photo

whoa OpenAI is giving GPT-5 on the API a prompt before the one I set?? they give it: - instructions on formatting (which is contradicting my own) - today's date - telling it that it's being used over the API - a bunch of other stuff not cool