Kyle Corbitt(@corbtt) 's Twitter Profileg
Kyle Corbitt

@corbtt

Currently building @OpenPipeAI. Formerly @ycombinator, @google. I am always down to go on a quest.

ID:823506858

calendar_today14-09-2012 15:44:30

761 Tweets

6,1K Followers

134 Following

Kyle Corbitt(@corbtt) 's Twitter Profile Photo

Finished evaluating the new GPT-4 on 5 real customer tasks (not benchmarks!).

Conclusion: The GPT-4 April release is pretty comparable on most things, but much worse on guided summarization.

Definitely worth running your own evals before adopting!

Finished evaluating the new GPT-4 on 5 real customer tasks (not benchmarks!). Conclusion: The GPT-4 April release is pretty comparable on most things, but much worse on guided summarization. Definitely worth running your own evals before adopting!
account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

Intentionally not bothering to refactor/fix any tech debt for the next 2 months. GPT-5 will be able to just cleanly rewrite my codebase in one fell swoop right?

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

Still pulling the data together but seems like newest GPT-4-turbo is a bit worse on average on our evals than the previous gpt-4-0125-preview. Will post data tomorrow probably.

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

Making tool calls a first-class citizen in the OpenAI API was a mistake. JSON mode can do everything tool calls can and more, and is conceptually simpler.

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

If you want to try out the new Llama 3 models when they drop next week the best way to do so is to get your dataset uploaded and ready to go on OpenPipe. We will have fine-tuning and inference live ASAP after it's live.

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

The Information reports that the timeline for releasing the smallest Llama 3 variants has moved up to next week!

This prob means Meta has found their 7B beats Mistral 7B on benchmarks (otherwise they wouldn't give it a dedicated release). Let's go! 🚀

theinformation.com/articles/meta-…

The Information reports that the timeline for releasing the smallest Llama 3 variants has moved up to next week! This prob means Meta has found their 7B beats Mistral 7B on benchmarks (otherwise they wouldn't give it a dedicated release). Let's go! 🚀 theinformation.com/articles/meta-…
account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

Potentially important development in parameter-efficient-fine-tuning. Much smaller parameter count than LoRA, which translates directly to smaller overhead at serving time and shorter training time. Coupled with (claimed) higher perf!

As always, need to verify it reproduces and…

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

I run a fine-tuning company and this is what I tell everyone.

Prompting is for 0 to 1. Fine tuning is for 1 to 100.

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

2025: it's now considered good manners to add subtle typos and grammar errors to your emails. signals a real human spent time on it.

2026: all frontier models are now RLHF'd to add typos and grammar errors.

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

This new optimizer is potentially a major breakthrough for fine-tuning in production.

Why?

You no longer need to choose a learning rate schedule that tails off to 0. That makes continual fine-tuning on new production data much less fraught, since you don't need to re-warm your…

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

'Anything worth doing is worth doing r̶i̶g̶h̶t̶ in a half-a** way so you can get to market fast and iterate from there'
-- any actually successful SaaS founder in a competitive industry

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

An AI-empowered employee has a vastly higher skill floor than a non-AI-empowered employee.

Just asked an engineer with 0 marketing experience to set up our Hubspot so to send a product newsletter to all our current and future users. Pre-AI, would not have been worth the ramp to…

account_circle
Jeremy Howard(@jeremyphoward) 's Twitter Profile Photo

Kyle Corbitt The reason people do most things in model training is because that's what everyone else does. Everyone else does it because the first paper in the sequence did it. That paper did it because a PhD student had some code for that handy.

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

Hard to overstate how big a deal this is for fine-tuning: with existing methods, you have to know *ahead of time* how many epochs you want to train for.

This training technique, if true, would let you just keep evaling checkpoints at epoch 1, 3, 5, etc. until it's good enough!…

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

Really enjoyed this conversation with the Cerebral Valley team. They've done a fantastic job of cultivating the AI community in SF. 🙂

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

Anyone know how much perf you gain from quantization on optimized hardware? Like on an H100 do you get higher throughput with a 13B model in FP8 or a 7B model in BF16?

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

imo the industry lost a lot when we switched from few-shot prompting with GPT-3 to instruction-prompting with GPT-3.5 and 4. A few good examples can guide a model much more strongly than written instructions.

Expecting a big comeback here.

account_circle
Kyle Corbitt(@corbtt) 's Twitter Profile Photo

For a long time I tried to come up with some clever new insight every time I was asked to give a conference/meetup talk.

Eventually I realized that just giving the same talk over and over (as long as the audience doesn't fully overlap) is the boring-but-optimal solution.

account_circle