Sam Bowyer (@sambowyer__) Twitter Tweets • TwiCopy

Sam Bowyer

@sambowyer__

+ Follow

Bristol ML PhD Student, Compass CDT

ID: 1635998612604821504

linkhttps://sambowyer.com/ calendar_today15-03-2023 13:37:27

24 Tweet

54 Followers

83 Following

Edward Milsom

@edward_milsom

9 months ago

Our paper "Function-Space Learning Rates" is on arXiv! We give an efficient way to estimate the magnitude of changes to NN outputs caused by a particular weight update. We analyse optimiser dynamics in function space, and enable hyperparameter transfer with our scheme FLeRM! 🧵👇

thumb_up_off_alt420

chat_bubble_outline12

repeat68

shareShare

Desi R. Ivanova

@desirivanova

9 months ago

I’ve been complaining about lack of error bars in LLM papers for some time. Rather than just complaining, here’s a guide on how to do it! ⬇️ We’ve done a small Python lib that you can install… or copy-paste one file into your projects (dependencies are annoying, we get it 🙃)

thumb_up_off_alt56

chat_bubble_outline2

repeat9

shareShare

Desi R. Ivanova

@desirivanova

9 months ago

So no more excuses for not adding error bars (or adding invalid ones 😬)

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

Thomas Heap

@thomaseheap

8 months ago

Link: arxiv.org/abs/2503.08264 Code: github.com/alan-ppl/alan This work was a team effort, I'm very grateful for my collaborators Sam Bowyer, and Laurence Aitchison. Thanks also to gavin leech who was involved in the MP-RWS paper.

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Sam Bowyer

@sambowyer__

8 months ago

Really happy to have this paper out on arXiv! Scalable GPU-based Bayesian inference for hierarchical models without requiring gradients wrt model parameters (unlike e.g. VI). arxiv.org/abs/2503.08264

thumb_up_off_alt8

chat_bubble_outline0

repeat4

shareShare

Sam Bowyer

@sambowyer__

7 months ago

Our position paper on LLM eval error bars has just been accepted to ICML 2025 as a spotlight poster!

thumb_up_off_alt19

chat_bubble_outline1

repeat10

shareShare

Laurence Aitchison

@laurence_ai

7 months ago

(Spotlight) LLM evals are increasingly based on tiny datasets (e.g. AIME), so considering uncertainty is becoming critical. We show approaches based on the CLT don't work, and give Bayesian+frequentist alternatives. (Sam Bowyer Desi R. Ivanova) arxiv.org/abs/2503.01747

thumb_up_off_alt14

chat_bubble_outline0

repeat3

shareShare

Ben Anson

@benaibean

6 months ago

Is it possible to _derive_ an attention scheme with effective zero-shot generalisation? The answer turns out to be yes! To achieve this, we began by thinking about desirable properties for attention over long contexts, and we distilled 2 key conditions:

thumb_up_off_alt407

chat_bubble_outline6

repeat42

shareShare

Xidulu

@xidulu

5 months ago

Thoughts after reading Sam Bowyer 's amazing position paper: Are there more sensible approaches to draw error bar when reporting pass@k than just computing the standard deviation? arxiv.org/abs/2503.01747

thumb_up_off_alt5

chat_bubble_outline0

repeat4

shareShare