Luigi Acerbi (@acerbiluigi) Twitter Tweets • TwiCopy

Luigi Acerbi

@acerbiluigi

+ Follow

Assoc. Prof. of Machine & Human Intelligence @UnivHelsinkiCS @FCAI_fi | Bayesian ML & probabilistic modeling

ID: 757009671366606848

linkhttps://lacerbi.github.io/ calendar_today24-07-2016 00:29:05

1,1K Tweet

2,2K Takipçi

493 Takip Edilen

François Fleuret

@francoisfleuret

4 months ago

As expected, that was popular. Here is my attempt at consolidating all the answers into a list. - Prenorm: normalization in the residual blocks before the attention operation and the FFN respectively - GQA (Group Query Attention): more Q than (K, V)

thumb_up_off_alt704

chat_bubble_outline8

repeat64

shareShare

Lisan al Gaib

@scaling01

4 months ago

I'm back and Gemini 2.5 Pro is still the king (no glaze) I did some more manual data cleaning and scrapped the shitty "average scaled score" and replaced it with Glicko-2 rating system with params: INITIAL_RATING = 1500 INITIAL_RD = 350 INITIAL_VOL = 0.06 TAU (τ) =

thumb_up_off_alt398

chat_bubble_outline29

repeat35

shareShare

Luigi Acerbi

@acerbiluigi

4 months ago

NeurIPS deadline came early. I remember someone proposing stochastic conference deadlines with exponential decay, is this an A/B test for that?

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Brian Krassenstein

@krassenstein

4 months ago

BREAKING: They now are sending secret service agents to James Comey's house for calling for the removal of a sitting president who is going against the Constitution. If they don't like your "free speech" they will intimidate you, lie about you, and make you the enemy. 8647

thumb_up_off_alt8,8K

chat_bubble_outline3,3K

repeat2,2K

shareShare

will brown

@willccbb

3 months ago

i miss gemini-2.5-pro-exp-03-25 so bad :(

thumb_up_off_alt1,1K

chat_bubble_outline57

repeat35

shareShare

N. Loka

@nasloka

2 months ago

In BO, normally we use P(y|x,D) for AF calculation. With ACE, we can get P(xopt|D) (just predict the optimum!), making Thompson sampling and entropy-based AF straightforward to compute. See Luigi's latest blog post on how we craft the dataset and the broader direction beyond BO!

thumb_up_off_alt6

chat_bubble_outline1

repeat1

shareShare

Andrea

@perina_ndrea

2 months ago

Now accepted at JMLR, and with an extension to general finite groups (including non-abelian groups)! Updated version of our (w/ Stéphane Deny) work: arxiv.org/abs/2412.11521

Now accepted at JMLR, and with an extension to general finite groups (including non-abelian groups)! Updated version of our (w/ <a href="/StphTphsn1/">Stéphane Deny</a>) work: arxiv.org/abs/2412.11521

thumb_up_off_alt37

chat_bubble_outline1

repeat8

shareShare

Daniel Litt

@littmath

2 months ago

Claude: my wife and I went antique shopping this weekend Gemini: if I can’t get this code to work I will k*** myself ChatGPT: the answer to your question came to me in a dream Grok: why yes I was in Berlin in 1939, why do you ask?

thumb_up_off_alt4,4K

chat_bubble_outline32

repeat208

shareShare

Boaz Barak

@boazbaraktcs

2 months ago

I didn't want to post on Grok safety since I work at a competitor, but it's not about competition. I appreciate the scientists and engineers at xAI but the way safety was handled is completely irresponsible. Thread below.

thumb_up_off_alt5,5K

chat_bubble_outline326

repeat335

shareShare

Ravid Shwartz Ziv

@ziv_ravid

a month ago

So, all the models underperform humans on the new International Mathematical Olympiad questions, and Grok-4 is especially bad on it, even with best-of-n selection? Unbelievable!

thumb_up_off_alt2,2K

chat_bubble_outline150

repeat185

shareShare

Luigi Acerbi

@acerbiluigi

a month ago

Remember to follow the official rebuttal guide.

thumb_up_off_alt7

chat_bubble_outline1

repeat0

shareShare

Tom Rainforth

@tom_rainforth

a month ago

I have an opening for a 2-year postdoc in probabilistic machine learning and/or experimental design. The application deadline is the 3rd of September. See here for details and how to apply: tinyurl.com/rainmlpostdoc2…

thumb_up_off_alt33

chat_bubble_outline0

repeat11

shareShare