Luigi Acerbi (@acerbiluigi) 's Twitter Profile
Luigi Acerbi

@acerbiluigi

Assoc. Prof. of Machine & Human Intelligence @UnivHelsinkiCS @FCAI_fi | Bayesian ML & probabilistic modeling

ID: 757009671366606848

linkhttps://lacerbi.github.io/ calendar_today24-07-2016 00:29:05

1,1K Tweet

2,2K Followers

493 Following

François Fleuret (@francoisfleuret) 's Twitter Profile Photo

As expected, that was popular. Here is my attempt at consolidating all the answers into a list. - Prenorm: normalization in the residual blocks before the attention operation and the FFN respectively - GQA (Group Query Attention): more Q than (K, V)

Lisan al Gaib (@scaling01) 's Twitter Profile Photo

I'm back and Gemini 2.5 Pro is still the king (no glaze) I did some more manual data cleaning and scrapped the shitty "average scaled score" and replaced it with Glicko-2 rating system with params: INITIAL_RATING = 1500 INITIAL_RD = 350 INITIAL_VOL = 0.06 TAU (τ) =

I'm back and Gemini 2.5 Pro is still the king (no glaze)

I did some more manual data cleaning and scrapped the shitty "average scaled score" and replaced it with Glicko-2 rating system with params:
INITIAL_RATING = 1500
INITIAL_RD     = 350
INITIAL_VOL    = 0.06
TAU (τ)        =
Luigi Acerbi (@acerbiluigi) 's Twitter Profile Photo

NeurIPS deadline came early. I remember someone proposing stochastic conference deadlines with exponential decay, is this an A/B test for that?

Brian Krassenstein (@krassenstein) 's Twitter Profile Photo

BREAKING: They now are sending secret service agents to James Comey's house for calling for the removal of a sitting president who is going against the Constitution. If they don't like your "free speech" they will intimidate you, lie about you, and make you the enemy. 8647

N. Loka (@nasloka) 's Twitter Profile Photo

In BO, normally we use P(y|x,D) for AF calculation. With ACE, we can get P(xopt|D) (just predict the optimum!), making Thompson sampling and entropy-based AF straightforward to compute. See Luigi's latest blog post on how we craft the dataset and the broader direction beyond BO!

Andrea (@perina_ndrea) 's Twitter Profile Photo

Now accepted at JMLR, and with an extension to general finite groups (including non-abelian groups)! Updated version of our (w/ Stéphane Deny) work: arxiv.org/abs/2412.11521

Now accepted at JMLR, and with an extension to general finite groups (including non-abelian groups)! Updated version of our (w/ <a href="/StphTphsn1/">Stéphane Deny</a>) work: arxiv.org/abs/2412.11521
Daniel Litt (@littmath) 's Twitter Profile Photo

Claude: my wife and I went antique shopping this weekend Gemini: if I can’t get this code to work I will k*** myself ChatGPT: the answer to your question came to me in a dream Grok: why yes I was in Berlin in 1939, why do you ask?

Boaz Barak (@boazbaraktcs) 's Twitter Profile Photo

I didn't want to post on Grok safety since I work at a competitor, but it's not about competition. I appreciate the scientists and engineers at xAI but the way safety was handled is completely irresponsible. Thread below.

Ravid Shwartz Ziv (@ziv_ravid) 's Twitter Profile Photo

So, all the models underperform humans on the new International Mathematical Olympiad questions, and Grok-4 is especially bad on it, even with best-of-n selection? Unbelievable!

So, all the models underperform humans on the new International Mathematical Olympiad questions, and  Grok-4 is especially bad on it, even with best-of-n selection?  Unbelievable!
Tom Rainforth (@tom_rainforth) 's Twitter Profile Photo

I have an opening for a 2-year postdoc in probabilistic machine learning and/or experimental design. The application deadline is the 3rd of September. See here for details and how to apply: tinyurl.com/rainmlpostdoc2…