Gabriel Recchia (@mesotronium) 's Twitter Profile
Gabriel Recchia

@mesotronium

Cognitive scientist, previously at @Cambridge_Uni 's Winton Centre for Risk and Evidence Communication, now working on LLM capability evaluation & alignment

ID: 182045911

linkhttp://gabrielrecchia.com calendar_today23-08-2010 18:07:31

503 Tweet

269 Takipçi

322 Takip Edilen

Fazl Barez (@fazlbarez) 's Twitter Profile Photo

This has been accepted at ICML 2025! See you all in Vancouver. Credit to Tingchen Fu for leading this work and to my wonderful collaborators! mrinank 🍂 philip Shay B.Cohen and David Krueger

Fazl Barez (@fazlbarez) 's Twitter Profile Photo

Responsible Reviewing #NeurIPS2025 — TL;DR 1- If you/ your co-author skip your assigned reviews → you wont see your own paper’s reviews. 2- Submit a poor quality review → your paper may be desk‑rejected. 👏 Nice one, NeurIPS! 🔗 blog.neurips.cc/2025/05/02/res…

Geoffrey Irving (@geoffreyirving) 's Twitter Profile Photo

AISI's research agenda is out! We cover a variety of topics in the evaluation and mitigations of risks from frontier LLMs, including both work happening at AISI and work we are excited to see others tackle.

AISI's research agenda is out! We cover a variety of topics in the evaluation and mitigations of risks from frontier LLMs, including both work happening at AISI and work we are excited to see others tackle.
Gabriel Recchia (@mesotronium) 's Twitter Profile Photo

This is an excellent and, I think, very important piece that I hope gets the attention it deserves within the AI safety community. Many congratulations to Josh Engels, David D. Baek, Subhash Kantamneni, Max Tegmark

AI Security Institute (@aisecurityinst) 's Twitter Profile Photo

Advanced AI systems require complex evaluations to measure abilities, but conventional analysis techniques often fall short. Introducing HiBayES: a flexible, robust statistical modelling framework that accounts for the nuances & hierarchical structure of advanced evaluations.

Advanced AI systems require complex evaluations to measure abilities, but conventional analysis techniques often fall short.
Introducing HiBayES: a flexible, robust statistical modelling framework that accounts for the nuances & hierarchical structure of advanced evaluations.
Benjamin Hilton (@benjamin_hilton) 's Twitter Profile Photo

Humans are often very wrong. This is a big problem if you want to use human judgment to oversee super-smart AI systems. In our new post, Geoffrey Irving argues that we might be able to deal with this issue – not by fixing the humans, but by redesigning oversight protocols.

Humans are often very wrong.

This is a big problem if you want to use human judgment to oversee super-smart AI systems.

In our new post, <a href="/geoffreyirving/">Geoffrey Irving</a> argues that we might be able to deal with this issue – not by fixing the humans, but by redesigning oversight protocols.
Charbel-Raphael (@crsegerie) 's Twitter Profile Photo

"If you cannot measure it, you cannot improve it" - Lord Kelvin The science of AI safety evaluation is still nascent, but is advancing and we know much more today than two years ago. CeSIA tried to make this knowledge accessible by publishing a SotA literature review!

"If you cannot measure it, you cannot improve it" - Lord Kelvin

The science of AI safety evaluation is still nascent, but is advancing and we know much more today than two years ago.

CeSIA tried to make this knowledge accessible by publishing a SotA literature review!
Gabriel Recchia (@mesotronium) 's Twitter Profile Photo

Proud to have contributed to this study that cleanly demonstrates the persuasive capabilities of LLMs. (No secret Reddit posting involved!!)