Gabriel Recchia (@mesotronium) Twitter Tweets • TwiCopy

Gabriel Recchia

@mesotronium

+ Follow

Cognitive scientist, previously at @Cambridge_Uni 's Winton Centre for Risk and Evidence Communication, now working on LLM capability evaluation & alignment

ID: 182045911

linkhttp://gabrielrecchia.com calendar_today23-08-2010 18:07:31

503 Tweet

269 Followers

322 Following

Fazl Barez

@fazlbarez

7 months ago

This has been accepted at ICML 2025! See you all in Vancouver. Credit to Tingchen Fu for leading this work and to my wonderful collaborators! mrinank 🍂 philip Shay B.Cohen and David Krueger

thumb_up_off_alt15

chat_bubble_outline1

repeat4

shareShare

Responsible Reviewing #NeurIPS2025 — TL;DR 1- If you/ your co-author skip your assigned reviews → you wont see your own paper’s reviews. 2- Submit a poor quality review → your paper may be desk‑rejected. 👏 Nice one, NeurIPS! 🔗 blog.neurips.cc/2025/05/02/res…

thumb_up_off_alt17

chat_bubble_outline1

repeat2

shareShare

Geoffrey Irving

@geoffreyirving

7 months ago

AISI's research agenda is out! We cover a variety of topics in the evaluation and mitigations of risks from frontier LLMs, including both work happening at AISI and work we are excited to see others tackle.

thumb_up_off_alt58

chat_bubble_outline2

repeat12

shareShare

Gabriel Recchia

@mesotronium

7 months ago

This is an excellent and, I think, very important piece that I hope gets the attention it deserves within the AI safety community. Many congratulations to Josh Engels, David D. Baek, Subhash Kantamneni, Max Tegmark

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

AI Security Institute

@aisecurityinst

7 months ago

Advanced AI systems require complex evaluations to measure abilities, but conventional analysis techniques often fall short. Introducing HiBayES: a flexible, robust statistical modelling framework that accounts for the nuances & hierarchical structure of advanced evaluations.

thumb_up_off_alt53

chat_bubble_outline2

repeat11

shareShare

Benjamin Hilton

@benjamin_hilton

7 months ago

Humans are often very wrong. This is a big problem if you want to use human judgment to oversee super-smart AI systems. In our new post, Geoffrey Irving argues that we might be able to deal with this issue – not by fixing the humans, but by redesigning oversight protocols.

Humans are often very wrong.

This is a big problem if you want to use human judgment to oversee super-smart AI systems.

In our new post, <a href="/geoffreyirving/">Geoffrey Irving</a> argues that we might be able to deal with this issue – not by fixing the humans, but by redesigning oversight protocols.

thumb_up_off_alt17

chat_bubble_outline1

repeat3

shareShare

Charbel-Raphael

@crsegerie

7 months ago

"If you cannot measure it, you cannot improve it" - Lord Kelvin The science of AI safety evaluation is still nascent, but is advancing and we know much more today than two years ago. CeSIA tried to make this knowledge accessible by publishing a SotA literature review!

thumb_up_off_alt34

chat_bubble_outline2

repeat7

shareShare

Gabriel Recchia

@mesotronium

7 months ago

Proud to have contributed to this study that cleanly demonstrates the persuasive capabilities of LLMs. (No secret Reddit posting involved!!)

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare