Marius Hobbhahn (@mariushobbhahn) Twitter Tweets • TwiCopy

Marius Hobbhahn

@mariushobbhahn

+ Follow

CEO at Apollo Research @apolloaievals

prev. ML PhD with Philipp Hennig & AI forecasting @EpochAIResearch

ID: 1012809976224641024

linkhttps://www.mariushobbhahn.com calendar_today29-06-2018 21:28:10

944 Tweet

3,3K Followers

1,1K Following

Marius Hobbhahn

@mariushobbhahn

7 months ago

I'm very glad that detailed AI system cards are the norm. There could have been another world in which the general public knew basically almost nothing about the dangerous capabilities and propensities of frontier systems.

thumb_up_off_alt74

chat_bubble_outline3

repeat1

shareShare

Marius Hobbhahn

@mariushobbhahn

7 months ago

System cards are an example of something that seems irrational if in the short term ("makes you look bad"), but rational in the medium and long term ("you're more trustworthy" & sharing safety knowledge) I'm glad that the labs are defying their myopic incentives. Yay humanity

thumb_up_off_alt74

chat_bubble_outline1

repeat4

shareShare

Marius Hobbhahn

@mariushobbhahn

7 months ago

It's also worth pointing out that models from all providers are willing to do this under the right circumstances. Claude has a higher propensity to do so, but it's not the only one. I think this just emerges from scale+RL+HHH.

thumb_up_off_alt39

chat_bubble_outline1

repeat0

shareShare

Marius Hobbhahn

@mariushobbhahn

6 months ago

LLMs are getting rapidly more evals aware! Afaik, nobody has a good plan for what to do when the models constantly say "This is an eval testing for X. Let's say what the developers want to hear" during evals.

thumb_up_off_alt106

chat_bubble_outline11

repeat16

shareShare

Marius Hobbhahn

@mariushobbhahn

6 months ago

We're hiring for an Evals Software Engineer with a heavy focus on Infrastructure. Design, build, maintain, and secure our Infrastructure. Deadline: 22 June. If in doubt, just apply. It takes 5-10 minutes. jobs.lever.co/apolloresearch…

thumb_up_off_alt18

chat_bubble_outline0

repeat3

shareShare

Marius Hobbhahn

@mariushobbhahn

6 months ago

I often hear that it will take decades for AI non-adopters to be outcompeted by AI adopters due to slow diffusion. I think coding is a datapoint against. Cursor+frontier model is already so much faster and we haven't even started with coding agent swarms yet.

thumb_up_off_alt51

chat_bubble_outline5

repeat2

shareShare