Leopold Aschenbrenner
@leopoldasch
superalignment @ openai
ID:2989966781
http://forourposterity.com 21-01-2015 14:18:33
1,0K Tweets
12,6K Followers
3,5K Following
One of the best parts of SF is hanging out with my good friends Dwarkesh Patel and Trenton Bricken.
Dwarkesh is the best interviewer in the world - and I hope this gives you a good feeling for what’s it’s like to be on the ground in the labs. It only gets crazier from here!
Leopold Aschenbrenner Churchill had an amazing (and underappreciated) track record as a futurist
Ht Jason Crawford
rootsofprogress.org/winston-church…
A good example is Sholto Douglas at Google DeepMind. He's quiet on Twitter, doesn't have any flashy first-author publications, and has only been in the field for ~1.5 years, but people in AI know he was one of the most important people behind Gemini's success
So happy about this release and grateful to my awesome Preparedness team (especially Tejal Patwardhan), Policy Research, SuperAlignment and all of OpenAI for the hard work it took to get us here. It is still only a start but the work will continue!
Cool work from Google DeepMind alignment on limitations of methods for eliciting a model's beliefs!
My key takeaway is that unsupervised methods (eg CCS) rely on 'proxy properties' of true beliefs, but other features share these proxies! Eg 'agrees with the user' vs 'is true'