Dalton brown
@daltonbrown944
AI Alignment, AI ethics
ID: 1624063095663460352
10-02-2023 15:10:01
56 Tweet
502 Followers
2,2K Following
“Talent hits a target no one else can hit; Genius hits a target no one else can see” A. Schopenhauer I've seen Geoffrey Hinton hit targets no one thought existed. So even though I don't fully agree with Geoff’s views on AI risks, I'd still listen carefully to what he has to say.
BTW, 2 papers I often recommend: 1) "The alignment problem from a deep learning perspective" Richard Ngo et al. for research overview: arxiv.org/abs/2209.00626 2) "Natural Selection Favors AIs over Humans" Dan Hendrycks for argument for risk: arxiv.org/abs/2303.16200
In our new Google DeepMind paper, we redteam methods that aim to discover latent knowledge through unsupervised learning from LLM activation data. TL;DR: Existing methods can be easily distracted by other salient features in the prompt. arxiv.org/abs/2312.10029 🧵👇
So happy about this release and grateful to my awesome Preparedness team (especially Tejal Patwardhan), Policy Research, SuperAlignment and all of OpenAI for the hard work it took to get us here. It is still only a start but the work will continue!