
Andy Zou
@andyzou_jiaming
PhD student at CMU, working on AI Safety and Security
ID: 2447660207
https://andyzoujm.github.io/ 30-03-2014 17:51:58
144 Tweet
3,3K Followers
67 Following



🎥 Bay Area Alignment Workshop videos are out! Check out talks by Anca Dragan Elizabeth Barnes Buck Shlegeris Richard Ngo Evan Hubinger Andy Zou Cas (Stephen Casper) Alexander Wei Adam Gleave @julianmichael, Micah Carroll with more coming! Blog recap & links. 👇









Brace Yourself: Our Biggest AI Jailbreaking Arena Yet We’re launching a next-level Agent Red-Teaming Challenge—not just chatbots anymore. Think direct & indirect attacks on anonymous frontier models. $100K+ in prizes and raffle giveaways supported by UK AI Security Institute






Excited about this work with Asher Trockman Yash Savani (and others) on antidistillation sampling. It uses a nifty trick to efficiently generate samples that makes student models _worse_ when you train on samples. I spoke about it at Simons this past week. Links below.


