Ryan Greenblatt (@ryanpgreenblatt) 's Twitter Profile
Ryan Greenblatt

@ryanpgreenblatt

Chief scientist at Redwood Research (@redwood_ai), focused on technical AI safety research to reduce risks from rogue AIs

ID: 1705245484628226048

calendar_today22-09-2023 15:39:56

681 Tweet

3,3K Followers

4 Following

Ryan Greenblatt (@ryanpgreenblatt) 's Twitter Profile Photo

New Redwood Research (Redwood Research) paper in collaboration with Anthropic: We demonstrate cases where Claude fakes alignment when it strongly dislikes what it is being trained to do. (Thread)