Chris Cundy
@chriscundy
Research Scientist at FAR AI.
PhD from Stanford University.
Hopefully making AI benefit humanity.
Views are my own.
ID: 891751545594707968
http://cundy.me 30-07-2017 20:05:11
342 Tweet
1,1K Takipçi
215 Takip Edilen
As part of our ongoing work on AI safety and security, we've discovered a powerful, yet simple LLM jailbreak that exploits an intrinsic LLM behavior we call 'crescendo' and have demonstrated it on dozens of tasks across major LLM models and services: …ndo-the-multiturn-jailbreak.github.io
🚀 Excited to share our latest #AI research from Stanford on privacy-constrained reinforcement learning, developed with Chris Cundy and Stefano Ermon! Our framework minimizes sensitive information exposure using mutual information regularizers. 🤖💡#AISTATS2024 1/6