Roy (@xwlin_roy) 's Twitter Profile
Roy

@xwlin_roy

ID: 1070840644032233474

calendar_today07-12-2018 00:41:19

25 Tweet

1,1K Followers

2,2K Following

Łukasz (@maldr0id) 's Twitter Profile Photo

I wrote a post which tries to explain why Jonathan's paper on Catalangate is wrong and dangerous (and why I care about this). medium.com/@maldr0id/misi…

Roy (@xwlin_roy) 's Twitter Profile Photo

New paper: "ICON: Intent-Context Coupling for Efficient Multi-Turn Jailbreak Attack" We find LLM safety constraints are significantly relaxed when malicious intent is coupled with a semantically congruent context. 97.1% ASR across 8 LLMs. arxiv.org/abs/2601.20903