@egor_zverev_ai : 🚀 We’ve released the source code for 𝗔𝗦𝗜𝗗𝗘 (presented as an 𝗢𝗿𝗮𝗹 at the #ICLR25025 BuildTrust workshop)! 🔍ASIDE boosts prompt injection robustness without safety-tuning: we simply rotate embeddings of marked tokens by 90° during instruction-tuning and inference 👇code • TwiCopy

Egor Zverev @ICLR 2025

@egor_zverev_ai

+ Follow

ML Safety PhD@ISTA

ID: 1770024348675399680

linkhttps://github.com/egozverev/ calendar_today19-03-2024 10:15:33

32 Tweet

57 Takipçi

160 Takip Edilen

Egor Zverev @ICLR 2025

@egor_zverev_ai

2 months ago

🚀 We’ve released the source code for 𝗔𝗦𝗜𝗗𝗘 (presented as an 𝗢𝗿𝗮𝗹 at the #ICLR2025 BuildTrust workshop)! 🔍ASIDE boosts prompt injection robustness without safety-tuning: we simply rotate embeddings of marked tokens by 90° during instruction-tuning and inference 👇code

thumb_up_off_alt9

chat_bubble_outline1

repeat6

shareShare