Fazl Barez (@fazlbarez) 's Twitter Profile
Fazl Barez

@fazlbarez

Making AI safe one Google doc at a time| Let's build AI's we can trust!

ID: 1341019917005537280

linkhttps://fbarez.github.io calendar_today21-12-2020 13:57:26

464 Tweet

1,1K Takipçi

729 Takip Edilen

Fazl Barez (@fazlbarez) 's Twitter Profile Photo

New paper🚨: We introduce POISONBENCH, a benchmark for assessing LLM vulnerabilities to data poisoning during preference learning. Key finding: Even 3% poisoned data can cause up to 80% performance deviation when triggered. 🧵

New paper🚨: 

We introduce POISONBENCH, a benchmark for assessing LLM vulnerabilities to data poisoning during preference learning. 

Key finding: 

Even 3% poisoned data can cause up to 80% performance deviation when triggered. 

🧵