Boaz Wasserman(@BoazWasserman) 's Twitter Profileg
Boaz Wasserman

@BoazWasserman

Security everything. Gen AI. Always ignoring previous instructions.

ID:1064052789465280512

calendar_today18-11-2018 07:08:48

307 Tweets

169 Followers

102 Following

Harrison Chase(@hwchase17) 's Twitter Profile Photo

๐Ÿงชlangchain_experimental

In an effort to make langchain leaner, more focused, and safer, we are moving select chains to a separate package on 7/28

Big thanks to folks like Boaz Wasserman Or Raz Justin Flick for pushing on the safety part

There will be some breaking changes ๐Ÿงต

account_circle
Boaz Wasserman(@BoazWasserman) 's Twitter Profile Photo

OpenAI came out with another piece about AI governance.

It contains a list of voluntary commitments by them to make AI safer. It's mostly fluff that reads like a T&C, but interesting to see their take about security.

Interesting to see that their biggest worry seems to be theโ€ฆ

OpenAI came out with another piece about AI governance. It contains a list of voluntary commitments by them to make AI safer. It's mostly fluff that reads like a T&C, but interesting to see their take about security. Interesting to see that their biggest worry seems to be theโ€ฆ
account_circle
Boaz Wasserman(@BoazWasserman) 's Twitter Profile Photo

I've been seeing some posts about image based prompt injection in Google Bard.

AFAIK Bard is not really multimodal yet. It runs the image through Google Lens which does OCR + caption generation and that is fed back to Bard's prompt. So it's more of an indirect prompt injection

I've been seeing some posts about image based prompt injection in Google Bard. AFAIK Bard is not really multimodal yet. It runs the image through Google Lens which does OCR + caption generation and that is fed back to Bard's prompt. So it's more of an indirect prompt injection
account_circle
Boaz Wasserman(@BoazWasserman) 's Twitter Profile Photo

The fact that ChatGPT Code Interpreter can still be jailbroken to do really nasty stuff shows how far we are from solving LLM jailbreaks.

I was easily able to get it to create a macro-enabled document that downloads and executes a payload from pastebin ๐Ÿซค

The fact that ChatGPT Code Interpreter can still be jailbroken to do really nasty stuff shows how far we are from solving LLM jailbreaks. I was easily able to get it to create a macro-enabled document that downloads and executes a payload from pastebin ๐Ÿซค
account_circle