juliette pluto 🌌 ICLR 2025 (@foundjuliette) Twitter Tweets • TwiCopy

juliette pluto 🌌 ICLR 2025

@foundjuliette

+ Follow

cyclist, shapeshifter, typo-generator. ML security @GoogleDeepMind. views mine.

ID: 2604325361

linkhttps://jul.sh calendar_today12-06-2014 18:10:45

2,2K Tweet

5,5K Followers

608 Following

juliette pluto 🌌 ICLR 2025

@foundjuliette

2 years ago

Prediction: unless patched, GPT's tendency to say delve will be assimilated into North American style English. Soon it won't stand out as odd anymore. More broadly, RLHF'ed LLMs will shape cultural norms in unexpected ways.

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare

Jacob Austin

@jacobaustin132

2 years ago

This is something I've worked on for a while! You can save the activations of one LLM call and reuse them for a follow-up that overlaps with the first. This means asking a question about a big codebase can take 30 seconds the first time and 1s after that!

thumb_up_off_alt435

chat_bubble_outline14

repeat47

shareShare

juliette pluto 🌌 ICLR 2025

@foundjuliette

2 years ago

prediction: when all is said and done Elon Musk's $42B purchase of Twitter of Twitter will be seen as a bargain. Not because of the platforms economic success, but its outsize sociopolitical influence.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Colin McCarthy

@us_stormwatch

2 years ago

12-hour timelapse of American Airlines, Delta, and United plane traffic after what was likely the biggest IT outage in history forced a nationwide ground stop of the three airlines.

thumb_up_off_alt101,101K

chat_bubble_outline1,1K

repeat20,20K

shareShare

evan loves worf

@esjesjesj

2 years ago

Here’s a video of his brother admitting it

thumb_up_off_alt13,13K

chat_bubble_outline35

repeat1,1K

shareShare

rohit

@krishnanrohit

a year ago

Phenomenal response, from the founder of Deepseek, on moats

thumb_up_off_alt940

chat_bubble_outline27

repeat120

shareShare

juliette pluto 🌌 ICLR 2025

@foundjuliette

a year ago

Bearish on Cursor (closed source, send your code to their servers, forced subscription) Bullish on Zed (open source, fast, private, existing anthropic partnership, supports API keys)

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Josh Engels

@joshaengels

a year ago

1/6: A recent paper shows that that LLMs are "self aware": when trained to exhibit a behavior like "risk taking", LLMs self report being risky. In a recent blog post, we explore what's happening here: some self awareness behaviors are caused by a simple learned steering vector!🧵

thumb_up_off_alt201

chat_bubble_outline3

repeat36

shareShare

Dan Allison

@danallison

a year ago

One failure mode that I’ve repeatedly fallen into is thinking “surely someone smarter than me must have already figured this out” when in fact no one has.

thumb_up_off_alt2,2K

chat_bubble_outline26

repeat153

shareShare

Ilia Shumailov🦔

@iliaishacked

a year ago

Our new Google DeepMind paper, "Lessons from Defending Gemini Against Indirect Prompt Injections," details our framework for evaluating and improving robustness to prompt injection attacks.

Our new <a href="/GoogleDeepMind/">Google DeepMind</a> paper, "Lessons from Defending Gemini Against Indirect Prompt Injections," details our framework for evaluating and improving robustness to prompt injection attacks.

thumb_up_off_alt169

chat_bubble_outline4

repeat35

shareShare

juliette pluto 🌌 ICLR 2025

@foundjuliette

9 months ago

even if model "neutrality" were mean perfectly representing the views & biases of the population in the model, the result would mean cementing the status quo.

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

François Chollet

@fchollet

9 months ago

Officially validated IMO gold medal, purely via search in token space, achieved in 4.5 hrs (unclear at what compute cost). The solutions read nicely as well deepmind.google/discover/blog/…

thumb_up_off_alt1,1K

chat_bubble_outline29

repeat175

shareShare

roon

@tszzl

7 months ago

correct me if im wrong but it seems like: - the theme of the Dan Wang book, and the general elite consensus now is that “industrial process” is a technology that lives in the heads of people and that it was a mistake to let so much “low value” industry be offshored due to the

thumb_up_off_alt2,2K

chat_bubble_outline152

repeat251

shareShare

juliette pluto 🌌 ICLR 2025

@foundjuliette

7 months ago

This chart is potentially misleading. It compares the latest sonnet model to older models (from March/April). Also, attacks in this data set were optimized against those other models, but not against sonnet 4.5 (!). It would likely do worse against tailored attacks.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

juliette pluto 🌌 ICLR 2025

@foundjuliette

5 months ago

thumb_up_off_alt11

chat_bubble_outline0

repeat3

shareShare