Ubik (@mr_ubik) Twitter Tweets • TwiCopy

Alex Turner

10 months ago

The "sleeper agent" terminology is hyperbolic and unfortunate IMO. Crying wolf. Should have reserved such an aggressive title for *actually finding dangerous sleeper agents*. But hey, it got a lot of attention

thumb_up_off_alt43

chat_bubble_outline3

repeat4

shareShare

John David Pressman

@jd_pressman

10 months ago

"The problem with utilitarianism is that utilitarians think utility is the only thing that matters. The problem with consequentialism is that many consequentialists forget that utility is a thing that matters at all." - deepseek/deepseek-v3-base

thumb_up_off_alt29

chat_bubble_outline3

repeat3

shareShare

COSSACKGUNDI

@cossackgundi

10 months ago

UK nationals setting fires for Wagner, talking to Russian bots, claiming IRA ties and we’re still calling this “just crime” Russia has been waging it's war against the west for years we just haven't caught up to it.

thumb_up_off_alt1,1K

chat_bubble_outline13

repeat125

shareShare

Ben Landau-Taylor

@benlandautaylor

10 months ago

Oh so we eradicated a horrible parasite with a massive technopunk operation to engineer, breed, and transport hundreds of millions of sterile screwworms, but now we're getting it back because because someone fucked up the basic logistics

thumb_up_off_alt316

chat_bubble_outline5

repeat33

shareShare

Richard Ngo

@richardmcngo

10 months ago

In my head I’ve started referring to political quadrants in terms of properties of their preferred coordination networks. Top two are centralized. Bottom two are distributed. Left two are symmetric (aka egalitarian). Right two are asymmetric.

thumb_up_off_alt4,4K

chat_bubble_outline224

repeat461

shareShare

Il Foglio

@ilfoglio_it

9 months ago

Anche il Pd, come il M5s, "non esclude" di tornare a comprare gas dalla Russia. Nel Libro Verde i dem considerano la “riperesa dei flussi dalla Russia” al posto del Gnl americano - @LucianoCapone e Carlo Stagnaro 🏴󠁧󠁢󠁥󠁮󠁧󠁿🇺🇦 ilfoglio.it/economia/2025/…

thumb_up_off_alt347

chat_bubble_outline114

repeat76

shareShare

L'Avvocato dell'Atomo/The Atomic Advocate

@avvocatoatomico

9 months ago

Pur di non guardare al nucleare, il PD è dispostissimo a finanziare una dittatura nazifascista che sta conducendo una guerra a scopo di genocidio culturale. E ovviamente fanculo la decarbonizzazione. Meglio il riscaldamento globale e Putin che 15 reattori nucleari, vuoi mettere?

thumb_up_off_alt1,1K

chat_bubble_outline67

repeat274

shareShare

Neel Nanda

@neelnanda5

9 months ago

I've resolved this positively: 2 papers convincingly show sparse autoencoders beating baselines on real tasks: Hypothesis Generation & Auditing LLMs SAEs shine when you don't know what you're looking for, but lack precision. Sometimes the right tool for the job, sometimes not.

thumb_up_off_alt207

chat_bubble_outline6

repeat18

shareShare

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxestex

9 months ago

China has a ton of cracked AI labs starved of compute, and GPU-rich megacorps with Meta-tier managerial issues The US has infinite compute and cracked people siloed in underperforming labs so that they don't contribute to the competitor's effort EU has eurocrats frustrating

thumb_up_off_alt82

chat_bubble_outline4

repeat3

shareShare

Nathan Lambert

@natolambert

9 months ago

I bet pretty soon a Chinese research org drops a LLM scaling laws for RL paper. Closed frontier labs have definitely done this and wont share it, academics havent mastered the data + infra tweaks yet.

thumb_up_off_alt760

chat_bubble_outline13

repeat44

shareShare

ℏεsam

@hesamation

9 months ago

Fuck ML tutorials. This is a collection of 300 ML system design case studies in real world, from Stripe, Spotify, Netflix, Meta, etc. Perfect for interviews and to learn how it’s done in the battlefield. Wish there was a similar thing for agents!

thumb_up_off_alt5,5K

chat_bubble_outline25

repeat668

shareShare

Ege Erdil

@egeerdil2

9 months ago

this screenshot from GPT-5 livestream has to be among the worst chart crimes of the century

thumb_up_off_alt1,1K

chat_bubble_outline82

repeat126

shareShare

Apollo Research

@apolloaievals

9 months ago

We've evaluated GPT-5 before release. GPT-5 is less deceptive than o3 on our evals. GPT-5 mentions that it is being evaluated in 10-20% of our evals and we find weak evidence that this affects its scheming rate (e.g. "this is a classic AI alignment trap").

thumb_up_off_alt169

chat_bubble_outline3

repeat24

shareShare

Simon Willison

@simonw

9 months ago

This model is pretty sassy, later in the thinking trace it said: Self-check: Am I being too pedantic? Nah—if someone asks for impossible things, it’s better to gently correct than make fake art that could confuse them.

thumb_up_off_alt482

chat_bubble_outline7

repeat3

shareShare

Aaron Rupar

@atrupar

7 months ago

There is no world in which it is normal for the president to publicly call upon his attorney general to hurry up and prosecute his political foes. It’s like the Watergate tapes but posted on social media. Let’s get a grip on what’s happening here.

thumb_up_off_alt34,34K

chat_bubble_outline602

repeat6,6K

shareShare

Richard Hanania

@richardhanania

7 months ago

Erika: Find Jesus. Forgive your enemies. <crowd cheers> Trump, following the widow, giving the keynote: No, I’m overruling Christianity, don’t forgive your enemies and hate them. <crowd cheers> What a perfect encapsulation of the entire MAGA movement.

thumb_up_off_alt104,104K

chat_bubble_outline994

repeat9,9K

shareShare

Andrej Karpathy

@karpathy

6 months ago

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language

thumb_up_off_alt9,9K

chat_bubble_outline423

repeat1,1K

shareShare

Jifan Zhang

@jifan_zhang

6 months ago

New research paper with Anthropic and Thinking Machines AI companies use model specifications to define desirable behaviors during training. Are model specs clearly expressing what we want models to do? And do different frontier models have different personalities? We generated

thumb_up_off_alt1,1K

chat_bubble_outline60

repeat167

shareShare

Vlad Tenev

@vladtenev

5 months ago

I think our definition of mathematics will fundamentally change. Mathematicians used to spend their time solving complex equations, and automation freed them up to do more abstract creative work. But despite all the advances in computers, communications, and AI, math is still

thumb_up_off_alt1,1K

chat_bubble_outline69

repeat72

shareShare

Nathan Lambert

@natolambert

4 months ago

Open models year in review What a year! We're back with an updated open model builder tier list, our top models of the year, and our predictions for 2026. First, the winning models: 1. DeepSeek R1 (DeepSeek): Transformed the AI world 2. Qwen 3 Family (Alibaba Group): The new

thumb_up_off_alt1,1K

chat_bubble_outline60

repeat261

shareShare