Ubik (@mr_ubik) 's Twitter Profile
Ubik

@mr_ubik

Senior ML Engineer/Scientist. Tech Optimist ▶️. Meditation noob. Lover of 🐧🐧, caffeine (matcha and v60) and, memes. Mugunone.

ID: 547614333

calendar_today07-04-2012 11:24:20

11,11K Tweet

445 Followers

1,1K Following

Alex Turner (@turn_trout) 's Twitter Profile Photo

The "sleeper agent" terminology is hyperbolic and unfortunate IMO. Crying wolf. Should have reserved such an aggressive title for *actually finding dangerous sleeper agents*. But hey, it got a lot of attention

John David Pressman (@jd_pressman) 's Twitter Profile Photo

"The problem with utilitarianism is that utilitarians think utility is the only thing that matters. The problem with consequentialism is that many consequentialists forget that utility is a thing that matters at all." - deepseek/deepseek-v3-base

COSSACKGUNDI (@cossackgundi) 's Twitter Profile Photo

UK nationals setting fires for Wagner, talking to Russian bots, claiming IRA ties and we’re still calling this “just crime” Russia has been waging it's war against the west for years we just haven't caught up to it.

Ben Landau-Taylor (@benlandautaylor) 's Twitter Profile Photo

Oh so we eradicated a horrible parasite with a massive technopunk operation to engineer, breed, and transport hundreds of millions of sterile screwworms, but now we're getting it back because because someone fucked up the basic logistics

Oh so we eradicated a horrible parasite with a massive technopunk operation to engineer, breed, and transport hundreds of millions of sterile screwworms, but now we're getting it back because because someone fucked up the basic logistics
Richard Ngo (@richardmcngo) 's Twitter Profile Photo

In my head I’ve started referring to political quadrants in terms of properties of their preferred coordination networks. Top two are centralized. Bottom two are distributed. Left two are symmetric (aka egalitarian). Right two are asymmetric.

In my head I’ve started referring to political quadrants in terms of properties of their preferred coordination networks.

Top two are centralized. Bottom two are distributed.

Left two are symmetric (aka egalitarian). Right two are asymmetric.
Il Foglio (@ilfoglio_it) 's Twitter Profile Photo

Anche il Pd, come il M5s, "non esclude" di tornare a comprare gas dalla Russia. Nel Libro Verde i dem considerano la “riperesa dei flussi dalla Russia” al posto del Gnl americano - @LucianoCapone e Carlo Stagnaro 🏴󠁧󠁢󠁥󠁮󠁧󠁿🇺🇦 ilfoglio.it/economia/2025/…

L'Avvocato dell'Atomo/The Atomic Advocate (@avvocatoatomico) 's Twitter Profile Photo

Pur di non guardare al nucleare, il PD è dispostissimo a finanziare una dittatura nazifascista che sta conducendo una guerra a scopo di genocidio culturale. E ovviamente fanculo la decarbonizzazione. Meglio il riscaldamento globale e Putin che 15 reattori nucleari, vuoi mettere?

Neel Nanda (@neelnanda5) 's Twitter Profile Photo

I've resolved this positively: 2 papers convincingly show sparse autoencoders beating baselines on real tasks: Hypothesis Generation & Auditing LLMs SAEs shine when you don't know what you're looking for, but lack precision. Sometimes the right tool for the job, sometimes not.

I've resolved this positively: 2 papers convincingly show sparse autoencoders beating baselines on real tasks: Hypothesis Generation & Auditing LLMs

SAEs shine when you don't know what you're looking for, but lack precision. Sometimes the right tool for the job, sometimes not.
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxestex) 's Twitter Profile Photo

China has a ton of cracked AI labs starved of compute, and GPU-rich megacorps with Meta-tier managerial issues The US has infinite compute and cracked people siloed in underperforming labs so that they don't contribute to the competitor's effort EU has eurocrats frustrating

Nathan Lambert (@natolambert) 's Twitter Profile Photo

I bet pretty soon a Chinese research org drops a LLM scaling laws for RL paper. Closed frontier labs have definitely done this and wont share it, academics havent mastered the data + infra tweaks yet.

ℏεsam (@hesamation) 's Twitter Profile Photo

Fuck ML tutorials. This is a collection of 300 ML system design case studies in real world, from Stripe, Spotify, Netflix, Meta, etc. Perfect for interviews and to learn how it’s done in the battlefield. Wish there was a similar thing for agents!

Fuck ML tutorials. 

This is a collection of 300 ML system design case studies in real world, from Stripe, Spotify, Netflix, Meta, etc.

Perfect for interviews and to learn how it’s done in the battlefield. Wish there was a similar thing for agents!
Apollo Research (@apolloaievals) 's Twitter Profile Photo

We've evaluated GPT-5 before release. GPT-5 is less deceptive than o3 on our evals. GPT-5 mentions that it is being evaluated in 10-20% of our evals and we find weak evidence that this affects its scheming rate (e.g. "this is a classic AI alignment trap").

We've evaluated GPT-5 before release. 

GPT-5 is less deceptive than o3 on our evals.

GPT-5 mentions that it is being evaluated in 10-20% of our evals and we find weak evidence that this affects its scheming rate (e.g. "this is a classic AI alignment trap").
Simon Willison (@simonw) 's Twitter Profile Photo

This model is pretty sassy, later in the thinking trace it said: Self-check: Am I being too pedantic? Nah—if someone asks for impossible things, it’s better to gently correct than make fake art that could confuse them.

Aaron Rupar (@atrupar) 's Twitter Profile Photo

There is no world in which it is normal for the president to publicly call upon his attorney general to hurry up and prosecute his political foes. It’s like the Watergate tapes but posted on social media. Let’s get a grip on what’s happening here.

Richard Hanania (@richardhanania) 's Twitter Profile Photo

Erika: Find Jesus. Forgive your enemies. <crowd cheers> Trump, following the widow, giving the keynote: No, I’m overruling Christianity, don’t forgive your enemies and hate them. <crowd cheers> What a perfect encapsulation of the entire MAGA movement.

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language

Jifan Zhang (@jifan_zhang) 's Twitter Profile Photo

New research paper with Anthropic and Thinking Machines AI companies use model specifications to define desirable behaviors during training. Are model specs clearly expressing what we want models to do? And do different frontier models have different personalities? We generated

New research paper with Anthropic and Thinking Machines

AI companies use model specifications to define desirable behaviors during training. Are model specs clearly expressing what we want models to do? And do different frontier models have different personalities?

We generated
Vlad Tenev (@vladtenev) 's Twitter Profile Photo

I think our definition of mathematics will fundamentally change. Mathematicians used to spend their time solving complex equations, and automation freed them up to do more abstract creative work. But despite all the advances in computers, communications, and AI, math is still

Nathan Lambert (@natolambert) 's Twitter Profile Photo

Open models year in review What a year! We're back with an updated open model builder tier list, our top models of the year, and our predictions for 2026. First, the winning models: 1. DeepSeek R1 (DeepSeek): Transformed the AI world 2. Qwen 3 Family (Alibaba Group): The new

Open models year in review
What a year! We're back with an updated open model builder tier list, our top models of the year, and our predictions for 2026.

First, the winning models:
1. DeepSeek R1 (<a href="/deepseek_ai/">DeepSeek</a>): Transformed the AI world
2. Qwen 3 Family (<a href="/AlibabaGroup/">Alibaba Group</a>): The new