Trenton Bricken (@trentonbricken) 's Twitter Profile
Trenton Bricken

@trentonbricken

Trying to figure out what makes minds and machines go "Beep Bop!" @AnthropicAI

ID: 2373791492

linkhttp://trentonbricken.com calendar_today05-03-2014 13:50:47

1,1K Tweet

9,9K Followers

1,1K Following

Anthropic (@anthropicai) 's Twitter Profile Photo

Claude wasn’t designed to be a calculator; it was trained to predict text. And yet it can do math "in its head". How? We find that, far from merely memorizing the answers to problems, it employs sophisticated parallel computational paths to do "mental arithmetic".

Claude wasn’t designed to be a calculator; it was trained to predict text. And yet it can do math "in its head". How?

We find that, far from merely memorizing the answers to problems, it employs sophisticated parallel computational paths to do "mental arithmetic".
Jack Lindsey (@jack_w_lindsey) 's Twitter Profile Photo

Human thought is built out of billions of cellular computations each second. Language models also perform billions of computations for each word they write. But do these form a coherent ā€œthought process?ā€ We’re starting to build tools to find out! Some reflections in thread.

Dwarkesh Patel (@dwarkesh_sp) 's Twitter Profile Photo

The Scott Alexander & Daniel Kokotajlo episode. Scott and Daniel break down every month from now until the 2027 intelligence explosion. Misaligned hive minds, Xi and Trump waking up, automated Ilyas accelerating AI progress. I went in quite skeptical. But I learned a tremendous

Amanda Askell (@amandaaskell) 's Twitter Profile Photo

If you're a prompting genius, please apply to this role and include an example that shows off how well you can inspire models, regardless of the target. Scaffolding pipelines, metaprompts, prompts that improve outputs, and so on are all great. job-boards.greenhouse.io/anthropic/jobs…

Joshua Batson (@thebasepoint) 's Twitter Profile Photo

Great post "So you want to work in mechanistic interpretability" about skills to develop and resources to use, whether you're coming more from research or engineering. (link in thread)

Great post "So you want to work in mechanistic interpretability" about skills to develop and resources to use, whether you're coming more from research or engineering. (link in thread)
Tristan Hume (@trishume) 's Twitter Profile Photo

Anthropic is hosting a recruiting social in NYC targeted at the quant trading industry! Signup in thread. I enjoyed trading systems, and Anthropic combines the technical depth of trading with being in the fastest most impactful area of tech.

Sam Bowman (@sleepinyourhat) 's Twitter Profile Photo

šŸ§µāœØšŸ™ With the new Claude Opus 4, we conducted what I think is by far the most thorough pre-launch alignment assessment to date, aimed at understanding its values, goals, and propensities. Preparing it was a wild ride. Here’s some of what we learned. šŸ™āœØšŸ§µ

Trenton Bricken (@trentonbricken) 's Twitter Profile Photo

Circuits at home! (but it’s actually really good) Big win from the Anthropic Fellows program and open source interp collaborators

Anthropic (@anthropicai) 's Twitter Profile Photo

New Anthropic Research: A new set of evaluations for sabotage capabilities. As models gain more agentic abilities, we need to get smarter in how we monitor them. We’re publishing a new set of complex evaluations that test for sabotage—and sabotage-monitoring—capabilities.

New Anthropic Research: A new set of evaluations for sabotage capabilities.

As models gain more agentic abilities, we need to get smarter in how we monitor them. We’re publishing a new set of complex evaluations that test for sabotage—and sabotage-monitoring—capabilities.