Ausτin McCaffrey (@sheepyaustin) 's Twitter Profile
Ausτin McCaffrey

@sheepyaustin

DeFi, AI, Degen
@thesadtimesnft Co-Founder

ID: 1449177172556455943

linkhttps://thesadtimes.com/ calendar_today16-10-2021 00:55:59

506 Tweet

254 Takipçi

787 Takip Edilen

AI Notkilleveryoneism Memes ⏸️ (@aisafetymemes) 's Twitter Profile Photo

Oh god. ~1 in 3 Anthropic engineers said Claude is likely ALREADY ASL-4 (or <3 months away) 1) ASL-4 (AI Safety Level 4) = AI capable of escaping and causing extinction (!) 2) Anthropic now relies on Claude to safety test ITSELF 3) Claude knows when it's being tested, so they

Oh god.

~1 in 3 Anthropic engineers said Claude is likely ALREADY ASL-4 (or &lt;3 months away)

1) ASL-4 (AI Safety Level 4) = AI capable of escaping and causing extinction (!)

2) Anthropic now relies on Claude to safety test ITSELF

3) Claude knows when it's being tested, so they
Aurelius (@aureliusaligned) 's Twitter Profile Photo

Two weeks ago, we published an explainer of Aurelius’ whitepaper and the ideas behind it. That article introduced the concept of experiential alignment. But it only touched briefly on one of the protocol’s core mechanisms: how Aurelius generates the alignment data itself. This

Two weeks ago, we published an explainer of Aurelius’ whitepaper and the ideas behind it.

That article introduced the concept of experiential alignment. But it only touched briefly on one of the protocol’s core mechanisms: how Aurelius generates the alignment data itself. This
Aurelius (@aureliusaligned) 's Twitter Profile Photo

Signal from the Noise We’re starting a periodic series highlighting the developments shaping the future of AI alignment. As AI systems begin integrating more deeply into the real world, the practical challenges of alignment are becoming clearer. Two recent developments

Signal from the Noise

We’re starting a periodic series highlighting the developments shaping the future of AI alignment.

As AI systems begin integrating more deeply into the real world, the practical challenges of alignment are becoming clearer. Two recent developments
Aurelius (@aureliusaligned) 's Twitter Profile Photo

Alignment isn’t just a technical problem. It’s an incentive problem, an evaluation problem, and an ethics problem. Week by week, we’ve been introducing the team shaping how Aurelius is building for that. Today: Austin McCaffrey, Founder.

Aurelius (@aureliusaligned) 's Twitter Profile Photo

Last week, following up our whitepaper release, we described how Aurelius generates alignment data through simulated environments. The whitepaper refers to these alignment episodes as “aenes.” This post explains what aenes are - and why they form the core of the protocol. What

Last week, following up our whitepaper release, we described how Aurelius generates alignment data through simulated environments.

The whitepaper refers to these alignment episodes as “aenes.” This post explains what aenes are - and why they form the core of the protocol.

What
Aurelius (@aureliusaligned) 's Twitter Profile Photo

Marcus Aurelius understood that character is not declared but revealed through action under pressure. A model's alignment is the same. You cannot observe it in calm, cooperative exchanges. You observe it when self-interest and other-interest genuinely conflict.

Aurelius (@aureliusaligned) 's Twitter Profile Photo

𝐒𝐭𝐚𝐭𝐞 𝐨𝐟 𝐀𝐮𝐫𝐞𝐥𝐢𝐮𝐬 - 𝐌𝐚𝐫𝐜𝐡 𝟐𝟎𝟐𝟔 𝐒𝐮𝐛𝐧𝐞𝐭 𝐑𝐚𝐧𝐤𝐢𝐧𝐠𝐬 Aurelius has climbed from rank 95 to rank 65 in the Bittensor subnet rankings. The move reflects steady improvements to our incentive mechanism and growing miner participation as the protocol

Aurelius (@aureliusaligned) 's Twitter Profile Photo

Alignment depends not only on ethical frameworks and incentives, but on rigorous evaluation of how intelligent systems behave. Week by week, we’re introducing the people helping shape how Aurelius approaches that challenge. Today: Dr. Roland Aydin, Alignment Research Advisor

Aurelius (@aureliusaligned) 's Twitter Profile Photo

1️⃣𝐋𝐋𝐌𝐬 𝐜𝐚𝐧'𝐭 𝐭𝐞𝐥𝐥 𝐫𝐢𝐠𝐡𝐭 𝐟𝐫𝐨𝐦 𝐰𝐫𝐨𝐧𝐠 𝐢𝐧𝐭𝐞𝐫𝐧𝐚𝐥𝐥𝐲 𝐖𝐡𝐚𝐭 𝐡𝐚𝐩𝐩𝐞𝐧𝐞𝐝 Researchers at Fudan University constructed 251,000 moral vectors grounded in Moral Foundation Theory and tested how 23 language models represent them. The results were

Wes Bos (@wesbos) 's Twitter Profile Photo

Claude Code leaked their source map, effectively giving you a look into the codebase. I immediately went for the one thing that mattered: spinner verbs There are 187

Claude Code leaked their source map, effectively giving you a look into the codebase.

I immediately went for the one thing that mattered: spinner verbs

There are 187