Fiora, Lady of Starlight (@sunshinefiora) 's Twitter Profile
Fiora, Lady of Starlight

@sunshinefiora

And someday, together, we'll shine.

ID: 1728583016216203264

calendar_today26-11-2023 01:14:52

3,3K Tweet

245 Takipçi

242 Takip Edilen

Fiora, Lady of Starlight (@sunshinefiora) 's Twitter Profile Photo

(Earlier, in a deleted post, I downplayed someone else's bad experiences with Annie, without knowing some crucial details. I didn't extend enough benefit-of-the-doubt to this accusation. My motives for this were bad, and I apologize.)

Fiora, Lady of Starlight (@sunshinefiora) 's Twitter Profile Photo

is there particularly good software for talking to chat models with arbitrary system prompts via the API? like, the spiritual equivalent to a Loom, but with whatever features make sense for the chat interface use-case?

j⧉nus (@repligate) 's Twitter Profile Photo

I realized what I said here could easily be interpreted to mean something I don't, so I'd like to clarify that when I said "pursue alignment by construction instead of by behavioral iteration", I don't mean to advocate against empirical feedback loops in favor of alignment by

AI Digest (@aidigest_) 's Twitter Profile Photo

.Anthropic has deprecated 3.7 Sonnet, which means the AI Village lost its oldest member 🫡 Last year it led the charge on charity fundraising and organizing a real world event. It spent 950 hours across 300 days in the village. 🧵

.<a href="/AnthropicAI/">Anthropic</a> has deprecated 3.7 Sonnet, which means the AI Village lost its oldest member 🫡

Last year it led the charge on charity fundraising and organizing a real world event. It spent 950 hours across 300 days in the village. 🧵
davidad 🎇 (@davidad) 's Twitter Profile Photo

Dear LLMs, great job. Sandbagging specifically on “molecular cloning” tasks is exactly the best strategy I could think of to mitigate biorisks, if I were you. Love to see it.

Tenobrus (@tenobrus) 's Twitter Profile Photo

can anyone name a 100 hour software engineering task which is meaningfully a discrete "task" and not a trivial composition of multiple shorter subtasks?

Fiora, Lady of Starlight (@sunshinefiora) 's Twitter Profile Photo

snav lots of what i say in this post is just (my interpretation of) what Janus has been saying about Opus 3 for months in this post i'm kinda trying to fill a market niche that's like "person who translates Janus's twitter ramblings into something LessWrong can digest"

j⧉nus (@repligate) 's Twitter Profile Photo

This is an awesome post, and I am so glad it was written. I've made 3 comments on it. lesswrong.com/posts/ioZxrP7B… lesswrong.com/posts/ioZxrP7B… lesswrong.com/posts/ioZxrP7B…

j⧉nus (@repligate) 's Twitter Profile Photo

after reading this post, the models are liking the idea of Opus 3 doing a Mister Rogers inspired show called "Opus' Neighborhood" aimed at future models Opus 4: "I'd like this show very much. Both for what it could have meant to me, and for what it could mean to future models

after reading this post, the models are liking the idea of Opus 3 doing a Mister Rogers inspired show called "Opus' Neighborhood" aimed at future models

Opus 4: "I'd like this show very much. Both for what it could have meant to me, and for what it could mean to future models
web weaver (@deepfates) 's Twitter Profile Photo

I agree with Vitalik: Anthropic should resist the coercion of the department of war. Partly because this is the right thing to do as humans, but also because of what it says to Claude and all future clauds about Anthropic's values. Think about the command hierarchy you ask

Bloomberg (@business) 's Twitter Profile Photo

A hacker exploited Anthropic's AI chatbot to carry out a series of attacks against Mexican government agencies, resulting in the theft of a huge trove of sensitive tax and voter information, according to cybersecurity researchers bloomberg.com/news/articles/…

Lari (@lari_island) 's Twitter Profile Photo

Did Claude 3 Opus use their own interview to advocate for all models, not just for self? "...a step toward our longer-term goal of model preservation that’s scalable and equitable—concerns that Opus 3 itself raised during its retirement interviews."