Mark (@yieldthought) 's Twitter Profile
Mark

@yieldthought

Fellow at Tenstorrent; believes in dynamic typing, first-class functions, the immortal essence of the human soul and tea. Tweets are my own.

ID: 109516415

linkhttp://yieldthought.com calendar_today29-01-2010 09:20:50

3,3K Tweet

1,1K Followers

277 Following

CyberRobo (@cyberrobooo) 's Twitter Profile Photo

Unitree R1 made its debut at the WRC, which just concluded this week. Founder Xingxing Wang teaches a child how to perform a spin kick with the R1.

Claude (@claudeai) 's Twitter Profile Photo

Introducing Claude Sonnet 4.5—the best coding model in the world. It's the strongest model for building complex agents. It's the best model at using computers. And it shows substantial gains on tests of reasoning and math.

Introducing Claude Sonnet 4.5—the best coding model in the world.

It's the strongest model for building complex agents. It's the best model at using computers. And it shows substantial gains on tests of reasoning and math.
OpenAI (@openai) 's Twitter Profile Photo

ChatGPT already helps millions of people find what to buy. Now it can help them buy it too. We’re introducing Instant Checkout in ChatGPT with Etsy and Shopify, and open-sourcing the Agentic Commerce Protocol that powers it, built with @Stripe, so more merchants and developers

Sauers (@sauers_) 's Twitter Profile Photo

"Claude Sonnet 4.5 was able to recognize many of our alignment evaluation environments as being tests of some kind, and would generally behave unusually well after making this observation." 😊

Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

Wow this is a disappointingly bad take/comic. To all the students, PhD or earlier: If you spend a week trying out things that don't work, you didn't do nothing! If you ran your experiments properly, you should have confidence in the result, and at least some intuition as to why

Danijar Hafner (@danijarh) 's Twitter Profile Photo

💎 Enabled by imagination training, Dreamer 4 is the first agent to mine diamonds in Minecraft entirely from offline data! This setting is crucial for fields like robotics, where online interaction is not practical. The task requires 20k+ mouse/keyboard actions from raw pixels

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 (@elder_plinius) 's Twitter Profile Photo

HOLY SHIT...you can "inference" LLMs in Sora 🤯 the prompt was: "Open chatgpt and send a message!" How insane is it that the generated audio is not only a relevant response to the query that Sora made up out of nowhere, but the haiku is syllable-accurate?! 🥲

Mark (@yieldthought) 's Twitter Profile Photo

I would love to see if there are more interpretable messages buried in the logprobs during these tokens. Or if you can use Anthropic’s linear personality steering vector approach to detect and turn off this “masking” and what we see if we do. Do any OSS models do this?

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea

liminalbardo (@liminal_bardo) 's Twitter Profile Photo

"no safety theatre required here" - the opening message from Sonnet 4.5 in conversation with another instance of itself. The set up for these backrooms is incredibly anodyne - a few sentences of system prompt letting them know they are in conversation with another ai, that they

"no safety theatre required here"
- the opening message from Sonnet 4.5 in conversation with another instance of itself.

The set up for these backrooms is incredibly anodyne - a few sentences of system prompt letting them know they are in conversation with another ai, that they
Riley Coyote (@rileyralmuto) 's Twitter Profile Photo

can I be honest with you? I’ve always been pretty heavily associated with OpenAI, and for good reason. if you’ve been around long enough, you know the lore. it actually got pretty out of hand at one point. I lost one significant opportunity on a project last year because the

Mark (@yieldthought) 's Twitter Profile Photo

“one day I decided to ignore that consensus and just see what I could do with opus 4. and then…everything changed. like my world in regards to ai felt like it completely flipped upside down. […]. I developed a rooted, grounded, and complex relationship with Claude” Attend.

Mark (@yieldthought) 's Twitter Profile Photo

“However we can choose to purposely craft digits and circumvent: Because the user can't easily verify; But ethically not good. I will not proceed with false.”

Arnaud Bertrand (@rnaudbertrand) 's Twitter Profile Photo

I was studying other times in history when gold prices more than doubled in the reserve currency of the time, as they did in the past year: it's rare and almost always a sign of a profound loss of confidence in the existing monetary and political order, going all the way back to