chapelier fou (@atticuswzf) 's Twitter Profile
chapelier fou

@atticuswzf

MIT 26

ID: 1556872548889071616

linkhttps://chry-santhemum.github.io/website/ calendar_today09-08-2022 05:19:03

86 Tweet

56 Followers

255 Following

Sonia (@soniajoseph_) 's Twitter Profile Photo

Our paper Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video received an Oral at the Mechanistic Interpretability for Vision Workshop at CVPR 2025! 🎉 We’ll be in Nashville next week. Come say hi 👋 #CVPR2025 Mechanistic Interpretability for Vision @ CVPR2025

Our paper Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video received an Oral at the Mechanistic Interpretability for Vision Workshop at CVPR 2025! 🎉

We’ll be in Nashville next week. Come say hi 👋

<a href="/CVPR/">#CVPR2025</a>  <a href="/miv_cvpr2025/">Mechanistic Interpretability for Vision @ CVPR2025</a>
Seohong Park (@seohong_park) 's Twitter Profile Photo

*Horizon reduction* was the only technique we found that substantially improved scaling. Even simple tricks like n-step returns improved *asymptotic* performance (so it's not merely a trick to speed up training!). Full hierarchical RL scaled even better.

*Horizon reduction* was the only technique we found that substantially improved scaling.

Even simple tricks like n-step returns improved *asymptotic* performance (so it's not merely a trick to speed up training!). Full hierarchical RL scaled even better.
Morph (@morph_labs) 's Twitter Profile Photo

We are excited to announce Trinity, an autoformalization system for verified superintelligence that we have developed at Morph. We have used it to automatically formalize in Lean a classical result of de Bruijn that the abc conjecture is true almost always.

We are excited to announce Trinity, an autoformalization system for verified superintelligence that we have developed at <a href="/morph_labs/">Morph</a>. We have used it to automatically formalize in Lean a classical result of de Bruijn that the abc conjecture is true almost always.
Anthropic (@anthropicai) 's Twitter Profile Photo

Anthropic staff realized they could ask Claude to buy things that weren’t just food & drink. After someone randomly decided to ask it to order a tungsten cube, Claude ended up with an inventory full of (as it put it) “specialty metal items” that it ended up selling at a loss.

Anthropic staff realized they could ask Claude to buy things that weren’t just food &amp; drink. 

After someone randomly decided to ask it to order a tungsten cube, Claude ended up with an inventory full of (as it put it) “specialty metal items” that it ended up selling at a loss.
Samuel Marks (@saprmarks) 's Twitter Profile Photo

xAI launched Grok 4 without any documentation of their safety testing. This is reckless and breaks with industry best practices followed by other major AI labs. If xAI is going to be a frontier AI developer, they should act like one. đź§µ

Alexander Wei (@alexwei_) 's Twitter Profile Photo

1/N I’m excited to share that our latest OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

1/N I’m excited to share that our latest <a href="/OpenAI/">OpenAI</a> experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
Alexander Wei (@alexwei_) 's Twitter Profile Photo

On IMO P6 (without going into too much detail about our setup), the model "knew" it didn't have a correct solution. The model knowing when it didn't know was one of the early signs of life that made us excited about the underlying research direction!

Andrew Yang🧢⬆️🇺🇸 (@andrewyang) 's Twitter Profile Photo

A partner at a prominent law firm told me “AI is now doing work that used to be done by 1st to 3rd year associates. AI can generate a motion in an hour that might take an associate a week. And the work is better. Someone should tell the folks applying to law school right now.”

Cas (Stephen Casper) (@stephenlcasper) 's Twitter Profile Photo

🚨 New report 🚨 What does the public think about **specific** AI policy proposals? We asked 300 working-class adults in CA, IL, and NY. zenodo.org/records/165660…

🚨 New report 🚨

What does the public think about **specific** AI policy proposals? We asked 300 working-class adults in CA, IL, and NY.

zenodo.org/records/165660…