FallMonkey (@fallmonkey) Twitter Tweets • TwiCopy

John Schulman

7 months ago

Barret Zoph and I recently gave a talk at Stanford on post-training and our experience working together on ChatGPT. Unfortunately the talk wasn't recorded, but here are the slides: docs.google.com/presentation/d…. (If you have a recording, please let me know!)

thumb_up_off_alt639

chat_bubble_outline10

repeat77

shareShare

Yasmine

@cyousakura

7 months ago

🎉 Introducing Open Reasoner Zero 🚀 Performance: Matches DeepSeek R1-Zero (32B) in just 1/30 steps! 📚 Full training strategies & technical paper 💻 100% open-source: Code + Data + Model ⚖️ MIT licensed - Use it your way! 🌊 Let the Reasoner-Zero tide rise! 🚢 1/n

thumb_up_off_alt862

chat_bubble_outline27

repeat158

shareShare

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxestex

7 months ago

Fair enough. Here's my compilation of all results from relevant sources on AIME 2025 performance of Grok and OpenAI models, plus extrapolations of cons@64 for DeepSeek models and o1. I think this is significantly easier to understand than chart crimes of these frontier labs.

thumb_up_off_alt696

chat_bubble_outline26

repeat79

shareShare

DeepSeek

@deepseek_ai

7 months ago

🚀 Day 0: Warming up for #OpenSourceWeek! We're a tiny team DeepSeek exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency. These humble building blocks in our online service have been documented,

thumb_up_off_alt21,21K

chat_bubble_outline1,1K

repeat2,2K

shareShare

DeepSeek

@deepseek_ai

7 months ago

🚀 Day 1 of #OpenSourceWeek: FlashMLA Honored to share FlashMLA - our efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences and now in production. ✅ BF16 support ✅ Paged KV cache (block size 64) ⚡ 3000 GB/s memory-bound & 580 TFLOPS

thumb_up_off_alt10,10K

chat_bubble_outline562

repeat1,1K

shareShare

xjdr

@_xjdr

7 months ago

It would take a long ass article to articulate this properly but this is not a vageupoast. I have spent the last few months working on some very hard problems (more on that soon). I've been using a combination of R1 and DeepResearch to build and formalize the ideas and proofs.

thumb_up_off_alt623

chat_bubble_outline32

repeat21

shareShare

DeepSeek

@deepseek_ai

6 months ago

🚀 Day 6 of #OpenSourceWeek: One More Thing – DeepSeek-V3/R1 Inference System Overview Optimized throughput and latency via: 🔧 Cross-node EP-powered batch scaling 🔄 Computation-communication overlap ⚖️ Load balancing Statistics of DeepSeek's Online Service: ⚡ 73.7k/14.8k

thumb_up_off_alt9,9K

chat_bubble_outline764

repeat1,1K

shareShare

Zihan Wang - on RAGEN

@wzihanw

6 months ago

Bro, your post suggests many influential people are never aware of X's several key features: > Highlight - marks important tweets amid sheetposts > X Lists - tracks themed accounts across areas, more organized than following > X Explore - summarized what's happening around >

thumb_up_off_alt79

chat_bubble_outline1

repeat9

shareShare

near

@nearcyan

6 months ago

my 2025 ai twitter experience

thumb_up_off_alt2,2K

chat_bubble_outline176

repeat345

shareShare

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxestex

5 months ago

I've been saying that DeepSeek will expand from verifiable to general domains, and expected a paper. Here is that paper. Self-Principled Critique Tuning. rule-based online RL. Gemma-2 27b is enough to match R1. This is roughly what Google does for Gemma 3 and likely Geminis.

thumb_up_off_alt453

chat_bubble_outline5

repeat53

shareShare

Nathan Lambert

@natolambert

4 months ago

A couple years of weekly analysis, frontier research, and writing a small book on RLHF was pretty much a long winded lead up to writing this blog post. If nothing else, read it as a favor to me. interconnects.ai/p/sycophancy-a…

thumb_up_off_alt118

chat_bubble_outline1

repeat15

shareShare

kalomaze

@kalomaze

4 months ago

VR-CLI is an obscenely powerful RL objective that was mentioned in a paper that wasn't hyped to 1/10th of the degree it deserved. "oh, you can optimize the reasoning traces for next-token prediction in a way that generalizes WAY better..." ...casual bombshell implications.

thumb_up_off_alt451

chat_bubble_outline21

repeat41

shareShare

xjdr

@_xjdr

4 months ago

the last week of launches has highlighted a few things for me: - Progress in LLMs has been amazing but incremental capability gains are clearly closer to log than linear while the corresponding cost for those gains is closer to exponential than linear. scale still works but the

thumb_up_off_alt700

chat_bubble_outline37

repeat62

shareShare

kalomaze

@kalomaze

3 months ago

simple "LLM as a judge" protip if you prompt for something like "provide answers to the TRUE/FALSE rubric questions in order, followed by a one sentence justification" this will be worse than the justification coming *before* the TRUE/FALSE marker

thumb_up_off_alt374

chat_bubble_outline24

repeat7

shareShare

player401

@theplayer401

2 months ago

being asked how to experience the full version of the Kimi K2 API from our official platform. Simply visit platform.moonshot.ai, click on 'Console' and then 'Recharge.' You will receive a $5 voucher after your first successful payment. We have a clear rate limit schedule, and

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare