Katrina Drozdov (Evtimova) (@stochasticdoggo) 's Twitter Profile
Katrina Drozdov (Evtimova)

@stochasticdoggo

AI researcher | PhD from @NYUDataScience | Bulgarian yogurt, prime numbers, and dogs bring me joy | she/her

ID: 904789658399322112

linkhttps://kevtimova.github.io/ calendar_today04-09-2017 19:33:59

337 Tweet

391 Followers

347 Following

Gabriele Berton (@gabriberton) 's Twitter Profile Photo

This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical
Katrina Drozdov (Evtimova) (@stochasticdoggo) 's Twitter Profile Photo

The principle of least effort, from psychology, describes how we favor efficiency over effort. It aligns with System 1 (fast, intuitive) vs. System 2 (slow, deliberate) reasoning. AI faces a similar challenge: knowing when to rely on heuristics vs. deeper reasoning.

Andrew Ng (@andrewyng) 's Twitter Profile Photo

The buzz over DeepSeek this week crystallized, for many people, a few important trends that have been happening in plain sight: (i) China is catching up to the U.S. in generative AI, with implications for the AI supply chain. (ii) Open weight models are commoditizing the

Hila Chefer (@hila_chefer) 's Twitter Profile Photo

VideoJAM is our new framework for improved motion generation from AI at Meta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

Katrina Drozdov (Evtimova) (@stochasticdoggo) 's Twitter Profile Photo

I asked ChatGPT, Gemini, and Claude for a clever joke. They all gave me the same one. Either AI is merging into a hive mind… or humor has officially been solved mathematically!

I asked ChatGPT, Gemini, and Claude for a clever joke. They all gave me the same one. Either AI is merging into a hive mind… or humor has officially been solved mathematically!
Thomas Wolf (@thom_wolf) 's Twitter Profile Photo

I shared a controversial take the other day at an event and I decided to write it down in a longer format: I’m afraid AI won't give us a "compressed 21st century". The "compressed 21st century" comes from Dario's "Machine of Loving Grace" and if you haven’t read it, you probably

Misha Laskin (@mishalaskin) 's Twitter Profile Photo

Today I’m launching Reflection AI with my friend and co-founder Ioannis Antonoglou. Our team pioneered major advances in RL and LLMs, including AlphaGo and Gemini. At Reflection, we're building superintelligent autonomous systems. Starting with autonomous coding.

Today I’m launching <a href="/reflection_ai/">Reflection AI</a> with my friend and co-founder <a href="/real_ioannis/">Ioannis Antonoglou</a>.

Our team pioneered major advances in RL and LLMs, including AlphaGo and Gemini.

At Reflection, we're building superintelligent autonomous systems. Starting with autonomous coding.
NYU Center for Data Science (@nyudatascience) 's Twitter Profile Photo

CDS PhD Vlad Sobal (Vlad Sobal) and Courant PhD Wancong (Kevin) Zhang show that when good data is scarce, planning beats traditional reinforcement learning. With Kyunghyun Cho, Tim G. J. Rudner, and Yann LeCun. nyudatascience.medium.com/when-good-data…

Kyunghyun Cho (@kchonyc) 's Twitter Profile Photo

it's been more than a decade since KD was proposed, and i've been using it all along .. but why does it work? too many speculations but no simple explanation. Sungmin Cha and i decided to see if we can come up with the simplest working description of KD in this work. we ended

it's been more than a decade since KD was proposed, and i've been using it all along .. but why does it work? too many speculations but no simple explanation. <a href="/_sungmin_cha/">Sungmin Cha</a> and i decided to see if we can come up with the simplest working description of KD in this work. 

we ended
jack morris (@jxmnop) 's Twitter Profile Photo

excited to finally share on arxiv what we've known for a while now: All Embedding Models Learn The Same Thing embeddings from different models are SO similar that we can map between them based on structure alone. without *any* paired data feels like magic, but it's real:🧵

Katrina Drozdov (Evtimova) (@stochasticdoggo) 's Twitter Profile Photo

Finally dipped my toes into RL post-training. I trained a code generation LLM with GRPO using open-r1. Here are my 9 takeaways: kevtimova.github.io/posts/grpo/

Diana Cai (@dianarycai) 's Twitter Profile Photo

The application for a research fellowship at the Flatiron Institute in the Center for Computational Math is now live! This includes positions for ML and stats. The deadline is Dec 1. Links below with more details.

John Schulman (@johnschulman2) 's Twitter Profile Photo

Tinker provides an abstraction layer that is the right one for post-training R&D -- it's the infrastructure I've always wanted. I'm excited to see what people build with it. "Civilization advances by extending the number of important operations which we can perform without

Bob McGrew (@bobmcgrewai) 's Twitter Profile Photo

After spending billions of dollars of compute, GPT-5 learned that the most effective use of its token budget is to give itself a little pep talk every time it figures something out. Maybe you should do the same.

Thinking Machines (@thinkymachines) 's Twitter Profile Photo

Today we’re announcing research and teaching grants for Tinker: credits for scholars and students to fine-tune and experiment with open-weight LLMs. Read more and apply at: thinkingmachines.ai/blog/tinker-re…

Katrina Drozdov (Evtimova) (@stochasticdoggo) 's Twitter Profile Photo

Really glad to see initiatives like Thinking Machines Tinker grants that support hands-on RL and open-weights LLM work in both research and teaching. What an exciting opportunity for the community!