Grant Watson (@grhwatson) 's Twitter Profile
Grant Watson

@grhwatson

ML @RecursionPharma. Previous: ML Engineer @dewpoint_tx @PhenomicAI. Into ML, physics, math, music, computer-generated art, and Dungeons & Dragons.

ID: 894635129485897729

calendar_today07-08-2017 19:03:31

500 Tweet

218 Followers

1,1K Following

Majdi Hassan (@majdi_has) 's Twitter Profile Photo

(1/n)🚨You can train a model solving DFT for any geometry almost without training data!🚨 Introducing Self-Refining Training for Amortized Density Functional Theory — a variational framework for learning a DFT solver that predicts the ground-state solutions for different

Amil Dravid (@_amildravid) 's Twitter Profile Photo

Artifacts in your attention maps? Forgot to train with registers? Use 𝙩𝙚𝙨𝙩-𝙩𝙞𝙢𝙚 𝙧𝙚𝙜𝙞𝙨𝙩𝙚𝙧𝙨! We find a sparse set of activations set artifact positions. We can shift them anywhere ("Shifted") — even outside the image into an untrained token. Clean maps, no retrain.

Artifacts in your attention maps? Forgot to train with registers? Use 𝙩𝙚𝙨𝙩-𝙩𝙞𝙢𝙚 𝙧𝙚𝙜𝙞𝙨𝙩𝙚𝙧𝙨! We find a sparse set of activations set artifact positions. We can shift them anywhere ("Shifted") — even outside the image into an untrained token. Clean maps, no retrain.
Corin Wagen (@corinwagen) 's Twitter Profile Photo

Gabriele Corso Patrick Walters Various forms of this discussion are playing out in a lot of different "AI x science" areas right now. (I'm team extrapolation-is-good, but open to being wrong.) I wrote about closely related topics previously, albeit in an esoteric format: corinwagen.github.io/public/blog/20…

Sakana AI (@sakanaailabs) 's Twitter Profile Photo

We’re excited to introduce AB-MCTS! Our new inference-time scaling algorithm enables collective intelligence for AI by allowing multiple frontier models (like Gemini 2.5 Pro, o4-mini, DeepSeek-R1-0528) to cooperate. Blog: sakana.ai/ab-mcts Paper: arxiv.org/abs/2503.04412

We’re excited to introduce AB-MCTS!

Our new inference-time scaling algorithm enables collective intelligence for AI by allowing multiple frontier models (like Gemini 2.5 Pro, o4-mini, DeepSeek-R1-0528) to cooperate.

Blog: sakana.ai/ab-mcts
Paper: arxiv.org/abs/2503.04412
Giannis Daras (@giannis_daras) 's Twitter Profile Photo

Announcing Ambient Protein Diffusion, a state-of-the-art 17M-params generative model for protein structures. Diversity improves by 91% and designability by 26% over previous 200M SOTA model for long proteins. The trick? Treat low pLDDT AlphaFold predictions as low-quality data

Announcing Ambient Protein Diffusion, a state-of-the-art 17M-params generative model for protein structures.

Diversity improves by 91% and designability by 26% over previous 200M SOTA model for long proteins.

The trick? Treat low pLDDT AlphaFold predictions as low-quality data
Albert Gu (@_albertgu) 's Twitter Profile Photo

Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence. Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.

Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence.

Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.
Bilawal Sidhu (@bilawalsidhu) 's Twitter Profile Photo

Damn it worked! Genie 3 world --> inpaint UI --> 4x topaz AI upscale --> train 3d gaussian splat You can step inside a painting of Socrates from 1787. Better than any image-to-3d model I've seen. I think Google has stumbled upon the killer app for VR -- the literal holodeck.

Yilun Du (@du_yilun) 's Twitter Profile Photo

Excited to share Equilibrium Matching (EqM)! EqM simplifies and outperforms flow matching, enabling strong generative performance of FID 1.96 on ImageNet 256x256. EqM learns a single static EBM landscape for generation, enabling a simple gradient-based generation procedure.

Ivan Skorokhodov (@isskoro) 's Twitter Profile Photo

I think this paper [arxiv.org/abs/2510.08570] wins the "strangest" (in a good sense) 1-step diffusion award of this year. They parametrize a model as an invertible network, which maps from the sample space to the representation space, which is assumed to be linear: i.e. we assume

Saining Xie (@sainingxie) 's Twitter Profile Photo

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right.

today, we introduce Representation Autoencoders (RAE).

>> Retire VAEs. Use RAEs. 👇(1/n)
Sakana AI (@sakanaailabs) 's Twitter Profile Photo

Introducing Petri Dish Neural Cellular Automata (PD-NCA) 🦠 The search for open-ended complexification, a north star of Artificial Life (ALife) simulations, is a question that fascinates us deeply. In this work we explore the role of continual adaptation in ALife simulation,

Oriol Vinyals (@oriolvinyalsml) 's Twitter Profile Photo

The secret behind Gemini 3? Simple: Improving pre-training & post-training 🤯 Pre-training: Contra the popular belief that scaling is over—which we discussed in our NeurIPS '25 talk with Ilya Sutskever and Quoc Le—the team delivered a drastic jump. The delta between 2.5 and 3.0 is

The secret behind Gemini 3?

Simple: Improving pre-training & post-training 🤯

Pre-training: Contra the popular belief that scaling is over—which we discussed in our NeurIPS '25 talk with <a href="/ilyasut/">Ilya Sutskever</a> and <a href="/quocleix/">Quoc Le</a>—the team delivered a drastic jump. The delta between 2.5 and 3.0 is
hardmaru (@hardmaru) 's Twitter Profile Photo

Excited to announce our MIT Press book “Neuroevolution: Harnessing Creativity in AI Agent Design” by Sebastian Risi (Sebastian Risi), Yujin Tang (Yujin Tang), Risto Miikkulainen, and myself. We explore decades of work on evolving intelligent agents and shows how neuroevolution can

sway (@swaystar123) 's Twitter Profile Photo

Speedrunning ImageNet Diffusion - 360x faster training There have been many new techniques demonstrating convergence speedups compared to DiT in the past few years, however all of these have been studied in isolation, against increasingly outdated baselines. I present SR-DiT

Speedrunning ImageNet Diffusion - 360x faster training

There have been many new techniques demonstrating convergence speedups compared to DiT in the past few years, however all of these have been studied in isolation, against increasingly outdated baselines.

I present SR-DiT
Sakana AI (@sakanaailabs) 's Twitter Profile Photo

Our AI agent has achieved 1st place in a competitive optimization programming contest against over 800 human participants. Blog: sakana.ai/ahc058 In AtCoder Heuristic Contest 058, Sakana AI’s ALE-Agent took the top spot. For context on the difficulty of these challenges,

Our AI agent has achieved 1st place in a competitive optimization programming contest against over 800 human participants.

Blog: sakana.ai/ahc058

In AtCoder Heuristic Contest 058, Sakana AI’s ALE-Agent took the top spot. For context on the difficulty of these challenges,
Sakana AI (@sakanaailabs) 's Twitter Profile Photo

Introducing Digital Red Queen (DRQ): Adversarial Program Evolution in Core War with LLMs Blog: sakana.ai/drq Core War is a programming game where self-replicating assembly programs, called warriors, compete for control of a virtual machine. In this dynamic

机器之心 JIQIZHIXIN (@synced_global) 's Twitter Profile Photo

New paradigm from Kaiming He's team: Drifting Models! With this approach, you can generate a perfect image in a single step. The team trains a "drifting field" that smoothly moves samples toward equilibrium with the real data distribution. The result? A one-step generator that

New paradigm from Kaiming He's team: Drifting Models!

With this approach, you can generate a perfect image in a single step.

The team trains a "drifting field" that smoothly moves samples toward equilibrium with the real data distribution.

The result? A one-step generator that
OpenAI (@openai) 's Twitter Profile Photo

We worked with Ginkgo Bioworks to connect GPT-5 to an autonomous lab, so it could propose experiments, run them at scale, learn from the results, and decide what to try next. That closed loop brought protein production cost down by 40%.

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Terence Tao: AI isn’t hype anymore in Math discovery. Terence Tao is one of the greatest living mathematicians, in his new lecture explains how AI and human professional mathematicians are now complementary. "There has been a really visible increase in capability. It is not

Martin Bauer (@martinmbauer) 's Twitter Profile Photo

Yes, this is a significant result and a solid research paper. And it would’ve been much harder to achieve without GPT. While I understand the instinct, I think it is more interesting to evaluate what type of contribution the AI has made as opposed to focussing on how relevant