
Chaitanya K. Joshi @ICLR2025 🇸🇬
@chaitjo
PhD student @Cambridge_CL. Research Intern @AIatMeta and @PrescientDesign. Interested in Deep Learning for biomolecule design. Organising @LoGConference.
ID: 166216664
https://www.chaitjo.com/ 13-07-2010 16:36:04
2,2K Tweet
7,7K Followers
1,1K Following


Chaitanya K. Joshi Xiang Fu Brandon Wood Nathan C. Frey Jason Yim Hannes Stärk Ilyes Batatia Ahmed Elhag Taco Cohen I agree in general. There is also a second coming of equivariance in weight spaces—a new field of neural networks that operate on neural networks

Chaitanya K. Joshi It's possible that data aug + simpler model outperforms SE(3) or E(3) equivariance in practice. The effective sample complexity gain is a constant independent of the input size. But the learning a permutation symmetry from data will never perform better IMO (unless data is tiny)


Petar Veličković Transformers are also equivariant. In general, small/low-dimensional groups such as SE(3) or T(2) are easy to learn. Large groups like Sn are hopeless

Soledad Villar Chaitanya K. Joshi Whether or how learning a permutation symmetry through data augmentation can be beneficial depends heavily on the setup. Some positive examples are set prediction tasks, for which the permutation symmetric solution (set prediction architecture + permutation symmetric loss) often

Petar Veličković Michael Bronstein Pix2Seq (set prediction tasks randomly serialized into sequence predicted using autoregressive model) vs DETR (set prediction architecture and loss) comparison is a good case study for why even baking in permutation symmetry into architecture and loss is not necessarily what

Thomas Kipf Petar Veličković Michael Bronstein Recent works in point cloud processing do achieve state-of-the-art performance while breaking permutation symmetry to impose regular structure on unordered sets for better scaling (e.g. PointTransformer v3 arxiv.org/abs/2312.10035 or Erwin arxiv.org/abs/2502.17019).

Chaitanya K. Joshi I agree that strict equivariance isn’t always necessary e.g. in generative diffusion models but it’s essential in others, like ML potentials. The claim that equivariant models are slow doesn’t hold in our field, top models like MACE (equi) and Orb (non-equi) have similar speeds.

Rubén Ballester Petar Veličković Michael Bronstein I often see the claim that “SO(3) is easy to learn because it’s low-dimensional,” but to my knowledge, there’s no empirical or theoretical evidence supporting this. In fact, SO(3) is significantly harder to sample than SO(2), despite being only two dimensions higher.

Chaitanya K. Joshi Xiang Fu Brandon Wood Nathan C. Frey Michael Bronstein Jason Yim Ilyes Batatia Ahmed Elhag Taco Cohen AF3/Boltz has little uncertainty left after the SE3 invariant trunk. A fun connected visualization is the x0 prediction trajectory. They always look like in this video where the initial x0 prediction is almost the same as at the end of the denoising trajectory.

Michael Bronstein Rubén Ballester Petar Veličković I’d argue that non-equivariant models learn SO(3) well enough for generative modelling, where equivariance accuracy isn’t critical. But there’s evidence that it is hard to reach high equivariance accuracy for force fields. To me "easy to learn" would mean to any given accuracy.

Mark Neumann Chaitanya K. Joshi I agree standardised tests are essential, and timing things can be surprisingly hard. I think we can agree that orb-v3-conservative-inf and MACE are in a similar speed range, meaning they can do a few nanoseconds/day for a few thousand atoms. I can quote directly from your paper.

Chaitanya K. Joshi Xiang Fu Brandon Wood Nathan C. Frey Michael Bronstein Jason Yim Hannes Stärk Ilyes Batatia Ahmed Elhag Taco Cohen Non equivariant models can perform extremely well as inter atomic potentials. arxiv.org/abs/2504.06231


Chaitanya K. Joshi True, I guess “required” was too strong of a choice of wording 🫢. I think nuance is absolutely needed, Id argue is probably what we needed in the first place.


Today marks a big milestone for me. I'm launching LawZero - LoiZéro, a nonprofit focusing on a new safe-by-design approach to AI that could both accelerate scientific discovery and provide a safeguard against the dangers of agentic AI.