Maurice Weiler (@maurice_weiler) Twitter Tweets • TwiCopy

Aditi Krishnapriyan

2 years ago

Neural Spectral Methods (spectral-based neural operator + spectral loss training method) will be presented at Poster session 4 this Wed (May 8, 4:30 PM) at #ICLR2024! Paper: openreview.net/forum?id=2DbVe… Code: github.com/ASK-Berkeley/N…

thumb_up_off_alt95

chat_bubble_outline1

repeat10

shareShare

Kat ⊷ the Poet Engineer

@poetengineer__

a year ago

seeing differently

thumb_up_off_alt7,7K

chat_bubble_outline48

repeat1,1K

shareShare

Tycho van der Ouderaa

@tychovdo

a year ago

🌟New work: Noether's razor⭐️ Our NeurIPS 2024 paper connects ML symmetries to conserved quantities through a seminal result in mathematical physics: Noether's theorem. We can learn neural network symmetries from data by learning associated conservation laws. Learn more👇. 1/16🧵

thumb_up_off_alt348

chat_bubble_outline4

repeat73

shareShare

Sham Kakade

@shamkakade6

a year ago

1/5⚡Introducing Flash Inference: an *exact* method cutting inference time for Long Convolution Sequence Models (LCSMs) to near-linear O(L log² L) complexity! Faster inference, same precision. Learn how we accelerate LCSM inference.

thumb_up_off_alt95

chat_bubble_outline2

repeat23

shareShare

Jeremy Bernstein

@jxbz

a year ago

Over the past month, methods developed by myself and my collaborators were used to set new speed records for training LLMs up to 1.5B scale. I also want to help the science go faster, so now get ready for: ~The General Theory of Modular Duality~ arxiv.org/abs/2410.21265 (1/9)

thumb_up_off_alt510

chat_bubble_outline6

repeat71

shareShare

Johann Brehmer

@johannbrehmer

a year ago

Does equivariance matter when you have lots of data and compute? In a new paper with Sönke Behrends, Pim de Haan, and Taco Cohen, we collect some evidence. arxiv.org/abs/2410.23179 1/7

thumb_up_off_alt308

chat_bubble_outline15

repeat61

shareShare

Maurice Weiler

@maurice_weiler

a year ago

Very cool paper on "Generator Matching", a method to parametrize+train Markov processes via their infinitesimal generators. Besides flows and diffusion, it includes jumps over finite distances and allows to jointly generate continuous and discrete random variables🦾

thumb_up_off_alt26

chat_bubble_outline0

repeat0

shareShare

Aditi Krishnapriyan

@ask1729

a year ago

1/ What are key design principles for scaling neural network interatomic potentials? Our exploration leads us to top results on the Open Catalyst Project (OC20, OC22), SPICE, and MPTrj, with vastly improved efficiency! Accepted at #NeurIPS2024: arxiv.org/abs/2410.24169

thumb_up_off_alt118

chat_bubble_outline10

repeat34

shareShare

Ruben Ohana

@oharub

a year ago

Generating cat videos is nice, but what if you could tackle real scientific problems with the same methods? 🧪🌌 Introducing The Well: 16 datasets (15TB) for Machine Learning, from astrophysics to fluid dynamics and biology. 🐙: github.com/PolymathicAI/t… 📜: openreview.net/pdf?id=00Sx577…

thumb_up_off_alt377

chat_bubble_outline10

repeat83

shareShare

Zhou Xian

@zhou_xian_

a year ago

Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics

thumb_up_off_alt16,16K

chat_bubble_outline578

repeat3,3K

shareShare

Peter Holderrieth

@peholderrieth

9 months ago

New paper out! We introduce “LEAPS”, a neural sampling algorithm for discrete distributions via continuous-time Markov chains (“discrete diffusion”). We introduce a novel importance sampling scheme and novel symmetries built into neural networks. arxiv.org/pdf/2502.10843 (1/4)

thumb_up_off_alt190

chat_bubble_outline3

repeat29

shareShare

Shubhendu Trivedi

@_onionesque

7 months ago

This work is quite natural, even though a lot of it looks like this (it will take a while to work out the details). What they do is to use general integral operators to write the non-linear maps. These then correspond to specific characterizations of the space of kernels.

thumb_up_off_alt26

chat_bubble_outline1

repeat4

shareShare

Nabil Iqbal

@nblqbl

6 months ago

The arxiv preprint on our conformally equivariant neural network -- named AdS-GNN due to its secret origins in AdS/CFT -- is now out! arxiv.org/abs/2505.12880 🧵explaining it below. Joint work with the amazing team of Max Zhdanov, Erik Bekkers and Patrick Forre.

thumb_up_off_alt271

chat_bubble_outline1

repeat55

shareShare

Maurice Weiler

@maurice_weiler

6 months ago

New preprint! We extend Taco Cohen's theory of equivariant CNNs on homogeneous spaces to the non-linear setting. Beyond convolutions, this covers equivariant attention, implicit kernel MLPs and more general message passing layers. More details in Oscar Carlsson's thread 👇

New preprint! We extend <a href="/TacoCohen/">Taco Cohen</a>'s theory of equivariant CNNs on homogeneous spaces to the non-linear setting. Beyond convolutions, this covers equivariant attention, implicit kernel MLPs and more general message passing layers.
More details in <a href="/O_EA_Carlsson/">Oscar Carlsson</a>'s thread 👇

thumb_up_off_alt79

chat_bubble_outline1

repeat14

shareShare

Katie Everett

@_katieeverett

6 months ago

1. We often observe power laws between loss and compute: loss = a * flops ^ b + c 2. Models are rapidly becoming more efficient, i.e. use less compute to reach the same loss But: which innovations actually change the exponent in the power law (b) vs change only the constant (a)?

thumb_up_off_alt254

chat_bubble_outline8

repeat44

shareShare

Ekdeep Singh Lubana

@ekdeepl

6 months ago

🚨 New paper alert! Linear representation hypothesis (LRH) argues concepts are encoded as **sparse sum of orthogonal directions**, motivating interpretability tools like SAEs. But what if some concepts don’t fit that mold? Would SAEs capture them? 🤔 1/11

thumb_up_off_alt378

chat_bubble_outline5

repeat60

shareShare

Alejandro García

@algarciacast

5 months ago

🌍 From earthquake prediction to robot navigation - what connects them? Eikonal equations! We developed E-NES: a neural network that leverages geometric symmetries to solve entire families of velocity fields through group transformations. Grid-free and scalable! 🧵👇

thumb_up_off_alt131

chat_bubble_outline2

repeat28

shareShare

Alex Vacca

@itsalexvacca

5 months ago

BREAKING: MIT just completed the first brain scan study of ChatGPT users & the results are terrifying. Turns out, AI isn't making us more productive. It's making us cognitively bankrupt. Here's what 4 months of data revealed: (hint: we've been measuring productivity all wrong)

thumb_up_off_alt52,52K

chat_bubble_outline1,1K

repeat10,10K

shareShare

Kirill Neklyudov

@k_neklyudov

5 months ago

(1/n) Sampling from the Boltzmann density better than Molecular Dynamics (MD)? It is possible with PITA 🫓 Progressive Inference Time Annealing! A spotlight GenBio Workshop @ ICML25 of ICML Conference 2025! PITA learns from "hot," easy-to-explore molecular states 🔥 and then cleverly "cools"

thumb_up_off_alt295

chat_bubble_outline5

repeat52

shareShare

Sukjun (June) Hwang

@sukjun_hwang

5 months ago

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

thumb_up_off_alt2,2K

chat_bubble_outline58

repeat355

shareShare