Quentin Bertrand (@qu3ntinb) Twitter Tweets • TwiCopy

Mathieu Blondel

6 months ago

We just released a new approach for turning local search heuristics used to solve NP-hard combinatorial problems in OR into differentiable layers. The key idea is to use the neighborhoods used by these algorithms for creating MCMC proposal distributions arxiv.org/abs/2505.14240

thumb_up_off_alt190

chat_bubble_outline2

repeat26

shareShare

Damien Ferbach

@damien_ferbach

6 months ago

Our paper derives momentum schedules that are functions of both the model dimension and data distribution. * On our theoretical model, this provably improves the scaling law exponents in many regimes! * And, this exponent improvement holds on LSTM experiments on C4.

thumb_up_off_alt23

chat_bubble_outline1

repeat2

shareShare

Giannis Daras

@giannis_daras

5 months ago

Announcing Ambient Diffusion Omni — a framework that uses synthetic, low-quality, and out-of-distribution data to improve diffusion models. State-of-the-art ImageNet performance. A strong text-to-image results in just 2 days on 8 GPUs. Filtering ❌ Clever data use ✅

thumb_up_off_alt416

chat_bubble_outline8

repeat55

shareShare

Lenka Zdeborova

@zdeborova

5 months ago

Pleased to see that this time, three Czech ladies are in the list of the European Research Council (ERC) Advanced Grants, and I am very proud to be among them ;). Congrats to Kateřina Čapková a Anna Durnová!

thumb_up_off_alt103

chat_bubble_outline4

repeat6

shareShare

Dalalyan Arnak

@arnakdalalyan

5 months ago

🎉 It’s official! I’ve been awarded an ERC Advanced Grant for my project on Statistical Analysis of Generative Models. More details 👉 crest.science/arnak-dalalyan… #ERCAdG European Research Council (ERC)

thumb_up_off_alt52

chat_bubble_outline7

repeat6

shareShare

Waïss Azizian

@wazizian

5 months ago

❓ How long does SGD take to reach the global minimum on non-convex functions? With Franck Iutzeler, J. Malick, P. Mertikopoulos, we tackle this fundamental question in our new ICML 2025 paper: "The Global Convergence Time of Stochastic Gradient Descent in Non-Convex Landscapes"

thumb_up_off_alt146

chat_bubble_outline5

repeat20

shareShare

Mila - Institut québécois d'IA

@mila_quebec

5 months ago

Looking back on an inspiring day of exchange last week at Mila, where students presented cutting-edge research work to their peers during a casual poster session. See you in September for the next edition!

thumb_up_off_alt15

chat_bubble_outline0

repeat2

shareShare

David Duvenaud

@davidduvenaud

5 months ago

It's hard to plan for AGI without knowing what outcomes are even possible, let alone good. So we’re hosting a workshop! Post-AGI Civilizational Equilibria: Are there any good ones? Vancouver, July 14th Featuring: Joe Carlsmith Richard Ngo Emmett Shear 🧵

thumb_up_off_alt80

chat_bubble_outline7

repeat8

shareShare

Marta Skreta

@martoskreto

5 months ago

🧵(1/6) Delighted to share our ICML Conference 2025 spotlight paper: the Feynman-Kac Correctors (FKCs) in Diffusion Picture this: it’s inference time and we want to generate new samples from our diffusion model. But we don’t want to just copy the training data – we may want to sample

thumb_up_off_alt45

chat_bubble_outline2

repeat9

shareShare

Mathurin Massias

@mathusmassias

5 months ago

New paper on the generalization of Flow Matching arxiv.org/abs/2506.03719 🤯 Why does flow matching generalize? Did you know that the flow matching target you're trying to learn **can only generate training points**? with Quentin Bertrand, Anne Gagneux & Rémi Emonet 👇👇👇

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat202

shareShare

Mathurin Massias

@mathusmassias

5 months ago

Yet FM generates new samples! An hypothesis to explain this paradox is target stochasticity: FM targets the conditional velocity field i.e. only a stochastic approximation of the full velocity field u* *We refute this hypothesis*: very early, the approximation almost equals u*

thumb_up_off_alt27

chat_bubble_outline1

repeat2

shareShare

Mathurin Massias

@mathusmassias

5 months ago

We propose to regress directly against the optimal (deterministic) u* and show that it never degrades the performance On the opposite, removing target stochasticity helps generalizing faster.

thumb_up_off_alt17

chat_bubble_outline1

repeat2

shareShare

IVADO

@ivado_qc

5 months ago

🚀IVADO et le Centre des Compétences futures Future Skills Centre - en collaboration avec le Tech3Lab de HEC Montréal, lancent une nouvelle formation gratuite en #IA pour les professionnel(le)s du #Québec et du #Canada. Lire le communiqué de presse➡️lnkd.in/gdDkqj8N

🚀IVADO et le Centre des Compétences futures <a href="/fsc_ccf_en/">Future Skills Centre</a> - en collaboration avec le Tech3Lab de <a href="/HEC_Montreal/">HEC Montréal</a>, lancent une nouvelle formation gratuite en #IA pour les professionnel(le)s du #Québec et du #Canada.

Lire le communiqué de presse➡️lnkd.in/gdDkqj8N

thumb_up_off_alt2

chat_bubble_outline1

repeat2

shareShare

Quentin Bertrand

@qu3ntinb

5 months ago

Yes! Indeed, deep generative networks do not exactly reproduce the training set/generalize because of the inductive bias. The key difference with prev. gen. models (e.g. GANs) is the closed-form formula of FM: one can study very finely where the inductive bias comes into play!

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Josh English

@joshseriesai

5 months ago

Mathurin Massias Quentin Bertrand Fascinating. So the failure to perfectly learn the target velocity field, due to neural network inductive biases, is what drives generalization in Flow Matching. Quite a reminder that sometimes the best solutions emerge from inherent system constraints

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Samuel Vaiter

@vaiter

5 months ago

ResNet and Neural ODEs are closely related: ResNet uses discrete residual/skip connections, while Neural ODEs generalize this to continuous transformations using ODEs. Neural ODEs *can* be seen as the limit of ResNet as the number of layers approaches infinity.

thumb_up_off_alt402

chat_bubble_outline3

repeat45

shareShare

Scientific Python

@scipytip

5 months ago

Mayavi: 3D scientific data visualization and plotting in Python docs.enthought.com/mayavi/mayavi/

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

logprob

@logprob

5 months ago

Burny - Effective Omni Mathurin Massias Quentin Bertrand Yes, it is basically a different way of training normalizing flows via a regressive objective on the vector field, thus avoiding simulation step a training time. Meta uses it!

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Mila - Institut québécois d'IA

@mila_quebec

5 months ago

Mila's science communication contest finale showcased 6 brilliant researchers pioneering AI for medical imaging, assistive robotics, forest monitoring, inclusive urban design and more. Watch the presentations that won the hearts of the jury and the public: ow.ly/wyIY50We2p3

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Mathieu Blondel

@mblondel_ml

5 months ago

Slides of my talk on our ICML 2025 paper "Joint Learning of Energy-based Models and their Partition Function" mblondel.org/talks/?p=ebm.m…

thumb_up_off_alt100

chat_bubble_outline1

repeat14

shareShare