David Grangier (@grangierdavid) 's Twitter Profile
David Grangier

@grangierdavid

ML research with practical impact.

ID: 1205564776996376576

calendar_today13-12-2019 19:07:08

34 Tweet

413 Takipçi

53 Takip Edilen

David Grangier (@grangierdavid) 's Twitter Profile Photo

Faster, better model training by reusing old gradients (>10k steps ago) with negligible extra computation? Count me in. arxiv.org/abs/2409.03137

David Grangier (@grangierdavid) 's Twitter Profile Photo

#ICLR #TrainLLMBetter Tomorrow, #soup of experts, an #hypernetwork conditioned on a simple description of the test distribution: adaptation without retraining (Modularity workshop Sunday). arxiv.org/abs/2502.01804 Still on today... CRISP Importance Sampling for LLM pretraining.

#ICLR #TrainLLMBetter Tomorrow, #soup of experts, an #hypernetwork  conditioned on a simple description of the test distribution: adaptation without retraining (Modularity workshop Sunday). arxiv.org/abs/2502.01804

Still on today... CRISP Importance Sampling for LLM pretraining.