Kartik Ahuja (@kartikahuja1) 's Twitter Profile
Kartik Ahuja

@kartikahuja1

Research Scientist, FAIR, @MetaAI. Prev: Postdoc @Mila_Quebec, @IBMResearch, PhD @UCLA, Undergrad @IITKanpur. Interested in theory of machine learning. He/Him.

ID: 539115783

linkhttps://ahujak.github.io calendar_today28-03-2012 13:12:41

84 Tweet

370 Followers

228 Following

Certified papers at TMLR (@tmlrcert) 's Twitter Profile Photo

New #FeaturedCertification: WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series Jean-Christophe Gagnon-Audet, Kartik Ahuja, Mohammad Javad Darvishi Bayazi et al. openreview.net/forum?id=mvftz… #generalization #generalize #datasets

Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

*Model Ratatouille: Recycling Diverse Models for OOD Generalization* by Alexandre Ramé Kartik Ahuja Matthieu Cord Creating robust models by averaging the weights of multiple models fine-tuned from the same initialization can be a very effective strategy. arxiv.org/abs/2212.10445

*Model Ratatouille: Recycling Diverse Models for OOD Generalization*
by <a href="/ramealexandre/">Alexandre Ramé</a> <a href="/KartikAhuja1/">Kartik Ahuja</a> <a href="/quobbe/">Matthieu Cord</a>

Creating robust models by averaging the weights of multiple models fine-tuned from the same initialization can be a very effective strategy.

arxiv.org/abs/2212.10445
Sharut Gupta (@sharut_gupta) 's Twitter Profile Photo

Tired of brittle AI models? Our paper “Context is Environment” shows that in-context learning in LLMs holds the key to domain generalization. Introducing In-Context Risk Minimization (ICRM)! Paper: arxiv.org/pdf/2309.09888…

Tired of brittle AI models?
Our paper “Context is Environment” shows that in-context
learning in LLMs holds the key to domain generalization.  

Introducing In-Context Risk Minimization (ICRM)!

Paper: arxiv.org/pdf/2309.09888…
Kartik Ahuja (@kartikahuja1) 's Twitter Profile Photo

In-context learning meets domain generalization (DG)! 1. DG researchers should consider environment as context to build better generalizers. 2. Train a transformer on a sequence of unlabelled data from the domain followed by the current query to predict its label.

Alexandre Ramé (@ramealexandre) 's Twitter Profile Photo

Exciting news 🎓! I'm defending my PhD on "Diverse & Efficient Ensembling of Deep Networks" tomorrow at 13h30 CEST. If you're in Paris and can join, DM me. Or catch the live stream on YouTube: youtube.com/watch?v=DTD7qt…. Wish me luck!

Exciting news 🎓! I'm defending my PhD on "Diverse &amp; Efficient Ensembling of Deep Networks" tomorrow at 13h30 CEST. If you're in Paris and can join, DM me. Or catch the live stream on YouTube: youtube.com/watch?v=DTD7qt…. Wish me luck!
Jason Hartford (@jasonhartford) 's Twitter Profile Photo

Dhanya Sridhar, Chandler Squires and I are running a talk series on Causality, Abstraction, Reasoning and Extrapolation (CARE).First talk is at 11EST tomorrow by Johnny on "The Statistical Structure of Identifiable Generative Models".All are welcome! portal.valencelabs.com/events/post/th…

Matej Zečević (@matej_zecevic) 's Twitter Profile Photo

This Wed. 10:30 ET (29th Nov.) @ CDG we are honored to see Divyat Mahajan with their CLeaR-Conference on Causal Learning and Reasoning paper: Towards efficient representation identification in SL (proceedings.mlr.press/v177/ahuja22a/…) All info & links @ discuss.causality.link 🌿 cc: Kartik Ahuja Vasilis Syrgkanis Ioannis Mitliagkas (Γιάννης Μητλιάγκας)

Amin Mansouri (@m_amin_mansouri) 's Twitter Profile Photo

Better late to the party than never. We have 3 works at #NeurIPS2023 , but sadly I’m not able to present any of them as the visa officer didn’t care to even look at my invitation/presentations/paper/etc the moment he learned about my nationality. (cont. in the thread)

Irina Rish (@irinarish) 's Twitter Profile Photo

Congrats to Jean-Christophe Gagnon-Audet Kartik Ahuja @653mjd and Guillaume Dumas - our paper on OoD time-series benchmarks (under-researched modality when studying OoD) got featured certification by TMLR and will be also presented at ICLR in May! See you in the WOODS ;) openreview.net/forum?id=mvftz…

Kartik Ahuja (@kartikahuja1) 's Twitter Profile Photo

Explore fundamental questions surrounding large language models by applying for a postdoctoral position with our Generalization Team at FAIR Paris. If you're attending ICLR, visit our booth to connect with my colleagues and learn more about this exciting opportunity.

Kartik Ahuja (@kartikahuja1) 's Twitter Profile Photo

This work delivers on both theory and practice—offering the sharpest provable compositionality guarantees I know of, alongside state‑of‑the‑art performance on tough compositional distribution‑shift benchmarks.

Sachin Goyal (@goyalsachin007) 's Twitter Profile Photo

1/Excited to share the first in a series of my research updates on LLM pretraining🚀. Our new work shows *distilled pretraining*—increasingly used to train deployable models—has trade-offs: ✅ Boosts test-time scaling ⚠️ Weakens in-context learning ✨ Needs tailored data curation

1/Excited to share the first in a series of my research updates on LLM pretraining🚀.
Our new work shows *distilled pretraining*—increasingly used to train deployable models—has trade-offs:
✅ Boosts test-time scaling
⚠️ Weakens in-context learning
✨ Needs tailored data curation
Kartik Ahuja (@kartikahuja1) 's Twitter Profile Photo

Distilled pretraining isn’t just a trend—it is the future of pretraining. Scratch trained models are on their way out (see Llama 4 Maverick, Gemma 3). Yet the science is still wide open. Amazing work here by our intern Sachin Goyal.

Sachin Goyal (@goyalsachin007) 's Twitter Profile Photo

🚨 Super excited to finally share our Safety Pretraining work — along with all the artifacts (safe data, models, code)! In this thread 🧵, I’ll walk through our journey — the key intermediate observations and lessons, and how they helped shape our final pipeline.