Rishika Bhagwatkar (@rishika2110) 's Twitter Profile
Rishika Bhagwatkar

@rishika2110

MSc in CS at Mila Quebec and UdeM

ID: 1374388605490139147

calendar_today23-03-2021 15:52:47

62 Tweet

135 Followers

268 Following

Ekaterina Lobacheva (@katelobacheva) 's Twitter Profile Photo

Did you know that learning rate affects which examples are easy or hard for a model? And this difference is meaningful and relates to generalization. Stop by our poster at the SciForDL Workshop #NeurIPS2024 tomorrow to learn more! Paper: openreview.net/forum?id=NeetG… 1/8

Did you know that learning rate affects which examples are easy or hard for a model? And this difference is meaningful and relates to generalization. Stop by our poster at the SciForDL Workshop #NeurIPS2024 tomorrow to learn more!

Paper: openreview.net/forum?id=NeetG…

1/8
Sabyasachi Sahoo (@saby_tweets) 's Twitter Profile Photo

🚀 Test-Time Adaptation (TTA) & Layer Selection! 🧵 (1/N) TTA helps LLMs like GPT-4o & DeepSeek-V2 via test-time compute scaling. But TTA fails on hard Out-of-Distribution (OOD) tasks! 😩 Our AAAI 2025 paper introduces GALA, a Gradient-Aligned Layer Adaptation framework to fix

🚀 Test-Time Adaptation (TTA) & Layer Selection! 🧵 (1/N)

TTA helps LLMs like GPT-4o & DeepSeek-V2 via test-time compute scaling. But TTA fails on hard Out-of-Distribution (OOD) tasks! 😩

Our AAAI 2025 paper introduces GALA, a Gradient-Aligned Layer Adaptation framework to fix
Tejas Vaidhya (@imtejas13) 's Twitter Profile Photo

🎉 Thrilled to share that our paper "Surprising effectiveness of pretraining ternary language models at scale" earned a spotlight at #ICLR2024! We dive into Ternary Language Models (TriLMs), systematically studying their training feasibility and scaling laws against FloatLMs.

🎉 Thrilled to share that our paper "Surprising effectiveness of pretraining ternary language models at scale" earned a spotlight at #ICLR2024! We dive into Ternary Language Models (TriLMs), systematically studying their training feasibility and scaling laws against FloatLMs.
Benjamin Thérien (@benjamintherien) 's Twitter Profile Photo

How do MoE transformers, like DeepSeek, behave under distribution shifts? Do their routers collapse? Can they still match full re-training performance? Excited to present “Continual Pre-training of MoEs: How robust is your router?”!🧵arxiv.org/abs/2503.05029 1/N

How do MoE transformers, like DeepSeek, behave under distribution shifts? Do their routers collapse? Can they still match full re-training performance? Excited to present “Continual Pre-training of MoEs: How robust is your router?”!🧵arxiv.org/abs/2503.05029 1/N
Accepted papers at TMLR (@tmlrpub) 's Twitter Profile Photo

Interpreting Neurons in Deep Vision Networks with Language Models Nicholas Bai, Rahul Ajay Iyer, Tuomas Oikarinen, Akshay R. Kulkarni, Tsui-Wei Weng. Action editor: Antoine Ledent. openreview.net/forum?id=x1dXv… #neuron #neurons #deep

Benjamin Thérien (@benjamintherien) 's Twitter Profile Photo

Llama4 MoEs just dropped! Now you're planning to continually pre-train Scout or Maverick on your data. BUT, you're not sure how the distribution shift may affect the MoE's router? Our new paper has you covered! x.com/benjamintherie…

Aniket Didolkar (@aniket_d98) 's Twitter Profile Photo

✈️ I am travelling to Singapore 🇸🇬 for #ICLR 2025. I will be presenting 1 paper, details in 🧵 I will also be at the Meta booth on 24th and 25th from 10-12. Come chat about self supervised learning, the student/visiting researcher program at FAIR or anything in general!

✈️ I am travelling to Singapore 🇸🇬 for #ICLR 2025. I will be presenting 1 paper, details in 🧵

I will also be at the Meta booth on 24th and 25th from 10-12. Come chat about self supervised learning, the student/visiting researcher program at FAIR or anything in general!
Gopeshh Subbaraj (@gopeshh1) 's Twitter Profile Photo

1/  Most RL methods assumes a turn-based setup-- agent acts, environment responds. But in the real world, the environment doesn’t wait. In real-time RL, slow inference means missed actions or delayed ones. This leads to two key challenges:  • Inaction Regret  • Delay Regret

Arnav Jain (@arnavkj95) 's Twitter Profile Photo

📢 Come say hi at our SFM poster at #ICLR2025, Poster Session 5 – #572! We’re presenting a method for Inverse Reinforcement Learning via Successor Feature Matching — a non-adversarial approach that works without action labels. Excited to share and chat!

📢 Come say hi at our SFM poster at #ICLR2025, Poster Session 5 – #572!

We’re presenting a method for Inverse Reinforcement Learning via Successor Feature Matching — a non-adversarial approach that works without action labels.

Excited to share and chat!
Arjun Ashok (@arjunashok37) 's Twitter Profile Photo

Context is Key🗝️ is accepted at ICML 2025! 📈 Let's catch up if you'll be at ICML 🛬 See the poster and tweet thread below for a preview of CiK 👇 x.com/arjunashok37/s… And stay tuned for new results ;)

Context is Key🗝️ is accepted at ICML 2025! 📈 

Let's catch up if you'll be at ICML 🛬

See the poster and tweet thread below for a preview of CiK 👇
x.com/arjunashok37/s…

And stay tuned for new results ;)
Divyat Mahajan (@divyat09) 's Twitter Profile Photo

Happy to share that Compositional Risk Minimization has been accepted at #ICML2025 📌Extensive theoretical analysis along with a practical approach for extrapolating classifiers to novel compositions! 📜 arxiv.org/abs/2410.06303

Happy to share that Compositional Risk Minimization has been accepted at #ICML2025

📌Extensive theoretical analysis along with a practical approach for extrapolating classifiers to novel compositions!

📜 arxiv.org/abs/2410.06303
francesco croce (@fra__31) 's Twitter Profile Photo

📃 In our new paper, we introduce FuseLIP, an encoder for multimodal embedding. We use early fusion of modalities to train a single transformer on contrastive + masked (multimodal) modeling loss More details👇

Sara Ghazanfari (@saraghznfri) 's Twitter Profile Photo

🚨How to incorporate temporal grounding into the reasoning steps of video LLMs? 📃We’re excited to introduce Chain-of-Frames (CoF), a simple method to improve reasoning via explicit frame references! 🧠✨ Big thanks to my co-authors, francesco croce, N. Flammarion, P. Krishnamurthy,

🚨How to incorporate temporal grounding into the reasoning steps of video LLMs?

📃We’re excited to introduce Chain-of-Frames (CoF), a simple method to improve reasoning via explicit frame references!  🧠✨

Big thanks to my co-authors, <a href="/fra__31/">francesco croce</a>, N. Flammarion, P. Krishnamurthy,
francesco croce (@fra__31) 's Twitter Profile Photo

We just released Chain-of-Frames: explicitly referencing relevant frames while reasoning improves the performance of video LLMs across benchmarks! Check it out 👇

Majdi Hassan (@majdi_has) 's Twitter Profile Photo

(1/n)🚨You can train a model solving DFT for any geometry almost without training data!🚨 Introducing Self-Refining Training for Amortized Density Functional Theory — a variational framework for learning a DFT solver that predicts the ground-state solutions for different

Emiliano Penaloza (@emilianopp_) 's Twitter Profile Photo

Excited that our paper "Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization" was accepted to ICML 2025! We show how Preference Optimization can reduce the impact of noisy concept labels in CBMs. 🧵/9

Akshay Kulkarni (@ak70000) 's Twitter Profile Photo

⚡Interested in making pretrained generative models interpretable with minimal training and annotations? I'll be presenting our paper, Interpretable Generative Models through Post-hoc Concept Bottlenecks, at #CVPR2025 today in Poster Session 2 (4 pm - 6 pm CDT) at poster #266!

Akshay Kulkarni (@ak70000) 's Twitter Profile Photo

🚀 Paper: arxiv.org/abs/2503.19377 🚀 Code: github.com/Trustworthy-ML… 🚀 Project site: lilywenglab.github.io/posthoc-genera… Thanks to my co-authors, Ge Yan, Chung-En, Sun, Tuomas Oikarinen, and my PhD advisor Lily Weng

Massimo Caccia (@masscaccia) 's Twitter Profile Photo

🎉 Our paper “𝐻𝑜𝑤 𝑡𝑜 𝑇𝑟𝑎𝑖𝑛 𝑌𝑜𝑢𝑟 𝐿𝐿𝑀 𝑊𝑒𝑏 𝐴𝑔𝑒𝑛𝑡: 𝐴 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑖𝑠” got an 𝐨𝐫𝐚𝐥 at next week’s 𝗜𝗖𝗠𝗟 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗨𝘀𝗲 𝗔𝗴𝗲𝗻𝘁𝘀! 🖥️🧠 We present the 𝐟𝐢𝐫𝐬𝐭 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞

🎉 Our paper “𝐻𝑜𝑤 𝑡𝑜 𝑇𝑟𝑎𝑖𝑛 𝑌𝑜𝑢𝑟 𝐿𝐿𝑀 𝑊𝑒𝑏 𝐴𝑔𝑒𝑛𝑡: 𝐴 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑖𝑠” got an 𝐨𝐫𝐚𝐥 at next week’s 𝗜𝗖𝗠𝗟 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗨𝘀𝗲 𝗔𝗴𝗲𝗻𝘁𝘀! 🖥️🧠

We present the 𝐟𝐢𝐫𝐬𝐭 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞
P Shravan Nayak (@pshravannayak) 's Twitter Profile Photo

Excited to be at #ICML2025 presenting 3 papers! 📌 UI-Vision (Poster, July 15, Hall B2-B3) 📌 LIVS (Poster, July 16, Hall B2-B3) 📌 CulturalFrames @ MoFA Workshop (July 18) If you're around and want to chat about agents, alignment, or cultural understanding, let's connect!