Lukas Thede (@lukas_thede) Twitter Tweets • TwiCopy

Vishaal Udandarao

8 months ago

🚀New Paper! arxiv.org/abs/2504.07086 Everyone’s celebrating rapid progress in math reasoning with RL/SFT. But how real is this progress? We re-evaluated recently released popular reasoning models—and found reported gains often vanish under rigorous testing!! 👀 🧵👇

thumb_up_off_alt264

chat_bubble_outline4

repeat53

shareShare

Explainable Machine Learning

@explainableml

8 months ago

📢 We’ve landed in Singapore for #ICLR2025! The EML group is presenting 4 exciting papers — come say hi at our poster sessions! 👇Let's chat! More details in the thread — see you there! 🌟

thumb_up_off_alt18

chat_bubble_outline1

repeat6

shareShare

Tom Hartvigsen

@tom_hartvigsen

7 months ago

Excited we have some papers accepted to ICML Conference in collaborations with some tremendous folks 🎉 Looking forward to Vancouver to discuss model editing for LLMs/VLMs and improving medical benchmarking!

Excited we have some papers accepted to <a href="/icmlconf/">ICML Conference</a> in collaborations with some tremendous folks 🎉

Looking forward to Vancouver to discuss model editing for LLMs/VLMs and improving medical benchmarking!

thumb_up_off_alt46

chat_bubble_outline1

repeat6

shareShare

Explainable Machine Learning

@explainableml

7 months ago

🚨Happy to announce that one paper, "Understanding the Limits of Lifelong Knowledge Editing in LLMs", is accepted at #icml2025 ! Congrats to the wonderful authors Lukas Thede , Karsten Roth , Matthias Bethge ,Zeynep Akata , and Tom Hartvigsen. 👇 Highlights in the thread

thumb_up_off_alt23

chat_bubble_outline3

repeat4

shareShare

Karsten Roth

@confusezius

6 months ago

In Nashville for my last PhD conference 🥲. Come join today 10:30-12:30 in Hall D (#391) to talk insights, tips and tricks to modify pretraining for representation reuse - scalably. 🚀Joint work w/ Zeynep Akata, Dima Damen @CVPR 2025, Ivana Balazevic & Olivier Hénaff while at Google DeepMind.

thumb_up_off_alt22

chat_bubble_outline2

repeat8

shareShare

Ori Press

@ori_press

5 months ago

Do language models have algorithmic creativity? To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️

thumb_up_off_alt141

chat_bubble_outline6

repeat54

shareShare

Jonathan Richard Schwarz

@schwarzjn_

5 months ago

✨ New ACL'25 Oral: Transforming dense LLMs into semantic MoEs during IFT! 📜 🎯 Key wins: • SOTA performance vs regular IFT & upcycling • Input-dependent expert-routing & merging • Learning WHERE to upcycle & HOW to specialize - no manual design! 🔗tinyurl.com/yae55x5e

thumb_up_off_alt21

chat_bubble_outline1

repeat5

shareShare

Adhiraj Ghosh

@adhiraj_ghosh98

4 months ago

Excited to be in Vienna for #ACL2025🇦🇹! You'll find Sebastian Dziadzio and I by our ONEBench poster, so do drop by! 🗓️Wed, July 30, 11-12:30 CET 📍Hall 4/5 I’m also excited to talk about lifelong and personalised benchmarking, data curation and vision-language in general! Let’s connect!

Excited to be in Vienna for #ACL2025🇦🇹! You'll find <a href="/sbdzdz/">Sebastian Dziadzio</a> and I by our ONEBench poster, so do drop by!

🗓️Wed, July 30, 11-12:30 CET
📍Hall 4/5

I’m also excited to talk about lifelong and personalised benchmarking, data curation and vision-language in general! Let’s connect!

thumb_up_off_alt16

chat_bubble_outline0

repeat5

shareShare

Luca Eyring @ICLR

@lucaeyring

4 months ago

Reward hacking is challenging when fine-tuning few-step Diffusion models. Direct fine-tuning on rewards can create artifacts that game metrics while degrading visual quality. We propose Noise Hypernetworks as a theoretically grounded solution, inspired by test-time optimization.

thumb_up_off_alt343

chat_bubble_outline8

repeat51

shareShare

Explainable Machine Learning

@explainableml

2 months ago

🔥We celebrate 3 papers accepted to NeurIPS Conference 2025, see you in San Diego! 🥳Topics include diffusion models, sparse autoencoders (SAEs) and neural chunking. See the thread for highlights👇

🔥We celebrate 3 papers accepted to <a href="/NeurIPSConf/">NeurIPS Conference</a> 2025, see you in San Diego! 🥳Topics include diffusion models, sparse autoencoders (SAEs) and neural chunking. See the thread for highlights👇

thumb_up_off_alt23

chat_bubble_outline1

repeat5

shareShare

Tom Hartvigsen

@tom_hartvigsen

2 months ago

📢Please retweet: We are hiring a **Postdoc** at UVA to work on Continually Monitoring and Updating Multi-modal Medical AI Models! Great opportunity to design impactful methods alongside great collaborators Ahmed Alaa and Roxana Daneshjou MD/PhD More info: tinyurl.com/ad7ptmvp

thumb_up_off_alt35

chat_bubble_outline0

repeat13

shareShare