Wieland Brendel (@wielandbr) 's Twitter Profile
Wieland Brendel

@wielandbr

Machine Learning Researcher and Social Entrepreneur | Group Lead at ELLIS Institute Tübingen | Co-Founder maddox.ai | Co-Initiator bw-ki.de | @ellis.eu scholar

ID: 3466980795

linkhttps://robustml.is.mpg.de calendar_today28-08-2015 09:29:49

503 Tweet

4,4K Takipçi

190 Takip Edilen

ELIAS (@elias_project) 's Twitter Profile Photo

The 8 #ELIASNodes were unveiled at the Falling Walls AI Night! These hubs in Amsterdam, Barcelona, Cambridge, Copenhagen, Munich, Potsdam, Tübingen & Zurich will foster #AIinnovation, connect academia with business & inspire a new generation of AI&Science value creators. 🌍🚀

The 8 #ELIASNodes were unveiled at the <a href="/Falling_Walls/">Falling Walls</a> AI Night! These hubs in Amsterdam, Barcelona, Cambridge, Copenhagen, Munich, Potsdam, Tübingen &amp; Zurich will foster #AIinnovation, connect academia with business &amp; inspire a new generation of AI&amp;Science value creators. 🌍🚀
Intelligent Systems (@mpi_is) 's Twitter Profile Photo

Day 4 of our advent calendar, showcasing #Polybot #robot, developed by Wieland Brendel and his Robust #MachineLearning Group. This flexible small robot could one day work in swarms, making it possible to realize sustainable and cost-effective farming: tuebingen.ai/news/want-an-a… #AI #KI

Vishaal Udandarao (@vishaal_urao) 's Twitter Profile Photo

🚀New Paper arxiv.org/abs/2412.06712 Model merging is the rage these days: simply fine-tune multiple task-specific models and merge them at the end. Guaranteed perf boost! But wait, what if you get new tasks over time, sequentially? How to merge your models over time? 🧵👇

🚀New Paper
arxiv.org/abs/2412.06712

Model merging is the rage these days: simply fine-tune multiple task-specific models and merge them at the end. Guaranteed perf boost!

But wait, what if you get new tasks over time, sequentially? How to merge your models over time?

🧵👇
Wieland Brendel (@wielandbr) 's Twitter Profile Photo

Does anyone know how OpenAI gets o3-mini to exceed 700 tokens/sec? I’ve only seen such speeds on specialized chips from Cerebras, SambaNova, or Groq Inc—but not on standard NVIDIA GPUs, which I assumed power OpenAI’s inference.

ELIAS (@elias_project) 's Twitter Profile Photo

🚀Introducing the ELIAS Node Barcelona! A key hub for AI knowledge transfer & innovation in Catalonia, bridging research & industry. Led by Dimosthenis Karatzas, with Meritxell Bassolas & Victor Rotellar, it unites top AI expertise & strategic innovation. 🔗 Learn more: elias-ai.eu/elias-node-bar…

🚀Introducing the ELIAS Node Barcelona!
A key hub for AI knowledge transfer &amp; innovation in Catalonia, bridging research &amp; industry.
Led by <a href="/dkaratzas/">Dimosthenis Karatzas</a>, with <a href="/txellbassolas/">Meritxell Bassolas</a> &amp; <a href="/VictorRotellar/">Victor Rotellar</a>, it unites top AI expertise &amp; strategic innovation.
🔗 Learn more: elias-ai.eu/elias-node-bar…
Andreas Hochlehnert (@ahochlehnert) 's Twitter Profile Photo

CuratedThoughts: Data Curation for RL Datasets 🚀 Since DeepSeek-R1 introduced reasoning-based RL, datasets like Open-R1 & OpenThoughts emerged for fine-tuning & GRPO. Our deep dive found major flaws — 25% of OpenThoughts needed elimination by data curation. Here's why 👇🧵

Wieland Brendel (@wielandbr) 's Twitter Profile Photo

New preprint out! Shoutout to Thaddäus Wiedemer and Prasanna Mayilvahanan for this clean work on what truly shapes LLM train-to-downstream performance! Turns out, architecture plays a shockingly small role—it's all about the data. Must-read for anyone thinking about scaling and

Vishaal Udandarao (@vishaal_urao) 's Twitter Profile Photo

🚀New Paper! arxiv.org/abs/2504.07086 Everyone’s celebrating rapid progress in math reasoning with RL/SFT. But how real is this progress? We re-evaluated recently released popular reasoning models—and found reported gains often vanish under rigorous testing!! 👀 🧵👇

🚀New Paper!
arxiv.org/abs/2504.07086

Everyone’s celebrating rapid progress in math reasoning with RL/SFT. But how real is this progress?

We re-evaluated recently released popular reasoning models—and found reported gains often vanish under rigorous testing!! 👀

🧵👇
Jack Brady (@jackhb98) 's Twitter Profile Photo

I'm at #ICLR2025 presenting our work on compositional generalization! (Sat. 10 AM; Hall 3 + Hall 2B, #310) We provide a general and unifying theory of compositional generalization, based on a new principle called interaction asymmetry! 📜 arxiv.org/abs/2411.07784 (See 🧵)

I'm at #ICLR2025 presenting our work on compositional generalization! (Sat. 10 AM; Hall 3 + Hall 2B, #310)

We provide a general and unifying theory of compositional generalization, based on a new principle called interaction asymmetry!

📜 arxiv.org/abs/2411.07784

(See 🧵)