Sotiris Anagnostidis (@sanagnostidis) Twitter Tweets • TwiCopy

Dimitri von Rütte

2 years ago

Attending #NeurIPS2023 in New Orleans this week to present OpenAssistant (arxiv.org/abs/2304.07327)! Happy to chat about open-source LLMs, personalized image generation, and more. DMs are open!

thumb_up_off_alt25

chat_bubble_outline1

repeat7

shareShare

Gregor Bachmann

@gregorbachmann1

2 years ago

I’ll be presenting "Scaling MLPs" at #NeurIPS2023, tomorrow (Wed) at 10:45am! Hyped to discuss things like inductive bias, the bitter lesson, compute-optimality and scaling laws 👷⚖️📈

thumb_up_off_alt58

chat_bubble_outline1

repeat12

shareShare

LIME: Localized Image Editing via Attention Regularization in Diffusion Models paper page: huggingface.co/papers/2312.09… Diffusion models (DMs) have gained prominence due to their ability to generate high-quality, varied images, with recent advancements in text-to-image generation.

thumb_up_off_alt93

chat_bubble_outline1

repeat26

shareShare

Dimitri von Rütte

@dvruette

2 years ago

🚨 Calling on all FABRIC users! We need your help to learn about how you’ve been using FABRIC. Help us by taking 5 minutes to fill out the survey. Haven’t tried FABRIC yet? Just try it using our Gradio demo! ✨👨‍🎨 📊 Survey: forms.gle/aMWLDW8xvyhkLb… 👾 Demo:

thumb_up_off_alt30

chat_bubble_outline0

repeat9

shareShare

Dimitri von Rütte

@dvruette

2 years ago

this llama-2 is behaving a bit... strangely 🤔

thumb_up_off_alt6

chat_bubble_outline1

repeat1

shareShare

Dimitri von Rütte

@dvruette

2 years ago

🚨📜 Announcing our latest work on LLM interpretability: We are able to control a model's humor, creativity, quality, truthfulness, and compliance by applying concept vectors to its hidden neural activations. 🧵 arxiv.org/abs/2402.14433

thumb_up_off_alt132

chat_bubble_outline4

repeat40

shareShare

Dimitri von Rütte

@dvruette

2 years ago

To try it out yourself and for technical implementation details, check out our HF space and GitHub. 🤗 Demo: huggingface.co/spaces/dvruett… 👾 Code: github.com/dvruette/conce…

thumb_up_off_alt7

chat_bubble_outline2

repeat2

shareShare

Bobby

@bobby_he

2 years ago

Outlier Features (OFs) aka “neurons with big features” emerge in standard transformer training & prevent benefits of quantisation🥲but why do OFs appear & which design choices minimise them? Our new work (+Lorenzo Noci Daniele Paliotta Imanol Schlag T. Hofmann) takes a look👀🧵

thumb_up_off_alt182

chat_bubble_outline4

repeat39

shareShare

Aurelien Lucchi

@aurelienlucchi

a year ago

The University of Basel, Switzerland, is offering an open-rank Professorship in AI and Foundation Models. For more information, visit this link: jobs.unibas.ch/offene-stellen….

thumb_up_off_alt33

chat_bubble_outline4

repeat12

shareShare

Sotiris Anagnostidis

@sanagnostidis

a year ago

Join us today at 13.30 in #ICML to learn how to navigate across scaling laws and how to accelerate your training! Poster #1007

thumb_up_off_alt22

chat_bubble_outline0

repeat5

shareShare

Dimitri von Rütte

@dvruette

a year ago

We’re presenting our work on concept guidance today at 13:30’s ICML poster session (# 706). Come by and say hi! #ICML #ICML2024

thumb_up_off_alt12

chat_bubble_outline0

repeat3

shareShare

Samuel Albanie 🇬🇧

@samuelalbanie

a year ago

Are LLMs easily influenced? Interesting work from Sotiris Anagnostidis and Jannis Bulian TLDR: Having an LLM advocate for a question answer in the prompt significantly influences predictions

Are LLMs easily influenced?

Interesting work from <a href="/SAnagnostidis/">Sotiris Anagnostidis</a> and <a href="/jannis1/">Jannis Bulian</a>

TLDR: Having an LLM advocate for a question answer in the prompt significantly influences predictions

thumb_up_off_alt13

chat_bubble_outline1

repeat2

shareShare

Bobby

@bobby_he

a year ago

Updated camera ready arxiv.org/abs/2405.19279. New results include: - non-diagonal preconditioners (SOAP/Shampoo) minimise OFs compared to diagonal (Adam/AdaFactor) - Scaling to 7B params - showing our methods to reduce OFs translate to PTQ int8 quantisation ease. Check it out!

thumb_up_off_alt154

chat_bubble_outline1

repeat31

shareShare

Antonio Orvieto

@orvieto_antonio

a year ago

Only a few more days to apply to our amazing PhD program! imprs.is.mpg.de/application

thumb_up_off_alt15

chat_bubble_outline0

repeat6

shareShare

Dimitri von Rütte

@dvruette

10 months ago

🚨 NEW PAPER DROP! Wouldn't it be nice if LLMs could spot and correct their own mistakes? And what if we could do so directly from pre-training, without any SFT or RL? We present a new class of discrete diffusion models, called GIDD, that are able to do just that: 🧵1/12

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat157

shareShare

Weronika Ormaniec

@wormaniec

8 months ago

Ever wondered how the loss landscape of Transformers differs from that of other architectures? Or which Transformer components make its loss landscape unique? With Sidak Pal Singh & Felix Dangel, we explore this via the Hessian in our #ICLR2025 spotlight paper! Key insights👇 1/8

thumb_up_off_alt25

chat_bubble_outline1

repeat8

shareShare

Enea Monzio Compagnoni

@eneamc

8 months ago

If you are at ICLR 2026, come by my poster tomorrow at 10.00 am! You find me at Hall 3 + Hall 2B #367! See you there! iclr.cc/virtual/2025/p… #ICLR2025

thumb_up_off_alt6

chat_bubble_outline1

repeat4

shareShare

Artsiom Sanakoyeu

@artsiom_s

6 months ago

Thrilled to share that our CVPR 2025 paper “𝐀𝐮𝐭𝐨𝐫𝐞𝐠𝐫𝐞𝐬𝐬𝐢𝐯𝐞 𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐃𝐢𝐟𝐟𝐮𝐬𝐢𝐨𝐧 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬”(ARD) has been selected as an Oral! ✨ Catch us at CVPR on Saturday, June 14 🗣 Oral Session 4A — 14:00-14:15, Karl Dean Ballroom

thumb_up_off_alt30

chat_bubble_outline0

repeat5

shareShare

Edgar Schoenfeld

@schoenfeldedgar

6 months ago

🚀 Want to speed up your image and video model inference? Come see our highlight poster at #CVPR2026 : "FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute" 📍 Today at 4 PM, ExHall D – Poster #205 🔗 arxiv.org/abs/2502.20126 Work done

🚀 Want to speed up your image and video model inference?

Come see our highlight poster at <a href="/CVPR/">#CVPR2026</a> :

"FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute"

📍 Today at 4 PM, ExHall D – Poster #205 🔗 arxiv.org/abs/2502.20126

Work done

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

Neil Houlsby

@neilhoulsby

6 months ago

📣 Anthropic Zurich is hiring again 🇨🇭 The team has been shaping up fantastically over the last months, and I have re-opened applications for pre-training. We welcome applications from anywhere along the "scientist/engineer spectrum". If building the future of AI for the

thumb_up_off_alt653

chat_bubble_outline12

repeat37

shareShare

Sotiris Anagnostidis

Dimitri von Rütte

Gregor Bachmann

AK

Dimitri von Rütte

Dimitri von Rütte

Dimitri von Rütte

Dimitri von Rütte

Bobby

Aurelien Lucchi

Sotiris Anagnostidis

Dimitri von Rütte

Samuel Albanie 🇬🇧

Bobby

Antonio Orvieto

Dimitri von Rütte

Weronika Ormaniec

Enea Monzio Compagnoni

Artsiom Sanakoyeu

Edgar Schoenfeld

Neil Houlsby