Joan Serrà (@serrjoa) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Model merging is a great way to combine multiple models' abilities, however, existing methods only work with models fine-tuned from the same initialization, and produce models of the same size. Our new work - PLeaS (at #CVPR2025) aims to resolve both these issues 🧵.

thumb_up_off_alt88

chat_bubble_outline2

repeat13

shareShare

Ricard Marxer ([email protected])

@ricardmp

21 days ago

#Orcas vocal complexity doi.org/10.1016/j.ecoi… We analyse #bioacoustics recordings spanning over 5 years of orcas in the wild. DL classification followed by complexity measures are contrasted to pod sizes, to approach the social complexity hypothesis in #cetaceans

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

Joan Serrà

@serrjoa

21 days ago

The second internship in our team! And back to diffusion models (for images for now).

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare

Luca Ambrogioni

@lucaamb

19 days ago

1/2) It's finally out on Arxiv: Feedback guidance of generative diffusion models! We derived an adaptive guidance methods from first principles that regulate the amount of guidance based on its current state. Complex prompts are highly guided while simplem ones are almost free

thumb_up_off_alt443

chat_bubble_outline3

repeat71

shareShare

Andrei Bursuc

@abursuc

17 days ago

Arash Vahdat ✈️ #CVPR2025 Heavy-tailed diffusion models: lines of code to improve the ability of your diffusion model to handle extreme events in heavy-tailed distributions. ll;dr: replace you gaussian distribution with a tuned t-student one. Arash Vahdat ✈️ #CVPR2025 #uncv2025 #cvpr2025

<a href="/ArashVahdat/">Arash Vahdat ✈️ #CVPR2025</a> Heavy-tailed diffusion models: lines of code to improve the ability of your diffusion model to handle extreme events in heavy-tailed distributions. ll;dr: replace you gaussian distribution with a tuned t-student one.
<a href="/ArashVahdat/">Arash Vahdat ✈️ #CVPR2025</a> #uncv2025 #cvpr2025

thumb_up_off_alt13

chat_bubble_outline2

repeat6

shareShare

Nate Raw

@_nateraw

16 days ago

“Y’all can have your vibe coding, I’m doing vibe science” - zach 🏔🎶

thumb_up_off_alt3

chat_bubble_outline0

repeat2

shareShare

Randall Balestriero

@randall_balestr

12 days ago

Language/tokens provide a compressed space that is aligned with current LLM evaluation tasks (see our Next Token Perception Score arxiv.org/abs/2505.17169) while pixels are raw unfiltered sensing of the world known to be misaligned with perception tasks (see our paper with

thumb_up_off_alt80

chat_bubble_outline1

repeat10

shareShare

Marta Skreta

@martoskreto

11 days ago

🧵(1/6) Delighted to share our ICML Conference 2025 spotlight paper: the Feynman-Kac Correctors (FKCs) in Diffusion Picture this: it’s inference time and we want to generate new samples from our diffusion model. But we don’t want to just copy the training data – we may want to sample

thumb_up_off_alt45

chat_bubble_outline2

repeat9

shareShare

Joan Serrà

@serrjoa

9 days ago

Writing (code, essays, emails...) makes you think. Also reading long texts and summarizing them (even if those are not correctly written). Do you want to think? Or do you want to offload that?

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Alexia Jolicoeur-Martineau

@jm_alexia

9 days ago

When did "multi-modal" become image + text? Meanwhile, image + text + audio is now called Omni-modal. "Omni" means "all", so it stands for "all modalities". As if this represented all modalities!

thumb_up_off_alt102

chat_bubble_outline6

repeat6

shareShare

Bao Pham

@baophamhq

9 days ago

Diffusion models create novel images, but they can also memorize samples from the training set. How do they blend stored features to synthesize novel patterns? Our new work shows that diffusion models behave like Dense Associative Memory: in the low training data regime (number

thumb_up_off_alt454

chat_bubble_outline3

repeat75

shareShare

Daniel Arteaga (@dnlrtg.bsky.social)

@dnlrtg

8 days ago

In Dolby Barcelona we are offering an award for outstanding scientific papers in sound research. More info in Bluesky: bsky.app/profile/dnlrtg…

thumb_up_off_alt2

chat_bubble_outline0

repeat2

shareShare

Deep Learning Barcelona Symposium

@dlbcnai

8 days ago

We are already working on the 2025 edition. This year we will go red.

thumb_up_off_alt26

chat_bubble_outline0

repeat2

shareShare

jack morris

@jxmnop

4 days ago

In the beginning, there was BERT. Eventually BERT gave rise to RoBERTa. Then, DeBERTa. Later, ModernBERT. And now, NeoBERT. The new state-of-the-art small-sized encoder:

thumb_up_off_alt925

chat_bubble_outline28

repeat69

shareShare

Sauers

@sauers_

4 days ago

Wow. This is the reasoning the judge used to say that Anthropic training is fair use: "But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways

thumb_up_off_alt2,2K

chat_bubble_outline75

repeat152

shareShare

Julien Guinot

@juj_guinot

3 days ago

A thread by Alain Riou about our recent ISMIR Conference work, SLAP! paper: arxiv.org/abs/2506.17815 code: github.com/Pliploop/SLAP/… TLDR: Joint multimodal models without negatives (No more contrastive 😈) - Better performance! - Better scalability! - Closed modality gap! 🧵⏬

thumb_up_off_alt15

chat_bubble_outline1

repeat1

shareShare