Pau Rodríguez (@prlz77) Twitter Tweets • TwiCopy

Eeshan Gunesh Dhekane

7 months ago

Parameterized Transforms 🚀 Here is a new tool that provides a modular and extendable implementation of torchvision-based image augmentations that provides access to their parameterization. [1/5]

thumb_up_off_alt14

chat_bubble_outline1

repeat9

shareShare

"A calculator app? Anyone could make that." Not true. A calculator should show you the result of the mathematical expression you entered. That's much, much harder than it sounds. What I'm about to tell you is the greatest calculator app development story ever told.

thumb_up_off_alt34,34K

chat_bubble_outline592

repeat4,4K

shareShare

Aayush Karan

@aakaran31

6 months ago

Can machine learning models predict their own errors 🤯 ? In a new preprint w/ Apple collaborators Aravind Gollakota, Parikshit Gopalan, Charlotte Peale, and Udi Wieder, we present a theory of loss prediction and show an equivalence with algorithmic fairness! A thread (1/n):

Can machine learning models predict their own errors 🤯 ?

In a new preprint w/ <a href="/Apple/">Apple</a> collaborators Aravind Gollakota, Parikshit Gopalan, Charlotte Peale, and Udi Wieder, we present a theory of loss prediction and show an equivalence with algorithmic fairness!

A thread (1/n):

thumb_up_off_alt967

chat_bubble_outline11

repeat135

shareShare

Juan A. Rodríguez 💫

@joanrod_ai

6 months ago

I’m excited to announce that 💫StarVector has been accepted at CVPR 2025! Over a year in the making, StarVector opens a new paradigm for Scalable Vector Graphics (SVG) generation by harnessing multimodal LLMs to generate SVG code that aesthetically mirrors input images and text.

thumb_up_off_alt162

chat_bubble_outline13

repeat56

shareShare

AK

@_akhaliq

6 months ago

StarVector is out on Hugging Face StarVector is a foundation model for generating Scalable Vector Graphics (SVG) code from images and text. It utilizes a Vision-Language Modeling architecture to understand both visual and textual inputs, enabling high-quality vectorization

thumb_up_off_alt3,3K

chat_bubble_outline59

repeat511

shareShare

Juan A. Rodríguez 💫

@joanrod_ai

5 months ago

You can try the official StarVector demo in Hugging Face Spaces: 1B🔗huggingface.co/spaces/starvec… 8B🔗huggingface.co/spaces/starvec…

thumb_up_off_alt227

chat_bubble_outline4

repeat44

shareShare

Juan A. Rodríguez 💫

@joanrod_ai

5 months ago

Following the excitement around our recent 💫 StarVector work, we're thrilled to introduce our latest research on benchmarking UI Vision Agents. We present UI-Vision, a benchmark designed to evaluate multimodal LLMs in their UI perception, spatial reasoning, and interaction

thumb_up_off_alt27

chat_bubble_outline0

repeat8

shareShare

Juan A. Rodríguez 💫

@joanrod_ai

5 months ago

rizal˙ Jeremy Blaze StarVector does a very decent job on this example! Try here: huggingface.co/spaces/starvec…

thumb_up_off_alt59

chat_bubble_outline0

repeat12

shareShare

Aran Komatsuzaki

@arankomatsuzaki

5 months ago

Scaling Laws for Native Multimodal Models - Early-fusion exhibits stronger perf at lower param counts, is more efficient to train, and is easier to deploy, compared w/ late fusion. - Incorporating MoEs allows for models that learn modality-specific weights, significantly

thumb_up_off_alt462

chat_bubble_outline4

repeat78

shareShare

Pau Rodríguez

@prlz77

5 months ago

Our work on fine-grained control of LLMs and diffusion models via Activation Transport will be presented ICLR 2025 as spotlight✨Check out our new blog post machinelearning.apple.com/research/trans…

thumb_up_off_alt40

chat_bubble_outline1

repeat10

shareShare

Gabriele Berton

@gabriberton

5 months ago

Apparently you can trick an LLM to believe it did its thinking, and results improve by a lot (up to 40%)! In what authors call NoThinking they just force the string "Thinking: Okay, I think I have finished thinking" From paper: Reasoning Models Can Be Effective Without Thinking

thumb_up_off_alt86

chat_bubble_outline4

repeat9

shareShare

Patrik Reizinger

@rpatrik96

5 months ago

Many design choices in (self-)supervised classification lead to identifiability. If you are curious how and when optimizing cross entropy leads to theoretical guarantees, check out our oral on Saturday in Session 5C or the poster from 3pm. Details: iclr.cc/virtual/2025/o…

thumb_up_off_alt33

chat_bubble_outline0

repeat8

shareShare

Jason Ramapuram

@jramapuram

5 months ago

Stop by poster #596 at 10A-1230P tomorrow (Fri 25 April) at #ICLR2025 to hear more about Sigmoid Attention! We just pushed 8 trajectory checkpoints each for two 7B LLMs for Sigmoid Attention and a 1:1 Softmax Attention (trained with a deterministic dataloader for 1T tokens): -

thumb_up_off_alt45

chat_bubble_outline1

repeat14

shareShare

Pau Rodríguez

@prlz77

4 months ago

If you are interested in training multimodal models, this paper from my colleagues Apple is a gem!

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Pau Rodríguez

@prlz77

3 months ago

This is awesome Juan A. Rodríguez 💫!

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Pau Rodríguez

@prlz77

3 months ago

Is input selectivity the secret sauce of Mamba or there is more🤔? In our new ICML2025 paper, we show input selectivity provides an edge but conv and gating are key for associative recall! Check out Teresa Huang 's thread for more insights! Paper: arxiv.org/abs/2506.11891

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

Teresa Huang

@teresanhuang

2 months ago

Come check out our poster today at the East Exhibition Hall (E-2312) from 11am-130pm PDT!

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

2 months ago

Apple Intelligence Foundation Language Models: Tech Report 2025 "We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and services: i a 3B-parameter on-device model optimized for Apple silicon through

thumb_up_off_alt301

chat_bubble_outline5

repeat59

shareShare

Dan Busbridge

@danbusbridge

a month ago

Uncertainty methods and correctness metrics often share "mutual bias" (systematic errors from a common confounder like response length), skewing LLM evaluations. New paper from my colleagues shows that "LM-as-a-judge" evaluation is more robust and human-aligned. Important work -

thumb_up_off_alt12

chat_bubble_outline0

repeat1

shareShare

Pau Rodríguez

@prlz77

a month ago

When you go back home and find your parents raise their monitor with your PhD thesis 😅

thumb_up_off_alt83

chat_bubble_outline5

repeat3

shareShare

Pau Rodríguez

Eeshan Gunesh Dhekane

Nauseam

Aayush Karan

Juan A. Rodríguez 💫

AK

Juan A. Rodríguez 💫

Juan A. Rodríguez 💫

Juan A. Rodríguez 💫

Aran Komatsuzaki

Pau Rodríguez

Gabriele Berton

Patrik Reizinger

Jason Ramapuram

Pau Rodríguez

Pau Rodríguez

Pau Rodríguez

Teresa Huang

Tanishq Mathew Abraham, Ph.D.

Dan Busbridge

Pau Rodríguez