Pau Rodríguez (@prlz77) 's Twitter Profile
Pau Rodríguez

@prlz77

Research Scientist @Apple MLR on #machine_learning understanding and robustness. @ELLISforEurope member. Previously at ServiceNow and Element AI in Montréal.

ID: 618193431

linkhttp://prlz77.github.io calendar_today25-06-2012 13:55:29

627 Tweet

1,1K Takipçi

1,1K Takip Edilen

Eeshan Gunesh Dhekane (@eeshandhekane) 's Twitter Profile Photo

Parameterized Transforms 🚀 Here is a new tool that provides a modular and extendable implementation of torchvision-based image augmentations that provides access to their parameterization. [1/5]

Parameterized Transforms 🚀

Here is a new tool that provides a modular and extendable implementation of torchvision-based image augmentations that provides access to their parameterization. [1/5]
Nauseam (@chadnauseam) 's Twitter Profile Photo

"A calculator app? Anyone could make that." Not true. A calculator should show you the result of the mathematical expression you entered. That's much, much harder than it sounds. What I'm about to tell you is the greatest calculator app development story ever told.

"A calculator app? Anyone could make that."

Not true.

A calculator should show you the result of the mathematical expression you entered. That's much, much harder than it sounds.

What I'm about to tell you is the greatest calculator app development story ever told.
Aayush Karan (@aakaran31) 's Twitter Profile Photo

Can machine learning models predict their own errors 🤯 ? In a new preprint w/ Apple collaborators Aravind Gollakota, Parikshit Gopalan, Charlotte Peale, and Udi Wieder, we present a theory of loss prediction and show an equivalence with algorithmic fairness! A thread (1/n):

Can machine learning models predict their own errors 🤯 ?

In a new preprint w/ <a href="/Apple/">Apple</a> collaborators Aravind Gollakota, Parikshit Gopalan, Charlotte Peale, and Udi Wieder, we present a theory of loss prediction and show an equivalence with algorithmic fairness!

A thread (1/n):
Juan A. Rodríguez 💫 (@joanrod_ai) 's Twitter Profile Photo

I’m excited to announce that 💫StarVector has been accepted at CVPR 2025! Over a year in the making, StarVector opens a new paradigm for Scalable Vector Graphics (SVG) generation by harnessing multimodal LLMs to generate SVG code that aesthetically mirrors input images and text.

AK (@_akhaliq) 's Twitter Profile Photo

StarVector is out on Hugging Face StarVector is a foundation model for generating Scalable Vector Graphics (SVG) code from images and text. It utilizes a Vision-Language Modeling architecture to understand both visual and textual inputs, enabling high-quality vectorization

Juan A. Rodríguez 💫 (@joanrod_ai) 's Twitter Profile Photo

You can try the official StarVector demo in Hugging Face Spaces: 1B🔗huggingface.co/spaces/starvec… 8B🔗huggingface.co/spaces/starvec…

Juan A. Rodríguez 💫 (@joanrod_ai) 's Twitter Profile Photo

Following the excitement around our recent 💫 StarVector work, we're thrilled to introduce our latest research on benchmarking UI Vision Agents. We present UI-Vision, a benchmark designed to evaluate multimodal LLMs in their UI perception, spatial reasoning, and interaction

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Scaling Laws for Native Multimodal Models - Early-fusion exhibits stronger perf at lower param counts, is more efficient to train, and is easier to deploy, compared w/ late fusion. - Incorporating MoEs allows for models that learn modality-specific weights, significantly

Scaling Laws for Native Multimodal Models

- Early-fusion exhibits stronger perf at lower param counts, is more efficient to train, and is easier to deploy, compared w/ late fusion.
- Incorporating MoEs allows for models that learn modality-specific weights, significantly
Pau Rodríguez (@prlz77) 's Twitter Profile Photo

Our work on fine-grained control of LLMs and diffusion models via Activation Transport will be presented ICLR 2025 as spotlight✨Check out our new blog post machinelearning.apple.com/research/trans…

Gabriele Berton (@gabriberton) 's Twitter Profile Photo

Apparently you can trick an LLM to believe it did its thinking, and results improve by a lot (up to 40%)! In what authors call NoThinking they just force the string "Thinking: Okay, I think I have finished thinking" From paper: Reasoning Models Can Be Effective Without Thinking

Apparently you can trick an LLM to believe it did its thinking, and results improve by a lot (up to 40%)!

In what authors call NoThinking they just force the string "Thinking: Okay, I think I have finished thinking"

From paper: Reasoning Models Can Be Effective Without Thinking
Patrik Reizinger (@rpatrik96) 's Twitter Profile Photo

Many design choices in (self-)supervised classification lead to identifiability. If you are curious how and when optimizing cross entropy leads to theoretical guarantees, check out our oral on Saturday in Session 5C or the poster from 3pm. Details: iclr.cc/virtual/2025/o…

Many design choices in (self-)supervised classification lead to identifiability. If you are curious how and when optimizing cross entropy leads to theoretical guarantees, check out our oral on Saturday in Session 5C or the poster from 3pm. 

Details: iclr.cc/virtual/2025/o…
Jason Ramapuram (@jramapuram) 's Twitter Profile Photo

Stop by poster #596 at 10A-1230P tomorrow (Fri 25 April) at #ICLR2025 to hear more about Sigmoid Attention! We just pushed 8 trajectory checkpoints each for two 7B LLMs for Sigmoid Attention and a 1:1 Softmax Attention (trained with a deterministic dataloader for 1T tokens): -

Stop by poster #596 at 10A-1230P tomorrow (Fri 25 April) at #ICLR2025 to hear more about Sigmoid Attention! 

We just pushed 8 trajectory checkpoints each for two 7B LLMs for Sigmoid Attention and a 1:1 Softmax Attention (trained with a deterministic dataloader for 1T tokens):

-
Pau Rodríguez (@prlz77) 's Twitter Profile Photo

Is input selectivity the secret sauce of Mamba or there is more🤔? In our new ICML2025 paper, we show input selectivity provides an edge but conv and gating are key for associative recall! Check out Teresa Huang 's thread for more insights! Paper: arxiv.org/abs/2506.11891

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Apple Intelligence Foundation Language Models: Tech Report 2025 "We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and services: i a 3B-parameter on-device model optimized for Apple silicon through

Apple Intelligence Foundation Language Models: Tech Report 2025

"We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and services: i a 3B-parameter on-device model optimized for Apple silicon through
Dan Busbridge (@danbusbridge) 's Twitter Profile Photo

Uncertainty methods and correctness metrics often share "mutual bias" (systematic errors from a common confounder like response length), skewing LLM evaluations. New paper from my colleagues shows that "LM-as-a-judge" evaluation is more robust and human-aligned. Important work -