Jason Ramapuram (@jramapuram) 's Twitter Profile
Jason Ramapuram

@jramapuram

ML Research Scientist  MLR | Formerly: DeepMind, Qualcomm, Viasat, Rockwell Collins | Swiss-minted PhD in ML | Barista alumnus ☕ @ Starbucks | 🇺🇸🇮🇳🇱🇻🇮🇹

ID: 66173851

linkhttps://jramapuram.github.io calendar_today16-08-2009 19:22:59

247 Tweet

1,1K Followers

526 Following

Jason Ramapuram (@jramapuram) 's Twitter Profile Photo

Ever wish you could have a simple pipeline to extract parameters from augmentations for an auxillary task (eg: self-supervised learning)? Well now you can! Check out Parameterized Transforms, great work from Eeshan Gunesh Dhekane at Apple MLR.

Edward Milsom (@edward_milsom) 's Twitter Profile Photo

Our paper "Function-Space Learning Rates" is on arXiv! We give an efficient way to estimate the magnitude of changes to NN outputs caused by a particular weight update. We analyse optimiser dynamics in function space, and enable hyperparameter transfer with our scheme FLeRM! 🧵👇

Our paper "Function-Space Learning Rates" is on arXiv! We give an efficient way to estimate the magnitude of changes to NN outputs caused by a particular weight update. We analyse optimiser dynamics in function space, and enable hyperparameter transfer with our scheme FLeRM! 🧵👇
Inception Labs (@inceptionailabs) 's Twitter Profile Photo

We are excited to introduce Mercury, the first commercial-grade diffusion large language model (dLLM)! dLLMs push the frontier of intelligence and speed with parallel, coarse-to-fine text generation.

Rin Metcalf Susa (@rinmetcalfsusa) 's Twitter Profile Photo

🚀 We're hiring an ML Researcher! 🚀 If you're an expert in LLM alignment & personalization and want to work on a world-class research team, apply here 👉 lnkd.in/gU9yeivi Know someone who’d be a great fit? Tag them! #MachineLearning #AI #Apple

Aayush Karan (@aakaran31) 's Twitter Profile Photo

Can machine learning models predict their own errors 🤯 ? In a new preprint w/ Apple collaborators Aravind Gollakota, Parikshit Gopalan, Charlotte Peale, and Udi Wieder, we present a theory of loss prediction and show an equivalence with algorithmic fairness! A thread (1/n):

Can machine learning models predict their own errors 🤯 ?

In a new preprint w/ <a href="/Apple/">Apple</a> collaborators Aravind Gollakota, Parikshit Gopalan, Charlotte Peale, and Udi Wieder, we present a theory of loss prediction and show an equivalence with algorithmic fairness!

A thread (1/n):
Andy Keller (@t_andy_keller) 's Twitter Profile Photo

In the physical world, almost all information is transmitted through traveling waves -- why should it be any different in your neural network? Super excited to share recent work with the brilliant Mozes Jacobs: "Traveling Waves Integrate Spatial Information Through Time" 1/14

Yizhe Zhang @ ICLR 2025 🇸🇬 (@yizhezhangnlp) 's Twitter Profile Photo

Excited to share our new paper on "Reversal Blessing" - where thinking BACKWARDS makes language models smarter on some multiple-choice questions! We found that right-to-left (R2L) models consistently outperform traditional left-to-right (L2R) models on certain reasoning tasks.🧵

Excited to share our new paper on "Reversal Blessing" - where thinking BACKWARDS makes language models smarter on some multiple-choice questions! We found that right-to-left (R2L) models consistently outperform traditional left-to-right (L2R) models on certain reasoning tasks.🧵
Kevin Patrick Murphy (@sirbayes) 's Twitter Profile Photo

I'm happy to announce that v2 of my RL tutorial is now online. I added a new chapter on multi-agent RL, and improved the sections on 'RL as inference' and 'RL+LLMs' (although latter is still WIP), fixed some typos, etc. arxiv.org/abs/2412.05265…

Martin Klissarov (@martinklissarov) 's Twitter Profile Photo

Here is an RL perspective on understanding LLMs for decision making. Are LLMs best used as: policies / rewards / transition functions ? How do you fine-tune them ? Can LLMs explore / exploit ? 🧵 Join us down this rabbit hole... (ICLR 2025 paper, done at  ML Research)

Pau Rodríguez (@prlz77) 's Twitter Profile Photo

Our work on fine-grained control of LLMs and diffusion models via Activation Transport will be presented ICLR 2025 as spotlight✨Check out our new blog post machinelearning.apple.com/research/trans…

Mustafa Shukor (@mustafashukor1) 's Twitter Profile Photo

We release a large scale study to answer the following: - Is late fusion inherently better than early fusion for multimodal models? - How do native multimodal models scale compared to LLMs. - How sparsity (MoEs) can play a detrimental role in handling heterogeneous modalities? 🧵

We release a large scale study to answer the following:
- Is late fusion inherently better than early fusion for multimodal models?
- How do native multimodal models scale compared to LLMs.
- How sparsity (MoEs) can play a detrimental role in handling heterogeneous modalities? 🧵
Shuangfei Zhai (@zhaisf) 's Twitter Profile Photo

Proud to report that TarFlow is accepted to #ICML2025 as a Spotlight 🎉 I’m really looking forward to new ideas and applications enabled by powerful Normalizing Flow models 🚀

Vaibhav (VB) Srivastav (@reach_vb) 's Twitter Profile Photo

Let's goo! Starting today you can access 5000+ LLMs powered by MLX directly from Hugging Face Hub! 🔥 All you need to do is click `Use this model` from any compatible model \o/ That's it, all you need to get blazingly fast intelligence right at your terminal! What would you

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵

Maureen de Seyssel (@maureendss) 's Twitter Profile Photo

Now that INTERSPEECH 2025 registration is open, time for some shameless promo! Sign-up and join our Interspeech tutorial: Speech Technology Meets Early Language Acquisition: How Interdisciplinary Efforts Benefit Both Fields. 🗣️👶 interspeech2025.org/tutorials ⬇️ (1/2)