Sahil Verma (@sahil1v) Twitter Tweets • TwiCopy

Sahil Verma

@sahil1v

+ Follow

PhD student @uwcse. Robustness and Interpretability. Currently at @MSFTResearch. Former intern at @amazon, @itsArthurAI. Undergrad @IITKanpur

ID: 1896456847

linkhttps://vsahil.github.io calendar_today23-09-2013 06:55:38

548 Tweet

518 Followers

1,1K Following

Feng Yao

@fengyao1909

6 months ago

🔥 "Vibe coding" is everywhere—but is it really care-free? We introduce 𝐑𝐞𝐚𝐋, an RL framework that trains LLMs with automated program analysis feedback, enabling "vibe coding" to be not just fast—but 𝐯𝐮𝐥𝐧𝐞𝐫𝐚𝐛𝐢𝐥𝐢𝐭𝐲-𝐟𝐫𝐞𝐞 & 𝐩𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧-𝐫𝐞𝐚𝐝𝐲 🛡️

thumb_up_off_alt135

chat_bubble_outline2

repeat37

shareShare

Sahil Verma

@sahil1v

6 months ago

Using retrieval? --> check out this work by my awesome collaborator on how to increase diversity when retrieving!

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

Avinandan Bose

@avibose22

6 months ago

🚨 Code is live! Check out LoRe – a modular, lightweight codebase for personalized reward modeling from user preferences. 📦 Few-shot personalization 📊 Benchmarks: TLDR, PRISM, PersonalLLM 👉 github.com/facebookresear… Huge thanks to AI at Meta for open-sourcing this research 🙌

thumb_up_off_alt21

chat_bubble_outline0

repeat6

shareShare

Feng Yao

@fengyao1909

5 months ago

😵‍💫 Struggling with 𝐟𝐢𝐧𝐞-𝐭𝐮𝐧𝐢𝐧𝐠 𝐌𝐨𝐄? Meet 𝐃𝐞𝐧𝐬𝐞𝐌𝐢𝐱𝐞𝐫 — an MoE post-training method that offers more 𝐩𝐫𝐞𝐜𝐢𝐬𝐞 𝐫𝐨𝐮𝐭𝐞𝐫 𝐠𝐫𝐚𝐝𝐢𝐞𝐧𝐭, making MoE 𝐞𝐚𝐬𝐢𝐞𝐫 𝐭𝐨 𝐭𝐫𝐚𝐢𝐧 and 𝐛𝐞𝐭𝐭𝐞𝐫 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐢𝐧𝐠! Blog: fengyao.notion.site/moe-posttraini…

thumb_up_off_alt273

chat_bubble_outline5

repeat61

shareShare

Mattia Opper

@zvez11

5 months ago

Are you compositionally curious 🤓 Want to know how to learn embeddings using🌲? In our new #ICML2025 paper, we present Banyan: A recursive net that you can train super efficiently for any language or domain, and get embeddings competitive with much much larger LLMs 1/🧵

thumb_up_off_alt24

chat_bubble_outline2

repeat12

shareShare

Shruti Joshi

@_shruti_joshi_

5 months ago

I will be at the Actionable Interpretability Workshop (Actionable Interpretability Workshop ICML2025, #ICML) presenting *SSAEs* in the East Ballroom A from 1-2pm. Drop by (or send a DM) to chat about (actionable) interpretability, (actionable) identifiability, and everything in between!

thumb_up_off_alt24

chat_bubble_outline1

repeat6

shareShare

Mattia Opper

@zvez11

5 months ago

Transformers struggle with length generalization and long context. What can we do about it? Our new #TMLR paper with Roland Fernandez , Paul Smolensky and Jianfeng Gao shows how to handle the issue. Using a new attention mechanism called TRA. Curious? Read the 🧵 for more 🤓

thumb_up_off_alt12

chat_bubble_outline1

repeat8

shareShare

Soumye Singhal

@soumyesinghal

5 months ago

Llama Nemotron model just got Super-Charged ⚡️We released Llama-Nemotron-Super-v1.5 today! The best open model that can be deployed on a single H100 🚀 Enhanced for reasoning, tool use, general chat, and instruction following. HF : huggingface.co/nvidia/Llama-3…

thumb_up_off_alt33

chat_bubble_outline2

repeat7

shareShare

Sahil Verma

@sahil1v

4 months ago

Glad to share that our paper was accepted the main EMNLP 2025 Conference! x.com/Sahil1V/status…

thumb_up_off_alt69

chat_bubble_outline4

repeat9

shareShare

Raktim Mitra

@raktim7879

3 months ago

RFDiffusion3 generates all atom bound conformation, making it significant for flexible targets like DNA. An excellent teamwork to achieve something impossible by any one of us in just few months. Jasper Butcher Rohith Krishna biorxiv.org/content/10.110…

thumb_up_off_alt379

chat_bubble_outline2

repeat64

shareShare

Jasper Butcher

@butcher_jasper

3 months ago

Very excited to share our paper "De novo Design of All-atom Biomolecular Interactions with RFdiffusion3", now on BioRXiv. biorxiv.org/content/10.110… 1/n

thumb_up_off_alt410

chat_bubble_outline4

repeat89

shareShare

Sahil Verma

@sahil1v

2 months ago

Hands down Hila is one of the best advisors out there, you would be lucky to work with her! (as was I :) )

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Divyat Mahajan

@divyat09

2 months ago

[1/9] While pretraining data might be hitting a wall, novel methods for modeling it are just getting started! We introduce future summary prediction (FSP), where the model predicts future sequence embeddings to reduce teacher forcing & shortcut learning. 📌Predict a learned

thumb_up_off_alt216

chat_bubble_outline10

repeat46

shareShare

Sahil Verma

@sahil1v

a month ago

Sarah is great, go work with her!

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Rohith Krishna

@r_krishna3

11 days ago

Today, we report a method for design of active enzymes, RFdiffusion2, in Nature Methods. For the first time, we are able to design enzymes with native-range catalytic activity. We also are releasing our next frontier model, RFdiffusion3, code 👇

thumb_up_off_alt363

chat_bubble_outline7

repeat69

shareShare