Sahil Verma (@sahil1v) 's Twitter Profile
Sahil Verma

@sahil1v

PhD student @uwcse. Robustness and Interpretability. Currently at @MSFTResearch. Former intern at @amazon, @itsArthurAI. Undergrad @IITKanpur

ID: 1896456847

linkhttps://vsahil.github.io calendar_today23-09-2013 06:55:38

548 Tweet

518 Followers

1,1K Following

Feng Yao (@fengyao1909) 's Twitter Profile Photo

๐Ÿ”ฅ "Vibe coding" is everywhereโ€”but is it really care-free? We introduce ๐‘๐ž๐š๐‹, an RL framework that trains LLMs with automated program analysis feedback, enabling "vibe coding" to be not just fastโ€”but ๐ฏ๐ฎ๐ฅ๐ง๐ž๐ซ๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ-๐Ÿ๐ซ๐ž๐ž & ๐ฉ๐ซ๐จ๐๐ฎ๐œ๐ญ๐ข๐จ๐ง-๐ซ๐ž๐š๐๐ฒ ๐Ÿ›ก๏ธ

๐Ÿ”ฅ "Vibe coding" is everywhereโ€”but is it really care-free?

We introduce ๐‘๐ž๐š๐‹, an RL framework that trains LLMs with automated program analysis feedback, enabling "vibe coding" to be not just fastโ€”but ๐ฏ๐ฎ๐ฅ๐ง๐ž๐ซ๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ-๐Ÿ๐ซ๐ž๐ž & ๐ฉ๐ซ๐จ๐๐ฎ๐œ๐ญ๐ข๐จ๐ง-๐ซ๐ž๐š๐๐ฒ ๐Ÿ›ก๏ธ
Avinandan Bose (@avibose22) 's Twitter Profile Photo

๐Ÿšจ Code is live! Check out LoRe โ€“ a modular, lightweight codebase for personalized reward modeling from user preferences. ๐Ÿ“ฆ Few-shot personalization ๐Ÿ“Š Benchmarks: TLDR, PRISM, PersonalLLM ๐Ÿ‘‰ github.com/facebookresearโ€ฆ Huge thanks to AI at Meta for open-sourcing this research ๐Ÿ™Œ

Feng Yao (@fengyao1909) 's Twitter Profile Photo

๐Ÿ˜ตโ€๐Ÿ’ซ Struggling with ๐Ÿ๐ข๐ง๐ž-๐ญ๐ฎ๐ง๐ข๐ง๐  ๐Œ๐จ๐„? Meet ๐ƒ๐ž๐ง๐ฌ๐ž๐Œ๐ข๐ฑ๐ž๐ซ โ€” an MoE post-training method that offers more ๐ฉ๐ซ๐ž๐œ๐ข๐ฌ๐ž ๐ซ๐จ๐ฎ๐ญ๐ž๐ซ ๐ ๐ซ๐š๐๐ข๐ž๐ง๐ญ, making MoE ๐ž๐š๐ฌ๐ข๐ž๐ซ ๐ญ๐จ ๐ญ๐ซ๐š๐ข๐ง and ๐›๐ž๐ญ๐ญ๐ž๐ซ ๐ฉ๐ž๐ซ๐Ÿ๐จ๐ซ๐ฆ๐ข๐ง๐ ! Blog: fengyao.notion.site/moe-posttrainiโ€ฆ

๐Ÿ˜ตโ€๐Ÿ’ซ Struggling with ๐Ÿ๐ข๐ง๐ž-๐ญ๐ฎ๐ง๐ข๐ง๐  ๐Œ๐จ๐„?

Meet ๐ƒ๐ž๐ง๐ฌ๐ž๐Œ๐ข๐ฑ๐ž๐ซ โ€” an MoE post-training method that offers more ๐ฉ๐ซ๐ž๐œ๐ข๐ฌ๐ž ๐ซ๐จ๐ฎ๐ญ๐ž๐ซ ๐ ๐ซ๐š๐๐ข๐ž๐ง๐ญ, making MoE ๐ž๐š๐ฌ๐ข๐ž๐ซ ๐ญ๐จ ๐ญ๐ซ๐š๐ข๐ง and ๐›๐ž๐ญ๐ญ๐ž๐ซ ๐ฉ๐ž๐ซ๐Ÿ๐จ๐ซ๐ฆ๐ข๐ง๐ !

Blog: fengyao.notion.site/moe-posttrainiโ€ฆ
Mattia Opper (@zvez11) 's Twitter Profile Photo

Are you compositionally curious ๐Ÿค“ Want to know how to learn embeddings using๐ŸŒฒ? In our new #ICML2025 paper, we present Banyan: A recursive net that you can train super efficiently for any language or domain, and get embeddings competitive with much much larger LLMs 1/๐Ÿงต

Shruti Joshi (@_shruti_joshi_) 's Twitter Profile Photo

I will be at the Actionable Interpretability Workshop (Actionable Interpretability Workshop ICML2025, #ICML) presenting *SSAEs* in the East Ballroom A from 1-2pm. Drop by (or send a DM) to chat about (actionable) interpretability, (actionable) identifiability, and everything in between!

Mattia Opper (@zvez11) 's Twitter Profile Photo

Transformers struggle with length generalization and long context. What can we do about it? Our new #TMLR paper with Roland Fernandez , Paul Smolensky and Jianfeng Gao shows how to handle the issue. Using a new attention mechanism called TRA. Curious? Read the ๐Ÿงต for more ๐Ÿค“

Soumye Singhal (@soumyesinghal) 's Twitter Profile Photo

Llama Nemotron model just got Super-Charged โšก๏ธWe released Llama-Nemotron-Super-v1.5 today! The best open model that can be deployed on a single H100 ๐Ÿš€ Enhanced for reasoning, tool use, general chat, and instruction following. HF : huggingface.co/nvidia/Llama-3โ€ฆ

Raktim Mitra (@raktim7879) 's Twitter Profile Photo

RFDiffusion3 generates all atom bound conformation, making it significant for flexible targets like DNA. An excellent teamwork to achieve something impossible by any one of us in just few months. Jasper Butcher Rohith Krishna biorxiv.org/content/10.110โ€ฆ

Jasper Butcher (@butcher_jasper) 's Twitter Profile Photo

Very excited to share our paper "De novo Design of All-atom Biomolecular Interactions with RFdiffusion3", now on BioRXiv. biorxiv.org/content/10.110โ€ฆ 1/n

Divyat Mahajan (@divyat09) 's Twitter Profile Photo

[1/9] While pretraining data might be hitting a wall, novel methods for modeling it are just getting started! We introduce future summary prediction (FSP), where the model predicts future sequence embeddings to reduce teacher forcing & shortcut learning. ๐Ÿ“ŒPredict a learned

Rohith Krishna (@r_krishna3) 's Twitter Profile Photo

Today, we report a method for design of active enzymes, RFdiffusion2, in Nature Methods. For the first time, we are able to design enzymes with native-range catalytic activity. We also are releasing our next frontier model, RFdiffusion3, code ๐Ÿ‘‡