Siva Reddy (@sivareddyg) 's Twitter Profile
Siva Reddy

@sivareddyg

Assistant Professor @Mila_Quebec @McGillU @ServiceNowRSRCH; Postdoc @StanfordNLP; PhD @EdinburghNLP; Natural Language Processor #NLProc

ID: 56686035

linkhttps://sivareddy.in calendar_today14-07-2009 12:56:42

1,1K Tweet

5,5K Takipçi

1,1K Takip Edilen

Gillian Hadfield (@ghadfield) 's Twitter Profile Photo

My lab Johns Hopkins University is recruiting research and communications professionals, and AI postdocs to advance our work ensuring that AI is safe and aligned to human well-being worldwide: We're hiring an AI Policy Researcher to conduct in-depth research into the technical and policy

Benno Krojer (@benno_krojer) 's Twitter Profile Photo

The video is online now! 3min speed science talk on "From a soup of raw pixels to abstract meaning" youtu.be/AHsoMYG2Vqk?si…

The video is online now!

3min speed science talk on "From a soup of raw pixels to abstract meaning"

youtu.be/AHsoMYG2Vqk?si…
Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

📢 Can LLMs really reason outside the box in math? Or are they just remixing familiar strategies? Remember DeepSeek R1, o1 have impressed us on Olympiad-level math but also they were failing at simple arithmetic 😬 We built a benchmark to find out → OMEGA Ω 📐 💥 We found

📢 Can LLMs really reason outside the box in math? Or are they just remixing familiar strategies? 

Remember DeepSeek R1, o1 have impressed us on Olympiad-level math but also they were failing at simple arithmetic 😬

 We built a benchmark to find out → OMEGA Ω 📐

💥 We found
Harman Singh (@harman26singh) 's Twitter Profile Photo

🚨 New Google DeepMind paper 𝐑𝐨𝐛𝐮𝐬𝐭 𝐑𝐞𝐰𝐚𝐫𝐝 𝐌𝐨𝐝𝐞𝐥𝐢𝐧𝐠 𝐯𝐢𝐚 𝐂𝐚𝐮𝐬𝐚𝐥 𝐑𝐮𝐛𝐫𝐢𝐜𝐬 📑 👉 arxiv.org/abs/2506.16507 We tackle reward hacking—when RMs latch onto spurious cues (e.g. length, style) instead of true quality. #RLAIF #CausalInference 🧵⬇️

🚨 New <a href="/GoogleDeepMind/">Google DeepMind</a> paper

𝐑𝐨𝐛𝐮𝐬𝐭 𝐑𝐞𝐰𝐚𝐫𝐝 𝐌𝐨𝐝𝐞𝐥𝐢𝐧𝐠 𝐯𝐢𝐚 𝐂𝐚𝐮𝐬𝐚𝐥 𝐑𝐮𝐛𝐫𝐢𝐜𝐬 📑
👉 arxiv.org/abs/2506.16507

We tackle reward hacking—when RMs latch onto spurious cues (e.g. length, style) instead of true quality.
#RLAIF #CausalInference

🧵⬇️
Tal Linzen (@tallinzen) 's Twitter Profile Photo

Congratulations Verna! This was one of the best theses I've ever read, I highly recommend checking out Verna's work on the tradeoffs between memorization and generalization in language models! vernadankers.com

Edoardo Ponti (@pontiedoardo) 's Twitter Profile Photo

I thoroughly enjoyed reading Verna Dankers's dissertation; my personal highlight was her idea of maps that track the training memorisation versus test generalisation of each example. I wish you all the best for the upcoming postdoc with Siva Reddy and his wonderful group!

Christopher Manning (@chrmanning) 's Twitter Profile Photo

I’ve joined AIX Ventures as a General Partner, working on investing in deep AI startups. Looking forward to working with founders on solving hard problems in AI and seeing products come out of that!  Thank you Yuliya Chernova at The Wall Street Journal for covering the news: wsj.com/articles/ai-re…

Siva Reddy (@sivareddyg) 's Twitter Profile Photo

Fantastic job, Verna Dankers, on passing your viva with flying colors! Absolutely thrilled to have you join us as a postdoc at Mila - Institut québécois d'IA and McGill NLP. So excited for the amazing things we'll work on together!

BlackboxNLP (@blackboxnlp) 's Twitter Profile Photo

🚨 Excited to announce two invited speakers at #BlackboxNLP 2025! Join us to hear from two leading voices in interpretability: 🎙️ Quanshi Zhang (Shanghai Jiao Tong University) 🎙️ Verna Dankers (McGill University) Verna Dankers Quanshi Zhang

🚨 Excited to announce two invited speakers at #BlackboxNLP 2025!

Join us to hear from two leading voices in interpretability:
🎙️ Quanshi Zhang (Shanghai Jiao Tong University)
🎙️ Verna Dankers (McGill University)

<a href="/vernadankers/">Verna Dankers</a> <a href="/QuanshiZhang/">Quanshi Zhang</a>
Joey Bose (@bose_joey) 's Twitter Profile Photo

🎉Personal update: I'm thrilled to announce that I'm joining Imperial College London Imperial College London as an Assistant Professor of Computing Imperial Computing starting January 2026. My future lab and I will continue to work on building better Generative Models 🤖, the hardest

Massimo Caccia (@masscaccia) 's Twitter Profile Photo

🎉 Our paper “𝐻𝑜𝑤 𝑡𝑜 𝑇𝑟𝑎𝑖𝑛 𝑌𝑜𝑢𝑟 𝐿𝐿𝑀 𝑊𝑒𝑏 𝐴𝑔𝑒𝑛𝑡: 𝐴 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑖𝑠” got an 𝐨𝐫𝐚𝐥 at next week’s 𝗜𝗖𝗠𝗟 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗨𝘀𝗲 𝗔𝗴𝗲𝗻𝘁𝘀! 🖥️🧠 We present the 𝐟𝐢𝐫𝐬𝐭 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞

🎉 Our paper “𝐻𝑜𝑤 𝑡𝑜 𝑇𝑟𝑎𝑖𝑛 𝑌𝑜𝑢𝑟 𝐿𝐿𝑀 𝑊𝑒𝑏 𝐴𝑔𝑒𝑛𝑡: 𝐴 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑖𝑠” got an 𝐨𝐫𝐚𝐥 at next week’s 𝗜𝗖𝗠𝗟 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗨𝘀𝗲 𝗔𝗴𝗲𝗻𝘁𝘀! 🖥️🧠

We present the 𝐟𝐢𝐫𝐬𝐭 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞
Alexandre Drouin (@alexandredrouin) 's Twitter Profile Photo

📢 Attention Attention ServiceNow Research is hiring a Research Scientist with a focus on Agent Safety+Security 👩🏻‍🔬 Join us to work on impactful open research projects like 🔹DoomArena: github.com/ServiceNow/doo… 🔹BrowserGym: github.com/ServiceNow/Bro… Apply: jobs.smartrecruiters.com/ServiceNow/744…

Edoardo Ponti (@pontiedoardo) 's Twitter Profile Photo

Thanks for acknowledging Dynamic Token Pooling as a predecessor to H-Net, Albert Gu! We had some decent ideas in that paper (e2e and entropy-based tokenisation), but it surprises me that it took 2 years (an eternity in NLP) to find the right recipe and scale better than BPE

Sebastian Schuster (@sebschu) 's Twitter Profile Photo

The Austrian Academy of Sciences is offering a pretty generous package to researchers in the US who would like to come to Austria for a postdoc. stipendien.oeaw.ac.at/en/fellowships…. Please email me if you're interested in applying for this by Jul 25 🧑‍🔬

Siva Reddy (@sivareddyg) 's Twitter Profile Photo

I am speaking at 10 am PT on a slightly different topic than I usually talk about 🙂: "Simple Ideas Can Have Mighty Effects: Don't Take LLM Fundamentals for Granted" Check out if you're around. #ICML2025

Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

Current agents are highly unsafe, o3-mini one of the most advanced models in reasoning score 71% in executing harmful requests 😱 We introduce a new framework for evaluating agent safety✨🦺 Discover more 👇 👩‍💻 Code & data: github.com/Open-Agent-Saf… 📄 Paper:

Current agents are highly unsafe, o3-mini one of the most advanced models in reasoning score 71% in executing harmful requests 😱
We introduce a new framework for evaluating agent safety✨🦺 Discover more 👇 

👩‍💻 Code &amp; data: github.com/Open-Agent-Saf… 
📄 Paper: