Siva Reddy (@sivareddyg) 's Twitter Profile
Siva Reddy

@sivareddyg

Assistant Professor @Mila_Quebec @McGillU @ServiceNowRSRCH; Postdoc @StanfordNLP; PhD @EdinburghNLP; Natural Language Processor #NLProc

ID: 56686035

linkhttps://sivareddy.in calendar_today14-07-2009 12:56:42

1,1K Tweet

5,5K Followers

1,1K Following

Gillian Hadfield (@ghadfield) 's Twitter Profile Photo

My lab Johns Hopkins University is recruiting research and communications professionals, and AI postdocs to advance our work ensuring that AI is safe and aligned to human well-being worldwide: We're hiring an AI Policy Researcher to conduct in-depth research into the technical and policy

Benno Krojer (@benno_krojer) 's Twitter Profile Photo

The video is online now! 3min speed science talk on "From a soup of raw pixels to abstract meaning" youtu.be/AHsoMYG2Vqk?siโ€ฆ

The video is online now!

3min speed science talk on "From a soup of raw pixels to abstract meaning"

youtu.be/AHsoMYG2Vqk?siโ€ฆ
Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

๐Ÿ“ข Can LLMs really reason outside the box in math? Or are they just remixing familiar strategies? Remember DeepSeek R1, o1 have impressed us on Olympiad-level math but also they were failing at simple arithmetic ๐Ÿ˜ฌ We built a benchmark to find out โ†’ OMEGA ฮฉ ๐Ÿ“ ๐Ÿ’ฅ We found

๐Ÿ“ข Can LLMs really reason outside the box in math? Or are they just remixing familiar strategies? 

Remember DeepSeek R1, o1 have impressed us on Olympiad-level math but also they were failing at simple arithmetic ๐Ÿ˜ฌ

 We built a benchmark to find out โ†’ OMEGA ฮฉ ๐Ÿ“

๐Ÿ’ฅ We found
Harman Singh (@harman26singh) 's Twitter Profile Photo

๐Ÿšจ New Google DeepMind paper ๐‘๐จ๐›๐ฎ๐ฌ๐ญ ๐‘๐ž๐ฐ๐š๐ซ๐ ๐Œ๐จ๐๐ž๐ฅ๐ข๐ง๐  ๐ฏ๐ข๐š ๐‚๐š๐ฎ๐ฌ๐š๐ฅ ๐‘๐ฎ๐›๐ซ๐ข๐œ๐ฌ ๐Ÿ“‘ ๐Ÿ‘‰ arxiv.org/abs/2506.16507 We tackle reward hackingโ€”when RMs latch onto spurious cues (e.g. length, style) instead of true quality. #RLAIF #CausalInference ๐Ÿงตโฌ‡๏ธ

๐Ÿšจ New <a href="/GoogleDeepMind/">Google DeepMind</a> paper

๐‘๐จ๐›๐ฎ๐ฌ๐ญ ๐‘๐ž๐ฐ๐š๐ซ๐ ๐Œ๐จ๐๐ž๐ฅ๐ข๐ง๐  ๐ฏ๐ข๐š ๐‚๐š๐ฎ๐ฌ๐š๐ฅ ๐‘๐ฎ๐›๐ซ๐ข๐œ๐ฌ ๐Ÿ“‘
๐Ÿ‘‰ arxiv.org/abs/2506.16507

We tackle reward hackingโ€”when RMs latch onto spurious cues (e.g. length, style) instead of true quality.
#RLAIF #CausalInference

๐Ÿงตโฌ‡๏ธ
Tal Linzen (@tallinzen) 's Twitter Profile Photo

Congratulations Verna! This was one of the best theses I've ever read, I highly recommend checking out Verna's work on the tradeoffs between memorization and generalization in language models! vernadankers.com

Edoardo Ponti (@pontiedoardo) 's Twitter Profile Photo

I thoroughly enjoyed reading Verna Dankers's dissertation; my personal highlight was her idea of maps that track the training memorisation versus test generalisation of each example. I wish you all the best for the upcoming postdoc with Siva Reddy and his wonderful group!

Christopher Manning (@chrmanning) 's Twitter Profile Photo

Iโ€™ve joined AIX Ventures as a General Partner, working on investing in deep AI startups. Looking forward to working with founders on solving hard problems in AI and seeing products come out of that!ย  Thank you Yuliya Chernova at The Wall Street Journal for covering the news: wsj.com/articles/ai-reโ€ฆ

Siva Reddy (@sivareddyg) 's Twitter Profile Photo

Fantastic job, Verna Dankers, on passing your viva with flying colors! Absolutely thrilled to have you join us as a postdoc at Mila - Institut quรฉbรฉcois d'IA and McGill NLP. So excited for the amazing things we'll work on together!

BlackboxNLP (@blackboxnlp) 's Twitter Profile Photo

๐Ÿšจ Excited to announce two invited speakers at #BlackboxNLP 2025! Join us to hear from two leading voices in interpretability: ๐ŸŽ™๏ธ Quanshi Zhang (Shanghai Jiao Tong University) ๐ŸŽ™๏ธ Verna Dankers (McGill University) Verna Dankers Quanshi Zhang

๐Ÿšจ Excited to announce two invited speakers at #BlackboxNLP 2025!

Join us to hear from two leading voices in interpretability:
๐ŸŽ™๏ธ Quanshi Zhang (Shanghai Jiao Tong University)
๐ŸŽ™๏ธ Verna Dankers (McGill University)

<a href="/vernadankers/">Verna Dankers</a> <a href="/QuanshiZhang/">Quanshi Zhang</a>
Joey Bose (@bose_joey) 's Twitter Profile Photo

๐ŸŽ‰Personal update: I'm thrilled to announce that I'm joining Imperial College London Imperial College London as an Assistant Professor of Computing Imperial Computing starting January 2026. My future lab and I will continue to work on building better Generative Models ๐Ÿค–, the hardest

Massimo Caccia (@masscaccia) 's Twitter Profile Photo

๐ŸŽ‰ Our paper โ€œ๐ป๐‘œ๐‘ค ๐‘ก๐‘œ ๐‘‡๐‘Ÿ๐‘Ž๐‘–๐‘› ๐‘Œ๐‘œ๐‘ข๐‘Ÿ ๐ฟ๐ฟ๐‘€ ๐‘Š๐‘’๐‘ ๐ด๐‘”๐‘’๐‘›๐‘ก: ๐ด ๐‘†๐‘ก๐‘Ž๐‘ก๐‘–๐‘ ๐‘ก๐‘–๐‘๐‘Ž๐‘™ ๐ท๐‘–๐‘Ž๐‘”๐‘›๐‘œ๐‘ ๐‘–๐‘ โ€ got an ๐จ๐ซ๐š๐ฅ at next weekโ€™s ๐—œ๐—–๐— ๐—Ÿ ๐—ช๐—ผ๐—ฟ๐—ธ๐˜€๐—ต๐—ผ๐—ฝ ๐—ผ๐—ป ๐—–๐—ผ๐—บ๐—ฝ๐˜‚๐˜๐—ฒ๐—ฟ ๐—จ๐˜€๐—ฒ ๐—”๐—ด๐—ฒ๐—ป๐˜๐˜€! ๐Ÿ–ฅ๏ธ๐Ÿง  We present the ๐Ÿ๐ข๐ซ๐ฌ๐ญ ๐ฅ๐š๐ซ๐ ๐ž-๐ฌ๐œ๐š๐ฅ๐ž

๐ŸŽ‰ Our paper โ€œ๐ป๐‘œ๐‘ค ๐‘ก๐‘œ ๐‘‡๐‘Ÿ๐‘Ž๐‘–๐‘› ๐‘Œ๐‘œ๐‘ข๐‘Ÿ ๐ฟ๐ฟ๐‘€ ๐‘Š๐‘’๐‘ ๐ด๐‘”๐‘’๐‘›๐‘ก: ๐ด ๐‘†๐‘ก๐‘Ž๐‘ก๐‘–๐‘ ๐‘ก๐‘–๐‘๐‘Ž๐‘™ ๐ท๐‘–๐‘Ž๐‘”๐‘›๐‘œ๐‘ ๐‘–๐‘ โ€ got an ๐จ๐ซ๐š๐ฅ at next weekโ€™s ๐—œ๐—–๐— ๐—Ÿ ๐—ช๐—ผ๐—ฟ๐—ธ๐˜€๐—ต๐—ผ๐—ฝ ๐—ผ๐—ป ๐—–๐—ผ๐—บ๐—ฝ๐˜‚๐˜๐—ฒ๐—ฟ ๐—จ๐˜€๐—ฒ ๐—”๐—ด๐—ฒ๐—ป๐˜๐˜€! ๐Ÿ–ฅ๏ธ๐Ÿง 

We present the ๐Ÿ๐ข๐ซ๐ฌ๐ญ ๐ฅ๐š๐ซ๐ ๐ž-๐ฌ๐œ๐š๐ฅ๐ž
Alexandre Drouin (@alexandredrouin) 's Twitter Profile Photo

๐Ÿ“ข Attention Attention ServiceNow Research is hiring a Research Scientist with a focus on Agent Safety+Security ๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ”ฌ Join us to work on impactful open research projects like ๐Ÿ”นDoomArena: github.com/ServiceNow/dooโ€ฆ ๐Ÿ”นBrowserGym: github.com/ServiceNow/Broโ€ฆ Apply: jobs.smartrecruiters.com/ServiceNow/744โ€ฆ

Edoardo Ponti (@pontiedoardo) 's Twitter Profile Photo

Thanks for acknowledging Dynamic Token Pooling as a predecessor to H-Net, Albert Gu! We had some decent ideas in that paper (e2e and entropy-based tokenisation), but it surprises me that it took 2 years (an eternity in NLP) to find the right recipe and scale better than BPE

Sebastian Schuster (@sebschu) 's Twitter Profile Photo

The Austrian Academy of Sciences is offering a pretty generous package to researchers in the US who would like to come to Austria for a postdoc. stipendien.oeaw.ac.at/en/fellowshipsโ€ฆ. Please email me if you're interested in applying for this by Jul 25 ๐Ÿง‘โ€๐Ÿ”ฌ

Siva Reddy (@sivareddyg) 's Twitter Profile Photo

I am speaking at 10 am PT on a slightly different topic than I usually talk about ๐Ÿ™‚: "Simple Ideas Can Have Mighty Effects: Don't Take LLM Fundamentals for Granted" Check out if you're around. #ICML2025

Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

Current agents are highly unsafe, o3-mini one of the most advanced models in reasoning score 71% in executing harmful requests ๐Ÿ˜ฑ We introduce a new framework for evaluating agent safetyโœจ๐Ÿฆบ Discover more ๐Ÿ‘‡ ๐Ÿ‘ฉโ€๐Ÿ’ป Code & data: github.com/Open-Agent-Safโ€ฆ ๐Ÿ“„ Paper:

Current agents are highly unsafe, o3-mini one of the most advanced models in reasoning score 71% in executing harmful requests ๐Ÿ˜ฑ
We introduce a new framework for evaluating agent safetyโœจ๐Ÿฆบ Discover more ๐Ÿ‘‡ 

๐Ÿ‘ฉโ€๐Ÿ’ป Code &amp; data: github.com/Open-Agent-Safโ€ฆ 
๐Ÿ“„ Paper: