Mrinmaya Sachan (@mrinmayasachan) 's Twitter Profile
Mrinmaya Sachan

@mrinmayasachan

Assistant Professor of Computer Science at ETH Zurich working in natural language processing (#NLProc), machine learning and education (#edtech).

ID: 156994523

linkhttp://www.mrinmaya.io/ calendar_today18-06-2010 14:40:05

414 Tweet

1,1K Followers

1,1K Following

Laura Ruis (@lauraruis) 's Twitter Profile Photo

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️
Jakub Macina (@dmacjam) 's Twitter Profile Photo

🚀 𝐇𝐨𝐰 𝐰𝐞𝐥𝐥 𝐜𝐚𝐧 𝐋𝐋𝐌𝐬 𝐭𝐞𝐚𝐜𝐡? Evaluating LLMs for education is key to making real progress, yet we lack a reliable and simple benchmark. Introducing 𝐌𝐚𝐭𝐡𝐓𝐮𝐭𝐨𝐫𝐁𝐞𝐧𝐜𝐡—an open-source benchmark designed to assess holistic tutoring capabilities in AI.

Zhijing Jin✈️ ICLR Singapore (@zhijingjin) 's Twitter Profile Photo

🌍 How do #LLMs handle trolley problems across cultures? We test them with 98K dilemmas in 107 languages, grounded in 40M+ human moral judgments. 💡 Spotlight @ICLR2025 in Singapore✈️| Best Paper Pluralistic Alignment Workshop Workshop #NeurIPS2024 📄 Paper: arxiv.org/abs/2407.02273 🧵👇

🌍 How do #LLMs handle trolley problems across cultures? We test them with 98K dilemmas in 107 languages, grounded in 40M+ human moral judgments.
💡 Spotlight @ICLR2025 in Singapore✈️| Best Paper <a href="/pluralistic_ai/">Pluralistic Alignment Workshop</a> Workshop #NeurIPS2024
📄 Paper: arxiv.org/abs/2407.02273 🧵👇
Zhijing Jin✈️ ICLR Singapore (@zhijingjin) 's Twitter Profile Photo

Very honored to be one of the 15,553 runners today in #SOLA Relay Zürich. And also super proud of our #NLProc team of 14 finishing 113km in total! Many many thanks to all the friends & our Prof Mrinmaya Sachan! It's such a meaningful day in my life. Yet to run for #EMNLP now ;)!

Very honored to be one of the 15,553 runners today in #SOLA Relay Zürich. And also super proud of our #NLProc team of 14 finishing 113km in total! Many many thanks to all the friends &amp; our Prof <a href="/mrinmayasachan/">Mrinmaya Sachan</a>! It's such a meaningful day in my life. Yet to run for #EMNLP now ;)!
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

This paper introduces an online reinforcement learning framework using simulated student-tutor interactions. It trains LLMs to prioritize guiding students pedagogically instead of simply revealing solutions, aligning models with better teaching methods. This helps students

This paper introduces an online reinforcement learning framework using simulated student-tutor interactions.

It trains LLMs to prioritize guiding students pedagogically instead of simply revealing solutions, aligning models with better teaching methods.

This helps students
Yinya Huang ✈️ ICLR (@yinyahuang) 's Twitter Profile Photo

🤖⚛️Can AI truly see Physics? Test your model with the newly released SeePhys Benchmark! 🚀 🖼️Covering 2,000 vision-text multimodal physics problems spanning from middle school to doctoral qualification exams, the SeePhys benchmark systematically evaluates LLMs/MLLMs on tasks

🤖⚛️Can AI truly see Physics? Test your model with the newly released SeePhys Benchmark! 🚀

🖼️Covering 2,000 vision-text multimodal physics problems spanning from middle school to doctoral qualification exams, the SeePhys benchmark systematically evaluates LLMs/MLLMs on tasks
Jakub Macina (@dmacjam) 's Twitter Profile Photo

AI alignment for tutoring🎓 We use full online RL with conversation-level rewards—not just single-turn signals like DPO. Did the student actually learn by the end? Using GRPO, the model learns real teaching strategies like when to hint or when to correct. Explore models below⤵️

Zhijing Jin✈️ ICLR Singapore (@zhijingjin) 's Twitter Profile Photo

Check out our 3 papers on Testing LLM Moral Reasoning via Multi-Agent Simulations! ✍️ Our summary blogpost: lesswrong.com/posts/2WAire3L… 📑Our series of 3 papers:1️⃣GovSim (NeurIPS 2024) arxiv.org/abs/2404.16698 2️⃣SanctSim zhijing-jin.com/files/papers/2… 3️⃣MoralSim arxiv.org/abs/2505.19212

Check out our 3 papers on Testing LLM Moral Reasoning via Multi-Agent Simulations!
✍️ Our summary blogpost: lesswrong.com/posts/2WAire3L…  
📑Our series of 3 papers:1️⃣GovSim (NeurIPS 2024) arxiv.org/abs/2404.16698 2️⃣SanctSim zhijing-jin.com/files/papers/2… 3️⃣MoralSim arxiv.org/abs/2505.19212
Alessandro Stolfo (@alesstolfo) 's Twitter Profile Photo

New paper on detecting & correcting arithmetic errors in LLMs! We show that simple probes can recover correct answers from hidden states and trigger self-correction of reasoning errors. 📍 If you’re at #ICML2025 stop by our poster @ the Act Interp WS 📝arxiv.org/abs/2507.12379

Zhijing Jin✈️ ICLR Singapore (@zhijingjin) 's Twitter Profile Photo

Congrats again to our brilliant students David Guzman Yongjin Yang for receiving the "Oral Paper Award" at the #ACL2025NLP Workshop on Research on Agent Language Models (REALM)! Check out how Reasoning LLMs optimize self interests over collective success 📊 in our paper

Congrats again to our brilliant students <a href="/davidguzman1120/">David Guzman</a> <a href="/_yongjinny/">Yongjin Yang</a> for receiving the "Oral Paper Award" at the #ACL2025NLP Workshop on Research on Agent Language Models (REALM)! Check out how Reasoning LLMs optimize self interests over collective success 📊 in our paper