Mrinmaya Sachan (@mrinmayasachan) Twitter Tweets • TwiCopy

Mrinmaya Sachan

@mrinmayasachan

+ Follow

Assistant Professor of Computer Science at ETH Zurich working in natural language processing (#NLProc), machine learning and education (#edtech).

ID: 156994523

linkhttp://www.mrinmaya.io/ calendar_today18-06-2010 14:40:05

414 Tweet

1,1K Followers

1,1K Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

thumb_up_off_alt966

chat_bubble_outline24

repeat208

shareShare

Natural Language Processing Papers

@hei

6 months ago

How to Select Datapoints for Efficient Human Evaluation of NLG Models?. arxiv.org/abs/2501.18251

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Jakub Macina

@dmacjam

5 months ago

🚀 𝐇𝐨𝐰 𝐰𝐞𝐥𝐥 𝐜𝐚𝐧 𝐋𝐋𝐌𝐬 𝐭𝐞𝐚𝐜𝐡? Evaluating LLMs for education is key to making real progress, yet we lack a reliable and simple benchmark. Introducing 𝐌𝐚𝐭𝐡𝐓𝐮𝐭𝐨𝐫𝐁𝐞𝐧𝐜𝐡—an open-source benchmark designed to assess holistic tutoring capabilities in AI.

thumb_up_off_alt8

chat_bubble_outline4

repeat3

shareShare

Zhijing Jin✈️ ICLR Singapore

@zhijingjin

4 months ago

🌍 How do #LLMs handle trolley problems across cultures? We test them with 98K dilemmas in 107 languages, grounded in 40M+ human moral judgments. 💡 Spotlight @ICLR2025 in Singapore✈️| Best Paper Pluralistic Alignment Workshop Workshop #NeurIPS2024 📄 Paper: arxiv.org/abs/2407.02273 🧵👇

thumb_up_off_alt130

chat_bubble_outline3

repeat27

shareShare

Zhijing Jin✈️ ICLR Singapore

@zhijingjin

3 months ago

Very honored to be one of the 15,553 runners today in #SOLA Relay Zürich. And also super proud of our #NLProc team of 14 finishing 113km in total! Many many thanks to all the friends & our Prof Mrinmaya Sachan! It's such a meaningful day in my life. Yet to run for #EMNLP now ;)!

thumb_up_off_alt47

chat_bubble_outline1

repeat3

shareShare

Rohan Paul

@rohanpaul_ai

3 months ago

This paper introduces an online reinforcement learning framework using simulated student-tutor interactions. It trains LLMs to prioritize guiding students pedagogically instead of simply revealing solutions, aligning models with better teaching methods. This helps students

thumb_up_off_alt27

chat_bubble_outline0

repeat3

shareShare

Yinya Huang ✈️ ICLR

@yinyahuang

2 months ago

🤖⚛️Can AI truly see Physics? Test your model with the newly released SeePhys Benchmark! 🚀 🖼️Covering 2,000 vision-text multimodal physics problems spanning from middle school to doctoral qualification exams, the SeePhys benchmark systematically evaluates LLMs/MLLMs on tasks

thumb_up_off_alt36

chat_bubble_outline4

repeat16

shareShare

Jakub Macina

@dmacjam

2 months ago

AI alignment for tutoring🎓 We use full online RL with conversation-level rewards—not just single-turn signals like DPO. Did the student actually learn by the end? Using GRPO, the model learns real teaching strategies like when to hint or when to correct. Explore models below⤵️

thumb_up_off_alt12

chat_bubble_outline1

repeat3

shareShare

Mrinmaya Sachan

@mrinmayasachan

2 months ago

Check out our latest work on aligning language models with pedagogy.

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Zhijing Jin✈️ ICLR Singapore

@zhijingjin

a month ago

Check out our 3 papers on Testing LLM Moral Reasoning via Multi-Agent Simulations! ✍️ Our summary blogpost: lesswrong.com/posts/2WAire3L… 📑Our series of 3 papers:1️⃣GovSim (NeurIPS 2024) arxiv.org/abs/2404.16698 2️⃣SanctSim zhijing-jin.com/files/papers/2… 3️⃣MoralSim arxiv.org/abs/2505.19212

thumb_up_off_alt134

chat_bubble_outline5

repeat31

shareShare

Mrinmaya Sachan

@mrinmayasachan

a month ago

Human evaluation is still the holy grail in NLG. But, can you get more bang for your buck? Check out this amazing piece of work from Vilém Zouhar

thumb_up_off_alt17

chat_bubble_outline1

repeat1

shareShare

Alessandro Stolfo

@alesstolfo

23 days ago

New paper on detecting & correcting arithmetic errors in LLMs! We show that simple probes can recover correct answers from hidden states and trigger self-correction of reasoning errors. 📍 If you’re at #ICML2025 stop by our poster @ the Act Interp WS 📝arxiv.org/abs/2507.12379

thumb_up_off_alt20

chat_bubble_outline0

repeat2

shareShare

Zhijing Jin✈️ ICLR Singapore

@zhijingjin

3 days ago

Congrats again to our brilliant students David Guzman Yongjin Yang for receiving the "Oral Paper Award" at the #ACL2025NLP Workshop on Research on Agent Language Models (REALM)! Check out how Reasoning LLMs optimize self interests over collective success 📊 in our paper

Congrats again to our brilliant students <a href="/davidguzman1120/">David Guzman</a> <a href="/_yongjinny/">Yongjin Yang</a> for receiving the "Oral Paper Award" at the #ACL2025NLP Workshop on Research on Agent Language Models (REALM)! Check out how Reasoning LLMs optimize self interests over collective success 📊 in our paper

thumb_up_off_alt28

chat_bubble_outline0

repeat4

shareShare

Mrinmaya Sachan

Gate.io

Laura Ruis

Natural Language Processing Papers

Jakub Macina

Zhijing Jin✈️ ICLR Singapore

Zhijing Jin✈️ ICLR Singapore

Rohan Paul

Yinya Huang ✈️ ICLR

Jakub Macina

Mrinmaya Sachan

Zhijing Jin✈️ ICLR Singapore

Mrinmaya Sachan

Alessandro Stolfo

Zhijing Jin✈️ ICLR Singapore