Jonibek Mansurov (@m_jonibek) Twitter Tweets • TwiCopy

Alham Fikri Aji

a year ago

Final work promotion in 2024, by Jonibek Mansurov! We managed to achieve ~75% on a challenging GPQA with only 2 layers of transformers(~ 40M params) that were trained on different data; in our case, MedMCQA. Introducing...

Final work promotion in 2024, by <a href="/M_Jonibek/">Jonibek Mansurov</a>!

We managed to achieve ~75% on a challenging GPQA with only 2 layers of transformers(~ 40M params) that were trained on different data; in our case, MedMCQA.

Introducing...

thumb_up_off_alt31

chat_bubble_outline1

repeat8

shareShare

AK

@_akhaliq

7 months ago

Crosslingual Reasoning through Test-Time Scaling TL;DR: show that scaling up thinking tokens of English-centric reasoning language models, such as s1 models, can improve multilingual math reasoning performance. Also analyze the language-mixing patterns, effects of different

thumb_up_off_alt93

chat_bubble_outline7

repeat18

shareShare

Yong Zheng-Xin (Yong)

@yong_zhengxin

7 months ago

📣 New paper! We observe that reasoning language models finetuned only on English data are capable of zero-shot cross-lingual reasoning through a "quote-and-think" pattern. However, this does not mean they reason the same way across all languages or in new domains. [1/N]

thumb_up_off_alt182

chat_bubble_outline5

repeat42

shareShare

Cohere Labs

@cohere_labs

7 months ago

Reasoning language models are primarily trained on English data, but do they generalize well to multilingual settings in various domains? We show that test-time scaling can improve their zero-shot crosslingual reasoning performance! 🔥

thumb_up_off_alt64

chat_bubble_outline2

repeat15

shareShare

Genta Winata

@gentaiscool

7 months ago

⭐️Reasoning LLMs trained on English data can think in other languages. Read our paper to learn more! Thank you Yong Zheng-Xin (Yong) for leading the project and team! It was an exciting colab! farid Jonibek Mansurov Ruochen Zhang Niklas Muennighoff Carsten Eickhoff Julia Kreutzer

thumb_up_off_alt28

chat_bubble_outline0

repeat7

shareShare

Alham Fikri Aji

@alhamfikri

7 months ago

🚨Multilingual LLMs, finetuned only on English reasoning data, can still reason when asked non-English questions, showing reasoning traces that go back & forth between languages. I had so much fun working on this project Please give our paper a read! arxiv.org/abs/2505.05408

thumb_up_off_alt93

chat_bubble_outline2

repeat22

shareShare

farid

@faridlazuarda

7 months ago

Can English-finetuned LLMs reason in other languages? Short Answer: Yes, thanks to “quote-and-think” + test-time scaling. You can even force them to reason in a target language! But: 🌐 Low-resource langs & non-STEM topics still tough. New paper: arxiv.org/abs/2505.05408

thumb_up_off_alt33

chat_bubble_outline1

repeat6

shareShare

Sophia Yang, Ph.D.

@sophiamyang

7 months ago

Can an AI trained in English solve math problems in other languages without extra training?

thumb_up_off_alt628

chat_bubble_outline18

repeat82

shareShare

Yong Zheng-Xin (Yong)

@yong_zhengxin

6 months ago

This is incredible findings – a reproducibility crisis where baselines are not faithfully reproduced or reported (e.g., footnote indicating performance difference) 🍎 In our work (arxiv.org/abs/2505.05408) we tried so hard to ensure apple-to-apple comparison.

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

Yong Zheng-Xin (Yong)

@yong_zhengxin

6 months ago

Amidst the evaluation/reproducibility crisis for reasoning LLMs, it's great to see *concurrent independent work (with different models & benchmarks) aligns with our findings*! We reported the same fundamental trade-off: language forcing leads to ✅ compliance, ❌ accuracy!

thumb_up_off_alt20

chat_bubble_outline0

repeat10

shareShare