Shubham Toshniwal (@shubhamtoshniw6) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

10 months ago

Normalized Transformer - tricks to keep the activations constrained, improves training convergence; from NVIDIA Was pointed to this paper by lucidrains arxiv.org/abs/2410.01131

thumb_up_off_alt479

chat_bubble_outline10

repeat82

shareShare

🧙 Come be my colleague! We have TWO Assistant Professor positions that might be of particular interest to folks in my reach, in Linguistics and CS. These hires are parts of an AI cluster hiring initiative led by BU Computing & Data Sciences (CDS). More below 👇

thumb_up_off_alt80

chat_bubble_outline2

repeat29

shareShare

Oleksii Kuchaiev

@kuchaev

9 months ago

Llama-3.1-Nemotron-70B-Instruct model aligned by our team is now live on lmarena.ai leaderboard with overall rank 9. Everything used to create this model is public: code, data and reward model. HF checkpoint: huggingface.co/nvidia/Llama-3…

thumb_up_off_alt92

chat_bubble_outline6

repeat20

shareShare

Freda Shi

@fredahshi

9 months ago

I’d always be proud of receiving my PhD from TTIC, a magic place which gives you the most unique (in a positive sense, of course!) experience among all PhD programs. Do apply to TTIC !

thumb_up_off_alt42

chat_bubble_outline0

repeat2

shareShare

Shubham Toshniwal

@shubhamtoshniw6

9 months ago

"A research team led by neurobiologist Margaret Livingstone trained three rhesus macaques to identify symbols representing the numbers zero to 25. They then taught the test subjects how to perform addition.... According to the study, all three monkeys were on average capable of

thumb_up_off_alt6

chat_bubble_outline1

repeat0

shareShare

Sean Welleck

@wellecks

8 months ago

Check out our new benchmark for an increasingly important capability: generating synthetic data Among other insights, it turned out that the best problem solver was indeed not always the best teacher!

thumb_up_off_alt21

chat_bubble_outline0

repeat4

shareShare

(((ل()(ل() 'yoav))))👾

@yoavgo

8 months ago

Sasha Rush or maybe in other words: i feel that with DL, our previous NLP training was helpful and allowed us to identify opportunities. with LLMs, it was the exact opposite, it blocked/hid opportunities from us.

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

NVIDIA AI Developer

@nvidiaaidev

4 months ago

🎉 Huge congrats to our NVIDIA team “NemoSkills” for winning the AIMO-2 Competition 🏆 on @Kaggle. Their system solved 34 out of 50 problems in just 5 hours using 4 L4 GPUs. 🔢✨⏱️ kaggle.com/competitions/a… How? A powerhouse squad—Christof Henkel, Darragh Hanley, Ivan Sorokin,

thumb_up_off_alt110

chat_bubble_outline7

repeat20

shareShare

Darragh

@gonedarragh

4 months ago

Our team, NemoSkills, is presumptive winner of AIMO2. Outstanding organization from AIMO, Kaggle, XTX markets, Simon Frieder. Stay tuned for more performance evaluations currently underway. #NVIDIA - Ivan Moshkov Shubham Toshniwal Igor Gitman Dieter, IvanSorokin, BenediktSchifferer

Our team, NemoSkills, is presumptive winner of AIMO2. Outstanding organization from AIMO, Kaggle, XTX markets, <a href="/friederrrr/">Simon Frieder</a>. Stay tuned for more performance evaluations currently underway. #NVIDIA - <a href="/i_vainn/">Ivan Moshkov</a> <a href="/ShubhamToshniw6/">Shubham Toshniwal</a> <a href="/igtmn/">Igor Gitman</a> <a href="/kagglingdieter/">Dieter</a>, IvanSorokin, BenediktSchifferer

thumb_up_off_alt81

chat_bubble_outline5

repeat11

shareShare

Darragh

@gonedarragh

3 months ago

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset abs: arxiv.org/abs/2504.16891 ‼️💹New 5.5M solution math reasoning dataset ‼️📈New models 1.5B/7B/14B/32B+ AIMO2-14b So much learning from this team & #aimoprize!

thumb_up_off_alt68

chat_bubble_outline1

repeat15

shareShare

Dieter

@kagglingdieter

3 months ago

Happy to announce that we published our 🥇 1st place winning model for the AI Math Olympiad (and smaller/ bigger variants) on Hugging Face Even our tiny 1.5B version beats the mighty DeepSeek-R1 on AIME math benchmark 🦾 huggingface.co/collections/nv…

thumb_up_off_alt370

chat_bubble_outline9

repeat53

shareShare

Wei Ping

@_weiping

3 months ago

Introducing AceMath-RL-Nemotron-7B, an open math model trained with reinforcement learning from the SFT-only checkpoint: Deepseek-R1-Distilled-Qwen-7B. It achieves: - AIME24: 69.0 (+13.5 gain by RL) - AIME25: 53.6 (+14.4) - LiveCodeBench: 44.4 (surprisingly, +6.8 gain after

thumb_up_off_alt75

chat_bubble_outline1

repeat23

shareShare

Darragh

@gonedarragh

3 months ago

🤩 🤩 AIMO2 keeps improving #aimoprize Igor Gitman Shubham Toshniwal Ivan Moshkov Dieter

thumb_up_off_alt29

chat_bubble_outline0

repeat6

shareShare

Vaibhav (VB) Srivastav

@reach_vb

3 months ago

NVIDIA just open sourced Open Code Reasoning models - 32B, 14B AND 7B - APACHE 2.0 licensed 🔥 > Beats O3 mini & O1 (low) on LiveCodeBench 😍 Backed by OCR dataset the models are 30% token efficient than other equivalent Reasoning models Works with llama.cpp, vLLM,

thumb_up_off_alt748

chat_bubble_outline17

repeat140

shareShare

Somshubra Majumdar

@haseox94

3 months ago

We finally (!) released all our SOTA Code Reasoning models ! Play around with them and get Better scores than QwQ* with 20-30% fewer tokens ! Maybe even useful for code reasoning synthetic data generation? *With caveats (only code task, on average of 64 runs :D)

thumb_up_off_alt9

chat_bubble_outline1

repeat3

shareShare

Oleksii Kuchaiev

@kuchaev

3 months ago

NeMo RL is now open source! It replaces NeMo-Aligner and is the toolkit we use to post train next generations of our models. Give it a try github.com/NVIDIA/NeMo-RL

thumb_up_off_alt393

chat_bubble_outline4

repeat65

shareShare

Jason Weston

@jaseweston

3 months ago

🚨Announcing RAM 2 workshop @ COLM25 - call for papers🚨 - 10 years on, we present the sequel to the classic RAM🐏 (Reasoning, Attention, Memory) workshop that took place in 2015 at the cusp of major change in the area. Now in 2025 we reflect on what's happened and discuss the

thumb_up_off_alt111

chat_bubble_outline2

repeat29

shareShare

Shubham Toshniwal

@shubhamtoshniw6

2 months ago

What a blog! Need more such checks.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Shubham Toshniwal

Gate.io

Tanishq Mathew Abraham, Ph.D.

Najoung Kim 🫠

Oleksii Kuchaiev

Freda Shi

Shubham Toshniwal

Sean Welleck

(((ل()(ل() 'yoav))))👾

NVIDIA AI Developer

Darragh

Darragh

Dieter

Wei Ping

Darragh

Vaibhav (VB) Srivastav

Somshubra Majumdar

Oleksii Kuchaiev

Jason Weston

Shubham Toshniwal