Wei Ping (@_weiping) 's Twitter Profile
Wei Ping

@_weiping

Distinguished Research Scientist @NVIDIA, working on large language models.
Views my own.

ID: 1273059094325153793

linkhttps://wpingnet.github.io/ calendar_today17-06-2020 01:05:33

271 Tweet

2,2K Takipçi

287 Takip Edilen

Qwen (@alibaba_qwen) 's Twitter Profile Photo

Introducing Qwen3! We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general

Introducing Qwen3! 

We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general
NVIDIA AI Developer (@nvidiaaidev) 's Twitter Profile Photo

📣 Introducing AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning (RL) Starting from the SFT model DeepSeek-R1-Distill-Qwen-14B, our AceReason-Nemotron-14B achieves substantial improvements in pass@1 accuracy on key benchmarks through RL: AIME

📣 Introducing AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning (RL)

Starting from the SFT model DeepSeek-R1-Distill-Qwen-14B, our AceReason-Nemotron-14B achieves substantial improvements in pass@1 accuracy on key benchmarks through RL:

AIME
Yang Chen (@ychennlp) 's Twitter Profile Photo

with just math-RL, AceReason-Nemotron-14B surpass DeepCoder-14B on LiveCodeBench v5. we then did code-RL and found training becomes so much easier

with just math-RL, AceReason-Nemotron-14B surpass DeepCoder-14B on LiveCodeBench v5. 
we then did code-RL and found training becomes so much easier
AK (@_akhaliq) 's Twitter Profile Photo

Nvidia just dropped AceReason-Nemotron on Hugging Face Advancing Math and Code Reasoning through Reinforcement Learning

Nvidia just dropped AceReason-Nemotron on Hugging Face

Advancing Math and Code Reasoning through Reinforcement Learning
Wei Ping (@_weiping) 's Twitter Profile Photo

Pass@1024 results of our RL model (AceReason-Nemotron-7B) and its starting SFT model (DeepSeek-R1-Distill-Qwen-7B) on LiveCodeBench-v6, which features a large answer space and high-quality test cases that are difficult to solve through 'guessing', even with extensive sampling.

Pass@1024 results of our RL model (AceReason-Nemotron-7B) and its starting SFT model (DeepSeek-R1-Distill-Qwen-7B) on LiveCodeBench-v6, which features a large answer space and high-quality test cases that are difficult to solve through 'guessing',  even with extensive sampling.
Oleksii Kuchaiev (@kuchaev) 's Twitter Profile Photo

New reasoning Nemotron-H models are now publicly available. These models are based on hybrid architecture! 47B and 8B in BF16 and FP8. Blogpost: developer.nvidia.com/blog/nemotron-… Weights: huggingface.co/collections/nv…

Max Zhaoshuo Li 李赵硕 (@mli0603) 's Twitter Profile Photo

Cosmos-Reason1 has exciting updates 💡 Now it understands physical reality — judging videos as real or fake! Check out the resources👇 Paper: arxiv.org/abs/2503.15558 Huggingface: huggingface.co/nvidia/Cosmos-… Code: github.com/nvidia-cosmos/… Project page: research.nvidia.com/labs/dir/cosmo… (1/n)

Yang Chen (@ychennlp) 's Twitter Profile Photo

📢We conduct a systematic study to demystify the synergy between SFT and RL for reasoning models. The result? We trained a 7B model - AceReason-Nemotron-1.1, significantly improved from version 1.0 on math and coding benchmarks. ✅AIME2025 (math): 53.6% -> 64.8% ✅LiveCodeBench

📢We conduct a systematic study to demystify the synergy between SFT and RL for reasoning models.

The result? We trained a 7B model - AceReason-Nemotron-1.1, significantly improved from version 1.0 on math and coding benchmarks.

✅AIME2025 (math): 53.6% -> 64.8%
✅LiveCodeBench
Wei Ping (@_weiping) 's Twitter Profile Photo

Introducing AceReason-Nemotron 1.1 Our previous release, AceReason-Nemotron-1.0, introduced a stage-wise RL recipe that was applied sequentially to math-only and code-only prompts, demonstrating both high efficiency and strong effectiveness. Here, we systematically investigate

Introducing AceReason-Nemotron 1.1

Our previous release, AceReason-Nemotron-1.0, introduced a stage-wise RL recipe that was applied sequentially to math-only and code-only prompts, demonstrating both high efficiency and strong effectiveness.
Here, we systematically investigate
Zhuolin Yang (@lucas110550) 's Twitter Profile Photo

Our released evaluation toolkit can reproduce our AceReason-Nemotron models numbers (see below): AceReason-Nemotron-1.0-7B: LiveCodeBench (Avg@8): * [05/23-05/24]: 72.0; [06/24-01/25]: 54.2 * release set v5: 51.2; release set v6: 44.4 AIME (Avg@64): * AIME'24: 68.6; AIME'25:

Rafael Valle (@rafaelvalleart) 's Twitter Profile Photo

🤯 Audio Flamingo 3 is out already... and that's before Audio Flamingo 2 makes its debut at ICML on Wednesday, July 16 at 4:30 p.m.! These benchmark results are insane! arxiv.org/abs/2507.08128

🤯 Audio Flamingo 3 is out already... and that's before Audio Flamingo 2 makes its debut at ICML on Wednesday, July 16 at 4:30 p.m.!

These benchmark results are insane!
arxiv.org/abs/2507.08128