Sanjeev Satheesh (@issanjeev) Twitter Tweets • TwiCopy

clem 🤗

5 months ago

Nvidia is currently #1 trending open model & #1 trending open dataset and closed to 25,000 followers on Hugging Face. They've been really impactful for open-source AI recently!

thumb_up_off_alt408

chat_bubble_outline13

repeat58

shareShare

Year Progress

@year_progress

4 months ago

2025 is 43% complete.

thumb_up_off_alt3,3K

chat_bubble_outline15

repeat613

shareShare

Transformers are still dominating the LLM scene but we show that higher throughput alternatives exist which are just as strong! Grateful to have a part in Nemotron-H Reasoning effort. 🙏 Technical report will be out soon, stay tuned!

thumb_up_off_alt34

chat_bubble_outline1

repeat7

shareShare

Lucas Beyer (bl16)

@giffmana

3 months ago

Oh wow, did you guys know that torch.compile can compile numpy code? And even run it on GPU? This is pretty neat for all kinds of "surrounding" code besides the model (like evals and fancy metrics) that I used to do with numba/numexpr (cuz CPU-XLA was pretty meh). Poll below

thumb_up_off_alt949

chat_bubble_outline43

repeat66

shareShare

Somshubra Majumdar

@haseox94

3 months ago

OpenReasoning-Nemotron are super strong on math science and code, achieving scores that often surpass models way larger in size. We also show that using GenSelect algorithm, you can perform test time compute scaling to significantly improve scores on all benchmarks

thumb_up_off_alt15

chat_bubble_outline1

repeat3

shareShare

Midnight Maniac Sri

@sridatta

3 months ago

'water is transparent only within a very narrow band of the electromagnetic spectrum, so living organisms evolved sensitivity to that band, and that's what we now call "visible light". ' (found via HN)

thumb_up_off_alt45,45K

chat_bubble_outline392

repeat4,4K

shareShare

Bryan Catanzaro

@ctnzr

2 months ago

Today we're releasing NVIDIA Nemotron Nano v2 - a 9B hybrid SSM that is 6X faster than similarly sized models, while also being more accurate. Along with this model, we are also releasing most of the data we used to create it, including the pretraining corpus. Links to the

thumb_up_off_alt1,1K

chat_bubble_outline37

repeat228

shareShare

Oleksii Kuchaiev

@kuchaev

2 months ago

We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: huggingface.co/collections/nv…

thumb_up_off_alt286

chat_bubble_outline9

repeat54

shareShare

Eric W. Tramel

@fujikanaeda

2 months ago

we got you a fast ssm 9b post we got you a 9b base we got you a 12b base teacher we got you 5 datasets we got you 💚 research.nvidia.com/labs/adlr/NVID…

thumb_up_off_alt228

chat_bubble_outline11

repeat15

shareShare

Rohan Paul

@rohanpaul_ai

2 months ago

NVIDIA just released Nemotron Nano v2 - a 9B hybrid SSM (Mamba) that is 6X faster than similarly sized models, while also being more accurate. Ready for commercial use, all available on Hugging Face 💾 Nemotron Nano 2 is a 9B hybrid Mamba Transformer for fast reasoning, up to

thumb_up_off_alt106

chat_bubble_outline4

repeat16

shareShare

AK

@_akhaliq

2 months ago

NVIDIA Nemotron Nano 2 An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

thumb_up_off_alt168

chat_bubble_outline6

repeat38

shareShare

Bryan Catanzaro

@ctnzr

a month ago

We've been making a lot of progress on LLM pretraining in NVFP4: developer.nvidia.com/blog/nvfp4-tra…

thumb_up_off_alt155

chat_bubble_outline2

repeat23

shareShare

ADP Research

@adpresearch

a month ago

It’s true: The rise of #AI is impacting jobs, and for tenured workers it signals a benefit. But at least for one very specific group of workers. Using ADP payroll data, Erik Brynjolfsson and his team at Stanford’s Stanford Digital Economy Lab found that employment among young adults whose work is exposed

thumb_up_off_alt13

chat_bubble_outline1

repeat3

shareShare

Bryan Catanzaro

@ctnzr

a month ago

As part of Nemotron, we're releasing a new Math dataset, made by rendering webpages using Lynx and then using an LLM to rewrite the result into LaTeX. Our models got much better at math when we started using this dataset. We hope it's helpful to the community. 💚

thumb_up_off_alt226

chat_bubble_outline10

repeat34

shareShare

Sanjeev Satheesh

@issanjeev

a month ago

Nemotron-CC-Math is a 133B-token *benchmark-agnostic, pretraining dataset* built entirely from CommonCrawl. huggingface.co/datasets/nvidi…

thumb_up_off_alt19

chat_bubble_outline0

repeat1

shareShare

Chris Alexiuk 🇨🇦

@llm_wizard

a month ago

NVIDIA Nemotron Office Hours *tonight* LMK any last minute questions you'd like to ask about our open model releases!

thumb_up_off_alt9

chat_bubble_outline1

repeat2

shareShare

Syeda Nahida Akter

@snat02792153

9 days ago

✨ Core Idea: Treat chain-of-thought (CoT) as an action before next-token prediction. Reward = information gain on the next token. ✅ Dense, verifier-free signal ✅ Works and scales with ordinary pretraining text

thumb_up_off_alt8

chat_bubble_outline1

repeat3

shareShare

Shrimai

@shrimai_

9 days ago

💫 Introducing RLP: Reinforcement Learning Pretraining—information-driven, verifier-free objective that teaches models to think before they predict 🔥+19% vs BASE on Qwen3-1.7B 🚀+35% vs BASE on Nemotron-Nano-12B 📄Paper: github.com/NVlabs/RLP/blo… 📝Blog: research.nvidia.com/labs/adlr/RLP/

thumb_up_off_alt167

chat_bubble_outline2

repeat30

shareShare

AK

@_akhaliq

6 days ago

Nvidia presents RLP Reinforcement as a Pretraining Objective

thumb_up_off_alt97

chat_bubble_outline3

repeat19

shareShare

DailyPapers

@huggingpapers

3 days ago

ServiceNow's Apriel-1.5-15B-Thinker: Frontier AI on a single GPU This 15B-parameter open-weights multimodal model achieves state-of-the-art reasoning performance, matching models 8-10x its size—all without an RL phase!

thumb_up_off_alt155

chat_bubble_outline3

repeat23

shareShare

Sanjeev Satheesh

clem 🤗

Year Progress

Adi Renduchintala

Lucas Beyer (bl16)

Somshubra Majumdar

Midnight Maniac Sri

Bryan Catanzaro

Oleksii Kuchaiev

Eric W. Tramel

Rohan Paul

AK

Bryan Catanzaro

ADP Research

Bryan Catanzaro

Sanjeev Satheesh

Chris Alexiuk 🇨🇦

Syeda Nahida Akter

Shrimai

AK

DailyPapers