Prithviraj (Raj) Ammanabrolu (@rajammanabrolu) 's Twitter Profile
Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

Interactive & grounded AI, RL, NLP. Assistant Prof @UCSanDiego. Research Scientist @DbrxMosaicAI. Prev: @allen_ai, @GeorgiaTech

ID: 1115817574078582785

linkhttps://prithvirajva.com calendar_today10-04-2019 03:23:35

2,2K Tweet

6,6K Followers

593 Following

Pratyush Maini (@pratyushmaini) 's Twitter Profile Photo

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today DatologyAI shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼‍🍳 - 3B LLMs beat 8B models🚀 - Pareto frontier for performance

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today <a href="/datologyai/">DatologyAI</a> shares BeyondWeb, our synthetic data approach &amp; all the learnings from scaling it to trillions of tokens🧑🏼‍🍳
- 3B LLMs beat 8B models🚀
- Pareto frontier for performance
Oleksii Kuchaiev (@kuchaev) 's Twitter Profile Photo

We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: huggingface.co/collections/nv…

We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: huggingface.co/collections/nv…
Prithviraj (Raj) Ammanabrolu (@rajammanabrolu) 's Twitter Profile Photo

conference pc emails are weirdly passive aggressive "we are compiling a list" "there will be consequences" "as per your agreement" like chill this is a free service im doing late at night on my own time after my other two actually paid jobs, the world isn't ending in a day

Prithviraj (Raj) Ammanabrolu (@rajammanabrolu) 's Twitter Profile Photo

Oh hey so we did do this a few years ago for grade school science. Would love to see ppl get more of these Text Worlds up and running! arxiv.org/abs/2203.07540

Oh hey so we did do this a few years ago for grade school science. Would love to see ppl get more of these Text Worlds up and running!
arxiv.org/abs/2203.07540
Prithviraj (Raj) Ammanabrolu (@rajammanabrolu) 's Twitter Profile Photo

special place in hell for swes who don't understand backward compatibility. bruh wydm you made server side changes that broke my code from last week without updating your api ???

Yejin Choi (@yejinchoinka) 's Twitter Profile Photo

Honored to be back on TIME100 AI for 2025 — alongside my longtime heroes Fei-Fei Li and Regina Barzilay! 😍 The recognition goes to my amazing students and colleagues, who strive to find ways to use AI to better humanity, as opposed to making AI for the sake of making AI better

Banghua Zhu (@banghuaz) 's Twitter Profile Photo

Our team at Nvidia is hiring for full time! Please contact us if you're interested in working on LLM / DLM post-training and system optimization!

Eric W. Tramel (@fujikanaeda) 's Twitter Profile Photo

if you've ever used a nemotron model (esp nano v2 or llama nemotron super v1.5), looking to know what kinds of tasks didn't work for you in practice. Can DM or follow up here. we don't make user-facing ai apps, just open models, so I'm polling X in lieu of telemetry :)

Prithviraj (Raj) Ammanabrolu (@rajammanabrolu) 's Twitter Profile Photo

If a PhD student in the US gets an offer to intern at a Chinese frontier lab, we should be paying them to go and come back. Why are we making it so difficult???