Ronak Pradeep (@rpradeep42) 's Twitter Profile
Ronak Pradeep

@rpradeep42

PhD at @UWaterloo. LLMs + IR. Research interns @Apple @GoogleAI. Building @TREC_RAG.
There is no dark side in the moon, really. Matter of fact, it's all dark.

ID: 1145385790203211776

calendar_today30-06-2019 17:37:07

281 Tweet

597 Followers

559 Following

Ronak Pradeep (@rpradeep42) 's Twitter Profile Photo

4 weeks since launch & we Yupp have gathered 2M+ preference data on 500+ models. Building a leaderboard capturing the nuances of the global community has been loads of fun. Check out the thread! Onwards🚀

Shivani Upadhyay (@ushivani3) 's Twitter Profile Photo

📢📢RAG 2025 topics are officially now released! 🔍Test narratives are out now (total 105): trec-rag.github.io/annoucements/2… Let the games begin! #TREC2025 #RAG

张大珂 ZHANG Dake (@zhangdake1998) 's Twitter Profile Photo

We use the same web collection as the TREC RAG Track. You can easily adapt your RAG systems for our track to see its performance in helping people better understand daily news.

Ronak Pradeep (@rpradeep42) 's Twitter Profile Photo

We've onboarded the Gemini 2.5 Flash-Lite along with variants (Thinking, Online, etc.) super quick on Yupp and are already gathering preferences! Check out the thread for more. Here's a fun comparison of the thinking variant (left) with the standard one!

Ronak Pradeep (@rpradeep42) 's Twitter Profile Photo

We are out with the official baselines for TREC RAG @ 2025 this year: github.com/castorini/ragn… Shivani Upadhyay and I had fun putting together a strong Retrieve (Pyserini) -> Rerank (RankLLM) -> Augmented Gen (Ragnarök) baseline and we hope to see you all beat it!

vinh q. tran (@vqctran) 's Twitter Profile Photo

Excited to see this go out and see it used beyond IMO -- congrats to the team!! Happy to have contributed some research to this model with Yi Tay and Steven Zheng :D

Josh McGrath (@j_mcgraph) 's Twitter Profile Photo

Along with GPT5, we're open sourcing a new eval, BrowseComp Long Context! It improves upon existing long context qa evals in data quality and input difficulty. Work with Kuo Lin, Julie Wang, and our mascot the longham. A bit more below

Along with GPT5, we're open sourcing a new eval, BrowseComp Long Context!

It improves upon existing long context qa evals in  data quality and input difficulty. Work with <a href="/LK112358/">Kuo Lin</a>, <a href="/julieswangg/">Julie Wang</a>, and our mascot the longham. 

A bit more below
Ronak Pradeep (@rpradeep42) 's Twitter Profile Photo

Did I say four? Thirteen (: Standard, High, Low, Minimal Reasoning variants for each of GPT-5, mini, and nano! Here's a case where more reasoning definitely helps. Check out yupp.ai/chat/9491cc19-… and the songs!

Did I say four? Thirteen (: 

Standard, High, Low, Minimal Reasoning variants for each of GPT-5, mini, and nano! Here's a case where more reasoning definitely helps.

Check out yupp.ai/chat/9491cc19-… and the songs!
Aditya Jayaprakash (@adijayaprakash) 's Twitter Profile Photo

We’ve raised our $10M Series A, led by Google Ventures. 18 months ago, when we started Blacksmith, building a CI cloud purpose-built to run CI workloads as fast as possible seemed like a pipe dream to us. It’s reasonable to say that we’ve made that a reality since. To give

Ronak Pradeep (@rpradeep42) 's Twitter Profile Photo

We Yupp just shipped Help Me Chose 🚀 Now LLMs don’t just respond, they self-critique & cross-check each other 🤖⚔️🤖 At day's end, you’re the arbiter of your own taste! Fun example where OpenAI's GPT 5 & xAI's Grok 4 go at it & learn from the each other (AND SO DO YOU!).

We <a href="/yupp_ai/">Yupp</a>  just shipped Help Me Chose 🚀
Now LLMs don’t just respond, they self-critique &amp; cross-check each other 🤖⚔️🤖
At day's end, you’re the arbiter of your own taste!
Fun example where <a href="/OpenAI/">OpenAI</a>'s GPT 5 &amp; <a href="/xai/">xAI</a>'s Grok 4 go at it &amp; learn from the each other (AND SO DO YOU!).