Swaroop Nath (@swaroopnath6) 's Twitter Profile
Swaroop Nath

@swaroopnath6

Pre-Doctoral Researcher @GoogleDeepMind India | Ex-AI Researcher @LinkedIn | NLP @cfiltnlp @CSE_IITBombay
Tweets about RL, NLP, RLHF, and general AI-ML

ID: 1243052464992964608

calendar_today26-03-2020 05:50:03

544 Tweet

460 Followers

145 Following

Kyle Corbitt (@corbtt) 's Twitter Profile Photo

If you're fine-tuning LLMs, Gemma 3 is the new 👑 and it's not close. Gemma 3 trounces Qwen/Llama models at every size! - Gemma 3 4B beats 7B/8B competition - Gemma 3 27B matches 70B competiton Vision benchmarks coming soon!

If you're fine-tuning LLMs, Gemma 3 is the new 👑 and it's not close. Gemma 3 trounces Qwen/Llama models at every size!
 - Gemma 3 4B beats 7B/8B competition
 - Gemma 3 27B matches 70B competiton

Vision benchmarks coming soon!
Andrew White 🐦‍⬛ (@andrewwhite01) 's Twitter Profile Photo

HLE has recently become the benchmark to beat for frontier agents. We FutureHouse took a closer look at the chem and bio questions and found about 30% of them are likely invalid based on our analysis and third-party PhD evaluations. 1/7

Roberta Raileanu (@robertarail) 's Twitter Profile Photo

I’m building a new team at Google DeepMind to work on Open-Ended Discovery! We’re looking for strong Research Scientists and Research Engineers to help us push the frontier of autonomously discovering novel artifacts such as new knowledge, capabilities, or algorithms, in an

Dimitris Papailiopoulos (@dimitrispapail) 's Twitter Profile Photo

Excited about our new work: Language models develop computational circuits that are reusable AND TRANSFER across tasks. Over a year ago, I tested GPT-4 on 200 digit addition, and the model managed to do it (without CoT!). Someone from OpenAI even clarified they NEVER trained

Excited about our new work: 
Language models develop  computational circuits that are reusable AND TRANSFER across tasks.
Over a year ago, I tested GPT-4 on 200 digit addition, and the model managed to do it (without CoT!). Someone from OpenAI even clarified they NEVER trained
Sahil Goyal (@sahilgo6801) 's Twitter Profile Photo

Earlier, I curated this list of resources for the niche field of "Aesthetic Assessment of Graphic Designs". github.com/sahilg06/Aweso… I try to update it, as I think this area has good directions for future research and is very underexplored.

Swaroop Nath (@swaroopnath6) 's Twitter Profile Photo

The lesson of the week is: Slow is smooth, smooth is fast. Probably one of the best productivity tips I have incorporated this year 🤩

Swaroop Nath (@swaroopnath6) 's Twitter Profile Photo

"There's no language out there in nature"? A bit unbelievable! Language, or rather communication, forms the basis of collective intelligence. What I do agree on that next-token prediction is probably not teaching the model actual procedural knowledge.

Raj Dabre (@prajdabre1) 's Twitter Profile Photo

Professor Pushpak Bhattacharyya passed away this morning. This world has lost a great human being and a researcher. May he rest in peace.

Professor Pushpak Bhattacharyya passed away this morning. This world has lost a great human being and a researcher. May he rest in peace.