Harman Singh (@harman26singh) 's Twitter Profile
Harman Singh

@harman26singh

Researcher @GoogleDeepMind Prev: AI Resident @MetaAI, Undergrad @iitdelhi, INK Lab @CSatUSC, @IBMResearch. language, vision, reasoning

ID: 1133860417602772993

linkhttps://harmandotpy.github.io/ calendar_today29-05-2019 22:19:24

465 Tweet

844 Takipçi

1,1K Takip Edilen

JosH100 (@josh_wills) 's Twitter Profile Photo

1/ Really looking forward to #PytorchConf this week in SF-- I've spent the last couple of months at DatologyAI immersed in the DataLoader ecosystem (especially for our VLM stack) and I have a few topics I would love to discuss with folks (DMs are open, say hi if you see me, etc.

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Avijit Thawani (Avi) Haha. I am afraid people interpreted my “delete tokenizer” as “use bytes directly without BPE”, the issue is you *still* need bytes encoding arbitrariness even for that! Pixels is the only way. Just like humans. It is written. If GPT-10 uses utf8 at the input I will eat a shoe.

Rachit Bansal (@rach_it_) 's Twitter Profile Photo

Excited to share one of the first projects from my PhD! We find that Adam (often seen as approximate second-order) can actually outperform Gauss-Newton (true second-order) in certain cases! Our 2x2 comparison across basis choice and gradient noise is revealing! Thread by Sham:

Mohammed Safi Ur Rahman Khan (@safikhan2k) 's Twitter Profile Photo

Grateful to be named a recipient of the Google PhD Fellowship 2025 under the NLP track! Thanks to Google and my wonderful AI4Bharat family for making this journey so special.

Gabriele Berton (@gabriberton) 's Twitter Profile Photo

This is really crazy Yet another work showing that in-context learning on SOTA MLLMs (Gemini 2.5 Pro) not only does not help, but even hurts results! ICL on MLLMs is very much an open problem, and the biggest differentiator between LLMs and MLLMs [1/3]

This is really crazy

Yet another work showing that in-context learning on SOTA MLLMs (Gemini 2.5 Pro) not only does not help, but even hurts results!

ICL on MLLMs is very much an open problem, and the biggest differentiator between LLMs and MLLMs [1/3]
Miles Brundage (@miles_brundage) 's Twitter Profile Photo

Seems like the whole "RL just surfaces intelligence, it doesn't increase it" series of papers is just an artifact of RL being a small fraction of compute in most LM contexts still, no? AlphaGo (etc.) shows quite clearly that there is nothing to this as a general matter

Sneha Kudugunta (@snehaark) 's Twitter Profile Photo

A new, tractable approach to study scaling laws for larger data mixtures compared to prior art. We achieve significantly better fit ($R^2=0.98$) on multilingual data mixtures with ~50 languages.

A new, tractable approach to study scaling laws for larger data mixtures compared to prior art. We achieve significantly better fit ($R^2=0.98$) on multilingual data mixtures with ~50 languages.
rishabh ranjan (@_rishabhranjan_) 's Twitter Profile Photo

Transformers are great for sequences, but most business-critical predictions (e.g. product sales, customer churn, ad CTR, in-hospital mortality) rely on highly-structured relational data where signal is scattered across rows, columns, linked tables and time. Excited to finally

Transformers are great for sequences, but most business-critical predictions (e.g. product sales, customer churn, ad CTR, in-hospital mortality) rely on highly-structured relational data where signal is scattered across rows, columns, linked tables and time.
Excited to finally
Gowthami Somepalli (@gowthami_s) 's Twitter Profile Photo

> be me > come across a paper with interesting premise. > excitedly start reading > claims are on mnist/cifar > all excitement gone, reduced to atoms

Srinivas Narayanan (@snsf) 's Twitter Profile Photo

IndQA is a new benchmark designed to evaluate how well AI systems understand culture, context and history to answer questions that matter to people in India. With 2278 questions created in partnership with 250+ experts, IndQA dives deep into reasoning about everyday life,

Harman Singh (@harman26singh) 's Twitter Profile Photo

Exciting to see much-needed progress on evaluating Indic language/culture understanding! IndicGenBench shared these motivations and is one of the first generative evals for 29 Indic Languages! x.com/Harman26Singh/… Partha Talukdar Nitish Gupta

Harman Singh (@harman26singh) 's Twitter Profile Photo

Online Rubrics for non-verifiable domains reduce reward hacking. Cool co-training of generator and critic simultaneously! In our previous work, we show rubrics help in making more robust reward models: x.com/Harman26Singh/…

Mian Wu (@merlinnoth79247) 's Twitter Profile Photo

Can we run RL to train LLMs on hard-to-verify or open-ended tasks? Even when tasks are verifiable, it is often impossible to check every design detail or catch all mistakes.. We can go prompt-tune LLM judges, but is that really the answer? Our new paper introduces RLAC: a

Sumanth (@sumanthd17) 's Twitter Profile Photo

here's a sneak peak into my life. one full year filled a lot of moments. (Comes with its own hot takes) thanks to P and the life guard, I get to write this today! read the full article (in 🧵👇)

here's a sneak peak into my life. one full year filled a lot of moments. (Comes with its own hot takes)

thanks to P and the life guard, I get to write this today!

read the full article (in 🧵👇)
Aditya Grover (@adityagrover_) 's Twitter Profile Photo

In many ways, (continuous) diffusion models are in-place reasoners where the quality improves with more denoising steps. Lately, we have been extending this to language, combining RLVR with discrete diffusion, resulting in d1 (arxiv.org/abs/2504.12216, NeurIPS2025 spotlight).

Lakshya A Agrawal (@lakshyaaagrawal) 's Twitter Profile Photo

GEPA featured in OpenAI and Bain & Company new cookbook tutorial, showing how to build self-evolving agents that move beyond static prompts. See how GEPA dynamically enables agents to autonomously reflect, learn from feedback, and evolve their own instructions.

GEPA featured in <a href="/OpenAI/">OpenAI</a> and <a href="/BainandCompany/">Bain & Company</a>  new cookbook tutorial, showing how to build self-evolving agents that move beyond static prompts.

See how GEPA dynamically enables agents to autonomously reflect, learn from feedback, and evolve their own instructions.
Pang Wei Koh (@pangweikoh) 's Twitter Profile Photo

Two ideas here for scaling up RL for reasoning: 1. Procedurally generating (verifiable) problems lets us adapt difficulty to the model, making training more efficient 2. Teaching the model to reason by hand (e.g., sort numbers w/o code) generalizes to realistic reasoning tasks!