Harman Singh (@harman26singh) Twitter Tweets • TwiCopy

JosH100

23 days ago

1/ Really looking forward to #PytorchConf this week in SF-- I've spent the last couple of months at DatologyAI immersed in the DataLoader ecosystem (especially for our VLM stack) and I have a few topics I would love to discuss with folks (DMs are open, say hi if you see me, etc.

thumb_up_off_alt69

chat_bubble_outline2

repeat14

shareShare

Andrej Karpathy

@karpathy

23 days ago

Avijit Thawani (Avi) Haha. I am afraid people interpreted my “delete tokenizer” as “use bytes directly without BPE”, the issue is you *still* need bytes encoding arbitrariness even for that! Pixels is the only way. Just like humans. It is written. If GPT-10 uses utf8 at the input I will eat a shoe.

thumb_up_off_alt935

chat_bubble_outline41

repeat41

shareShare

Rachit Bansal

@rach_it_

22 days ago

Excited to share one of the first projects from my PhD! We find that Adam (often seen as approximate second-order) can actually outperform Gauss-Newton (true second-order) in certain cases! Our 2x2 comparison across basis choice and gradient noise is revealing! Thread by Sham:

thumb_up_off_alt110

chat_bubble_outline2

repeat13

shareShare

Mohammed Safi Ur Rahman Khan

@safikhan2k

20 days ago

Grateful to be named a recipient of the Google PhD Fellowship 2025 under the NLP track! Thanks to Google and my wonderful AI4Bharat family for making this journey so special.

thumb_up_off_alt36

chat_bubble_outline4

repeat3

shareShare

Gabriele Berton

@gabriberton

19 days ago

This is really crazy Yet another work showing that in-context learning on SOTA MLLMs (Gemini 2.5 Pro) not only does not help, but even hurts results! ICL on MLLMs is very much an open problem, and the biggest differentiator between LLMs and MLLMs [1/3]

thumb_up_off_alt364

chat_bubble_outline11

repeat35

shareShare

Rishabh Agarwal

@agarwl_

16 days ago

Yeah, easy to implement on-policy distillation in any existing RL framework

thumb_up_off_alt286

chat_bubble_outline5

repeat18

shareShare

Miles Brundage

@miles_brundage

16 days ago

Seems like the whole "RL just surfaces intelligence, it doesn't increase it" series of papers is just an artifact of RL being a small fraction of compute in most LM contexts still, no? AlphaGo (etc.) shows quite clearly that there is nothing to this as a general matter

thumb_up_off_alt116

chat_bubble_outline16

repeat1

shareShare

Sneha Kudugunta

@snehaark

16 days ago

A new, tractable approach to study scaling laws for larger data mixtures compared to prior art. We achieve significantly better fit ($R^2=0.98$) on multilingual data mixtures with ~50 languages.

thumb_up_off_alt9

chat_bubble_outline1

repeat7

shareShare

rishabh ranjan

@_rishabhranjan_

15 days ago

Transformers are great for sequences, but most business-critical predictions (e.g. product sales, customer churn, ad CTR, in-hospital mortality) rely on highly-structured relational data where signal is scattered across rows, columns, linked tables and time. Excited to finally

thumb_up_off_alt130

chat_bubble_outline4

repeat37

shareShare

Gowthami Somepalli

@gowthami_s

10 days ago

> be me > come across a paper with interesting premise. > excitedly start reading > claims are on mnist/cifar > all excitement gone, reduced to atoms

thumb_up_off_alt141

chat_bubble_outline10

repeat4

shareShare

Srinivas Narayanan

@snsf

9 days ago

IndQA is a new benchmark designed to evaluate how well AI systems understand culture, context and history to answer questions that matter to people in India. With 2278 questions created in partnership with 250+ experts, IndQA dives deep into reasoning about everyday life,

thumb_up_off_alt214

chat_bubble_outline4

repeat29

shareShare

Conference on Language Modeling

@colm_conf

9 days ago

COLM Keynotes: Luke Zettlemoyer Mixed-modal Language Modeling youtu.be/PdsKNtEofFY

thumb_up_off_alt148

chat_bubble_outline0

repeat19

shareShare

Harman Singh

@harman26singh

8 days ago

Exciting to see much-needed progress on evaluating Indic language/culture understanding! IndicGenBench shared these motivations and is one of the first generative evals for 29 Indic Languages! x.com/Harman26Singh/… Partha Talukdar Nitish Gupta

thumb_up_off_alt6

chat_bubble_outline0

repeat3

shareShare

Harman Singh

@harman26singh

7 days ago

Online Rubrics for non-verifiable domains reduce reward hacking. Cool co-training of generator and critic simultaneously! In our previous work, we show rubrics help in making more robust reward models: x.com/Harman26Singh/…

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Mian Wu

@merlinnoth79247

6 days ago

Can we run RL to train LLMs on hard-to-verify or open-ended tasks? Even when tasks are verifiable, it is often impossible to check every design detail or catch all mistakes.. We can go prompt-tune LLM judges, but is that really the answer? Our new paper introduces RLAC: a

thumb_up_off_alt324

chat_bubble_outline9

repeat54

shareShare

Sumanth

@sumanthd17

3 days ago

here's a sneak peak into my life. one full year filled a lot of moments. (Comes with its own hot takes) thanks to P and the life guard, I get to write this today! read the full article (in 🧵👇)

thumb_up_off_alt18

chat_bubble_outline4

repeat1

shareShare

Aditya Grover

@adityagrover_

3 days ago

In many ways, (continuous) diffusion models are in-place reasoners where the quality improves with more denoising steps. Lately, we have been extending this to language, combining RLVR with discrete diffusion, resulting in d1 (arxiv.org/abs/2504.12216, NeurIPS2025 spotlight).

thumb_up_off_alt227

chat_bubble_outline6

repeat17

shareShare

Lakshya A Agrawal

@lakshyaaagrawal

3 days ago

GEPA featured in OpenAI and Bain & Company new cookbook tutorial, showing how to build self-evolving agents that move beyond static prompts. See how GEPA dynamically enables agents to autonomously reflect, learn from feedback, and evolve their own instructions.

GEPA featured in <a href="/OpenAI/">OpenAI</a> and <a href="/BainandCompany/">Bain & Company</a> new cookbook tutorial, showing how to build self-evolving agents that move beyond static prompts.

See how GEPA dynamically enables agents to autonomously reflect, learn from feedback, and evolve their own instructions.

thumb_up_off_alt562

chat_bubble_outline22

repeat78

shareShare

Pang Wei Koh

@pangweikoh

2 days ago

Two ideas here for scaling up RL for reasoning: 1. Procedurally generating (verifiable) problems lets us adapt difficulty to the model, making training more efficient 2. Teaching the model to reason by hand (e.g., sort numbers w/o code) generalizes to realistic reasoning tasks!

thumb_up_off_alt98

chat_bubble_outline2

repeat6

shareShare