Harman Singh
@harman26singh
Researcher @GoogleDeepMind Prev: AI Resident @MetaAI, Undergrad @iitdelhi, INK Lab @CSatUSC, @IBMResearch. language, vision, reasoning
ID: 1133860417602772993
https://harmandotpy.github.io/ 29-05-2019 22:19:24
465 Tweet
844 Takipçi
1,1K Takip Edilen
1/ Really looking forward to #PytorchConf this week in SF-- I've spent the last couple of months at DatologyAI immersed in the DataLoader ecosystem (especially for our VLM stack) and I have a few topics I would love to discuss with folks (DMs are open, say hi if you see me, etc.
Avijit Thawani (Avi) Haha. I am afraid people interpreted my “delete tokenizer” as “use bytes directly without BPE”, the issue is you *still* need bytes encoding arbitrariness even for that! Pixels is the only way. Just like humans. It is written. If GPT-10 uses utf8 at the input I will eat a shoe.
Exciting to see much-needed progress on evaluating Indic language/culture understanding! IndicGenBench shared these motivations and is one of the first generative evals for 29 Indic Languages! x.com/Harman26Singh/… Partha Talukdar Nitish Gupta