Sulin Liu (@su_lin_liu) Twitter Tweets • TwiCopy

Sulin Liu

@su_lin_liu

+ Follow

Postdoc @MIT Ex: Machine Learning PhD @Princeton @Meta @NTUsg @NUSingapore

ID: 264576956

linkhttps://liusulin.github.io/ calendar_today12-03-2011 03:48:27

216 Tweet

592 Followers

1,1K Following

Lin Zheng

@linzhengisme

a year ago

🚀 Meet EvaByte: The best open-source tokenizer-free language model! Our 6.5B byte LM matches modern tokenizer-based LMs with 5x less data & 2x faster decoding, naturally extending to multimodal tasks while fixing tokenization quirks. 💻 Blog: bit.ly/3CjEmTC 🧵 1/9

thumb_up_off_alt473

chat_bubble_outline13

repeat94

shareShare

Sulin Liu

@su_lin_liu

a year ago

This quirky topic summarization (edge case?) somehow made my day😂

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Sulin Liu

@su_lin_liu

10 months ago

Check out new paper on how to do planning for discrete diffusion 👏 Really exciting to see more exploration in this direction🔥

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Sitan Chen

@sitanch

10 months ago

Excited about this new work where we dig into the role of token order in masked diffusions! MDMs train on some horribly hard tasks, but careful planning at inference can sidestep the hardest ones, dramatically improving over vanilla MDM sampling (e.g. 7%->90% acc on Sudoku) 1/

thumb_up_off_alt104

chat_bubble_outline3

repeat16

shareShare

Ji-Ha

@ji_ha_kim

10 months ago

I can’t begin to imagine how strong Anthropic’s internal models must be, since Claude was by far the strongest of the standard non-reasoning models: it’s the only one who could escape getting stuck in loops, a recurrent problem that every other LLM has not overcome

thumb_up_off_alt60

chat_bubble_outline4

repeat1

shareShare

Sulin Liu

@su_lin_liu

10 months ago

Discrete diffusion (including masked language model) deserves more investment in terms of research and compute, especially when we are running out of pre-training data for autoregressive LLMs. You can get a lot more data for free by just masking data or perturbing them with

thumb_up_off_alt20

chat_bubble_outline0

repeat2

shareShare

Sulin Liu

@su_lin_liu

10 months ago

grok also tends to do more solution verification at the end of the solution than chatgpt. Clearly this cannot be baked in through just RL from verifiable reward...

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

Stefano Ermon

@stefanoermon

10 months ago

Excited to share that I’ve been working on scaling up diffusion language models at Inception. A new generation of LLMs with unprecedented capabilities is coming!

thumb_up_off_alt692

chat_bubble_outline37

repeat81

shareShare

David Duvenaud

@davidduvenaud

10 months ago

LLMs have complex joint beliefs about all sorts of quantities. And my postdoc James Requeima visualized them! In this thread we show LLM predictive distributions conditioned on data and free-form text. LLMs pick up on all kinds of subtle and unusual structure: 🧵

thumb_up_off_alt1,1K

chat_bubble_outline30

repeat208

shareShare

Federico Cassano

@ellev3n11

10 months ago

i think that all the pre-training is dead takes are bad. the issue with these big big models is that they are capped by dogwater human-labeled post-training data. we shall continue to scale by exploiting verified RL. excited to see gpt-4.5 be used as a base for the next o model.

thumb_up_off_alt13

chat_bubble_outline0

repeat3

shareShare

Simo Ryu

@cloneofsimo

9 months ago

LLaDA with muP. it just works, again. Im so tired of saying it works. Just use it, and thank me later

thumb_up_off_alt401

chat_bubble_outline8

repeat23

shareShare

Sulin Liu

@su_lin_liu

9 months ago

which train are you on?🚄🚇🚆 (also me: we need faster trains in the states 😶)

thumb_up_off_alt15

chat_bubble_outline1

repeat1

shareShare

Kenny Peng

@kennylpeng

8 months ago

Our lab had a #dogathon 🐕 yesterday where we analyzed NYC Open Data on dog licenses. We learned a lot of dog facts, which I’ll share in this thread 🧵 1) Geospatial trends: Cavalier King Charles Spaniels are common in Manhattan; the opposite is true for Yorkshire Terriers.

thumb_up_off_alt20

chat_bubble_outline1

repeat6

shareShare