Xiaochuang Han (@xiaochuanghan) Twitter Tweets • TwiCopy

Xiaochuang Han

@xiaochuanghan

+ Follow

PhD student at the University of Washington

ID: 4916685123

linkhttp://xhan77.github.io calendar_today16-02-2016 00:47:56

93 Tweet

567 Followers

730 Following

Shangbin Feng

2 months ago

🚨 Detecting social media bots has always been an arms race: we design better detectors with advanced ML tools, while more evasive bots emerge adversarially. What does LLM bring to the arms race between bot detectors and operators? A thread 🧵#ACL2024

🚨 Detecting social media bots has always been an arms race: we design better detectors with advanced ML tools, while more evasive bots emerge adversarially.

What does LLM bring to the arms race between bot detectors and operators?

A thread 🧵#ACL2024

thumb_up_off_alt55

chat_bubble_outline2

Sachin Kumar

2 months ago

Check out our paper on model noncompliance. We outline a taxonomy of requests that LLMs should not comply with beyond only unsafe queries. Based on this taxonomy we create CoCoNot, a resource for training and evaluating models’ noncompliance. More details in this thread👇🏾

thumb_up_off_alt39

chat_bubble_outline0

Leo Liu

2 months ago

Knowledge updates for code LLMs: Code LLMs generate calls to libraries like numpy and pandas. What if these libraries change? Can we update LLMs with modified function definitions? Test this with our new benchmark, CodeUpdateArena. Findings: updating LLMs’ API knowledge is hard!

Knowledge updates for code LLMs: Code LLMs generate calls to libraries like numpy and pandas. What if these libraries change? Can we update LLMs with modified function definitions?

Test this with our new benchmark, CodeUpdateArena. Findings: updating LLMs’ API knowledge is hard!

thumb_up_off_alt131

chat_bubble_outline2

Weijia Shi

2 months ago

Can 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐮𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 make language models forget their training data? We shows Yes but at the cost of privacy and utility. Current unlearning scales poorly with the size of the data to be forgotten and can’t handle sequential unlearning requests. 🔗:

Can 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐮𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 make language models forget their training data?

We shows Yes but at the cost of privacy and utility. Current unlearning scales poorly with the size of the data to be forgotten and can’t handle sequential unlearning requests.

🔗:

thumb_up_off_alt321

chat_bubble_outline5

Sachin Kumar

2 months ago

You think your model just fell out of a coconot tree 🥥? It should not always comply in the context of all it has seen in the request. Check out our paper on contextual noncompliance.

thumb_up_off_alt59

chat_bubble_outline3

Shangbin Feng

2 months ago

What can we do when certain values, cultures, and communities are underrepresented in LLM alignment? Introducing Modular Pluralism, where a general LLM interacts with a pool of specialized community LMs in various modes to advance pluralistic alignment. A thread 🧵

What can we do when certain values, cultures, and communities are underrepresented in LLM alignment?

Introducing Modular Pluralism, where a general LLM interacts with a pool of specialized community LMs in various modes to advance pluralistic alignment.

A thread 🧵

thumb_up_off_alt45

chat_bubble_outline1

Alisa Liu

2 months ago

What do BPE tokenizers reveal about their training data?🧐 We develop an attack🗡️ that uncovers the training data mixtures📊 of commercial LLM tokenizers (incl. GPT-4o), using their ordered merge lists! Co-1⃣st Jonathan Hayase arxiv.org/abs/2407.16607 🧵⬇️

What do BPE tokenizers reveal about their training data?🧐

We develop an attack🗡️ that uncovers the training data mixtures📊 of commercial LLM tokenizers (incl. GPT-4o), using their ordered merge lists!

Co-1⃣st <a href="/JonathanHayase/">Jonathan Hayase</a>
arxiv.org/abs/2407.16607 🧵⬇️

thumb_up_off_alt354

chat_bubble_outline12

Shangbin Feng

a month ago

Instruction tuning with synthetic graph data leads to graph LLMs, but: Are LLMs learning generalizable graph reasoning skills or merely memorizing patterns in the synthetic training data? 🤔 (for examples, patterns like how you describe a graph in natural language) A thread 🧵

Instruction tuning with synthetic graph data leads to graph LLMs, but:

Are LLMs learning generalizable graph reasoning skills or merely memorizing patterns in the synthetic training data? 🤔 (for examples, patterns like how you describe a graph in natural language)

A thread 🧵

thumb_up_off_alt35

chat_bubble_outline2

AK

a month ago

JPEG-LM LLMs as Image Generators with Canonical Codec Representations discuss: huggingface.co/papers/2408.08… Recent work in image and video generation has been adopting the autoregressive LLM architecture due to its generality and potentially easy integration into multi-modal

JPEG-LM

LLMs as Image Generators with Canonical Codec Representations

discuss: huggingface.co/papers/2408.08…

Recent work in image and video generation has been adopting the autoregressive LLM architecture due to its generality and potentially easy integration into multi-modal

thumb_up_off_alt290

chat_bubble_outline6

Jianing “Jed” Yang

a month ago

🤯One of the most outside-the-box-thinking usages of LLMs I have seen. Interesting work from Xiaochuang Han 🎉

thumb_up_off_alt4

chat_bubble_outline0

tsvetshop

a month ago

Huge congrats to Oreva Ahia and Shangbin Feng for winning awards at #ACL2024! DialectBench Best Social Impact Paper Award arxiv.org/abs/2403.11009 Don't Hallucinate, Abstain Area Chair Award, QA track & Outstanding Paper Award arxiv.org/abs/2402.00367

thumb_up_off_alt58

chat_bubble_outline1

Chunting Zhou

a month ago

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039 Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039

Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

thumb_up_off_alt983

chat_bubble_outline23

Lili Yu

a month ago

🚀 Excited to share our latest work: Transfusion! A new multi-modal generative training combining language modeling and image diffusion in a single transformer! Huge shout to Chunting Zhou Omer Levy Michi Yasunaga Arun Babu Kushal Tirumala and other collaborators.

thumb_up_off_alt98

chat_bubble_outline4

Shangbin Feng

a month ago

Check out JPEG-LM pleaseeeeeee

thumb_up_off_alt8

chat_bubble_outline0

tsvetshop

a month ago

Check out JPEG-LM, amazing and transformative idea!

thumb_up_off_alt7

chat_bubble_outline0

Pang Wei Koh

a month ago

Check out JPEG-LM, a fun idea led by Xiaochuang Han -- we generate images simply by training an LM on raw JPEG bytes and show that it outperforms much more complicated VQ models, especially on rare inputs.

thumb_up_off_alt40

chat_bubble_outline2

Marjan Ghazvininejad

a month ago

Can we train an LM on raw JPEG bytes and generate images with that? Yes we can. Check out JPEG-LM (arxiv.org/abs/2408.08459), a cool work lead by @XiaochuangHano to learn more.

thumb_up_off_alt32

chat_bubble_outline2