Xiaochuang Han (@xiaochuanghan) 's Twitter Profile
Xiaochuang Han

@xiaochuanghan

PhD student at the University of Washington

ID: 4916685123

linkhttp://xhan77.github.io calendar_today16-02-2016 00:47:56

93 Tweet

567 Followers

730 Following

Shangbin Feng (@shangbinfeng) 's Twitter Profile Photo

๐Ÿšจ Detecting social media bots has always been an arms race: we design better detectors with advanced ML tools, while more evasive bots emerge adversarially. What does LLM bring to the arms race between bot detectors and operators? A thread ๐Ÿงต#ACL2024

๐Ÿšจ Detecting social media bots has always been an arms race: we design better detectors with advanced ML tools, while more evasive bots emerge adversarially.

What does LLM bring to the arms race between bot detectors and operators?

A thread ๐Ÿงต#ACL2024
Sachin Kumar (@shocheen) 's Twitter Profile Photo

Check out our paper on model noncompliance. We outline a taxonomy of requests that LLMs should not comply with beyond only unsafe queries. Based on this taxonomy we create CoCoNot, a resource for training and evaluating modelsโ€™ noncompliance. More details in this thread๐Ÿ‘‡๐Ÿพ

Leo Liu (@zeyuliu10) 's Twitter Profile Photo

Knowledge updates for code LLMs: Code LLMs generate calls to libraries like numpy and pandas. What if these libraries change? Can we update LLMs with modified function definitions? Test this with our new benchmark, CodeUpdateArena. Findings: updating LLMsโ€™ API knowledge is hard!

Knowledge updates for code LLMs: Code LLMs generate calls to libraries like numpy and pandas. What if these libraries change? Can we update LLMs with modified function definitions?

Test this with our new benchmark, CodeUpdateArena. Findings: updating LLMsโ€™ API knowledge is hard!
Weijia Shi (@weijiashi2) 's Twitter Profile Photo

Can ๐ฆ๐š๐œ๐ก๐ข๐ง๐ž ๐ฎ๐ง๐ฅ๐ž๐š๐ซ๐ง๐ข๐ง๐  make language models forget their training data? We shows Yes but at the cost of privacy and utility. Current unlearning scales poorly with the size of the data to be forgotten and canโ€™t handle sequential unlearning requests. ๐Ÿ”—:

Can ๐ฆ๐š๐œ๐ก๐ข๐ง๐ž ๐ฎ๐ง๐ฅ๐ž๐š๐ซ๐ง๐ข๐ง๐  make language models forget their training data?

We shows Yes but at the cost of privacy and utility. Current unlearning scales poorly with the size of the data to be forgotten and canโ€™t handle sequential unlearning requests.

๐Ÿ”—:
Sachin Kumar (@shocheen) 's Twitter Profile Photo

You think your model just fell out of a coconot tree ๐Ÿฅฅ? It should not always comply in the context of all it has seen in the request. Check out our paper on contextual noncompliance.

Shangbin Feng (@shangbinfeng) 's Twitter Profile Photo

What can we do when certain values, cultures, and communities are underrepresented in LLM alignment? Introducing Modular Pluralism, where a general LLM interacts with a pool of specialized community LMs in various modes to advance pluralistic alignment. A thread ๐Ÿงต

What can we do when certain values, cultures, and communities are underrepresented in LLM alignment?

Introducing Modular Pluralism, where a general LLM interacts with a pool of specialized community LMs in various modes to advance pluralistic alignment.

A thread ๐Ÿงต
Alisa Liu (@alisawuffles) 's Twitter Profile Photo

What do BPE tokenizers reveal about their training data?๐Ÿง We develop an attack๐Ÿ—ก๏ธ that uncovers the training data mixtures๐Ÿ“Š of commercial LLM tokenizers (incl. GPT-4o), using their ordered merge lists! Co-1โƒฃst Jonathan Hayase arxiv.org/abs/2407.16607 ๐Ÿงตโฌ‡๏ธ

What do BPE tokenizers reveal about their training data?๐Ÿง

We develop an attack๐Ÿ—ก๏ธ that uncovers the training data mixtures๐Ÿ“Š of commercial LLM tokenizers (incl. GPT-4o), using their ordered merge lists!

Co-1โƒฃst <a href="/JonathanHayase/">Jonathan Hayase</a>
arxiv.org/abs/2407.16607 ๐Ÿงตโฌ‡๏ธ
Shangbin Feng (@shangbinfeng) 's Twitter Profile Photo

Instruction tuning with synthetic graph data leads to graph LLMs, but: Are LLMs learning generalizable graph reasoning skills or merely memorizing patterns in the synthetic training data? ๐Ÿค” (for examples, patterns like how you describe a graph in natural language) A thread ๐Ÿงต

Instruction tuning with synthetic graph data leads to graph LLMs, but:

Are LLMs learning generalizable graph reasoning skills or merely memorizing patterns in the synthetic training data? ๐Ÿค” (for examples, patterns like how you describe a graph in natural language)

A thread ๐Ÿงต
AK (@_akhaliq) 's Twitter Profile Photo

JPEG-LM LLMs as Image Generators with Canonical Codec Representations discuss: huggingface.co/papers/2408.08โ€ฆ Recent work in image and video generation has been adopting the autoregressive LLM architecture due to its generality and potentially easy integration into multi-modal

JPEG-LM

LLMs as Image Generators with Canonical Codec Representations

discuss: huggingface.co/papers/2408.08โ€ฆ

Recent work in image and video generation has been adopting the autoregressive LLM architecture due to its generality and potentially easy integration into multi-modal
tsvetshop (@tsvetshop) 's Twitter Profile Photo

Huge congrats to Oreva Ahia and Shangbin Feng for winning awards at #ACL2024! DialectBench Best Social Impact Paper Award arxiv.org/abs/2403.11009 Don't Hallucinate, Abstain Area Chair Award, QA track & Outstanding Paper Award arxiv.org/abs/2402.00367

Chunting Zhou (@violet_zct) 's Twitter Profile Photo

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039 Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039

Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This
Lili Yu (@liliyu_lili) 's Twitter Profile Photo

๐Ÿš€ Excited to share our latest work: Transfusion! A new multi-modal generative training combining language modeling and image diffusion in a single transformer! Huge shout to Chunting Zhou Omer Levy Michi Yasunaga Arun Babu Kushal Tirumala and other collaborators.

Pang Wei Koh (@pangweikoh) 's Twitter Profile Photo

Check out JPEG-LM, a fun idea led by Xiaochuang Han -- we generate images simply by training an LM on raw JPEG bytes and show that it outperforms much more complicated VQ models, especially on rare inputs.

Marjan Ghazvininejad (@gh_marjan) 's Twitter Profile Photo

Can we train an LM on raw JPEG bytes and generate images with that? Yes we can. Check out JPEG-LM (arxiv.org/abs/2408.08459), a cool work lead by @XiaochuangHano to learn more.