Yichen (Zach) Wang (@yichenzw) Twitter Tweets • TwiCopy

Jack Jingyu Zhang @ NAACL🌵

2 years ago

🤖 LLMs are powerful, but their "one-size-fits-all" safety alignment limits flexibility. Safety standards vary across cultures and users—what’s safe in one context might not be in another. 🌍 We propose ✨Controllable Safety Alignment✨ for inference-time safety adaptation! 🧵👇

thumb_up_off_alt135

chat_bubble_outline5

repeat40

shareShare

Liam Dugan

@liamdugan_

2 years ago

The deadline to submit a detector to the RAID shared task at COLING 2025 has been extended to November 2nd! There's plenty of time left to join the task + submission is quick and easy. Check out the Github for more info! github.com/liamdugan/COLI…

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

Mina Lee

@minalee__

a year ago

Interested in writing with AI? ✍️ Please apply to be a **postdoc** in my group through the UChicago DSI Scholars program! 🤠 Research in my group: minalee-research.github.io/research.html Application: datascience.uchicago.edu/research/postd… (review begins on Dec 6)

thumb_up_off_alt122

chat_bubble_outline1

repeat25

shareShare

Chenghao Yang

@chrome1996

a year ago

Happy Thanksgiving! Inspired by many great bloggers Sasha Rush Yao Fu, I made a tutorial about the "inference-time compute" tech showcased by O1. I incorporate insights from Sasha's great talk and ongoing O1 replications. Video: youtu.be/_Bw5o55SRL8. Feedback welcome!

thumb_up_off_alt134

chat_bubble_outline2

repeat19

shareShare

Rock Pang

@rockpang6

a year ago

🤔Interested in how #HCI thinks about using #LLMs, or looking to understand best practices for human-LLM interaction? 🚨🚨New paper: Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review

thumb_up_off_alt143

chat_bubble_outline5

repeat30

shareShare

Yuntian Deng

@yuntiandeng

a year ago

For those curious about how o3-mini performs on multi-digit multiplication, here's the result. It does much better than o1 but still struggles past 13×13. (Same evaluation setup as before, but with 40 test examples per cell.)

thumb_up_off_alt766

chat_bubble_outline75

repeat101

shareShare

Zhiyuan Zeng

@zhiyuanzeng_

a year ago

Is a single accuracy number all we can get from model evals?🤔 🚨Does NOT tell where the model fails 🚨Does NOT tell how to improve it Introducing EvalTree🌳 🔍identifying LM weaknesses in natural language 🚀weaknesses serve as actionable guidance (paper&demo 🔗in🧵) [1/n]

thumb_up_off_alt240

chat_bubble_outline4

repeat89

shareShare

Abe Hou

@abe_hou

a year ago

👁️Recent works use LLMs for social simulations—but can these agents help shape effective policies? 💥Our new paper tackles a bold question many have wondered about: Can generative agent societies simulate to inform public health policy? 🔗: arxiv.org/abs/2503.09639

thumb_up_off_alt68

chat_bubble_outline3

repeat27

shareShare

Xuandong Zhao

@xuandongzhao

a year ago

🚨 New Paper Alert 🚨 Are you really getting the model you paid for? In our latest work, we uncover a critical trust gap in LLM APIs—and propose methods to audit for covert model substitution. 🕵️‍♂️

thumb_up_off_alt23

chat_bubble_outline1

repeat5

shareShare

Dang Nguyen

@divingwithorcas

a year ago

1/n You may know that large language models (LLMs) can be biased in their decision-making, but ever wondered how those biases are encoded internally and whether we can surgically remove them?

thumb_up_off_alt12

chat_bubble_outline1

repeat8

shareShare

Jack Jingyu Zhang @ NAACL🌵

@jackjingyuzhang

a year ago

Our Controllable Safety Alignment paper will be presented at #ICLR2025 this week in Singapore 🇸🇬! We've release our code and the human-authored CoSApien👥 dataset: 👉 aka.ms/controllable-s… Watch the short video summary here: 🎬 youtube.com/watch?v=kDioFn…

thumb_up_off_alt27

chat_bubble_outline0

repeat7

shareShare

Xiao Pu

@xiaosophiapu

a year ago

🧠 Reasoning models often overthink. 🚀 In our new paper, we show: 1️⃣ Two overthinking scores. 2️⃣ DUMB500 — a benchmark of extremely easy questions. 3️⃣ THOUGHT TERMINATOR — a decoding method that reduces token waste by up to 90%, often improving accuracy. Details below 👇

thumb_up_off_alt100

chat_bubble_outline4

repeat12

shareShare

Kevin Yang

@kevinyang41

a year ago

Will be at NAACL next week, excited to share two of our papers: FACTTRACK: Time-Aware World State Tracking in Story Outlines arxiv.org/abs/2407.16347 THOUGHTSCULPT: Reasoning with Intermediate Revision and Search arxiv.org/abs/2404.05966 Shoutout to first authors Zhiheng LYU and

thumb_up_off_alt10

chat_bubble_outline0

repeat4

shareShare

William Merrill

@lambdaviking

a year ago

Excited to announce I'll be starting as an assistant professor at TTIC for fall 2026! In the meantime, I'll be graduating and hanging around Ai2 in Seattle🏔️

Excited to announce I'll be starting as an assistant professor at <a href="/TTIC_Connect/">TTIC</a> for fall 2026!

In the meantime, I'll be graduating and hanging around Ai2 in Seattle🏔️

thumb_up_off_alt350

chat_bubble_outline56

repeat24

shareShare

Niloofar (on faculty job market!)

@niloofar_mire

a year ago

📣Thrilled to announce I’ll join Carnegie Mellon University (CMU Engineering & Public Policy & Language Technologies Institute | @CarnegieMellon) as an Assistant Professor starting Fall 2026! Until then, I’ll be a Research Scientist at AI at Meta FAIR in SF, working with Kamalika Chaudhuri’s amazing team on privacy, security, and reasoning in LLMs!

📣Thrilled to announce I’ll join Carnegie Mellon University (<a href="/CMU_EPP/">CMU Engineering & Public Policy</a> & <a href="/LTIatCMU/">Language Technologies Institute | @CarnegieMellon</a>) as an Assistant Professor starting Fall 2026!

Until then, I’ll be a Research Scientist at <a href="/AIatMeta/">AI at Meta</a> FAIR in SF, working with <a href="/kamalikac/">Kamalika Chaudhuri</a>’s amazing team on privacy, security, and reasoning in LLMs!

thumb_up_off_alt1,1K

chat_bubble_outline212

repeat67

shareShare

Harvey Yiyun Fu

@harveyiyun

10 months ago

LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing? 🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents. paper:

thumb_up_off_alt159

chat_bubble_outline11

repeat33

shareShare

Ari Holtzman

@universeinanegg

10 months ago

New benchmark! LLMs can retrieve bits of information from ridiculously long contexts (needle-in-a-haystack) but they can't tell what's missing from relatively short documents (AbsenceBench). We can't trust LLMs to annotate or judge documents if they can't see negative space!

thumb_up_off_alt98

chat_bubble_outline3

repeat15

shareShare

Chenghao Yang

@chrome1996

10 months ago

Have you noticed… 🔍 Aligned LLM generations feel less diverse? 🎯 Base models are decoding-sensitive? 🤔 Generations get more predictable as they progress? 🌲 Tree search fails mid-generation (esp. for reasoning)? We trace these mysteries to LLM probability concentration, and

thumb_up_off_alt88

chat_bubble_outline1

repeat25

shareShare

Aryan Shrivastava

@aryan_shri123

10 months ago

🤫Jailbreak prompts make aligned LMs produce harmful responses.🤔But is that info linearly decodable? ↗️We show many refused concepts are linearly represented, sometimes persist through instruction-tuning, and may also shape downstream behavior❗ arxiv.org/abs/2507.00239 🧵1/

thumb_up_off_alt19

chat_bubble_outline1

repeat8

shareShare

Abe Hou

@abe_hou

10 months ago

Accepted to #COLM2025 ! Thanks to all my collaborators and see you in Montreal!

thumb_up_off_alt31

chat_bubble_outline0

repeat7

shareShare