Caleb Ziems (@cjziems) 's Twitter Profile
Caleb Ziems

@cjziems

bsky.app/profile/calebz…

PhD student at @StanfordNLP 🌲 Working on socially-aware + dialect-robust #NLP, #CSS

ID: 439511556

linkhttp://calebziems.com calendar_today17-12-2011 21:44:20

131 Tweet

927 Followers

930 Following

Jeff Dean (@jeffdean) 's Twitter Profile Photo

My longtime collaborator Dave Patterson (long-time faculty at UC Berkeley, Association for Computing Machinery Turing Award winner, and fellow Laude Institute board member) wrote a very good op-ed about how continued investing in basic science and technology research is essential for the U.S. Dave

Yanzhe Zhang (@stevenyzzhang) 's Twitter Profile Photo

Soon, AI agents will act for us—collaborating, negotiating, and sharing data. But can they truly protect our privacy? We simulate privacy-critical scenarios, using alternating search to evolve attacks and defenses, uncovering severe vulnerabilities and building protections.

Yanzhe Zhang (@stevenyzzhang) 's Twitter Profile Photo

Introducing Generative Interfaces - a new paradigm beyond chatbots. We generate interfaces on the fly to better facilitate LLM interaction, so no more passive reading of long text blocks. Adaptive and Interactive: creates the form that best adapts to your goals and needs!

Joachim Baumann (@joabaum) 's Twitter Profile Photo

🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**. Paper: arxiv.org/pdf/2509.08825

🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825
Dora Zhao (@dorazhao9) 's Twitter Profile Photo

LLMs are powerful, but they don't know your world. This knowledge gap can lead to generic, unhelpful, or incorrect responses. In our #UIST2025 paper, we explore how users can fill these gaps through creating a community knowledge ecosystem, giving models access to more specific

Jenna Russell (@jennajrussell) 's Twitter Profile Photo

AI is already at work in American newsrooms. We examine 186k articles published this summer and find that ~9% are either fully or partially AI-generated, usually without readers having any idea. Here's what we learned about how AI is influencing local and national journalism:

AI is already at work in American newsrooms.

We examine 186k articles published this summer and find that ~9% are either fully or partially AI-generated, usually without readers having any idea.

Here's what we learned about how AI is influencing local and national journalism:
Zora Wang (@zhiruow) 's Twitter Profile Photo

Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine

Tiancheng Hu (@tiancheng_hu) 's Twitter Profile Photo

Can AI simulate human behavior? 🧠 The promise is revolutionary for science & policy. But there’s a huge "IF": Do these simulations actually reflect reality? To find out, we introduce SimBench: The first large-scale benchmark for group-level social simulation. (1/9)

elie (@eliebakouch) 's Twitter Profile Photo

Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably huggingface.co/spaces/Hugging…

Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably

huggingface.co/spaces/Hugging…
Houjun Liu (@houjun_liu) 's Twitter Profile Photo

Good morning Suzhou! Amelia Hardy and I will be at EMNLP 2025 to present our work *TODAY, Hall C, 12:30PM; paper number 426* Come learn: ✅ why likelihood is important to simultaneously optimize with attack success ✅ online preference learning tricks for LM falsification

Good morning Suzhou!
<a href="/amelia_f_hardy/">Amelia Hardy</a> and I will be at <a href="/emnlpmeeting/">EMNLP 2025</a> to present our work *TODAY, Hall C, 12:30PM; paper number 426*

Come learn:
✅ why likelihood is important to simultaneously optimize with attack success
✅ online preference learning tricks for LM falsification
Akaash Kolluri (@kolluriakaash) 's Twitter Profile Photo

New EMNLP main paper: “Finetuning LLMs for Human Behavior Prediction in Social Science Experiments” We built SocSci210—2.9M human responses from 210 social science experiments. Finetuning Qwen2.5-14B on SocSci210 beats its base model by 26% & GPT-4o by 13% on unseen studies.🧵

New EMNLP main paper: “Finetuning LLMs for Human Behavior Prediction in Social Science Experiments” 

We built SocSci210—2.9M human responses from 210 social science experiments. Finetuning Qwen2.5-14B on SocSci210 beats its base model by 26% &amp; GPT-4o by 13% on unseen studies.🧵