Max Lamparth (@mlamparth) 's Twitter Profile
Max Lamparth

@mlamparth

Postdoc at @Stanford, @StanfordCISAC, Stanford Center for AI Safety, SERI. | Focusing on interpretable, safe, and ethical AI/LLM decision-making. Find me on 🦋

ID: 1588663024969125888

linkhttp://www.maxlamparth.com calendar_today04-11-2022 22:43:21

536 Tweet

684 Followers

679 Following

Kayo Yin (@kayo_yin) 's Twitter Profile Photo

Induction heads are commonly associated with in-context learning, but are they the primary driver of ICL at scale? We find that recently discovered "function vector" heads, which encode the ICL task, are the actual primary drivers of few-shot ICL. arxiv.org/abs/2502.14010 🧵

Induction heads are commonly associated with in-context learning, but are they the primary driver of ICL at scale?

We find that recently discovered "function vector" heads, which encode the ICL task, are the actual primary drivers of few-shot ICL.

arxiv.org/abs/2502.14010
🧵
CISAC & on Bluesky @stanfordcisac.bsky.social (@stanfordcisac) 's Twitter Profile Photo

The Helpful, Honest, and Harmless (HHH) principle is key for AI alignment but current interpretations miss contextual nuances. CISAC postdoc Max Lamparth & colleagues propose an adaptive framework to prioritize values, balance trade-offs, & enhance AI ethics arxiv.org/abs/2502.06059

Max Lamparth (@mlamparth) 's Twitter Profile Photo

Thank you for featuring our work! Great collaboration with Declan Grabb, MD and the team. We created a dataset that goes beyond medical exam-style questions and studies the impact of patient demographic on clinical decision-making in psychiatric care on fifteen language models

Cassidy Laidlaw (@cassidy_laidlaw) 's Twitter Profile Photo

We built an AI assistant that plays Minecraft with you. Start building a house—it figures out what you’re doing and jumps in to help. This assistant *wasn't* trained with RLHF. Instead, it's powered by *assistance games*, a better path forward for building AI assistants. 🧵

CISAC & on Bluesky @stanfordcisac.bsky.social (@stanfordcisac) 's Twitter Profile Photo

In their latest blog post for Stanford AI Lab, CISAC Postdoc @mlamparth and colleague Declan Grabb dive into MENTAT, a clinician-annotated dataset tackling real-world ambiguities in psychiatric decision-making. ai.stanford.edu/blog/mentat/

Rylan Schaeffer (@rylanschaeffer) 's Twitter Profile Photo

I'm going to catch hell for posting but to summarize: 1. This paper misled its way to an #ICLR2025 Oral 2. I pointed this out 3. AC rejected the paper 4. Authors complained & somehow persuaded ICLR to overrule the AC and award a Spotlight 5. AC made clear they were overruled

Andreas Kirsch 🇺🇦 (@blackhc) 's Twitter Profile Photo

Elvis Dohmatob Let's reframe your narrative : what I get is that you were very well aware of the paper incl the final updated version that got submitted and then accepted at COLM and you refused to cite it because you were upset of an earlier draft of that paper that was sent to you for

CISAC & on Bluesky @stanfordcisac.bsky.social (@stanfordcisac) 's Twitter Profile Photo

The 2025 SERI Symposium explored risks that emerge from the intersection of complex global challenges & policies designed to mitigate them, bringing together leading experts & researchers from across the Bay Area who specialize in a range of global risks⤵️ youtube.com/watch?v=wF20vy…

The 2025 SERI Symposium explored risks that emerge from the intersection of complex global challenges & policies designed to mitigate them, bringing together leading experts & researchers from across the Bay Area who specialize in a range of global risks⤵️
youtube.com/watch?v=wF20vy…
Cas (Stephen Casper) (@stephenlcasper) 's Twitter Profile Photo

🚨New paper: Current reports on AI audits/evals often omit crucial details, and there are huge disparities between the thoroughness of different reports. Even technically rigorous evals can offer little useful insight if reported selectively or obscurely. Audit cards can help.

🚨New paper:

Current reports on AI audits/evals often omit crucial details, and there are huge disparities between the thoroughness of different reports. Even technically rigorous evals can offer little useful insight if reported selectively or obscurely.

Audit cards can help.
Ruiqi Zhong (@zhongruiqi) 's Twitter Profile Photo

Last day of PhD! I pioneered using LLMs to explain dataset&model. It's used by interp at OpenAI and societal impact Anthropic Tutorial here. It's a great direction & someone should carry the torch :) Thesis available, if you wanna read my acknowledgement section=P

Last day of PhD! 

I pioneered using LLMs to explain dataset&amp;model. It's used by interp at <a href="/OpenAI/">OpenAI</a>  and societal impact <a href="/AnthropicAI/">Anthropic</a> 

Tutorial here. It's a great direction &amp; someone should carry the torch :)

Thesis available, if you wanna read my acknowledgement section=P