Ahmed Ahmed (@ahmedsqrd) 's Twitter Profile
Ahmed Ahmed

@ahmedsqrd

CS PhD @Stanford - Funding @KnightHennessy @NSF- 🇸🇩 - tweets include history & politics

ID: 1225896347242336256

calendar_today07-02-2020 21:39:07

499 Tweet

608 Followers

956 Following

Nathan Lambert (@natolambert) 's Twitter Profile Photo

Agents research with LMs right now feels like Deep RL in the late 2010s. Tons of new algorithms on narrow domains, so I expect most of these results to not transfer at all. The thing is, its still early, so its going to get worse still before it gets better.

Anikait Singh (@anikait_singh_) 's Twitter Profile Photo

Personalization in LLMs is crucial for meeting diverse user needs, yet collecting real-world preferences at scale remains a significant challenge. Introducing FSPO, a simple framework leveraging synthetic preference data to adapt new users with meta-learning for open-ended QA! 🧵

Personalization in LLMs is crucial for meeting diverse user needs, yet collecting real-world preferences at scale remains a significant challenge. Introducing FSPO, a simple framework leveraging synthetic preference data to adapt new users with meta-learning for open-ended QA! 🧵
Chenchen Gu (@chenchenygu) 's Twitter Profile Photo

Prompt caching lowers inference costs but can leak private information from timing differences. Our audits found 7 API providers with potential leakage of user data. Caching can even leak architecture info—OpenAI's embedding model is likely a decoder-only Transformer! 🧵1/9

Prompt caching lowers inference costs but can leak private information from timing differences.

Our audits found 7 API providers with potential leakage of user data.

Caching can even leak architecture info—OpenAI's embedding model is likely a decoder-only Transformer!
🧵1/9
Kiran Garimella (@gvrkiran) 's Twitter Profile Photo

As AI systems increasingly simulate human behavior, we must ask: How do we ensure they don’t amplify bias, deceive, or manipulate? This paper lays out a much-needed framework for responsible AI design. Its REALLY good. arxiv.org/abs/2503.02250

As AI systems increasingly simulate human behavior, we must ask: How do we ensure they don’t amplify bias, deceive, or manipulate? This paper lays out a much-needed framework for responsible AI design.

Its REALLY good.

arxiv.org/abs/2503.02250
Trajan Hammonds (@trajan317) 's Twitter Profile Photo

people love and dream about movies like The Martian and Interstellar but then celebrate defunding basic science research to save a few dollars in taxes per year

Krishnamurthy (Dj) Dvijotham (@djdvij) 's Twitter Profile Photo

(1/n) Fine tuning APIs create significant security vulnerabilities, breaking alignment in frontier models for under $100! Introducing NOICE, a fine-tuning attack that requires just 1000 training examples to remove model safeguards. The strangest part: we use ONLY harmless data.

(1/n) Fine tuning APIs create significant security vulnerabilities,  breaking alignment in frontier models for under $100!
Introducing NOICE, a fine-tuning attack that requires just 1000 training examples to remove model safeguards. The strangest part: we use ONLY harmless data.
Nathan Lambert (@natolambert) 's Twitter Profile Photo

Google, Anthropic, xAI etc should have a Model Spec. Would help them with all of these if done right: Developers: Know what future models will become Internal: Focus to define and to deliver your goals Regulators: Transparency into wtf frontier labs care about

Ken Liu (@kenziyuliu) 's Twitter Profile Photo

An LLM generates an article verbatim—did it “train on” the article? It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs & data transparency🧵

An LLM generates an article verbatim—did it “train on” the article?

It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs & data transparency🧵
Ahmed Ahmed (@ahmedsqrd) 's Twitter Profile Photo

Incredibly relevant work— suggesting the real bottleneck for effective misinformation isn’t technical detection (watermarking) but purely capabilities (how persuasive and fluent models become)… a chilling shift for AI safety efforts

Ahmed Ahmed (@ahmedsqrd) 's Twitter Profile Photo

Thoughtful retrospective on where RL folks went wrong (focus on algorithms, ignore priors should have been flipped) and where to go next. Domains w dense rewards seem ~solved (verified reasoning, RLHF) so more open ended evals are next (chatbot arena)

Dylan HadfieldMenell (@dhadfieldmenell) 's Twitter Profile Photo

Let's open-source GPT-4. If the non-profit genuinely controlled OpenAI, it's hard to see why they wouldn't release the model. Tons of science that would be unlocked with this. Tons of papers that become reproducible.