Omar Shaikh (@oshaikh13) 's Twitter Profile
Omar Shaikh

@oshaikh13

CS Ph.D. student @Stanford - previously @GeorgiaTech - also @[email protected]

ID: 995446550

linkhttp://oshaikh.com calendar_today07-12-2012 16:52:10

524 Tweet

840 Takipรงi

945 Takip Edilen

Quan Ze Chen (@cquanze) 's Twitter Profile Photo

Online groups and communities often need to make decisions around social concepts like what content is appropriate. But how do we ensure these decisions are aligned across human decision-makers or even AI systems? We explore this in our work (CI '25): ๐Ÿ“œ Case Law Grounding โš–๏ธ

Online groups and communities often need to make decisions around social concepts like what content is appropriate.
But how do we ensure these decisions are aligned across human decision-makers or even AI systems?

We explore this in our work (CI '25): ๐Ÿ“œ Case Law Grounding โš–๏ธ
dilara (@dilarafsoylu) 's Twitter Profile Photo

Should you RL your compound AI system or optimize its prompts? We think both! ๐Ÿคฏ A short preview of work co-led with Noah Ziems and Lakshya A Agrawal!๐Ÿ‘‡

Should you RL your compound AI system or optimize its prompts? We think both! ๐Ÿคฏ

A short preview of work co-led with <a href="/NoahZiems/">Noah Ziems</a> and <a href="/LakshyAAAgrawal/">Lakshya A Agrawal</a>!๐Ÿ‘‡
Jiaju Ma (@jama1017) 's Twitter Profile Photo

We introduce MoVer, a Motion Verification DSL that automatically checks if AI-generated motion graphics animations match your text prompts! We make it easy for designers to specify and verify complex animations with LLM-powered iterative refinement. Catch our #SIGGRAPH2025 talk:

Yujie Tao (@tao_yujie) 's Twitter Profile Photo

Self-presentation is multifaceted, but the expression is often limited to physical accessories. How could Audio AR transform social interaction? We introduce Audio Personas, body-anchored sounds to dynamically shape social impression. Upcoming in TOCHI: arxiv.org/pdf/2505.00956

Jessy Li (@jessyjli) 's Twitter Profile Photo

The Echoes in AI paper showed quite the opposite with also a story continuation setup. Additionally, we present evidence that both *syntactic* and *discourse* diversity measures show strong homogenization that lexical and cosine used in this paper do not capture.

The Echoes in AI paper showed quite the opposite with also a story continuation setup.
Additionally, we present evidence that both *syntactic* and *discourse* diversity measures show strong homogenization that lexical and cosine used in this paper do not capture.
Yanzhe Zhang (@stevenyzzhang) 's Twitter Profile Photo

Soon, AI agents will act for usโ€”collaborating, negotiating, and sharing data. But can they truly protect our privacy? We simulate privacy-critical scenarios, using alternating search to evolve attacks and defenses, uncovering severe vulnerabilities and building protections.

Tim Althoff (@timalthoff) 's Twitter Profile Photo

Iโ€™m excited to share our new nature paper ๐Ÿ“, which provides strong evidence that the walkability of our built environment matters a great deal to our physical activity and health. Details in thread.๐Ÿงต nature.com/articles/s4158โ€ฆ

Iโ€™m excited to share our new <a href="/Nature/">nature</a> paper ๐Ÿ“, which provides strong evidence that the walkability of our built environment matters a great deal to our physical activity and health. 

Details in thread.๐Ÿงต

nature.com/articles/s4158โ€ฆ
Houjun Liu (@houjun_liu) 's Twitter Profile Photo

New Paper Day! For EMNLP findingsโ€”in LM red-teaming, we show you have to optimize for **both** perplexity and toxicity for high-probability, hard to filter, and natural attacks!

New Paper Day! For EMNLP findingsโ€”in LM red-teaming, we show you have to optimize for **both** perplexity and toxicity for high-probability, hard to filter, and natural attacks!
Kawin Ethayarajh (@ethayarajh) 's Twitter Profile Photo

๐Ÿ“ข Belated update, but I'm thrilled to share that I've joined The University of Chicago Chicago Booth as an Assistant Professor in the newly created Applied AI group! I'll continue to work on behavior-bound machine learning: understanding how AI shapes, is shaped, and should be shaped by the

๐Ÿ“ข Belated update, but I'm thrilled to share that I've joined <a href="/UChicago/">The University of Chicago</a> <a href="/ChicagoBooth/">Chicago Booth</a> as an Assistant Professor in the newly created Applied AI group!
I'll continue to work on behavior-bound machine learning: understanding how AI shapes, is shaped, and should be shaped by the
Ken Liu (@kenziyuliu) 's Twitter Profile Photo

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress โ‰  value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.

Instead of contrived exams where progress โ‰  value, we eval LLMs on organic, unsolved problems via reference-free LLM validation &amp; community verification. LLMs solved ~10/500 so far:
Yanzhe Zhang (@stevenyzzhang) 's Twitter Profile Photo

Introducing Generative Interfaces - a new paradigm beyond chatbots. We generate interfaces on the fly to better facilitate LLM interaction, so no more passive reading of long text blocks. Adaptive and Interactive: creates the form that best adapts to your goals and needs!

Jessy Lin (@realjessylin) 's Twitter Profile Photo

๐Ÿ” How do we teach an LLM to ๐˜ฎ๐˜ข๐˜ด๐˜ต๐˜ฆ๐˜ณ a body of knowledge? In new work with AI at Meta, we propose Active Reading ๐Ÿ“™: a way for models to teach themselves new things by self-studying their training data. Results: * ๐Ÿ”๐Ÿ”% on SimpleQA w/ an 8B model by studying the wikipedia

๐Ÿ” How do we teach an LLM to ๐˜ฎ๐˜ข๐˜ด๐˜ต๐˜ฆ๐˜ณ a body of knowledge?

In new work with <a href="/AIatMeta/">AI at Meta</a>, we propose Active Reading ๐Ÿ“™: a way for models to teach themselves new things by self-studying their training data. Results:

* ๐Ÿ”๐Ÿ”% on SimpleQA w/ an 8B model by studying the wikipedia
Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

Maybe this is part of why I find GPT-5 on ChatGPT so annoying -- apparently its system prompt is explicitly set to *not* ask clarifying questions?!? I find it really annoying the way it just goes off and tries to solve the world in one shot. I really want to iterate!

Maybe this is part of why I find GPT-5 on ChatGPT so annoying -- apparently its system prompt is explicitly set to *not* ask clarifying questions?!?

I find it really annoying the way it just goes off and tries to solve the world in one shot. I really want to iterate!
Munyeong Kim (@kim_munyeong) 's Twitter Profile Photo

I led a session at @mila_quebec HCAI reading group on Omar Shaikh et al.'s General User Model paper! So excited to find and present this paper to our group. We discussed both the paper's novelty and people's concerns, as well as technical approches to address them.

I led a session at @mila_quebec HCAI reading group on <a href="/oshaikh13/">Omar Shaikh</a> et al.'s General User Model paper! So excited to find and present this paper to our group. We discussed both the paper's novelty and people's concerns, as well as technical approches to address them.