Lisa Dunlap (@lisabdunlap) Twitter Tweets • TwiCopy

Lisa Dunlap

4 months ago

an someone do a study on the rates of blue boxes with blue text in websites pre and post Claude? I swear I see this style everywhere now

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Brent Yi

@brenthyi

4 months ago

July has been a big month for Viser! - Released v1.0.0😊 - We did some writing Some demos👇

thumb_up_off_alt646

chat_bubble_outline10

repeat79

shareShare

1/N 🚀 Launching LEANN — the tiniest vector index on Earth! Fast, accurate, and 100% private RAG on your MacBook. 0% internet. 97% smaller. Semantic search on everything. Your personal Jarvis, ready to dive into your emails, chats, and more. 🔗 Code: github.com/yichuan-w/LEANN 📄

thumb_up_off_alt131

chat_bubble_outline3

repeat42

shareShare

Lisa Dunlap

@lisabdunlap

4 months ago

Recently when using in a somewhat long Cursor chat, it will just randomly delete all the code it has written from that chat and restoring the checkpoint doesn't bring any of it back. Files which it created are just completely empty. Anyone else notice this?

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Nick Jiang @ ICLR

@nickhjiang

3 months ago

What makes LLMs like Grok-4 unique? We use sparse autoencoders (SAEs) to tackle queries like these and apply them to four data analysis tasks: data diffing, correlations, targeted clustering, and retrieval. By analyzing model outputs, SAEs find novel insights on model behavior!

thumb_up_off_alt159

chat_bubble_outline6

repeat16

shareShare

Norman Mu

@thenormanmu

3 months ago

we’re on the board! x.ai/safety

thumb_up_off_alt662

chat_bubble_outline21

repeat153

shareShare

Alex Pan

@aypan_17

3 months ago

grok 4 model card and RMF! x.ai/safety

thumb_up_off_alt105

chat_bubble_outline3

repeat8

shareShare

Lisa Dunlap

@lisabdunlap

3 months ago

Don't know what to get your advisor for their birthday? Give them the best gift of all: their time back. Happy (late) birthday Joey Gonzalez :)

Don't know what to get your advisor for their birthday? Give them the best gift of all: their time back.

Happy (late) birthday <a href="/profjoeyg/">Joey Gonzalez</a> :)

thumb_up_off_alt178

chat_bubble_outline7

repeat8

shareShare

Lisa Dunlap

@lisabdunlap

3 months ago

is it just me or is claude-4 on cursor incredibly cheery and sycophantic now

thumb_up_off_alt8

chat_bubble_outline1

repeat0

shareShare

Jessy Lin

@realjessylin

3 months ago

🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge? In new work with AI at Meta, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results: * 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia

🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge?

In new work with <a href="/AIatMeta/">AI at Meta</a>, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results:

* 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat158

shareShare

Liana

@lianapatel_

3 months ago

Interested in building and benchmarking deep research systems? Excited to introduce DeepScholar-Bench, a live benchmark for generative research synthesis, from our team at Stanford and Berkeley! 🏆Live Leaderboard guestrin-lab.github.io/deepscholar-le… 📚 Paper: arxiv.org/abs/2508.20033 🛠️

thumb_up_off_alt166

chat_bubble_outline1

repeat36

shareShare

XuDong Wang

@xdwang101

3 months ago

🎉 Excited to share RecA: Reconstruction Alignment Improves Unified Multimodal Models 🔥 Post-train w/ RecA: 8k images & 4 hours (8 GPUs) → SOTA UMMs: GenEval 0.73→0.90 | DPGBench 80.93→88.15 | ImgEdit 3.38→3.75 Code: github.com/HorizonWind200… 1/n

thumb_up_off_alt79

chat_bubble_outline6

repeat30

shareShare

Tsung-Han (Patrick) Wu @ ICLR’25

@tsunghan_wu

2 months ago

NeurIPS 2025 ✅ Our generate-verify de-hallucination paper is in! ✔️ DFS-backtracking–like tricks fix VLM hallucinations ✔️ Explicit confidence targets matter (we stressed this before OpenAI’s “Why LMs Hallucinate”) 👉 Check it out: reverse-vlm.github.io See u all at SD!

thumb_up_off_alt72

chat_bubble_outline1

repeat12

shareShare

Melissa Pan

@melissapan

2 months ago

Excited to share: MAST has been accepted as 🌟 NeurIPS D&B Spotlight🌟 Updates for the community: - NEW: We open-source 1,000+ multi-agent traces (link in 🧵). - lots of exciting use cases are emerging, we’ll be releasing blogs & tutorials to help you get started - And … more

thumb_up_off_alt155

chat_bubble_outline10

repeat33

shareShare

LMSYS Org

@lmsysorg

2 months ago

SGLang now supports deterministic LLM inference! Building on Thinking Machines batch-invariant kernels, we integrated deterministic attention & sampling ops into a high-throughput engine - fully compatible with chunked prefill, CUDA graphs, radix cache, and non-greedy sampling. ✅

SGLang now supports deterministic LLM inference! Building on <a href="/thinkymachines/">Thinking Machines</a> batch-invariant kernels, we integrated deterministic attention & sampling ops into a high-throughput engine - fully compatible with chunked prefill, CUDA graphs, radix cache, and non-greedy sampling.

✅

thumb_up_off_alt392

chat_bubble_outline6

repeat68

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

2 months ago

🎉 Re-introducing Categories in Vision Arena! Since we first introduced categories over two years ago (and Vision Arena last year), the AI evaluation landscape has grown rapidly. Categories let us zoom in on model performance for specific areas, from captioning to diagrams. 🧵

thumb_up_off_alt103

chat_bubble_outline1

repeat9

shareShare

Parth Asawa

@pgasawa

2 months ago

Training our advisors was too hard, so we tried to train black-box models like GPT-5 instead. Check out our work: Advisor Models, a training framework that adapts frontier models behind an API to your specific environment, users, or tasks using a smaller, advisor model (1/n)!

thumb_up_off_alt206

chat_bubble_outline12

repeat32

shareShare

Lisa Dunlap

Lisa Dunlap

Brent Yi

YichuanWang

Lisa Dunlap

Nick Jiang @ ICLR

Norman Mu

Alex Pan

Lisa Dunlap

Lisa Dunlap

Jessy Lin

Liana

XuDong Wang

Tsung-Han (Patrick) Wu @ ICLR’25

Melissa Pan

LMSYS Org

lmarena.ai (formerly lmsys.org)

Parth Asawa