YichuanWang (@yichuanm) 's Twitter Profile
YichuanWang

@yichuanm

1st year EECS PhD at UC Berkeley SkyLab @BerkeleySky, 2020 ACM class in SJTU interested in MLSYS. Prev, intern at NYU with jinyang li.

ID: 1389164045782028288

linkhttps://yichuan520030910320.github.io/ calendar_today03-05-2021 10:25:00

63 Tweet

451 Takipçi

1,1K Takip Edilen

Jiaxin Ge (@aomaru_21490) 's Twitter Profile Photo

✨Introducing ECHO, the newest in-the-wild image generation benchmark! You’ve seen new image models and new use cases discussed on social media, but old benchmarks don’t test them! We distilled this qualitative discussion into a structured benchmark. 🔗 echo-bench.github.io

Yifan Qiao (@yifanqiao_ucla) 's Twitter Profile Photo

🚀 End the GPU Cost Crisis Today!!! Headache with LLMs lock a whole GPU but leave capacity idle? Frustrated by your cluster's low utilization? We launch kvcached, the first library for elastic GPU sharing across LLMs. 🔗 github.com/ovg-project/kv… 🧵👇 Why it matters:

🚀 End the GPU Cost Crisis Today!!!

Headache with LLMs lock a whole GPU but leave capacity idle? Frustrated by your cluster's low utilization?

We launch kvcached, the first library for elastic GPU sharing across LLMs.
🔗 github.com/ovg-project/kv…
🧵👇 Why it matters:
Tsung-Han (Patrick) Wu @ ICLR’25 (@tsunghan_wu) 's Twitter Profile Photo

Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: ⛔ Force Stop → Reasoning leakage (won’t stop) ⚡️ Speedup → Panic (rushed answers) ❓ Info Updates → Self-doubt (reject updates) 👉Check out dynamic-lm.github.io

Humans handle dynamic situations easily, what about models? 

Turns out, they break in three distinct ways:

⛔ Force Stop → Reasoning leakage (won’t stop) 
⚡️ Speedup → Panic (rushed answers) 
❓ Info Updates → Self-doubt (reject updates)

👉Check out dynamic-lm.github.io
Wenhu Chen (@wenhuchen) 's Twitter Profile Photo

Had some really interesting discoveries recently: If a model performs extremely stable on one benchmark. Let's say a model is always getting 62% on SWEBench no matter what prompts or scaffold you used. It DOES NOT mean that the model is robust. It actually means that the model

Melissa Pan (@melissapan) 's Twitter Profile Photo

The Sky’s Fun Committee, representing the ppl of sky, just dropped the new lab theme: ⚫️💖 Black Pink x Halloween 🎃🦇 We have: - Gru & the minions - kpop ??? 🫰😉

The Sky’s Fun Committee, representing the ppl of sky, just dropped the new lab theme: 

⚫️💖 Black Pink x Halloween 🎃🦇

We have:
- Gru & the minions 
- kpop ???

🫰😉
António Loison (@antonio_loison) 's Twitter Profile Photo

📢 ViDoRe V3, our new multimodal retrieval benchmark for enterprise use cases, is finally here! It focuses on real-world applied RAG scenarios using high-quality human-verified data. huggingface.co/blog/QuentinJG… 🧵(1/N)

📢 ViDoRe V3, our new multimodal retrieval benchmark for enterprise use cases, is finally here!
It focuses on real-world applied RAG scenarios using high-quality human-verified data. huggingface.co/blog/QuentinJG…
🧵(1/N)
Cursor (@cursor_ai) 's Twitter Profile Photo

Semantic search improves our agent's accuracy across all frontier models, especially in large codebases where grep alone falls short. Learn more about our results and how we trained an embedding model for retrieving code.

Sumanth (@sumanth_077) 's Twitter Profile Photo

Turn your laptop into a powerful RAG system! LEANN can index and search through millions of documents while using 97% less storage than traditional solutions without accuracy loss. LEANN achieves this through graph-based selective recomputation with high-degree preserving

Turn your laptop into a powerful RAG system!

LEANN can index and search through millions of documents while using 97% less storage than traditional solutions without accuracy loss.

LEANN achieves this through graph-based selective recomputation with high-degree preserving
Machine Learning Community ⭐️ (@c4ml_) 's Twitter Profile Photo

Turn your laptop into a powerful RAG system! LEANN can index and search through millions of documents while using 97% less storage than traditional solutions without accuracy loss. 100% Open Source

Turn your laptop into a powerful RAG system!  

LEANN can index and search through millions of documents while using 97% less storage than traditional solutions without accuracy loss.

100% Open Source
YichuanWang (@yichuanm) 's Twitter Profile Photo

The best open-source deep-research model I’ve seen so far. Tons of eye-opening insights on RLVR, real web-search grounding, and how LLMs actually use data. If you care about retrieval, reasoning, or agentic systems, this is a must-read.

YichuanWang (@yichuanm) 's Twitter Profile Photo

Congrats to the team — incredible work! 👏 Interestingly, we built LEANN MCP in July (the first open-source semantic-search MCP for Claude Code), and we also saw: 🚀 50% token savings ⚡ 30% lower latency Repo👇 github.com/yichuan-w/LEAN…

Antoine Chaffin (@antoine_chaffin) 's Twitter Profile Photo

Alright so since state-of-the-art semantic search MCP is trending again, please let me introduce you to MCPyLate again 😇 x.com/antoine_chaffi… It probably is less powerful than Mixedbread’s search and have higher storage cost than LEANN, but it let’s you use any ColBERT

YichuanWang (@yichuanm) 's Twitter Profile Photo

Yeah, super exciting to see Claude boosted with MS MARCO via latent-space search. Now it’s our turn — time to bring multi-vector retrieval to code 🚀 LEANN is already the first Claude-powered Code MCP, and CC+MultiVector+LEANN MCP (or even FastPlaid MCP Antoine Chaffin👀) is on

Aamir Shakir (@aaxsh18) 's Twitter Profile Photo

grep is multimodal now. performs better then apple photos and gives you and your agent perfect search. just run npm install -g @mixedbread/mgrep

Shiyi Cao (@shiyi_c98) 's Twitter Profile Photo

1/n 🚀 Introducing SkyRL-Agent, a framework for efficient RL agent training. ⚡ 1.55× faster async rollout dispatch 🛠 Lightweight tool + task integration 🔄 Backend-agnostic (SkyRL-train / VeRL / Tinker) 🏆 Used to train SA-SWE-32B, improving Qwen3-32B from 24.4% → 39.4%

1/n
🚀 Introducing SkyRL-Agent, a framework for efficient RL agent training.

⚡ 1.55× faster async rollout dispatch
🛠 Lightweight tool + task integration
🔄 Backend-agnostic (SkyRL-train / VeRL / Tinker)
🏆 Used to train SA-SWE-32B, improving Qwen3-32B from 24.4% → 39.4%