Yufan Zhuang (@yufan_zhuang) 's Twitter Profile
Yufan Zhuang

@yufan_zhuang

LLM & GenAI Researcher | phd student @UCSanDiego | prev @AMD @Meta @MSFTResearch @IBMResearch

ID: 973591762756280320

linkhttps://evanzhuang.github.io/ calendar_today13-03-2018 16:08:59

886 Tweet

246 Followers

275 Following

Sebastien Bubeck (@sebastienbubeck) 's Twitter Profile Photo

Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct. Details below.

Claim: gpt-5-pro can prove new interesting mathematics.

Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct.

Details below.
Samuel Schmidgall (@srschmidgall) 's Twitter Profile Photo

Our paper on autonomous scientific research is accepted to Findings of #EMNLP2025! 🎉 We introduce Agent Laboratory, a framework that accelerates scientific discovery by teaming human researchers with LLM agents.

Our paper on autonomous scientific research is accepted to Findings of #EMNLP2025! 🎉

We introduce Agent Laboratory, a framework that accelerates scientific discovery by teaming human researchers with LLM agents.
Yanda Chen (@yanda_chen_) 's Twitter Profile Photo

Our results overall suggest that we can effectively separate harmful from harmless data and use pretraining data filtering to improve model safety without compromising usefulness. Big thanks to the team! 🙏 Mycal Tucker, Nina, Tony Wang 🐨, Francesco Mosconi,

tensorqt (@tensorqt) 's Twitter Profile Photo

attention sinks may be a bias in causal transformers. as some of you know, i've been writing a long blogpost on attention and its properties as a message-passing operation on graphs. while doing so, i figured i might have found an explanation for which attention sinks may be an

attention sinks may be a bias in causal transformers. 

as some of you know, i've been writing a long blogpost on attention and its properties as a message-passing operation on graphs. while doing so, i figured i might have found an explanation for which attention sinks may be an
SemiAnalysis (@semianalysis_) 's Twitter Profile Photo

TogetherAI's Chief Scientist Tri Dao announced Flash Attention v4 at HotChips Conference which is up to 22% faster than the attention kernel implementation from NVIDIA's cuDNN library. Tri Dao was able to achieve this 2 key algorithmic changes. Firstly, it uses a new online

TogetherAI's Chief Scientist <a href="/tri_dao/">Tri Dao</a> announced Flash Attention v4 at HotChips Conference which is up to 22% faster than the attention kernel implementation from NVIDIA's cuDNN library. Tri Dao was able to achieve this 2 key algorithmic changes. Firstly, it uses a new online
Pan Lu (@lupantech) 's Twitter Profile Photo

🔔 Two months ago, we released #IneqMath, which revealed the Soundness Gap: LLMs can guess answers to Olympiad-level inequalities problems, but still struggle to make rigorous proof steps. Since then, it's been downloaded 4K+ times on HuggingFace! ➡️ ineqmath.github.io

🔔 Two months ago, we released #IneqMath, which revealed the Soundness Gap:

LLMs can guess answers to Olympiad-level inequalities problems, but still struggle to make rigorous proof steps.

Since then, it's been downloaded 4K+ times on HuggingFace!
➡️ ineqmath.github.io
Liyuan Liu (Lucas) (@liyuanlucas) 's Twitter Profile Photo

appreciate Thinking Machines taking an open research approach! excited to see the first blog mentioned our work! truly on-policy RL is like RTX3090 for gamers in 2020 - you really want it, but the blockers make your head itch… kernel mismatches, parallelism mismatches, etc. etc.

Zilong (Ryan) Wang (@zlwang_cs) 's Twitter Profile Photo

🤖 RLVR is great for aligning LLMs — but what about optimizing multiple objectives at once? Different rewards have different learning difficulty & saturation rates ⚖️ Introducing my intern Yining Lu 's work 🎓 Dynamic Reward Weighting 🔀 – Adapts weights online as training

alphaXiv (@askalphaxiv) 's Twitter Profile Photo

This new paper suggests that LLM ‘aha moments’ arise from an emergent planning-vs-execution hierarchy, similar to HRM’s slow-planner/fast-executor idea So they proposed HICRA which amplifies per-token credit on scarce planning tokens, focusing strategy & often beating GRPO!

This new paper suggests that LLM ‘aha moments’ arise from an emergent planning-vs-execution hierarchy, similar to HRM’s slow-planner/fast-executor idea

So they proposed HICRA which amplifies per-token credit on scarce planning tokens, focusing strategy &amp; often beating GRPO!
Dheeraj Mekala (@mekaladheeraj) 's Twitter Profile Photo

Super excited to share GAIA2 & ARE! ARE - research platform for scalable creation of RL environments. GAIA2 - successor of GAIA for evaluating agents in a smartphone-like environment.

Super excited to share GAIA2 &amp; ARE!

ARE - research platform for scalable creation of RL environments.
GAIA2 - successor of GAIA for evaluating agents in a smartphone-like environment.
Qwen (@alibaba_qwen) 's Twitter Profile Photo

🚀 Introducing Qwen3-Omni — the first natively end-to-end omni-modal AI unifying text, image, audio & video in one model — no modality trade-offs! 🏆 SOTA on 22/36 audio & AV benchmarks 🌍 119L text / 19L speech in / 10L speech out ⚡ 211ms latency | 🎧 30-min audio

🚀 Introducing Qwen3-Omni — the first natively end-to-end omni-modal AI unifying text, image, audio &amp; video in one model — no modality trade-offs!

🏆 SOTA on 22/36 audio &amp; AV benchmarks
🌍 119L text / 19L speech in / 10L speech out
⚡ 211ms latency | 🎧 30-min audio
Riley Walz (@rtwlz) 's Twitter Profile Photo

I reverse engineered the San Francisco parking ticket system. I can see every ticket seconds after it's written So I made a website. Find My Friends? AVOID THE PARKING COPS.

I reverse engineered the San Francisco parking ticket system. I can see every ticket seconds after it's written

So I made a website. Find My Friends? AVOID THE PARKING COPS.
Da Yu (@dayu85201802) 's Twitter Profile Photo

✨ Internship Opportunity @ Google Research ✨ We are seeking a self-motivated student researcher to join our team at Google Research starting around January 2026. 🚀 In this role, you will contribute to research projects advancing agentic LLMs through tool use and RL, with the

Pan Lu (@lupantech) 's Twitter Profile Photo

🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task. 🌐agentflow.stanford.edu 📄huggingface.co/papers/2510.05… AgentFlow unlocks full potential of LLMs w/ tool-use. (And yes, our 3/7B model beats GPT-4o)👇

🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task.

🌐agentflow.stanford.edu
📄huggingface.co/papers/2510.05…

AgentFlow unlocks full potential of LLMs w/ tool-use.
(And yes, our 3/7B model beats GPT-4o)👇