Haoxuan (Steve) Chen (@haoxuan_steve_c) 's Twitter Profile
Haoxuan (Steve) Chen

@haoxuan_steve_c

Ph.D. Candidate ICME @Stanford; B.S. in Math (PMA) & Data Science (CMS)@Caltech; Applied & Computational Math/Scientific ML (AI4Science)/Statistics/Optimization

ID: 1149061846697070593

linkhttps://haoxuanstevec00.github.io/ calendar_today10-07-2019 21:04:27

538 Tweet

659 Followers

2,2K Following

DailyPapers (@huggingpapers) 's Twitter Profile Photo

Dive into mathematical frameworks, training/inference techniques, & applications across language, vision-language, & bio domains: huggingface.co/papers/2506.13… Repo: github.com/LiQiiiii/DLLM-…

Thomas Fel (@napoolar) 's Twitter Profile Photo

Great excuse to share something I really love: 1-Lipschitz nets. They give clean theory, certs for robustness, the right loss for W-GANs, even nicer grads for explainability!! Yet are still niche. Here’s a speed-run through some of my favorite papers on the field. 🧵👇

Great excuse to share something I really love: 
1-Lipschitz nets.

They give clean theory, certs for robustness, the right loss for W-GANs, even nicer grads for explainability!! Yet are still niche.

Here’s a speed-run through some of my favorite papers on the field. 🧵👇
Mihir Prabhudesai (@mihirp98) 's Twitter Profile Photo

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

🚨 The era of infinite internet data is ending, So we ask:

👉 What’s the right generative modelling objective when data—not compute—is the bottleneck?

TL;DR:

▶️Compute-constrained? Train Autoregressive models

▶️Data-constrained? Train Diffusion models

Get ready for 🤿  1/n
alphaXiv (@askalphaxiv) 's Twitter Profile Photo

"Deep Researcher with Test-Time Diffusion" This paper treats report writing as an iterative retrieval‑augmented diffusion process that can be enhanced by component‑wise self‑evolution. This demonstrates SoTA on multi‑hop search‑and‑reasoning benchmarks.

"Deep Researcher with Test-Time Diffusion"

This paper treats report writing as an iterative retrieval‑augmented diffusion process that can be enhanced by component‑wise self‑evolution. 

This demonstrates SoTA on multi‑hop search‑and‑reasoning benchmarks.
Shuiwang Ji (@shuiwangji) 's Twitter Profile Photo

Our 500+ page AI4Science paper is finally published: Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems. Foundations and Trends® in Machine Learning, Vol. 18, No. 4, 385–912, 2025 nowpublishers.com/article/Detail…

Our 500+ page AI4Science paper is finally published:

Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems. Foundations and Trends® in Machine Learning, Vol. 18, No. 4, 385–912, 2025

nowpublishers.com/article/Detail…
Lin Yang (@lyang36) 's Twitter Profile Photo

Code release! 🚀 Following up on our IMO 2025 results with the public LLM Gemini 2.5 Pro — here’s the full pipeline & general (non-problem-specific) prompts. 👉 [github.com/lyang36/IMO25] Have fun exploring! #AI #Math #LLMs #IMO2025

Molei Tao (@moleitaomath) 's Twitter Profile Photo

Interested in some foundation aspects? Waiting or unhappy about NeurIPS reviews? Plz consider NeurIPS workshop DynaFront: Dynamics at the Frontiers of Optimization, Sampling, and Games sites.google.com/view/dynafront… Yuejie Chi Andrea Montanari Taiji Suzuki Tatjana Chavdarova ++ Sponsor appreciated!

Interested in some foundation aspects?
Waiting or unhappy about NeurIPS reviews?

Plz consider NeurIPS workshop
DynaFront: Dynamics at the Frontiers of Optimization, Sampling, and Games
sites.google.com/view/dynafront…

<a href="/yuejiec/">Yuejie Chi</a> <a href="/Andrea__M/">Andrea Montanari</a> <a href="/btreetaiji/">Taiji Suzuki</a> <a href="/T_Chavdarova/">Tatjana Chavdarova</a> ++
Sponsor appreciated!
Climate Change AI (@climatechangeai) 's Twitter Profile Photo

We're delighted to invite submissions to the #NeurIPS2025 workshop "Tackling Climate Change with Machine Learning". Important dates: ▶️Mentorship program: Jul 28 ▶️Papers, proposal & tutorial submissions: Aug 20 ▶️Workshop: Dec 6 or 7 Learn more: climatechange.ai/events/neurips…

We're delighted to invite submissions to the #NeurIPS2025 workshop "Tackling Climate Change with Machine Learning". 

Important dates:
▶️Mentorship program: Jul 28
▶️Papers, proposal &amp; tutorial submissions: Aug 20
▶️Workshop: Dec 6 or 7

Learn more: climatechange.ai/events/neurips…
Arash Vahdat (@arashvahdat) 's Twitter Profile Photo

📢 Excited to announce that GenMol is now open-sourced. GenMol: A Drug Discovery Generalist with Discrete Diffusion Paper: arxiv.org/abs/2501.06158 Code: github.com/NVIDIA-Digital…

📢 Excited to announce that GenMol is now open-sourced. 

GenMol: A Drug Discovery Generalist with Discrete Diffusion
Paper: arxiv.org/abs/2501.06158
Code: github.com/NVIDIA-Digital…
Chujie Zheng (@chujiezheng) 's Twitter Profile Photo

Proud to introduce Group Sequence Policy Optimization (GSPO), our stable, efficient, and performant RL algorithm that powers the large-scale RL training of the latest Qwen3 models (Instruct, Coder, Thinking) 🚀 📄 huggingface.co/papers/2507.18…

Proud to introduce Group Sequence Policy Optimization (GSPO), our stable, efficient, and performant RL algorithm that powers the large-scale RL training of the latest Qwen3 models (Instruct, Coder, Thinking) 🚀

📄 huggingface.co/papers/2507.18…
alphaXiv (@askalphaxiv) 's Twitter Profile Photo

"Group Sequence Policy Optimization" Qwen introduces Group Sequence Policy Optimization (GSPO), a RL algorithm for LLMs that uses sequence-level importance weighting and clipping instead of token-level methods, which provides high performing training for MoE models.

"Group Sequence Policy Optimization"

Qwen introduces Group Sequence Policy Optimization (GSPO), a RL algorithm for LLMs that uses sequence-level importance weighting and clipping instead of token-level methods, which provides high performing training for MoE models.
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Anthropic's recent blog post provides a good summary of some of the recent work applying mechanistic interpretability to biology. I am very bullish on this research direction!

Anthropic's recent blog post provides a good summary of some of the recent work applying mechanistic interpretability to biology. I am very bullish on this research direction!
Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

Phase 1 of Physics of Language Models code release ✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours ✅Canon layers = strong, scalable gains ✅Real open-source (data/train/weights) ✅Apache 2.0 license (commercial ok!) 🔗github.com/facebookresear…

Phase 1 of Physics of Language Models code release
✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours
✅Canon layers = strong, scalable gains
✅Real open-source (data/train/weights)
✅Apache 2.0 license (commercial ok!)
🔗github.com/facebookresear…
You Jiacheng (@youjiacheng) 's Twitter Profile Photo

Yep, TRPO ignores the mismatch of state distribution. (and the 2002 reference shows that this is a classic approximation). GSPO just says: why we need to ignore it?

Yep, TRPO ignores the mismatch of state distribution.
(and the 2002 reference shows that this is a classic approximation).
GSPO just says: why we need to ignore it?
Physical Review D (@physrevd) 's Twitter Profile Photo

Higher order calculations in perturbation theory quickly become complex and difficult, but progress is being made. This highlight uses modern techniques from computational algebraic geometry to analytically compute five-point three-loop diagrams. #EdSugg journals.aps.org/prd/abstract/1…