Haven (Haiwen) Feng (@havenfeng) 's Twitter Profile
Haven (Haiwen) Feng

@havenfeng

PhD student @MPI_IS, visiting @berkeley_ai now. Interested in machine learning, computer vision, computer graphic, and how to understand the physical world.

ID: 1450717795948367877

linkhttp://havenfeng.github.io calendar_today20-10-2021 06:58:06

129 Tweet

946 Followers

873 Following

Seohong Park (@seohong_park) 's Twitter Profile Photo

Q-learning is not yet scalable seohong.me/blog/q-learninโ€ฆ I wrote a blog post about my thoughts on scalable RL algorithms. To be clear, I'm still highly optimistic about off-policy RL and Q-learning! I just think we haven't found the right solution yet (the post discusses why).

Q-learning is not yet scalable

seohong.me/blog/q-learninโ€ฆ

I wrote a blog post about my thoughts on scalable RL algorithms.

To be clear, I'm still highly optimistic about off-policy RL and Q-learning! I just think we haven't found the right solution yet (the post discusses why).
Xiuyu Li (@xiuyu_l) 's Twitter Profile Photo

Sparsity can make your LoRA fine-tuning go brrr ๐Ÿ’จ Announcing SparseLoRA (ICML 2025): up to 1.6-1.9x faster LLM fine-tuning (2.2x less FLOPs) via contextual sparsity, while maintaining performance on tasks like math, coding, chat, and ARC-AGI ๐Ÿคฏ ๐Ÿงต1/ z-lab.ai/projects/sparsโ€ฆ

Albert Gu (@_albertgu) 's Twitter Profile Photo

I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.

I converted one of my favorite talks I've given over the past year into a blog post.

"On the Tradeoffs of SSMs and Transformers"
(or: tokens are bullshit)

In a few days, we'll release what I believe is the next major advance for architectures.
Qiyang Li (@qiyang_li) 's Twitter Profile Photo

Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! colinqiyangli.github.io/qc/ The recipe to achieve this is incredibly simple. ๐Ÿงต 1/N

Ruilong Li (@ruilong_li) 's Twitter Profile Photo

For everyone interested in precise ๐Ÿ“ทcamera control ๐Ÿ“ท in transformers [e.g., video / world model etc] Stop settling for Plรผcker raymaps -- use camera-aware relative PE in your attention layers, like RoPE (for LLMs) but for cameras! Paper & code: liruilong.cn/prope/

For everyone interested in precise ๐Ÿ“ทcamera control ๐Ÿ“ท in transformers [e.g., video / world model etc]

Stop settling for Plรผcker raymaps -- use camera-aware relative PE in your attention layers, like RoPE (for LLMs) but for cameras! 

Paper & code: liruilong.cn/prope/
David McAllister (@davidrmcall) 's Twitter Profile Photo

Check out our blog post at flowreinforce.github.ioย  We developed interactive plots that explain the connection between flow/diffusion models and RL. w/ a great team of collaborators! Songwei Ge Brent Yi Chung Min Kim Ethan Weber Hongsuk Benjamin Choi Haiwen (Haven) Feng Angjoo Kanazawa

Qianqian Wang (@qianqianwang5) 's Twitter Profile Photo

๐Ÿ“ขThrilled to share that I'll be joining Harvard and the Kempner Institute as an Assistant Professor starting Fall 2026! I'll be recruiting students this year for the Fall 2026 admissions cycle. Hope you apply!

Xingang Pan (@xingangp) 's Twitter Profile Photo

Introducing ๐—ฆ๐—ง๐—ฟ๐—ฒ๐—ฎ๐—บ๐Ÿฏ๐—ฅ, a new 3D geometric foundation model for efficient 3D reconstruction from streaming input. Similar to LLMs, STream3R uses casual attention during training and KVCache at inference. No need to worry about post-alignment or reconstructing from scratch.

Weiyang Liu (@besteuler) 's Twitter Profile Photo

Excited to see Orthogonal Finetuning (OFT) and Quantized OFT (QOFT) now merged into LLaMA-Factory! ๐ŸŽ‰ OFT & QOFT are memory/time/parameter-efficient and excel at preserving pretraining knowledge. Try them in: ๐Ÿ”— LLaMA-Factory: github.com/hiyouga/LLaMA-โ€ฆ ๐Ÿ”— PEFT:

Excited to see Orthogonal Finetuning (OFT) and Quantized OFT (QOFT) now merged into LLaMA-Factory! ๐ŸŽ‰

OFT & QOFT are memory/time/parameter-efficient and excel at preserving pretraining knowledge. Try them in:
๐Ÿ”— LLaMA-Factory: github.com/hiyouga/LLaMA-โ€ฆ
๐Ÿ”— PEFT:
Phota Labs (@photalabs) 's Twitter Profile Photo

Introducing Phota Labs: We're building personalized visual GenAI and the next chapter of photography. Because memory-making should be effortless, personal, and compelling for everyone. We're excited to share our $5.6M seed led by a16z ( Yoko ), with @Figma Ventures,

Zhaoyang Lv (@lvzhaoyang) 's Twitter Profile Photo

We'd like thank reviewers and community that 4DGT got accepted to NeurIPS 2025 as a Spotlight. We have just released the demo code in github.com/facebookresearโ€ฆ There are a few features to be added with some updates in our writing, thanks to the awesome suggestions from

Andrea Tagliasacchi ๐Ÿ‡จ๐Ÿ‡ฆ (@taiyasaki) 's Twitter Profile Photo

Thrilled to announced that at #ICCV2025 we will host the first workshop on ๐†๐ž๐จ๐ฆ๐ž๐ญ๐ซ๐ฒ-๐…๐ซ๐ž๐ž ๐๐จ๐ฏ๐ž๐ฅ ๐•๐ข๐ž๐ฐ ๐’๐ฒ๐ง๐ญ๐ก๐ž๐ฌ๐ข๐ฌ ๐š๐ง๐ ๐‚๐จ๐ง๐ญ๐ซ๐จ๐ฅ๐ฅ๐š๐›๐ฅ๐ž ๐•๐ข๐๐ž๐จ ๐Œ๐จ๐๐ž๐ฅ๐ฌ geofreenvs.github.io a.k.a. "3D Computer Vision in the era of Video Models" ๐Ÿ˜…

Thrilled to announced that at #ICCV2025 we will host the first workshop on ๐†๐ž๐จ๐ฆ๐ž๐ญ๐ซ๐ฒ-๐…๐ซ๐ž๐ž ๐๐จ๐ฏ๐ž๐ฅ ๐•๐ข๐ž๐ฐ ๐’๐ฒ๐ง๐ญ๐ก๐ž๐ฌ๐ข๐ฌ ๐š๐ง๐ ๐‚๐จ๐ง๐ญ๐ซ๐จ๐ฅ๐ฅ๐š๐›๐ฅ๐ž ๐•๐ข๐๐ž๐จ ๐Œ๐จ๐๐ž๐ฅ๐ฌ

geofreenvs.github.io
a.k.a. "3D Computer Vision in the era of Video Models" ๐Ÿ˜…
Sherwin Bahmani (@sherwinbahmani) 's Twitter Profile Photo

๐Ÿ“ข Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation Got only one or a few images and wondering if recovering the 3D environment is a reconstruction or generation problem? Why not do it with a generative reconstruction model! We show that a

Weiyang Liu (@besteuler) 's Twitter Profile Photo

I enjoy reading this blog. This is exactly what I am trying to pursue throughout my research career -- using weight geometry to characterize and improve neural network training. Really excited that it finally got people's attention now!! In 2017, we study the weight

Siyuan Guo (@syguoml) 's Twitter Profile Photo

๐Ÿšจ New preprint. Physics of learning: A Lagrangian perspective to different learning paradigms. arxiv.org/abs/2509.21049 TL;DR A single Lagrangian unifies supervised, generative modelling, and RL. - We study the problem of building an efficient learning system. - We propose that

๐Ÿšจ New preprint. Physics of learning: A Lagrangian perspective to different learning paradigms.
arxiv.org/abs/2509.21049
TL;DR A single Lagrangian unifies supervised, generative modelling, and RL.

- We study the problem of building an efficient learning system.
- We propose that