Fujun Luan (@fujun_luan) 's Twitter Profile
Fujun Luan

@fujun_luan

Research Scientist @AdobeResearch. Previously Ph.D. at @Cornell | Alumni of @RealityLabs, @Tsinghua_Uni. #GenAI #ML #AI #GenerativeAI. Bayesian. Views my own.

ID: 1661351476428677121

calendar_today24-05-2023 12:40:45

52 Tweet

288 Takipçi

237 Takip Edilen

Mathurin Massias (@mathusmassias) 's Twitter Profile Photo

Anne Gagneux, Ségolène Martin, @qu3ntinb, Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/cfm/ with lots of illustrations and intuition! We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423

Anthropic (@anthropicai) 's Twitter Profile Photo

New Anthropic research: Do reasoning models accurately verbalize their reasoning? Our new paper shows they don't. This casts doubt on whether monitoring chains-of-thought (CoT) will be enough to reliably catch safety issues.

New Anthropic research: Do reasoning models accurately verbalize their reasoning?

Our new paper shows they don't.

This casts doubt on whether monitoring chains-of-thought (CoT) will be enough to reliably catch safety issues.
AI at Meta (@aiatmeta) 's Twitter Profile Photo

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model

Today is the start of a new era of natively multimodal AI innovation.

Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick —  our most advanced models yet and the best in their class for multimodality.

Llama 4 Scout
• 17B-active-parameter model
Ruben Wiersma (@rtwiersma) 's Twitter Profile Photo

Recent progress in multi-view 3D object capture is stunning! But is there a limit? In our upcoming #SIGGRAPH2025 paper "Uncertainty for SVBRDF Acquisition using Frequency Analysis," we show how to answer this question for material estimation. 1/6

Recent progress in multi-view 3D object capture is stunning! But is there a limit?
In our upcoming #SIGGRAPH2025 paper "Uncertainty for SVBRDF Acquisition using Frequency Analysis," we show how to answer this question for material estimation. 1/6
Hanwen Jiang (@hanwenjiang1) 's Twitter Profile Photo

Supervised learning has held 3D Vision back for too long. Meet RayZer — a self-supervised 3D model trained with zero 3D labels: ❌ No supervision of camera & geometry ✅ Just RGB images And the wild part? RayZer outperforms supervised methods (as 3D labels from COLMAP is noisy)

Tianyuan Zhang (@tianyuanzhang99) 's Twitter Profile Photo

Bored of linear recurrent memories (e.g., linear attention) and want a scalable, nonlinear alternative? Our new paper “Test-Time Training Done Right” propose LaCT (Large Chunk Test-Time Training) — a highly efficient, massively scalable nonlinear memory with: 💡 Pure PyTorch

Zhao Dong (@flycooler_zd) 's Twitter Profile Photo

🚀 Excited to announce our CVPR 2025 Workshop: 3D Digital Twin: Progress, Challenges, and Future Directions 🗓 June 12, 2025 · 9:00 AM–5:00 PM 📢 Incredible lineup: Richard Newcombe, Andrea Vedaldi Visual Geometry Group (VGG),Hao (Richard) Zhang,Qianqian Wang,Dr. Xiaoshuai Zhang Hillbot,

🚀 Excited to announce our CVPR 2025 Workshop:  
3D Digital Twin: Progress, Challenges, and Future Directions  
🗓 June 12, 2025 · 9:00 AM–5:00 PM  
📢 Incredible lineup: <a href="/rapideRobot/">Richard Newcombe</a>, Andrea Vedaldi
<a href="/Oxford_VGG/">Visual Geometry Group (VGG)</a>,<a href="/richardzhangsfu/">Hao (Richard) Zhang</a>,<a href="/QianqianWang5/">Qianqian Wang</a>,Dr. Xiaoshuai Zhang <a href="/Hillbot_AI/">Hillbot</a>,
Mathurin Massias (@mathusmassias) 's Twitter Profile Photo

New paper on the generalization of Flow Matching arxiv.org/abs/2506.03719 🤯 Why does flow matching generalize? Did you know that the flow matching target you're trying to learn **can only generate training points**? with Quentin Bertrand, Anne Gagneux & Rémi Emonet 👇👇👇

Black Forest Labs (@bfl_ml) 's Twitter Profile Photo

High quality image editing no longer needs closed models We release FLUX.1 Kontext [dev] - an open weights model for proprietary-level image editing performance. Runs on consumer chips. ✓ Open weights available ✓ Best in-class performance ✓ Self-serve commercial licensing

High quality image editing no longer needs closed models

We release FLUX.1 Kontext [dev] - an open weights model for proprietary-level image editing performance. Runs on consumer chips.

✓ Open weights available
✓ Best in-class performance
✓ Self-serve commercial licensing
Andrew Ng (@andrewyng) 's Twitter Profile Photo

New Course: Post-training of LLMs Learn to post-train and customize an LLM in this short course, taught by Banghua Zhu, Assistant Professor at the University of Washington University of Washington, and co-founder of @NexusflowX. Training an LLM to follow instructions or answer questions has two key

Wan (@alibaba_wan) 's Twitter Profile Photo

🚀 Introducing Wan2.2: The World's First Open-Source MoE-Architecture Video Generation Model with Cinematic Control! 🔥 Key Innovations: ꔷ World's First Open-Source MoE Video Model: Our Mixture-of-Experts architecture scales model capacity without increasing computational

Black Forest Labs (@bfl_ml) 's Twitter Profile Photo

Today we are releasing FLUX.1 Krea [dev] - a new state-of-the-art open-weights FLUX model, built for photorealism. Developed in collaboration with KREA AI, this model is focused on images with unique aesthetics. No “AI look”, no blown-out highlights, just natural detail.

Guangxuan Xiao (@guangxuan_xiao) 's Twitter Profile Photo

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models. For those interested in the details: hanlab.mit.edu/blog/streaming…

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models.

For those interested in the details:
hanlab.mit.edu/blog/streaming…
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

RLVR/RLHF libraries: • verl - ByteDance • TRL - HuggingFace • slime - Zhipu AI • prime-rl - Prime Intellect • ROLL - Alibaba • Nemo-RL - NVIDIA • AReaL - Ant Research • SkyRL - UC Berkeley • open-instruct - Allen AI • torchtune - PyTorch Any I am missing? Which do you

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models "we replace reward guided test-time noise optimization in diffusion models with a Noise Hypernetwork that modulates initial input noise." "We show that our approach recovers a substantial portion of the

Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models

"we replace reward guided test-time noise optimization in diffusion models with a Noise Hypernetwork that modulates initial input noise."

"We show that our approach recovers a substantial portion of the
AI at Meta (@aiatmeta) 's Twitter Profile Photo

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense

Unitree (@unitreerobotics) 's Twitter Profile Photo

Unitree G1 has mastered more quirky skills 🤩 Unitree G1 has learned the "Anti-Gravity" mode: stability is greatly improved under any action sequence, and even if it falls, it can quickly get back up.

Qwen (@alibaba_qwen) 's Twitter Profile Photo

🔥 Qwen-Image-Edit-2509 IS LIVE — and it’s a GAME CHANGER. 🔥 We didn’t just upgrade it. We rebuilt it for creators, designers, and AI tinkerers who demand pixel-perfect control. ✅ Multi-Image Editing? YES. Drag in “person + product” or “person + scene” — it blends them like

🔥 Qwen-Image-Edit-2509 IS LIVE — and it’s a GAME CHANGER. 🔥

We didn’t just upgrade it. We rebuilt it for creators, designers, and AI tinkerers who demand pixel-perfect control.

✅ Multi-Image Editing? YES.
Drag in “person + product” or “person + scene” — it blends them like
Qwen (@alibaba_qwen) 's Twitter Profile Photo

🚀 We're thrilled to unveil Qwen3-VL — the most powerful vision-language model in the Qwen series yet! 🔥 The flagship model Qwen3-VL-235B-A22B is now open-sourced and available in both Instruct and Thinking versions: ✅ Instruct outperforms Gemini 2.5 Pro on key vision

🚀 We're thrilled to unveil Qwen3-VL — the most powerful vision-language model in the Qwen series yet!

🔥 The flagship model Qwen3-VL-235B-A22B is now open-sourced and available in both Instruct and Thinking versions:  
✅ Instruct outperforms Gemini 2.5 Pro on key vision