Fujun Luan (@fujun_luan) Twitter Tweets • TwiCopy

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Anne Gagneux, Ségolène Martin, @qu3ntinb, Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/cfm/ with lots of illustrations and intuition! We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423

thumb_up_off_alt584

chat_bubble_outline5

repeat98

shareShare

Anthropic

@anthropicai

8 months ago

New Anthropic research: Do reasoning models accurately verbalize their reasoning? Our new paper shows they don't. This casts doubt on whether monitoring chains-of-thought (CoT) will be enough to reliably catch safety issues.

thumb_up_off_alt3,3K

chat_bubble_outline151

repeat629

shareShare

AI at Meta

@aiatmeta

8 months ago

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model

thumb_up_off_alt13,13K

chat_bubble_outline706

repeat2,2K

shareShare

Ruben Wiersma

@rtwiersma

7 months ago

Recent progress in multi-view 3D object capture is stunning! But is there a limit? In our upcoming #SIGGRAPH2025 paper "Uncertainty for SVBRDF Acquisition using Frequency Analysis," we show how to answer this question for material estimation. 1/6

thumb_up_off_alt34

chat_bubble_outline1

repeat6

shareShare

Hanwen Jiang

@hanwenjiang1

7 months ago

Supervised learning has held 3D Vision back for too long. Meet RayZer — a self-supervised 3D model trained with zero 3D labels: ❌ No supervision of camera & geometry ✅ Just RGB images And the wild part? RayZer outperforms supervised methods (as 3D labels from COLMAP is noisy)

thumb_up_off_alt391

chat_bubble_outline5

repeat69

shareShare

Tianyuan Zhang

@tianyuanzhang99

6 months ago

Bored of linear recurrent memories (e.g., linear attention) and want a scalable, nonlinear alternative? Our new paper “Test-Time Training Done Right” propose LaCT (Large Chunk Test-Time Training) — a highly efficient, massively scalable nonlinear memory with: 💡 Pure PyTorch

thumb_up_off_alt390

chat_bubble_outline5

repeat74

shareShare

Zhao Dong

@flycooler_zd

6 months ago

🚀 Excited to announce our CVPR 2025 Workshop: 3D Digital Twin: Progress, Challenges, and Future Directions 🗓 June 12, 2025 · 9:00 AM–5:00 PM 📢 Incredible lineup: Richard Newcombe, Andrea Vedaldi Visual Geometry Group (VGG),Hao (Richard) Zhang,Qianqian Wang,Dr. Xiaoshuai Zhang Hillbot,

thumb_up_off_alt53

chat_bubble_outline2

repeat21

shareShare

Mathurin Massias

@mathusmassias

6 months ago

New paper on the generalization of Flow Matching arxiv.org/abs/2506.03719 🤯 Why does flow matching generalize? Did you know that the flow matching target you're trying to learn **can only generate training points**? with Quentin Bertrand, Anne Gagneux & Rémi Emonet 👇👇👇

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat202

shareShare

Black Forest Labs

@bfl_ml

5 months ago

High quality image editing no longer needs closed models We release FLUX.1 Kontext [dev] - an open weights model for proprietary-level image editing performance. Runs on consumer chips. ✓ Open weights available ✓ Best in-class performance ✓ Self-serve commercial licensing

thumb_up_off_alt1,1K

chat_bubble_outline89

repeat350

shareShare

Andrew Ng

@andrewyng

5 months ago

New Course: Post-training of LLMs Learn to post-train and customize an LLM in this short course, taught by Banghua Zhu, Assistant Professor at the University of Washington University of Washington, and co-founder of @NexusflowX. Training an LLM to follow instructions or answer questions has two key

thumb_up_off_alt1,1K

chat_bubble_outline38

repeat329

shareShare

Wan

@alibaba_wan

4 months ago

🚀 Introducing Wan2.2: The World's First Open-Source MoE-Architecture Video Generation Model with Cinematic Control! 🔥 Key Innovations: ꔷ World's First Open-Source MoE Video Model: Our Mixture-of-Experts architecture scales model capacity without increasing computational

thumb_up_off_alt1,1K

chat_bubble_outline71

repeat305

shareShare

Black Forest Labs

@bfl_ml

4 months ago

Today we are releasing FLUX.1 Krea [dev] - a new state-of-the-art open-weights FLUX model, built for photorealism. Developed in collaboration with KREA AI, this model is focused on images with unique aesthetics. No “AI look”, no blown-out highlights, just natural detail.

thumb_up_off_alt404

chat_bubble_outline18

repeat70

shareShare

Guangxuan Xiao

@guangxuan_xiao

4 months ago

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models. For those interested in the details: hanlab.mit.edu/blog/streaming…

thumb_up_off_alt895

chat_bubble_outline17

repeat114

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

4 months ago

RLVR/RLHF libraries: • verl - ByteDance • TRL - HuggingFace • slime - Zhipu AI • prime-rl - Prime Intellect • ROLL - Alibaba • Nemo-RL - NVIDIA • AReaL - Ant Research • SkyRL - UC Berkeley • open-instruct - Allen AI • torchtune - PyTorch Any I am missing? Which do you

thumb_up_off_alt993

chat_bubble_outline38

repeat112

shareShare

Edward Z. Yang

@ezyang

4 months ago

State of torch.compile, August 2025.

thumb_up_off_alt761

chat_bubble_outline13

repeat68

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

4 months ago

Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models "we replace reward guided test-time noise optimization in diffusion models with a Noise Hypernetwork that modulates initial input noise." "We show that our approach recovers a substantial portion of the

thumb_up_off_alt161

chat_bubble_outline5

repeat27

shareShare

AI at Meta

@aiatmeta

4 months ago

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense

thumb_up_off_alt3,3K

chat_bubble_outline150

repeat689

shareShare

Unitree

@unitreerobotics

2 months ago

Unitree G1 has mastered more quirky skills 🤩 Unitree G1 has learned the "Anti-Gravity" mode: stability is greatly improved under any action sequence, and even if it falls, it can quickly get back up.

thumb_up_off_alt7,7K

chat_bubble_outline697

repeat1,1K

shareShare

Qwen

@alibaba_qwen

2 months ago

🔥 Qwen-Image-Edit-2509 IS LIVE — and it’s a GAME CHANGER. 🔥 We didn’t just upgrade it. We rebuilt it for creators, designers, and AI tinkerers who demand pixel-perfect control. ✅ Multi-Image Editing? YES. Drag in “person + product” or “person + scene” — it blends them like

thumb_up_off_alt2,2K

chat_bubble_outline90

repeat347

shareShare

Qwen

@alibaba_qwen

2 months ago

🚀 We're thrilled to unveil Qwen3-VL — the most powerful vision-language model in the Qwen series yet! 🔥 The flagship model Qwen3-VL-235B-A22B is now open-sourced and available in both Instruct and Thinking versions: ✅ Instruct outperforms Gemini 2.5 Pro on key vision

thumb_up_off_alt1,1K

chat_bubble_outline80

repeat312

shareShare