Kai Zhang (@kaizhang9546) 's Twitter Profile
Kai Zhang

@kaizhang9546

Training image/video GenAI foundation models at Adobe. Opinions are my own.

ID: 1346885093361471489

calendar_today06-01-2021 18:23:38

294 Tweet

1,1K Followers

416 Following

Noam Brown (@polynoamial) 's Twitter Profile Photo

Our new OpenAI o3 and o4-mini models further confirm that scaling inference improves intelligence, and that scaling RL shifts up the whole compute vs. intelligence curve. There is still a lot of room to scale both of these further.

Our new <a href="/OpenAI/">OpenAI</a> o3 and o4-mini models further confirm that scaling inference improves intelligence, and that scaling RL shifts up the whole compute vs. intelligence curve. There is still a lot of room to scale both of these further.
Wan (@alibaba_wan) 's Twitter Profile Photo

1/3 🚀Thrilled to introduce Wan2.1-FLF2V-14B - our first 14B-parameter large model for First-Last-Frame to video generation! Open-source, open-source, open-source! Empowering digital artists with unprecedented efficiency and creative flexibility. #wan #AIGC #alart

Cognition (@cognition_labs) 's Twitter Profile Photo

Project DeepWiki Up-to-date documentation you can talk to, for every repo in the world. Think Deep Research for GitHub – powered by Devin. It’s free for open-source, no sign-up! Visit deepwiki com or just swap github → deepwiki on any repo URL:

Awni Hannun (@awnihannun) 's Twitter Profile Photo

Qwen3 and Qwen3 MoEs are already supported in the latest mlx-lm thanks to Prince Canuma and Gökdeniz Gülmez pip install -U mlx-lm Awesome that Qwen ships a model for every device: -iPhone: 0.6B, 4B -Macbook: 8B, 30B, 3B/30B MoE -M2, M3 Ultra: 22B/235B MoE

Junyang Lin (@justinlin610) 's Twitter Profile Photo

Qwen3 is finally out! It really takes some time for our guys to figure out methods to solve some problems that are not fancy. How to scale RL with stable training, how to balance data from different domains, how to increase the support of more languages with performance

Matt Deitke (@mattdeitke) 's Twitter Profile Photo

I’m very excited to introduce Vy, the AI that sees and acts on your computer! It’s a first glimpse of what we’ve been working on at Vercept! Early computers trapped the world's best experts in low-level tasks–loading code, managing memory, fighting errors. Progress

Runway (@runwayml) 's Twitter Profile Photo

Today we are releasing Gen-4 References to all paid plans. Now anyone can generate consistent characters, locations and more. With References, you can use photos, generated images, 3D models or selfies to place yourself or others into any scene you can imagine. More examples

Hanwen Jiang (@hanwenjiang1) 's Twitter Profile Photo

Supervised learning has held 3D Vision back for too long. Meet RayZer — a self-supervised 3D model trained with zero 3D labels: ❌ No supervision of camera & geometry ✅ Just RGB images And the wild part? RayZer outperforms supervised methods (as 3D labels from COLMAP is noisy)

Chris Rockwell (@_crockwell) 's Twitter Profile Photo

Excited to share ☀️Lightspeed⚡, a photorealistic, synthetic dataset with ground truth pose used for benchmarking alongside DynPose-100K! Now available for download: huggingface.co/datasets/nvidi… Paper accepted to #CVPR2025: arxiv.org/abs/2504.17788

elvis (@omarsar0) 's Twitter Profile Photo

AgenticSeek: Private, Local Manus Alternative This is worth checking. It's a local alternative to Manus AI that can autonomously browse the web, write code, and plan tasks. It's built for local reasoning models, runs on your hardware, and keeps all data on your device.

AgenticSeek: Private, Local Manus Alternative

This is worth checking.

It's a local alternative to Manus AI that can autonomously browse the web, write code, and plan tasks.

It's built for local reasoning models, runs on your hardware, and keeps all data on your device.
AK (@_akhaliq) 's Twitter Profile Photo

Direct3D-S2 Gigascale 3D Generation Made Easy with Spatial Sparse Attention high resolution 3D generation from image

Tianyuan Zhang (@tianyuanzhang99) 's Twitter Profile Photo

Bored of linear recurrent memories (e.g., linear attention) and want a scalable, nonlinear alternative? Our new paper “Test-Time Training Done Right” propose LaCT (Large Chunk Test-Time Training) — a highly efficient, massively scalable nonlinear memory with: 💡 Pure PyTorch

Zhao Dong (@flycooler_zd) 's Twitter Profile Photo

🚀 Excited to announce our CVPR 2025 Workshop: 3D Digital Twin: Progress, Challenges, and Future Directions 🗓 June 12, 2025 · 9:00 AM–5:00 PM 📢 Incredible lineup: Richard Newcombe, Andrea Vedaldi Visual Geometry Group (VGG),Hao (Richard) Zhang,Qianqian Wang,Dr. Xiaoshuai Zhang Hillbot,

🚀 Excited to announce our CVPR 2025 Workshop:  
3D Digital Twin: Progress, Challenges, and Future Directions  
🗓 June 12, 2025 · 9:00 AM–5:00 PM  
📢 Incredible lineup: <a href="/rapideRobot/">Richard Newcombe</a>, Andrea Vedaldi
<a href="/Oxford_VGG/">Visual Geometry Group (VGG)</a>,<a href="/richardzhangsfu/">Hao (Richard) Zhang</a>,<a href="/QianqianWang5/">Qianqian Wang</a>,Dr. Xiaoshuai Zhang <a href="/Hillbot_AI/">Hillbot</a>,
Two Minute Papers (@twominutepapers) 's Twitter Profile Photo

NVIDIA’s AI watched 150,000 videos… and learned to relight scenes incredibly well! No game engine. No 3D software. And it has an amazing cat demo. 🐱💡 Hold on to your papers! Full video: youtube.com/watch?v=yRk6vG…

NVIDIA’s AI watched 150,000 videos… and learned to relight scenes incredibly well! No game engine. No 3D software. And it has an amazing cat demo. 🐱💡
Hold on to your papers! Full video: youtube.com/watch?v=yRk6vG…
Jason Wei (@_jasonwei) 's Twitter Profile Photo

The most rewarding thing about working in the office on nights and weekends is not the actual work you get done, but the spontaneous conversations with other people who are always working. They’re the people who tend to do big things and will become your most successful friends

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements "To evaluate the ability of AI agents to reproduce results in an active research area, we introduce the Automated LLM Speedrunning Benchmark, leveraging the research community’s contributions on the

The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

"To evaluate the ability of AI agents to reproduce results in an active research area, we introduce the Automated LLM Speedrunning Benchmark, leveraging the research community’s contributions on the
Jason Weston (@jaseweston) 's Twitter Profile Photo

🌉 Bridging Offline & Online RL for LLMs 🌉 📝: arxiv.org/abs/2506.21495 New paper shows on verifiable & non-verifiable tasks: - Online DPO & GRPO give similar performance. - Semi-online (iterative) DPO with sync every s steps (more efficient!) works very well also. - Offline DPO

🌉 Bridging Offline &amp; Online RL for LLMs 🌉
📝: arxiv.org/abs/2506.21495
New paper shows on verifiable &amp; non-verifiable tasks:
- Online DPO &amp; GRPO give similar performance.
- Semi-online (iterative) DPO with sync every s steps (more efficient!) works very well also.
- Offline DPO