Yonglong Tian (@yonglongt) 's Twitter Profile
Yonglong Tian

@yonglongt

Research Scientist @OpenAI. Previously @GoogleDeepMind, @MIT. Opinions are my own.

ID: 1139739755510243328

calendar_today15-06-2019 03:41:47

91 Tweet

2,2K Takipçi

239 Takip Edilen

Yonglong Tian (@yonglongt) 's Twitter Profile Photo

Today marks the official ending of my PhD life at MIT. So grateful to this journey. Coincidentally, we arXiv a paper today: arxiv.org/abs/2306.00984. It shows the potential of learning from synthetic data. This coincidence nicely concludes my PhD life in an academic manner.

Today marks the official ending of my PhD life at MIT.  So grateful to this journey.

Coincidentally, we arXiv a paper today: arxiv.org/abs/2306.00984. It shows the potential of learning  from synthetic data. 

This coincidence nicely concludes my PhD life in an academic manner.
Dilip Krishnan (@dilipkay) 's Twitter Profile Photo

New paper!! We show that pre-training language-image models *solely* on synthetic images from Stable Diffusion can outperform training on real images!! Work done with Yonglong Tian (Google), Huiwen Chang (Google), Phillip Isola (MIT) and Lijie Fan (MIT)!!

New paper!! We show that pre-training language-image models *solely* on synthetic images from Stable Diffusion can outperform training on real images!!

Work done with <a href="/YonglongT/">Yonglong Tian</a> (Google), Huiwen Chang (Google), <a href="/phillip_isola/">Phillip Isola</a> (MIT) and Lijie Fan (MIT)!!
Jing Shao (@amanda_jshao) 's Twitter Profile Photo

🎉(1/6) Exciting News:🐑LAMM is online! ⭐️Features: ① 200k 2D/3D Instruction tuning dataset ② Benchmark on 14 high-level 2D/3D vision tasks ③ Primary but potential framework trainable with only 4*A100s 📚Paper: arxiv.org/pdf/2306.06687… ⌨️Code: github.com/OpenLAMM/LAMM

🎉(1/6) Exciting News:🐑LAMM is online!

⭐️Features:
① 200k 2D/3D Instruction tuning dataset
② Benchmark on 14 high-level 2D/3D vision tasks
③ Primary but potential framework trainable with only 4*A100s

📚Paper: arxiv.org/pdf/2306.06687…
⌨️Code: github.com/OpenLAMM/LAMM
Sangnie Bhardwaj (@sangnie) 's Twitter Profile Photo

Join us at the WiML Un-Workshop breakout session on "Role of Mentorship and Networking"! Do not miss the chance to talk with leading researchers Samy Bengio, Susan Zhang Hugo Larochelle Sharon Y. Li Pablo Samuel Castro John Langford and others! #ICML2023 WiML

Tongzhou Wang (@ssnl_tz) 's Twitter Profile Photo

Quasimetric RL code is now on GitHub: github.com/quasimetric-le… Instead of deleting 80% of the dev repo, I rewrote the algorithm in a hopefully cleaner way. But going through the old repo is fun. So many half-explored interesting ideas in the remaining 80%. RL=geometry

Lijie Fan (@lijie_fan) 's Twitter Profile Photo

🚀 Is the future of vision models Synthetic? Introducing SynCLR: our new pipeline leveraging LLMs & Text-to-image models to train vision models with only synthetic data! 🔥 Outperforming SOTAs like DinoV2 & CLIP on real images! SynCLR excels in fine-grained classification &

🚀 Is the future of vision models Synthetic? Introducing SynCLR: our new pipeline leveraging LLMs &amp; Text-to-image models to train vision models with only synthetic data!
🔥 Outperforming SOTAs like DinoV2 &amp; CLIP on real images! SynCLR excels in fine-grained classification &amp;
Yonglong Tian (@yonglongt) 's Twitter Profile Photo

HNY! Excited to share SynCLR, that rivals CLIP and Dino v2 but uses pure synthetic data. The interesting part - it can outperform models (e.g. CLIP) directly trained on LAION-2B, which was the dataset used to train SD 1.5 that we used to generate images. arxiv.org/abs/2312.17742

HNY! Excited to share SynCLR, that rivals CLIP and Dino v2 but uses pure synthetic data.

The interesting part - it can outperform models (e.g. CLIP) directly trained on LAION-2B, which was the dataset used to train SD 1.5 that we used to generate images. 
arxiv.org/abs/2312.17742
Phillip Isola (@phillip_isola) 's Twitter Profile Photo

Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman mitpress.mit.edu/9780262048972/… It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4

Our computer vision textbook is released!

Foundations of Computer Vision
with Antonio Torralba and Bill Freeman
mitpress.mit.edu/9780262048972/…

It’s been in the works for &gt;10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields.

1/4
Jiawei Yang (@jiaweiyang118) 's Twitter Profile Photo

Very excited to get this out: “DVT: Denoising Vision Transformers”. We've identified and combated those annoying positional patterns in many ViTs. Our approach denoises them, achieving SOTA results and stunning visualizations! Learn more on our website: jiawei-yang.github.io/DenoisingViT/

Lijie Fan (@lijie_fan) 's Twitter Profile Photo

🚀 Excited to share our latest work Fluid! We've developed a scalable autoregressive text-to-image model without VQ. We trained the model up to 10B parameters, achieving state-of-the-art COCO FID and GenEval scores. 🔥 Check it out: arxiv.org/pdf/2410.13863 🙏 Shout out to

🚀 Excited to share our latest work Fluid!

We've developed a scalable autoregressive text-to-image model without VQ. We trained the model up to 10B parameters, achieving state-of-the-art COCO FID and GenEval scores. 🔥
Check it out: arxiv.org/pdf/2410.13863

🙏 Shout out to
Shobhita Sundaram (@shobsund) 's Twitter Profile Photo

Personal vision tasks–like detecting *your mug*-are hard; they’re data scarce and fine-grained. In our new paper, we show you can adapt general-purpose vision models to these tasks from just three photos! 📝: arxiv.org/abs/2412.16156 💻: github.com/ssundaram21/pe… (1/n)

Personal vision tasks–like detecting *your mug*-are hard; they’re data scarce and fine-grained. 

In our new paper, we show you can adapt general-purpose vision models to these tasks from just three photos!

📝: arxiv.org/abs/2412.16156
💻: github.com/ssundaram21/pe…

(1/n)
Yonglong Tian (@yonglongt) 's Twitter Profile Photo

GPT-5 dropped! For *multimodal*, the nice thing is it will use tools way more efficient than o3 (much better than the rendered acc numbers here), making it both better and faster. Ji Lin, efforts baked in.

GPT-5 dropped! 

For *multimodal*, the nice thing is it will use tools way more efficient than o3 (much better than the rendered acc numbers here), making it both better and faster. <a href="/jilin_14/">Ji Lin</a>, efforts baked in.