Howard Zhou (@howardzzh) 's Twitter Profile
Howard Zhou

@howardzzh

I'm a Principal Software Engineer and Engineering Director at Google DeepMind, interested in Computer Vision, Machine Learning problems, and Computer Graphics.

ID: 317254344

calendar_today14-06-2011 17:22:16

14 Tweet

48 Followers

68 Following

Jon Barron (@jon_barron) 's Twitter Profile Photo

Training NeRFs per-scene is so 2020. Inspired by image based rendering, IBRNet does amortized inference for view synthesis by learning how to look at input images at render time. 15% drop in error, 80% fewer FLOPs than NeRF. Great work Qianqian Wang! ibrnet.github.io

Frank Dellaert (@fdellaert) 's Twitter Profile Photo

In anticipation of the Intl. Conf. on Computer Vision (#ICCV2021) this week, I rounded up all papers that use Neural Radiance Fields (NeRFs) represented in the main #ICCV2021 conference here (1/N): dellaert.github.io/NeRF21

Jeff Dean (@jeffdean) 's Twitter Profile Photo

New work from Google Research by @JHYUXM, Zirui Wang, Vijay Vasudevan, Legg Yeung, Mojtaba Seyedhosseini and Yonghui Wu: CoCa is a new way of combining image and text representations that achieves SOTA results on a large number of tasks of different kinds.

Frank Dellaert (@fdellaert) 's Twitter Profile Photo

Andrew Marmon and I rounded up all #CVPR2022 papers on NeRF/Neural Radiance Fields we could find in a new blog post here: dellaert.github.io/NeRF22/

Jason Baldridge (@jasonbaldridge) 's Twitter Profile Photo

We are excited to share our work on our Pathways Autoregressive Text-to-Image model, Parti! #Parti achieves high-fidelity photorealistic image generation and supports content-rich synthesis involving complex compositions and world knowledge. parti.research.google

We are excited to share our work on our Pathways Autoregressive Text-to-Image model, Parti! #Parti achieves high-fidelity photorealistic image generation and supports content-rich synthesis involving complex compositions and world knowledge.

parti.research.google
AK (@_akhaliq) 's Twitter Profile Photo

Modeling Collaborator Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use From content moderation to wildlife conservation, the number of applications that require models to recognize nuanced or subjective visual concepts is growing.

Modeling Collaborator

Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use

From content moderation to wildlife conservation, the number of applications that require models to recognize nuanced or subjective visual concepts is growing.
André Araujo (@andrefaraujo) 's Twitter Profile Photo

Want some TIPS? Well, then check out “Text-Image Pretraining with Spatial awareness” :) TIPS is a general-purpose image-text encoder, for off-the-shelf dense and image-level prediction. Finally image-text pretraining with spatially-aware representations! arxiv.org/abs/2410.16512

Want some TIPS? Well, then check out “Text-Image Pretraining with Spatial awareness” :)

TIPS is a general-purpose image-text encoder, for off-the-shelf dense and image-level prediction. Finally image-text pretraining with spatially-aware representations!

arxiv.org/abs/2410.16512
André Araujo (@andrefaraujo) 's Twitter Profile Photo

Multimodal AI encoders often lack spatial understanding… but not anymore! Our #ICLR2025 TIPS model (Text-Image Pretraining with Spatial awareness) from Google DeepMind can help 💡🚀 Check out our strong & versatile image-text encoder 💪 Paper & code: arxiv.org/abs/2410.16512

Multimodal AI encoders often lack spatial understanding… but not anymore! Our #ICLR2025 TIPS model (Text-Image Pretraining with Spatial awareness) from <a href="/GoogleDeepMind/">Google DeepMind</a> can help 💡🚀
Check out our strong &amp; versatile image-text encoder 💪
Paper &amp; code: arxiv.org/abs/2410.16512
lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆 Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆

Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer
Howard Zhou (@howardzzh) 's Twitter Profile Photo

Last Call: Learn GenAI and help us break the GUINNESS WORLD RECORDS™ for Largest Virtual AI Conference! Join Google & Kaggle's GenAI Intensive: No cost, live sessions, hands-on labs. Registration closes this Friday! #GenAI #GuinnessWorldRecords #Kaggle #GoogleAI