Elizabeth Hall (@vision_beth) Twitter Tweets • TwiCopy

Dr. Mar Gonzalez-Franco

9 months ago

With Karan Ahuja (embedded in the lab) we worked on multidevice + sensor fusion (his expertise!) research.google/pubs/intent-dr… And eric j gonzalez built a whole tool for prototyping multidevice which we opensourced: github.com/google/xdtk A space where Andrea Colaço contributed a lot too!

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

Yalda Mohsenzadeh

@yalda_mhz

9 months ago

We have two presentations @NeurIPS @UniReps workshop tomorrow: 1) Willow Han will present: openreview.net/forum?id=t4CnK…, and 2) Rouzbeh Meshkinnejad will present: openreview.net/forum?id=fS41j…

thumb_up_off_alt17

chat_bubble_outline0

repeat4

shareShare

Qi Wu

@wilson_over

8 months ago

Say goodbye to perfect pinhole assumptions Excited to introduce 3DGUT—a Gaussian Splatting formulation that unlocks support for distorted cameras, including time dependent effects like rolling shutter, while maintaining the benefits of rasterization, rendering at >250 FPS. 🧵

thumb_up_off_alt303

chat_bubble_outline5

repeat42

shareShare

Bingyi Kang

@bingyikang

8 months ago

Want to use Depth Anything, but need metric depth rather than relative depth? Thrilled to introduce Prompt Depth Anything, a new paradigm for accurate metric depth estimation with up to 4K resolution. 👉Key Message: Depth foundation models like DA have already internalized rich

thumb_up_off_alt454

chat_bubble_outline9

repeat77

shareShare

Moataz Assem

@moatazassem

8 months ago

New preprint! Category-biased patches encircle domain-general brain regions in the human lateral prefrontal cortex doi.org/10.1101/2025.0…

thumb_up_off_alt53

chat_bubble_outline1

repeat14

shareShare

Cambria Revsine

@crevsine

6 months ago

New paper out in Nature Human Behaviour! In it, Wilma Bainbridge and I find that participants tend to remember and forget the same speakers' voices, regardless of speech content. We also predict the memorability of voices from their low-level features: nature.com/articles/s4156…

New paper out in Nature Human Behaviour!

In it, <a href="/WilmaBainbridge/">Wilma Bainbridge</a> and I find that participants tend to remember and forget the same speakers' voices, regardless of speech content. We also predict the memorability of voices from their low-level features: nature.com/articles/s4156…

thumb_up_off_alt42

chat_bubble_outline0

repeat15

shareShare

Martin Hebart

@martin_hebart

6 months ago

We make about 3-4 fast eye movements a second, yet our world appears stable. How is this possible? In a preprint led by Luca Kämmer we test the intriguing idea that anticipatory signals in the fovea may explain visual stability. biorxiv.org/content/10.110…

thumb_up_off_alt49

chat_bubble_outline2

repeat10

shareShare

Thomas Fel

@napoolar

6 months ago

Train your vision SAE on Monday, then again on Tuesday, and you'll find only about 30% of the learned concepts match. ⚓ We propose Archetypal SAE which anchors concepts in the real data’s convex hull, delivering stable and consistent dictionaries. arxiv.org/pdf/2502.12892…

thumb_up_off_alt352

chat_bubble_outline6

repeat78

shareShare

Martin Hebart

@martin_hebart

6 months ago

I wrote a commentary on a very nice research paper that just appeared in Brain by Selma Lugtmeijer (she/her) Aleksandra Sobolewska Steven scholte. Spoiler: It's about modularity in mid-level vision. Here is the original paper: doi.org/10.1093/brain/… And here my commentary: doi.org/10.1093/brain/…

thumb_up_off_alt52

chat_bubble_outline2

repeat10

shareShare

Baifeng

@baifeng_shi

5 months ago

Next-gen vision pre-trained models shouldn’t be short-sighted. Humans can easily perceive 10K x 10K resolution. But today’s top vision models—like SigLIP and DINOv2—are still pre-trained at merely hundreds by hundreds of pixels, bottlenecking their real-world usage. Today, we

thumb_up_off_alt971

chat_bubble_outline27

repeat151

shareShare

Zhiqiu Lin

@zhiqiulin

4 months ago

📷 Can AI understand camera motion like a cinematographer? Meet CameraBench: a large-scale, expert-annotated dataset for understanding camera motion geometry (e.g., trajectories) and semantics (e.g., scene contexts) in any video – films, games, drone shots, vlogs, etc. Links

thumb_up_off_alt183

chat_bubble_outline10

repeat33

shareShare

Michael F. Chiang, MD

@neidirector

4 months ago

Interesting work on memory: Memorability and sustained attention can be leveraged in real time to improve memory performance. #NEIfunded The University of Chicago Brady Roberts Wilma Bainbridge Monica Rosenberg Megan deBettencourt Springer Nature: bit.ly/3EoPrUA

Interesting work on memory:

Memorability and sustained attention can be leveraged in real time to improve memory performance. #NEIfunded

<a href="/UChicago/">The University of Chicago</a> <a href="/BradyRTRoberts/">Brady Roberts</a> <a href="/WilmaBainbridge/">Wilma Bainbridge</a> <a href="/monicarosenb/">Monica Rosenberg</a> <a href="/MdeBettencourt/">Megan deBettencourt</a>

<a href="/SpringerNature/">Springer Nature</a>:
bit.ly/3EoPrUA

thumb_up_off_alt15

chat_bubble_outline0

repeat8

shareShare

Aaron Hertzmann

@aaronhertzmann

4 months ago

I'm excited to announce publication of our new paper that can help answer age old questions of perspective in art history and #visionscience . nature.com/articles/s4159…

thumb_up_off_alt36

chat_bubble_outline0

repeat9

shareShare

Jonathan Lorraine

@jonlorraine9

4 months ago

🔊New NVIDIA paper: Audio-SDS🔊 We repurpose Score Distillation Sampling (SDS) for audio, turning any pretrained audio diffusion model into a tool for diverse tasks, including source separation, impact synthesis & more. 🎧 Demos, audio examples, paper: research.nvidia.com/labs/toronto-a…

thumb_up_off_alt353

chat_bubble_outline6

repeat75

shareShare

Gordon Wetzstein

@gordonwetzstein

4 months ago

Excited to share our new #SIGGRAPH2025 paper! In this work, we show how to combine Gaussian splatting and computer-generated holography using Gaussian Wave Splatting. This enables photorealistic 3D holograms for emerging holographic VR/AR displays. 1/4

thumb_up_off_alt153

chat_bubble_outline2

repeat18

shareShare

Deep RL Course

@deeprlcourse

4 months ago

This week, we also have the honor of hosting Ida Momennejad as our guest speaker. Thank you, Ida Momennejad

This week, we also have the honor of hosting Ida Momennejad as our guest speaker. Thank you, <a href="/criticalneuro/">Ida Momennejad</a>

thumb_up_off_alt15

chat_bubble_outline1

repeat3

shareShare

S. Lester Li

@sizhe_lester_li

2 months ago

Now in Nature! 🚀 Our method learns a controllable 3D model of any robot from vision, enabling single-camera closed-loop control at test time! This includes robots previously uncontrollable, soft, and bio-inspired, potentially lowering the barrier of entry to automation! Paper:

thumb_up_off_alt420

chat_bubble_outline4

repeat68

shareShare

AI at Meta

@aiatmeta

2 months ago

🚀New from Meta FAIR: today we’re introducing Seamless Interaction, a research project dedicated to modeling interpersonal dynamics. The project features a family of audiovisual behavioral models, developed in collaboration with Meta’s Codec Avatars lab + Core AI lab, that

thumb_up_off_alt356

chat_bubble_outline27

repeat105

shareShare

Mick Bonner

@michaelfbonner

2 months ago

Excited to announce that this paper is now out in Science Advances science.org/doi/10.1126/sc…

thumb_up_off_alt44

chat_bubble_outline1

repeat8

shareShare

Yuki Kamitani

@ykamit

2 months ago

Our new study in Nature Computational Science, led by Haibao Wang, presents a neural code converter aligning brain activity across individuals & scanners without shared stimuli by minimizing content loss, paving the way for scalable decoding and cross-site data analysis. nature.com/articles/s4358…

thumb_up_off_alt121

chat_bubble_outline2

repeat37

shareShare