joao carreira (@joaocarreira) Twitter Tweets • TwiCopy

Shashank

2 years ago

Delighted to host the 1st edition of our tutorial "Time is precious: Self-Supervised Learning Beyond Images" at European Conference on Computer Vision #ECCV2026 with mrz.salehi and Yuki. We have an exciting line of speakers too joao carreira, Ishan Misra and Emin Orhan. More details coming soon...#ECCV2024

thumb_up_off_alt39

chat_bubble_outline0

repeat9

shareShare

Carl Doersch

@carldoersch

a year ago

We present a new SOTA on point tracking, via self-supervised training on real, unlabeled videos! BootsTAPIR achieves 67.4% AJ on TAP-Vid DAVIS with minimal architecture changes, tracks 10K points on a 50-frame video in 6 secs. Pytorch & JAX impl on Github. bootstap.github.io

thumb_up_off_alt317

chat_bubble_outline7

repeat66

shareShare

joao carreira

@joaocarreira

a year ago

The 2nd Perception Test Challenge is now on -- with a workshop happening in ECCV Milano later in the year. See all about it here ptchallenge-workshop.github.io and try out your top general perception models on it. Besides the original 6 tasks we'll have a new hour-long videoQA track.

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Shiry Ginosar

@shiryginosar

a year ago

Join us next week at our second (high-level) intelligence workshop Simons Institute for the Theory of Computing! Schedule: simons.berkeley.edu/workshops/unde… Register online for both in-person and streaming. Yet another FANTASTIC lineup of speakers:

thumb_up_off_alt23

chat_bubble_outline1

repeat9

shareShare

Skanda

@skandakoppula

a year ago

We're excited to release TAPVid-3D: an evaluation benchmark of 4,000+ real world videos and 2.1 million metric 3D point trajectories, for the task of Tracking Any Point in 3D!

thumb_up_off_alt290

chat_bubble_outline6

repeat59

shareShare

Dima Damen

@dimadamen

a year ago

Time to challenge VLMs? Fed up of benchmarks claiming long-video reasoning but only need few seconds? Try out Hour-Long VQA PerceptionTest Challenge European Conference on Computer Vision #ECCV2026 by Google DeepMind Q. How many dogs did the person encounter in 1-hour long walking video? youtu.be/kefMfeuBRsk

thumb_up_off_alt42

chat_bubble_outline1

repeat6

shareShare

Sjoerd van Steenkiste

@vansteenkiste_s

a year ago

Excited to announce MooG for learning video representations. MooG allows tokens to move “off-the-grid” enabling better representation of scene elements, even as they move across the image plane through time. 📜arxiv.org/abs/2411.05927 🌐moog-paper.github.io

thumb_up_off_alt40

chat_bubble_outline1

repeat14

shareShare

EEML

@eemlcommunity

9 months ago

Apply here: eeml.eu/application Confirmed speakers: Aaron Courville ChiaChun Hung Diana Borsa Emma Rocheteau joao carreira Mihaela Rosca Senka Krivic Federico Barbero Joey Bose Liliane Momeni Miruna Pîslar Razvan Pascanu Samy Bengio

thumb_up_off_alt9

chat_bubble_outline2

repeat2

shareShare

Tengda Han

@tengdahan

8 months ago

We are looking for a student researcher to work on video understanding plus 3D, in Google DeepMind London. DM/Email me or pass it to someone if you feel it may be a good fit!

thumb_up_off_alt119

chat_bubble_outline3

repeat33

shareShare

Tengda Han

@tengdahan

7 months ago

Check out our CVPR 2025 paper: arxiv.org/abs/2504.01961. Work with Dilara Gokay, Joseph Heyward, Chuhan Zhang , Daniel Zoran , Viorica Pătrăucean, joao carreira , Dima Damen and Andrew Zisserman, Google DeepMind

thumb_up_off_alt30

chat_bubble_outline0

repeat2

shareShare

joao carreira

@joaocarreira

7 months ago

Individual frames out of generative video models tend to look reasonable; capturing actions happening over time realistically ... that is way harder. TRAJAN is a new evaluation procedure to better guide progress in this (hot) area.

thumb_up_off_alt45

chat_bubble_outline2

repeat7

shareShare

Sangwoo Mo

@sangwoomo

5 months ago

Can scaling data and models alone solve computer vision? 🤔 Join us at the SP4V Workshop at #ICCV2025 in Hawaii to explore this question! 🎤 Speakers: Danfei Xu, joao carreira, Jiajun Wu, Kristen Grauman, Saining Xie, Vincent Sitzmann 🔗 sp4v.github.io

Can scaling data and models alone solve computer vision? 🤔
Join us at the SP4V Workshop at #ICCV2025 in Hawaii to explore this question!

🎤 Speakers: <a href="/danfei_xu/">Danfei Xu</a>, <a href="/joaocarreira/">joao carreira</a>, <a href="/jiajunwu_cs/">Jiajun Wu</a>, Kristen Grauman, <a href="/sainingxie/">Saining Xie</a>, <a href="/vincesitzmann/">Vincent Sitzmann</a>

🔗 sp4v.github.io

thumb_up_off_alt90

chat_bubble_outline2

repeat17

shareShare

Yana Hasson

@yanahasson

4 months ago

Thrilled to share our latest work on SciVid, to appear at #ICCV2025! 🎉 SciVid offers cross-domain evaluation of video models in scientific applications, including medical CV, animal behavior, & weather forecasting 🧪🌍📽️🪰🐭🫀🌦️ #AI4Science #FoundationModel #CV4Science [1/5]🧵

thumb_up_off_alt23

chat_bubble_outline1

repeat9

shareShare

joao carreira

@joaocarreira

4 months ago

3rd edition of the challenge with new exciting tasks and guest tracks; back during covid when we had the first workshop about the perception test (computerperception.github.io) some of us were afraid the benchmark was too difficult; now we just made it harder.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

joao carreira

@joaocarreira

2 months ago

Human vision is thought to have critical periods of development, after which plasticity is lost (e.g. children born with cataracts who are not treated early struggle to ever regain full vision). Here we propose a related principle to achieve simple non-collapsing latent learning.

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

joao carreira

@joaocarreira

a month ago

Starting in 35 minutes. Big advances in some of the tasks, cool new guest tasks + great speakers.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare