Karan Desai (KD) (@kdexd) Twitter Tweets • TwiCopy

Karan Desai (KD)

a year ago

Decisions for #CVPR2024 are out. One of my submissions was rejected with SA,WA,WR. WR reviewer (and AC) show signs of gatekeeping: "X was popular in the past. Does the community need a better version of X (= our work)?" How can one reviewer decide what our huge community needs?

thumb_up_off_alt45

chat_bubble_outline6

repeat0

shareShare

Chris Rockwell

@_crockwell

10 months ago

📢 Presenting 𝐅𝐀𝐑: 𝐅𝐥𝐞𝐱𝐢𝐛𝐥𝐞, 𝐀𝐜𝐜𝐮𝐫𝐚𝐭𝐞 𝐚𝐧𝐝 𝐑𝐨𝐛𝐮𝐬𝐭 𝟔𝐃𝐨𝐅 𝐑𝐞𝐥𝐚𝐭𝐢𝐯𝐞 𝐂𝐚𝐦𝐞𝐫𝐚 𝐏𝐨𝐬𝐞 𝐄𝐬𝐭𝐢𝐦𝐚𝐭𝐢𝐨𝐧 #CVPR2024 FAR builds upon complimentary Solver and Learning-Based works yielding accurate *and* robust pose! crockwell.github.io/far/

thumb_up_off_alt47

chat_bubble_outline1

repeat19

shareShare

Zhuang Liu

@liuzhuang1234

10 months ago

Very excited to share one of the most interesting projects I've ever worked on, but first, a small game: Here are 15 images from three of the largest and most diverse modern image datasets: YFCC100M, CC12M and DataComp-1B. Can you guess which images are from which datasets?

thumb_up_off_alt148

chat_bubble_outline10

repeat23

shareShare

Dmytro Mishkin 🇺🇦

@ducha_aiki

10 months ago

Benchmarking Object Detectors with COCO: A New Path Forward Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai tl;dr: SAM-refined masks for MS-CoCo -> re-evaluated benchmark -> all methods score higher. arxiv.org/abs/2403.18819…

thumb_up_off_alt137

chat_bubble_outline14

repeat25

shareShare

Prafull Sharma

@prafull7

7 months ago

Justice for Vegans #CVPR2025 🤣

Justice for Vegans <a href="/CVPR/">#CVPR2025</a> 🤣

thumb_up_off_alt48

chat_bubble_outline3

repeat2

shareShare

Saining Xie

@sainingxie

7 months ago

Introducing Cambrian-1, a fully open project from our group at NYU. The world doesn't need another MLLM to rival GPT-4V. Cambrian is unique as a vision-centric exploration & here's why I think it's time to shift focus from scaling LLMs to enhancing visual representations.🧵[1/n]

thumb_up_off_alt1,1K

chat_bubble_outline17

repeat257

shareShare

Karan Desai (KD)

@kdexd

6 months ago

Hey European Conference on Computer Vision #ECCV2026 , you mention on your website that a paper should be covered by a full in-person registration. But the reg portal shows this note (excludes NO VIRTUAL). It slipped my mind and I accidentally registered virtual. Pls fix this! Also what do I do?

Hey <a href="/eccvconf/">European Conference on Computer Vision #ECCV2026</a> , you mention on your website that a paper should be covered by a full in-person registration. But the reg portal shows this note (excludes NO VIRTUAL). It slipped my mind and I accidentally registered virtual. Pls fix this! Also what do I do?

thumb_up_off_alt13

chat_bubble_outline2

repeat0

shareShare

Karan Desai (KD)

@kdexd

6 months ago

Yep, that's me this time. Got to pay more for leading a last author project, I guess (I am not a faculty with grants, project not done in industry) /shrug

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Andrej Karpathy

@karpathy

6 months ago

⚡️ Excited to share that I am starting an AI+Education company called Eureka Labs. The announcement: --- We are Eureka Labs and we are building a new kind of school that is AI native. How can we approach an ideal experience for learning something new? For example, in the case

thumb_up_off_alt27,27K

chat_bubble_outline1,1K

repeat3,3K

shareShare

Soumith Chintala

@soumithchintala

6 months ago

I'm giving the opening Keynote at ICML 2024 on Tuesday the 23rd @ 9:30am CEST. I'll try empower folks to get Open Science back on track -- the free discussion of ideas is such an important aspect of AI progress, and we've been losing track. This is a complex topic, and I wont

thumb_up_off_alt683

chat_bubble_outline23

repeat68

shareShare

AI at Meta

@aiatmeta

6 months ago

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context

thumb_up_off_alt5,5K

chat_bubble_outline265

repeat1,1K

shareShare

Karan Desai (KD)

@kdexd

4 months ago

Hello, World Labs! 😍🎉 Excited to share that I have been building at World Labs after finishing my PhD! At World Labs, we are committed to building AI systems with a high level of spatial intelligence. All our lives, we humans constantly perceive and interact with the 3D

thumb_up_off_alt122

chat_bubble_outline6

repeat14

shareShare

Keunhong Park

@keunhongp

4 months ago

Happy to announce our company has come out of stealth! We are building spatial intelligence with some of the most talented researchers and engineers -- Luyang Zhu, Eric Chan, Karan Desai (KD), Mohamed El Banani, Kyle Sargent, Chao-Yuan Wu, and many more. We are hiring so please reach out!

thumb_up_off_alt86

chat_bubble_outline3

repeat12

shareShare

Jay Karhade

@jaykarhade

4 months ago

Summer Update: Had an incredible research internship at World Labs 🌎working towards #SpatialAI 🚀🚀 Next Up: PhD CMU Robotics Institute !

Summer Update: Had an incredible research internship at <a href="/theworldlabs/">World Labs</a> 🌎working towards #SpatialAI 🚀🚀

Next Up: PhD <a href="/CMU_Robotics/">CMU Robotics Institute</a> !

thumb_up_off_alt131

chat_bubble_outline1

repeat7

shareShare

Sagar Vaze

@sagar_vaze

4 months ago

Today, we're announcing Pixtral 12B! It's our first vision model, and is really strong on the standard multimodal benchmarks, **without compromising abilities on your favourite text and reasoning tasks**. Blog: mistral.ai/news/pixtral-1… Le Chat 🐈: chat.mistral.ai/chat

thumb_up_off_alt75

chat_bubble_outline1

repeat12

shareShare

Ai2

@allen_ai

4 months ago

Meet Molmo: a family of open, state-of-the-art multimodal AI models. Our best model outperforms proprietary systems, using 1000x less data. Molmo doesn't just understand multimodal data—it acts on it, enabling rich interactions in both the physical and virtual worlds. Try it

thumb_up_off_alt343

chat_bubble_outline17

repeat60

shareShare

clem 🤗

@clementdelangue

3 months ago

IMO one of the reasons why big companies keep winning is that startups feel an artificial urge to compete with each other instead of helping each other, sometimes driven by their needlessly competitive startup investors. I might be naive but I believe the opportunities are big

thumb_up_off_alt352

chat_bubble_outline31

repeat34

shareShare

Pascal Mettes

@pascalmettes

3 months ago

All vision-language models should have hyperbolic embeddings. Vision and language are incredibly hierarchical in nature! See below our latest work on hyperbolic vision-language models that exploit visual compositions through entailment:

thumb_up_off_alt135

chat_bubble_outline2

repeat10

shareShare