Karan Desai (KD) (@kdexd) 's Twitter Profile
Karan Desai (KD)

@kdexd

I fight the devil in the details ๐Ÿง
Current: Cool startup, Prev: CS PhD @ University of Michigan

ID: 841858076193955841

linkhttps://kdexd.xyz calendar_today15-03-2017 03:46:21

482 Tweet

2,2K Followers

443 Following

Karan Desai (KD) (@kdexd) 's Twitter Profile Photo

Decisions for #CVPR2024 are out. One of my submissions was rejected with SA,WA,WR. WR reviewer (and AC) show signs of gatekeeping: "X was popular in the past. Does the community need a better version of X (= our work)?" How can one reviewer decide what our huge community needs?

Chris Rockwell (@_crockwell) 's Twitter Profile Photo

๐Ÿ“ข Presenting ๐…๐€๐‘: ๐…๐ฅ๐ž๐ฑ๐ข๐›๐ฅ๐ž, ๐€๐œ๐œ๐ฎ๐ซ๐š๐ญ๐ž ๐š๐ง๐ ๐‘๐จ๐›๐ฎ๐ฌ๐ญ ๐Ÿ”๐ƒ๐จ๐… ๐‘๐ž๐ฅ๐š๐ญ๐ข๐ฏ๐ž ๐‚๐š๐ฆ๐ž๐ซ๐š ๐๐จ๐ฌ๐ž ๐„๐ฌ๐ญ๐ข๐ฆ๐š๐ญ๐ข๐จ๐ง #CVPR2024 FAR builds upon complimentary Solver and Learning-Based works yielding accurate *and* robust pose! crockwell.github.io/far/

๐Ÿ“ข Presenting ๐…๐€๐‘: ๐…๐ฅ๐ž๐ฑ๐ข๐›๐ฅ๐ž, ๐€๐œ๐œ๐ฎ๐ซ๐š๐ญ๐ž ๐š๐ง๐ ๐‘๐จ๐›๐ฎ๐ฌ๐ญ ๐Ÿ”๐ƒ๐จ๐… ๐‘๐ž๐ฅ๐š๐ญ๐ข๐ฏ๐ž ๐‚๐š๐ฆ๐ž๐ซ๐š ๐๐จ๐ฌ๐ž ๐„๐ฌ๐ญ๐ข๐ฆ๐š๐ญ๐ข๐จ๐ง #CVPR2024

FAR builds upon complimentary Solver and Learning-Based works yielding accurate *and* robust pose!

crockwell.github.io/far/
Zhuang Liu (@liuzhuang1234) 's Twitter Profile Photo

Very excited to share one of the most interesting projects I've ever worked on, but first, a small game: Here are 15 images from three of the largest and most diverse modern image datasets: YFCC100M, CC12M and DataComp-1B. Can you guess which images are from which datasets?

Very excited to share one of the most interesting projects I've ever worked on, but first, a small game:

Here are 15 images from three of the largest and most diverse modern image datasets: YFCC100M, CC12M and DataComp-1B.

Can you guess which images are from which datasets?
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ (@ducha_aiki) 's Twitter Profile Photo

Benchmarking Object Detectors with COCO: A New Path Forward Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai tl;dr: SAM-refined masks for MS-CoCo -> re-evaluated benchmark -> all methods score higher. arxiv.org/abs/2403.18819โ€ฆ

Benchmarking Object Detectors with COCO: A New Path Forward

Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai

tl;dr: SAM-refined masks for MS-CoCo -> re-evaluated benchmark -> all methods score higher.
arxiv.org/abs/2403.18819โ€ฆ
Saining Xie (@sainingxie) 's Twitter Profile Photo

Introducing Cambrian-1, a fully open project from our group at NYU. The world doesn't need another MLLM to rival GPT-4V. Cambrian is unique as a vision-centric exploration & here's why I think it's time to shift focus from scaling LLMs to enhancing visual representations.๐Ÿงต[1/n]

Introducing Cambrian-1, a fully open project from our group at NYU. The world doesn't need another MLLM to rival GPT-4V. Cambrian is unique as a vision-centric exploration & here's why I think it's time to shift focus from scaling LLMs to enhancing visual representations.๐Ÿงต[1/n]
Karan Desai (KD) (@kdexd) 's Twitter Profile Photo

Hey European Conference on Computer Vision #ECCV2026 , you mention on your website that a paper should be covered by a full in-person registration. But the reg portal shows this note (excludes NO VIRTUAL). It slipped my mind and I accidentally registered virtual. Pls fix this! Also what do I do?

Hey <a href="/eccvconf/">European Conference on Computer Vision #ECCV2026</a> , you mention on your website that a paper should be covered by a full in-person registration. But the reg portal shows this note (excludes NO VIRTUAL). It slipped my mind and I accidentally registered virtual. Pls fix this! Also what do I do?
Karan Desai (KD) (@kdexd) 's Twitter Profile Photo

Yep, that's me this time. Got to pay more for leading a last author project, I guess (I am not a faculty with grants, project not done in industry) /shrug

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

โšก๏ธ Excited to share that I am starting an AI+Education company called Eureka Labs. The announcement: --- We are Eureka Labs and we are building a new kind of school that is AI native. How can we approach an ideal experience for learning something new? For example, in the case

โšก๏ธ Excited to share that I am starting an AI+Education company called Eureka Labs. 
The announcement:

---
We are Eureka Labs and we are building a new kind of school that is AI native.

How can we approach an ideal experience for learning something new? For example, in the case
Soumith Chintala (@soumithchintala) 's Twitter Profile Photo

I'm giving the opening Keynote at ICML 2024 on Tuesday the 23rd @ 9:30am CEST. I'll try empower folks to get Open Science back on track -- the free discussion of ideas is such an important aspect of AI progress, and we've been losing track. This is a complex topic, and I wont

I'm giving the opening Keynote at ICML 2024 on Tuesday the 23rd @ 9:30am CEST.
I'll try empower folks to get Open Science back on track -- the free discussion of ideas is such an important aspect of AI progress, and we've been losing track.
This is a complex topic, and I wont
AI at Meta (@aiatmeta) 's Twitter Profile Photo

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today weโ€™re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context

Karan Desai (KD) (@kdexd) 's Twitter Profile Photo

Hello, World Labs! ๐Ÿ˜๐ŸŽ‰ Excited to share that I have been building at World Labs after finishing my PhD! At World Labs, we are committed to building AI systems with a high level of spatial intelligence. All our lives, we humans constantly perceive and interact with the 3D

Keunhong Park (@keunhongp) 's Twitter Profile Photo

Happy to announce our company has come out of stealth! We are building spatial intelligence with some of the most talented researchers and engineers -- Luyang Zhu, Eric Chan, Karan Desai (KD), Mohamed El Banani, Kyle Sargent, Chao-Yuan Wu, and many more. We are hiring so please reach out!

Sagar Vaze (@sagar_vaze) 's Twitter Profile Photo

Today, we're announcing Pixtral 12B! It's our first vision model, and is really strong on the standard multimodal benchmarks, **without compromising abilities on your favourite text and reasoning tasks**. Blog: mistral.ai/news/pixtral-1โ€ฆ Le Chat ๐Ÿˆ: chat.mistral.ai/chat

Today, we're announcing Pixtral 12B!

It's our first vision model, and is really strong on the standard multimodal benchmarks, **without compromising abilities on your favourite text and reasoning tasks**. 

Blog: mistral.ai/news/pixtral-1โ€ฆ
Le Chat ๐Ÿˆ: chat.mistral.ai/chat
Ai2 (@allen_ai) 's Twitter Profile Photo

Meet Molmo: a family of open, state-of-the-art multimodal AI models. Our best model outperforms proprietary systems, using 1000x less data. Molmo doesn't just understand multimodal dataโ€”it acts on it, enabling rich interactions in both the physical and virtual worlds. Try it

clem ๐Ÿค— (@clementdelangue) 's Twitter Profile Photo

IMO one of the reasons why big companies keep winning is that startups feel an artificial urge to compete with each other instead of helping each other, sometimes driven by their needlessly competitive startup investors. I might be naive but I believe the opportunities are big

Pascal Mettes (@pascalmettes) 's Twitter Profile Photo

All vision-language models should have hyperbolic embeddings. Vision and language are incredibly hierarchical in nature! See below our latest work on hyperbolic vision-language models that exploit visual compositions through entailment: