Saksham Suri (@_sakshams_) Twitter Tweets • TwiCopy

Saksham Suri

@_sakshams_

+ Follow

Research Scientist @AiatMeta. Previously PhD @UMDCS, @MetaAI, @AmazonScience, @USCViterbi, @IIITDelhi, @IBMResearch.
#computervision #deeplearning

ID: 2977040274

linkhttp://www.cs.umd.edu/~sakshams/ calendar_today12-01-2015 16:09:41

129 Tweet

760 Takipçi

638 Takip Edilen

Matt Shumer

@mattshumer_

2 years ago

Wild tech you have to try: groq.com They are serving Mixtral at nearly 500 tok/s. Answers are pretty much instantaneous. Opens up new use-cases, and completely changes the UX possibilities of existing ones.

thumb_up_off_alt1,1K

chat_bubble_outline68

repeat162

shareShare

Stability AI

@stabilityai

2 years ago

Announcing Stable Diffusion 3, our most capable text-to-image model, utilizing a diffusion transformer architecture for greatly improved performance in multi-subject prompts, image quality, and spelling abilities. Today, we are opening the waitlist for early preview. This phase

thumb_up_off_alt5,5K

chat_bubble_outline241

repeat1,1K

shareShare

Anthropic

@anthropicai

2 years ago

Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

thumb_up_off_alt9,9K

chat_bubble_outline560

repeat2,2K

shareShare

Abhinav Shrivastava

@abhi2610

2 years ago

Call for Papers: #INRV2024 Workshop on Implicit Neural Representation for Vision @ #CVPR2024! Topics: Compression, Representation using INR’s for images, audio, video & more! Ddl: 3/31. Submit now! #CVPR2025 Website: inrv.github.io Submission Link: shorturl.at/vzBR8

thumb_up_off_alt41

chat_bubble_outline1

repeat19

shareShare

Deedy

@deedydas

a year ago

Thank you King Kohli 👑 for all the memories. End of an era.

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat52

shareShare

Saksham Suri

@_sakshams_

a year ago

That's a wrap! Happy to share that I have defended my thesis. Thankful for the insightful questions and feedback from my committee members Abhinav Shrivastava,Tianyi Zhou, David Jacobs, Prof. Espy-Wilson, and Prof. Andrew Zisserman.

That's a wrap! Happy to share that I have defended my thesis.

Thankful for the insightful questions and feedback from my committee members <a href="/abhi2610/">Abhinav Shrivastava</a>,<a href="/zhoutianyi/">Tianyi Zhou</a>, <a href="/davwiljac/">David Jacobs</a>, Prof. Espy-Wilson, and Prof. Andrew Zisserman.

thumb_up_off_alt82

chat_bubble_outline10

repeat0

shareShare

Saksham Suri

@_sakshams_

a year ago

Excited to announce that I have joined AI at Meta as a Research Scientist where I will be working on model optimization. Also I will be at ECCV to present my work and am excited to meet and learn from everyone. Reach out if you are attending and would like to chat. Ciao 🇮🇹

thumb_up_off_alt211

chat_bubble_outline17

repeat6

shareShare

Yunyang Xiong

@youngxiong1

a year ago

🚨VideoLLM from Meta!🚨 LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding 📝Paper: huggingface.co/papers/2410.17… 🧑🏻‍💻Code: github.com/Vision-CAIR/Lo… 🚀Project (Demo): vision-cair.github.io/LongVU We propose LongVU, a video LLM with a spatiotemporal adaptive

thumb_up_off_alt253

chat_bubble_outline5

repeat73

shareShare

Saksham Suri

@_sakshams_

a year ago

We are happy to release our LiFT code and pretrained models! 📢 Code: github.com/saksham-s/lift Project Page: cs.umd.edu/~sakshams/LiFT Here are some super spooky super resolved feature visualizations to make the season scarier 🎃 Coauthors: Matthew Walmer Kamal Gupta Abhinav Shrivastava

thumb_up_off_alt244

chat_bubble_outline2

repeat46

shareShare

Saksham Suri

@_sakshams_

a year ago

Checkout LARP, our work on creating a video tokenizer which is trained with an autoregressive generative prior. Code and models are open sourced!

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Saksham Suri

@_sakshams_

10 months ago

Checkout Efficient Track Anything from our team. 2x faster than SAM2 on A100 > 10 FPS on iPhone 15 Pro Max Paper: arxiv.org/pdf/2411.18933 demo: yformer.github.io/efficient-trac…

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare

Forrest Iandola

@fiandola

10 months ago

[1/n] 𝗘𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗧𝗿𝗮𝗰𝗸 𝗔𝗻𝘆𝘁𝗵𝗶𝗻𝗴 from Meta: interactive video segmentation and tracking on an iPhone!

thumb_up_off_alt527

chat_bubble_outline13

repeat111

shareShare

Saksham Suri

@_sakshams_

8 months ago

📢 Excited to announce LARP has been accepted to #ICLR2025 ! 🇸🇬 Code and models are publicly available. Project page: hywang66.github.io/larp/index.html

thumb_up_off_alt34

chat_bubble_outline1

repeat2

shareShare

AI at Meta

@aiatmeta

5 months ago

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model

thumb_up_off_alt13,13K

chat_bubble_outline706

repeat2,2K

shareShare

Saksham Suri

@_sakshams_

5 months ago

Drop by our oral presentation and poster session to chat and learn about our video tokenizer with learned autoregressive prior. #ICLR2025

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare