Shrikar (@shrikarhaha) 's Twitter Profile
Shrikar

@shrikarhaha

Currently: Research @iiscbangalore
Building: Explainable AI systems in Healthcare
Interests: Startups, TechForGood, Product
Loves: Physics,Poems,Football&Food

ID: 1435618253313806341

calendar_today08-09-2021 14:57:26

508 Tweet

184 Followers

2,2K Following

Rishubh Parihar (@rishubhparihar) 's Twitter Profile Photo

“Make it red.” “No! More red!” “Ughh… slightly less red.” “Perfect!” ♥️ 🎚️Kontinuous Kontext adds slider-based control over edit strength to instruction-based image editing, enabling smooth, continuous transformations!

Willis (Nanye) Ma (@ma_nanye) 's Twitter Profile Photo

Excited to introduce DiffuseNNX, a comprehensive JAX/Flax NNX-based library for diffusion and flow matching! It supports multiple diffusion / flow-matching frameworks, Autoencoders, DiT variants, and sampling algorithms. Repo: github.com/willisma/diffu… Delve into details below!

Saining Xie (@sainingxie) 's Twitter Profile Photo

I used to think that semantic encoders primarily captured high-level, abstract representations and discarded fine-grained visual details, but I was wrong. we employ pretrained representation encoders (such as DINO, SigLIP, and MAE, all based on standardized ViTs) combined with

I used to think that semantic encoders primarily captured high-level, abstract representations and discarded fine-grained visual details, but I was wrong.

we employ pretrained representation encoders (such as DINO, SigLIP, and MAE, all based on standardized ViTs) combined with
Saining Xie (@sainingxie) 's Twitter Profile Photo

as always, we’re releasing everything: the paper, the model, and the PyTorch code. this project has been led by three amazing students: Boyang Boyang Zheng (1st year phd), Willis Willis (Nanye) Ma (2nd year phd), and Peter Peter Tong (3rd year phd). we’ve been working on this for

Rishubh Parihar (@rishubhparihar) 's Twitter Profile Photo

✨ I’ll be presenting our work on depth-aware image editing at #ICCV2025 in Hawaii 🌴 next week! 📅 Oct 22 | 📍 Exhibit Hall I | 🧩 Poster #82 🌍 Project: rishubhpar.github.io/DAEdit/ 🤝 Working on image generation or editing? I’d love to chat at ICCV! Vision and AI Lab, IISc

Aniket Didolkar (@aniket_d98) 's Twitter Profile Photo

Can we scale thinking without scaling context budget!? In our latest work, we propose a test-time scaling strategy which combines parallel drafts in a sequential self-improvement loop to further boost reasoning capabilities of frontier LLMs such as O3 and Gemini-2.5-flash

Can we scale thinking without scaling context budget!?

In our latest work, we propose a test-time scaling strategy which combines parallel drafts  in a sequential self-improvement loop to further boost reasoning capabilities of frontier LLMs such as O3 and Gemini-2.5-flash
Songyou Peng (@songyoupeng) 's Twitter Profile Photo

Our bigger group at Google DeepMind is hiring interns for next summer! If you are interested in working with us, apply through the link below and also email us. step 1: google.com/about/careers/… (US-based) step 2: send an email to [email protected]

sarah guo // conviction (@saranormous) 's Twitter Profile Photo

ok, I know I’m the biggest stan — but amongst many lessons I have learned from Pat Grady is that cynicism/selfishness is for losers, and openheartedness is for winners. believing in the talented people around you, assuming good intent, playing for your team - it is often enough!

Theoretically Media (@theomediaai) 's Twitter Profile Photo

Veo 3.1 now has a Camera Adjustment Feature, allowing you to change the angle and movement of a previously generated video. Taking it out for a test spin, here's our "Test" video, in the thread we'll check out how the feature does!

Mahan Fathi (@mahanfathi) 's Twitter Profile Photo

We're looking for Summer Interns to join the Post-Training Team at @NVIDIA! DM me with your updated resume and three concise bullets detailing your most relevant experience — e.g. publications, repos, blogs, etc. RT please to help us find top talent.

joao carreira (@joaocarreira) 's Twitter Profile Photo

I'm looking for a student researcher to work with me at Google DeepMind in London, preferably starting early next year -- topics will be around novel video model architectures / learning from a single video stream / representation learning .

Saining Xie (@sainingxie) 's Twitter Profile Photo

Introducing Cambrian-S it’s a position, a dataset, a benchmark, and a model but above all, it represents our first steps toward exploring spatial supersensing in video. 🧶

Ellis Brown (@_ellisbrown) 's Twitter Profile Photo

MLLMs are great at understanding videos, but struggle with spatial reasoning—like estimating distances or tracking objects across time. the bottleneck? getting precise 3D spatial annotations on real videos is expensive and error-prone. introducing SIMS-V 🤖 [1/n]

Shrikar (@shrikarhaha) 's Twitter Profile Photo

Wow! Sometimes you're lucky to wake up and see something fundamentally different to address the challenge of modeling how humans can stream continuous frames of visual information and store infinite context! Exciting times researching video and spatial supersensing! ✨🫡

Jehan Godrej (@jehangodrej) 's Twitter Profile Photo

I wish that everyone gets to experience running the NYC Marathon once in their lives, every human deserves to experience something like it, and it made all the training, discipline, and suffering along the way this year worth it.

Jiageng Mao (@pointscoder) 's Twitter Profile Photo

🎥 Video Generation Enables Zero-Shot Robotic Manipulation 🤖 Introducing PhysWorld, a framework that bridges video generation and robot learning through (generated) real-to-sim world modeling. 🌐 Project: pointscoder.github.io/PhysWorld_Web/ 📄 Paper: arxiv.org/abs/2511.07416 💻 Code: