SovitRath5 (@sovitrath5) 's Twitter Profile
SovitRath5

@sovitrath5

Lead SWE (GenAI and LLMs) @ Indegene
Blog - debuggercafe.com
GitHub - github.com/sovit-123

ID: 821778318022168576

linkhttps://debuggercafe.com calendar_today18-01-2017 17:56:33

588 Tweet

144 Takipçi

87 Takip Edilen

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

Something super interesting coming soon. Image => prompt => 3D NO NEED TO CROP AN OBJECT AND UPLOAD. Just need a lot of VRAM. Happy to get a GPU sponsor for my 3D and rendering project.

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

Great progress on image-to-3D with visual grounding. BiRefNet cleanup boosts quality. Qwen3-VL 2B, BiRefNet, and Hunyuan3D 2.0 run in under 200 lines and <10GB VRAM with Qwen on CPU.

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

Tracing through 9.9B documents in less than 5 seconds is impressive. Perhaps some good UX/UI practices can be adopted from the new Olmo3 playground UI.

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

Multi-object 3D mesh and texture generation from an image with a single prompt. Entire pipeline uses less than 19GB VRAM (18.9GB to be exact). Can be optimized more. Now, to check the extent to which we can push to the pipeline. Blog post coming soon.

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

The DEIMv2 family of object detection models caters to both high-performance models with a ViT backbone for GPU-based systems & cloud deployment, and ultra-lightweight models with HGNetv2 backbones for low-powered and edge devices. In this week's article on DebuggerCafe, we are

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

Image to 3D mesh + texturing with optimized pipeline for less VRAM usage, multi-object generation from a single prompt, and multi-object visualization. Article coming soon.

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

Image-to-3D is now a repository with simple steps to run locally. Plus Runpod Docker image and template for easier experiments. Launch Runpod template here => console.runpod.io/deploy?templat… Just two steps to execute on Runpod. See README.

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

Fine-tuning the Phi-3.5 Vision Instruct model can be confusing at times. Its single-batch size requirement, slightly different data preprocessing methods, can often feel odd. In this week's article on DebuggerCafe, we are covering fine-tuning the Phi-3.5 Vision Instruct model on

SovitRath5 (@sovitrath5) 's Twitter Profile Photo

Huge VRAM optimization for the Image-to-3D project. Now runs with less than 10GB VRAM by default instead of 20GB VRAM. Has a further VRAM optimization option that runs even with an 8GB VRAM GPU (slightly slower). What changed: * Removed model loading that is not necessary in