Prithiv Sakthi (@prithiv_003) Twitter Tweets • TwiCopy

Prithiv Sakthi

4 months ago

Made Nano Banana to transform freestyle drawings into image illustrations (from free-style drawing to image). It also supports other options like image generation and single or multiple image edits. Built with the Gemini API, powered by GCP. App: …no-banana-aio-op72ohwdda-uw.a.run.app

Made <a href="/NanoBanana/">Nano Banana</a> to transform freestyle drawings into image illustrations (from free-style drawing to image). It also supports other options like image generation and single or multiple image edits. Built with the Gemini API, powered by GCP.

App: …no-banana-aio-op72ohwdda-uw.a.run.app

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Vivek Galatage

@vivekgalatage

4 months ago

Understanding GPU Architecture from Cornell cvw.cac.cornell.edu/gpu-architectu… During a low-level discussion at a casual meetup, many folks were interested in understanding GPUs more closely. While CPUs optimize for complex control flow (see those big cores + caches), the GPUs maximize

thumb_up_off_alt2,2K

chat_bubble_outline11

repeat322

shareShare

Anand Bhattad

@anand_bhattad

4 months ago

This is cool! We’ve been building something along similar lines in academia: Generative Blocks World. No magic here—just an intuitive pipeline grounded in decades of work on primitive decomposition in computer vision. The inspiration goes all the way back to one of the earliest

thumb_up_off_alt188

chat_bubble_outline0

repeat23

shareShare

Qwen

@alibaba_qwen

4 months ago

Awesome! Let’s get started with the Qwen3-VL notebook in the near future.🌟

thumb_up_off_alt446

chat_bubble_outline11

repeat34

shareShare

AiBattle

@aibattle_

4 months ago

Another new Google Gemini model "Oceanreef" is being tested in LmArena The model is likely related to the "Oceanstone" model, which appeared 2 days ago

thumb_up_off_alt222

chat_bubble_outline9

repeat17

shareShare

Junyang Lin

@justinlin610

4 months ago

A toolkit for your ease to use Qwen3-ASR!

thumb_up_off_alt123

chat_bubble_outline3

repeat4

shareShare

Vivek Galatage

@vivekgalatage

4 months ago

While researching GPU architecture further, I found Kostas Anagnostou's recent blog post, "GPU utilisation and performance improvements". Quite interesting insights on GPU perf, read on! interplayoflight.wordpress.com/2025/08/29/gpu…

thumb_up_off_alt566

chat_bubble_outline4

repeat74

shareShare

merve

@mervenoyann

4 months ago

I love MiniCPM-V 4.5, it's underrated it's only 8B yet great in factual correction + thinking 💬 as they claim, gpt-4o level VLM on-device 👏 great work OpenBMB

I love MiniCPM-V 4.5, it's underrated

it's only 8B yet great in factual correction + thinking 💬

as they claim, gpt-4o level VLM on-device 👏 great work <a href="/OpenBMB/">OpenBMB</a>

thumb_up_off_alt151

chat_bubble_outline5

repeat18

shareShare

merve

@mervenoyann

4 months ago

IBM just released small swiss army knife for the document models: granite-docling-258M 🔥 not only a document converter but also can do document question answering, understand multiple languages 🤯 with Apache 2.0 license 👏

thumb_up_off_alt809

chat_bubble_outline15

repeat120

shareShare

Ant Ling

@antling20041208

4 months ago

⚡️Ling-flash-2.0⚡️ is now open source. 100B MoE LLM • only 6.1B active params --> 3x faster than 36B dense (200+ tok/s on H20) --> Beats ~40B dense LLM on complex reasoning --> Powerful coding and frontend development Small activation. Big performance.

thumb_up_off_alt456

chat_bubble_outline8

repeat80

shareShare

Draw Things

@drawthingsapp

4 months ago

1. This LoRA is called Qwen-Image-HeadshotX (source link below). It provides precise portrait rendering with a strong focus on realism.👇🏻 huggingface.co/prithivMLmods/…

thumb_up_off_alt14

chat_bubble_outline1

repeat2

shareShare

Vivek Galatage

@vivekgalatage

4 months ago

This has to be one of the best GPU programming resources I've found - the GPU Glossary from Modal breaks down complex concepts with clear visuals and explanations, from CUDA architecture to Tensor Cores to CTAs. modal.com/gpu-glossary

thumb_up_off_alt1,1K

chat_bubble_outline5

repeat168

shareShare

Ellie Sleightholm

@elsleightholm

4 months ago

new maths videos coming soon :))

thumb_up_off_alt2,2K

chat_bubble_outline54

repeat110

shareShare

DailyPapers

@huggingpapers

4 months ago

GenExam: The first multidisciplinary text-to-image exam is now on Hugging Face This new benchmark challenges T2I models with 1,000 rigorous, exam-style prompts across 10 subjects. It comes with ground-truth images and detailed scoring for semantic correctness and visual

thumb_up_off_alt31

chat_bubble_outline2

repeat6

shareShare

DailyPapers

@huggingpapers

4 months ago

ByteDance unveils SAIL-VL2, a SOTA vision-language foundation model. It achieves comprehensive multimodal understanding and reasoning, outperforming at 2B & 8B scales.

thumb_up_off_alt149

chat_bubble_outline1

repeat24

shareShare

Andi Marafioti

@andimarafioti

4 months ago

Just: import trackio as wandb

thumb_up_off_alt12

chat_bubble_outline1

repeat2

shareShare

Victor M

@victormustar

4 months ago

Looks good at portraits :)

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Prithiv Sakthi

@prithiv_003

4 months ago

After a while, I’ve migrated the app’s tech stack to make it compatible for deployment on HF Spaces. Nano Banana AIO (Wrapper) is used for manipulating free-style drawings into images, (multi-image editing, image generation, etc.). 🍌🤗 Space: huggingface.co/spaces/prithiv…

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

AK

@_akhaliq

4 months ago

SAIL-VL2 Technical Report

thumb_up_off_alt23

chat_bubble_outline3

repeat7

shareShare

DailyPapers

@huggingpapers

4 months ago

A new era for 360-degree vision in AI, co-authored by Insta360! PANORAMA introduces a revolutionary architecture for omnidirectional vision in embodied AI, offering holistic environmental awareness. It addresses key challenges in data, models, and applications.

thumb_up_off_alt7

chat_bubble_outline1

repeat1

shareShare