Ameer azam (@ameerazam18) Twitter Tweets • TwiCopy

Ameer azam

@ameerazam18

+ Follow

LLMs, TTS, Diffusion & Open Source Gen AI || AI changing things || AI papers Tweet || Deep Learning

ID: 739044397065478145

linkhttps://huggingface.co/ameerazam08 calendar_today04-06-2016 10:41:30

358 Tweet

122 Takipçi

730 Takip Edilen

VibeCode

@vibecodeapp

5 months ago

It's Labor Day, and Vibe coding is now FREE for EVERYONE. - Build iOS Apps - Build Web Apps with Claude Code, Codex w/ GPT-5, Gemini CLI - And now you can build Android Apps Like + Reply to this post and we'll DM you the link.

thumb_up_off_alt2,2K

chat_bubble_outline937

repeat231

shareShare

sway

@swaystar123

5 months ago

You can implement this paper with 2 lines of code cfm_target = torch.roll(flow_target, shifts=1, dims=0) cfm_loss = -((model_output - cfm_target) ** 2).mean() * λ (Official impl is 60 lines btw)

thumb_up_off_alt1,1K

chat_bubble_outline13

repeat104

shareShare

青龍聖者

@bdsqlsz

5 months ago

New diffusion RL method from Tencent:SRPO! Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference 32 H20 with 10minutes enhance Flux 1-dev.

thumb_up_off_alt272

chat_bubble_outline4

repeat44

shareShare

Gaurav Sen

@gkcs_

5 months ago

20 AI terms you need to know. 1. Large Langauge Models 2. Tokenization 3. Embeddings (Vectors) 4. Attention Mechanism 5. Transformer 6. Self- supervised Learning 7. Fine-Tuning 8. Quantization 9. Few Shot Prompting 10. Vector Databases 11. Retrieval Augmented Generation 12.

thumb_up_off_alt654

chat_bubble_outline14

repeat110

shareShare

Aran Komatsuzaki

@arankomatsuzaki

4 months ago

Apple presents Manzano: Simple & scalable unified multimodal LLM • Hybrid vision tokenizer (continuous ↔ discrete) cuts task conflict • SOTA on text-rich benchmarks, competitive in gen vs GPT-4o/Nano Banana • One model for both understanding & generation • Joint recipe:

thumb_up_off_alt314

chat_bubble_outline4

repeat63

shareShare

Shubham Saboo

@saboo_shubham_

4 months ago

Paper2Agent automatically transforms research papers into AI agents. You can use it via MCP with Claude Code or Google Gemini CLI. 100% Opensource.

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat218

shareShare

moondream

@moondreamai

4 months ago

Moondream 3 understands UIs, not just pixels. Identify buttons, prices, and labels with a prompt. Perfect for agentic workflows. Open, tiny, blazingly fast.

thumb_up_off_alt378

chat_bubble_outline11

repeat17

shareShare

DailyPapers

@huggingpapers

4 months ago

ByteDance just released FaceCLIP on Hugging Face! A new vision-language model specializing in understanding and generating diverse human faces. Dive into the future of facial AI. huggingface.co/ByteDance/Face…

thumb_up_off_alt439

chat_bubble_outline10

repeat78

shareShare

Zhe Gan

@zhegan4

3 months ago

🎁🎁 We release Pico-Banana-400K, a large-scale, high-quality image editing dataset distilled from Nana-Banana across 35 editing types. 🔗 Data link: github.com/apple/pico-ban… 🔗Paper link: arxiv.org/abs/2510.19808 It includes 258K single-turn image editing data, 72K multi-turn

thumb_up_off_alt766

chat_bubble_outline9

repeat119

shareShare

ℏεsam

@hesamation

3 months ago

multi-head attention visually explained:

thumb_up_off_alt1,1K

chat_bubble_outline17

repeat205

shareShare

Niels Rogge

@nielsrogge

3 months ago

This is a phenomenal video by Jia-Bin Huang explaining seminal papers in computer vision, including CLIP, SimCLR, DINO v1/v2/v3 in 15 minutes DINO is actually a brilliant idea, I found the decision of 65k neurons in the output head pretty interesting

This is a phenomenal video by <a href="/jbhuang0604/">Jia-Bin Huang</a> explaining seminal papers in computer vision, including CLIP, SimCLR, DINO v1/v2/v3 in 15 minutes

DINO is actually a brilliant idea, I found the decision of 65k neurons in the output head pretty interesting

thumb_up_off_alt1,1K

chat_bubble_outline14

repeat124

shareShare

weijia wu

@weijiawu7

3 months ago

🔥 New paper out: WEAVE — a 100K-sample interleaved multimodal dataset + WEAVEBench, a human-annotated benchmark for visual memory, multi-turn editing. 📄 arXiv: arxiv.org/abs/2511.11434 🐙 GitHub: github.com/weichow23/weave 🤗 HF Dataset: huggingface.co/datasets/WeiCh…

thumb_up_off_alt147

chat_bubble_outline2

repeat29

shareShare

Ameer azam

@ameerazam18

3 months ago

Just Enjoy this Gemini-3 Pro Vibe Vibe Coded Video Calling APP on Huggingface huggingface.co/spaces/ameeraz… github.com/AMEERAZAM08/Ge… star on repo Coonect with Anyone on Huggingface share code and speak. Hugging Face Google Google Gemini

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

ShaRPeyE

@sharpeye_wnl

2 months ago

> Disappear > do math > learn to code > life heavy > read a lot > build things > comeback stronger

thumb_up_off_alt1,1K

chat_bubble_outline51

repeat122

shareShare

Omar Sanseviero

@osanseviero

2 months ago

Introducing our latest open model: MedASR 🔬Speech to text model 🏥for healthcare-based voice applications 🤗available in Hugging Face ⚡️run with transformers Download right now huggingface.co/google/medasr

thumb_up_off_alt682

chat_bubble_outline32

repeat69

shareShare