Vuk Rosić (@vukrosic99) 's Twitter Profile
Vuk Rosić

@vukrosic99

🤖 AI Research Scientist
📊 Agents, LLMs, Inference (test) time scaling, RLHF..
🤝 Solve math and understand the universe
🧑‍🎓 我在学习中文,随时可以跟我用中文聊天

ID: 2234985075

linkhttps://www.youtube.com/channel/UC7XJj9pv_11a11FUxCMz15g calendar_today20-12-2013 16:28:24

150 Tweet

21 Takipçi

340 Takip Edilen

Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

DeepSeek INFINITE Context Window - Encode Text As Images - DeepSeek OCR 📝➡️🖼️ Screenshot massive amount of text and process it like images theoretically infinite context window yt youtu.be/bxTkOCv7SGM bilibili bilibili.com/video/BV1p4WSz…

DeepSeek INFINITE Context Window - Encode Text As Images - DeepSeek OCR

📝➡️🖼️ Screenshot massive amount of text and process it like images theoretically infinite context window

yt youtu.be/bxTkOCv7SGM

bilibili bilibili.com/video/BV1p4WSz…
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

Build 10,000 GPU Cluster - TikTok's LLM Training - New Paper by ByteDance explaining their GPU cluster yt - youtu.be/MAgURs2iFCQ bilibili - bilibili.com/video/BV1kvW9z…

Build 10,000 GPU Cluster - TikTok's LLM Training - New Paper by ByteDance explaining their GPU cluster

yt - youtu.be/MAgURs2iFCQ

bilibili - bilibili.com/video/BV1kvW9z…
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

Master RMSNorm From Scratch - Step by Step Tutorial Used in LLMs, Transformers...extremely popular normalization - but how does it work? youtube - youtu.be/HgSdYtPgJnU bilibili - bilibili.com/video/BV1zxWoz…

Master RMSNorm From Scratch - Step by Step Tutorial

Used in LLMs, Transformers...extremely popular normalization - but how does it work?

youtube - youtu.be/HgSdYtPgJnU

bilibili - bilibili.com/video/BV1zxWoz…
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

13x FASTER Video Generation - Sparse Linear Attention Transformer yt - youtu.be/SMNswPiU8go bilibili - bilibili.com/video/BV15Esaz… arxiv - arxiv.org/pdf/2509.24006

13x FASTER Video Generation - Sparse Linear Attention Transformer

yt - youtu.be/SMNswPiU8go

bilibili - bilibili.com/video/BV15Esaz…

arxiv - arxiv.org/pdf/2509.24006
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

Explaning new paper 'Definition of AGI' YouTube - youtu.be/7AWl-EqsD8w Bilibili - bilibili.com/video/BV1zesvz… paper - agidefinition.ai/paper.pdf

Explaning new paper 'Definition of AGI'

YouTube - youtu.be/7AWl-EqsD8w

Bilibili - bilibili.com/video/BV1zesvz…

paper - agidefinition.ai/paper.pdf
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

just crossed 10k subscribers on youtube a lot of new videos about AI papers and code (LLMs, Diffsuoin, video generation) - youtube.com/channel/UC7XJj…

just crossed 10k subscribers on youtube

a lot of new videos about AI papers and code (LLMs, Diffsuoin, video generation) - youtube.com/channel/UC7XJj…
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

I started sitting whole day making videos on AI research / math and doing nothing else, check new videos here - youtube.com/@vukrosic/vide…

I started sitting whole day making videos on AI research / math and doing nothing else, check new videos here - youtube.com/@vukrosic/vide…
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

Decaying learning rate prevents you from training LLM indefinitely, this is how you can replace learning rate decay with checkpoint merging - youtube.com/watch?v=Z5kEG7…

Decaying learning rate prevents you from training LLM indefinitely, this is how you can replace learning rate decay with checkpoint merging - youtube.com/watch?v=Z5kEG7…
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

LATENT Thinking LLM by TikTok parent ByteDance - Looped Transformer Thinking - New Paper Explained YouTube - youtu.be/SCvo_pO35eg bilibli 中文字幕 (潜思维大语言模型 — 循环Transformer思维 — 新论文详解) - bilibili.com/video/BV1h21JB…

LATENT Thinking LLM by TikTok parent ByteDance - Looped Transformer Thinking - New Paper Explained

YouTube - youtu.be/SCvo_pO35eg

bilibli 中文字幕 (潜思维大语言模型 — 循环Transformer思维 — 新论文详解) - bilibili.com/video/BV1h21JB…
Vuk Rosić (@vukrosic99) 's Twitter Profile Photo

I'm hearing Kimi K2 Thinking closed the gap even more with GPT and Claude, and surpassed in some areas, which I didn't expect. I thought open source will be stuck a bit behind. I wonder if it was Su Jianlin's cooking.