Marius (@rasmus1610) 's Twitter Profile
Marius

@rasmus1610

Applied AI, Neuroradiology, Software Development, Entrepreneurship

ID: 50410234

linkhttps://blog.mariusvach.com calendar_today24-06-2009 19:39:28

2,2K Tweet

546 Takipçi

270 Takip Edilen

Marius (@rasmus1610) 's Twitter Profile Photo

I really missed this. LLMs feel so foreign, but they are still just PyTorch models. Maybe I should get into training retrieval models. This seems to be approachable without a GPU cluster

Marius (@rasmus1610) 's Twitter Profile Photo

What's the hottest sh*t currently in computer vision? still EfficientNet or ConvNeXt, maybe now with some attention sprinkled in? What do kids these days use for image classification or image segmentation?

Marius (@rasmus1610) 's Twitter Profile Photo

Recently I think more and more about what problems we haven't tackled with a "bitter lesson" aka "search and compute" lens yet.

Marius (@rasmus1610) 's Twitter Profile Photo

I'm now at the point of questioning why the code ever worked. This is usually the lowest point in debugging ... I hope :D

Marius (@rasmus1610) 's Twitter Profile Photo

Interesting. I'm currently implement RLMs by Alex L Zhang and it's super model dependent, whether the `llm_query` tool function is used or not. GPT models never use it and rely on regex a lot. Kimi-K2 likes the tool and leverages it at lot. x.com/a1zhang/status…

Marius (@rasmus1610) 's Twitter Profile Photo

So nice to have a REPL with the context in it. You can interactively debug what the LLM has done by inspecting the variables the LLM has created. Also so fascinating to read the trajectories of RLMs. Love it.

Jordi Hays (@jordihays) 's Twitter Profile Photo

Rage Baiting is for Losers Yesterday, YC announced Chad IDE aka “the brainrot code editor.” Chad is an AI code editor that allows you to gamble, watch TikTok, and use dating apps while working on coding tasks. Their launch rightfully got a lot of attention. On one hand it’s

Omar Khattab (@lateinteraction) 's Twitter Profile Photo

Martin martin_casado and I had a fun hour-long chat about why we need an AI software layer, and why that's true even if AGI arrives. This is basically my take on why "the model" is definitely NOT "the product", though models are one way you may decide to implement some products