bruno (@brunostefoni) Twitter Tweets • TwiCopy

bruno

@brunostefoni

5 months ago

I bet they didn't quantize Opus, it's probably an infra bug like last time

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Ernesto abrió debate genuino llamando a tener una mirada holística de los datos (que bien!), pero al ahondar y observar *este* gráfico se reafirma de cierta forma conclusion apresurada de aquellos que "solo ven promedio top100". Estas distribuciones son deprimentemente distintas

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

4 months ago

i am cringe but i am free

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

4 months ago

AI /infra is moving so fast that sometimes you think you are ahead of the curve when suddenly a big lab basically implements what you were in the middle of doing

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

4 months ago

Cant wait for HBO's Industry take on this whole AI-fication cycle in the financial markets

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

3 months ago

CC = Claude Code CC = (Terran) Command Center Life imitates art

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

3 months ago

I think the word 'scrappy' can be a positive adjective for any senior+ engineer in a non-developed country. You don't just pay >$20/mo to a random SaaS to get something. You start from scratch and build it yourself. That mentality is hard to find in many seniors swes elsewhere

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

2 months ago

me (and Claude) had fun with Prime Intellect hosted RL training, (we) wrote a blog post about it brunose.github.io/blog-llm-git-r… Many enterprises need agents to run complex multiple-tool sequential workflows. What if we made specialized agents using small LLMs + RL?

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

2 months ago

Qwen has been such a positive impact for people who use open source models. I'm sure the researchers behind it will do just fine. Sad to see them leave the team

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

2 months ago

>gets invited to AI Engineer event in NY >ofc it's in williamsburg aka SF v2beta

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Zhuokai Zhao

@zhuokaiz

2 months ago

I wish someone had told me this when I started digging into diffusion language models (dLLMs) from an LLM post-training background. I've spent the last few weeks reading across both the dLLM RL literature (d1, EGSPO, MDPO, LLaDA 1.5) and the older robotics literature on

thumb_up_off_alt528

chat_bubble_outline15

repeat63

shareShare

Daniel Hnyk

@hnykda

2 months ago

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

thumb_up_off_alt4,4K

chat_bubble_outline151

repeat1,1K

shareShare

bruno

@brunostefoni

a month ago

This and also vibe predicting SOTA model capabilities in 6 months to build AI infra around that prediction

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

bruno

@brunostefoni

a month ago

1 bit large language model sounds so damn cool as a concept honestly. I wonder if it's only memory efficient or also helps mechanistic interpretability

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Anthropic

@anthropicai

a month ago

New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.