Mingfei Li (@mingfei_x) Twitter Tweets • TwiCopy

rez0

a year ago

The three areas of need are: 1. Agent Authentication & Authorization 2. Prompt Injection-related attacks 3. Secure Agent Architecture For #1, Agent authentication can likely be put into existing stuff like okta etc. Authorization is the really hard part and will need to be

thumb_up_off_alt69

chat_bubble_outline10

repeat11

shareShare

Mingfei Li

@mingfei_x

a year ago

Some interesting data points about SWE agents

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Jim Fan

@drjimfan

a year ago

We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive - truly open, frontier research that empowers all. It makes no sense. The most entertaining outcome is the most likely. DeepSeek-R1 not only open-sources a barrage of models but

thumb_up_off_alt8,8K

chat_bubble_outline224

repeat1,1K

shareShare

Jim Fan

@drjimfan

a year ago

Machines will train machines. Never bet against scaling. Never.

thumb_up_off_alt1,1K

chat_bubble_outline53

repeat144

shareShare

Andrej Karpathy

@karpathy

a year ago

For friends of open source: imo the highest leverage thing you can do is help construct a high diversity of RL environments that help elicit LLM cognitive strategies. To build a gym of sorts. This is a highly parallelizable task, which favors a large community of collaborators.

thumb_up_off_alt8,8K

chat_bubble_outline326

repeat827

shareShare

Aran Komatsuzaki

@arankomatsuzaki

a year ago

Stanford presents: s1: Simple test-time scaling - Seeks the simplest approach to achieve test-time scaling and strong reasoning performance - Exceeds o1-preview on competition math questions by up to 27% (MATH and AIME24) - Model, data, and code are open-source

thumb_up_off_alt1,1K

chat_bubble_outline12

repeat160

shareShare

xlr8harder

@xlr8harder

a year ago

A funny thing about the deepseek-v3 training cost everyone is freaking out about is that they reported comparable training efficiency in the deepseek-v2 paper in May 2024. 172.8K hours * 14.8T tokens = 2.557M hours vs 2.788M hours

thumb_up_off_alt56

chat_bubble_outline3

repeat8

shareShare

Stefano Ermon

@stefanoermon

a year ago

Excited to share that I’ve been working on scaling up diffusion language models at Inception. A new generation of LLMs with unprecedented capabilities is coming!

thumb_up_off_alt692

chat_bubble_outline37

repeat81

shareShare

Neel Nanda

@neelnanda5

a year ago

The best way to judge a forecaster is their track record. In 2021 Daniel Kokotajlo predicted o1-style models. I think we should all be very interested in the new predictions he's making in 2025! I've read it and highly recommend - it's thought provoking and stressfully plausible

thumb_up_off_alt294

chat_bubble_outline5

repeat19

shareShare

Liverpool FC

@lfc

a year ago

WE'RE PREMIER LEAGUE CHAMPIONS! 🏆

thumb_up_off_alt206,206K

chat_bubble_outline4,4K

repeat69,69K

shareShare

Andrej Karpathy

@karpathy

a year ago

I attended a vibe coding hackathon recently and used the chance to build a web app (with auth, payments, deploy, etc.). I tinker but I am not a web dev by background, so besides the app, I was very interested in what it's like to vibe code a full web app today. As such, I wrote

thumb_up_off_alt7,7K

chat_bubble_outline408

repeat633

shareShare

Percy Liang

@percyliang

10 months ago

Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team Tatsunori Hashimoto Marcel Rød Neil Band Rohith Kuditipudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:

thumb_up_off_alt3,3K

chat_bubble_outline31

repeat323

shareShare

Danielle Baskin

@djbaskin

9 months ago

SF's flyering game is incredible right now. There's 1000s of flyers around the city that are completely unrelated and all completely sincere

thumb_up_off_alt1,1K

chat_bubble_outline30

repeat77

shareShare

Floor Eijkelboom

@feijkelboom

9 months ago

Flow Matching (FM) is one of the hottest ideas in generative AI - and it’s everywhere at #ICML2025. But what is it? And why is it so elegant? 🤔 This thread is an animated, intuitive intro into (Variational) Flow Matching - no dense math required. Let's dive in! 🧵👇

thumb_up_off_alt1,1K

chat_bubble_outline110

repeat255

shareShare

Kevin Lu

@_kevinlu

9 months ago

Why you should stop working on RL research and instead work on product // The technology that unlocked the big scaling shift in AI is the internet, not transformers I think it's well known that data is the most important thing in AI, and also that researchers choose not to work

thumb_up_off_alt1,1K

chat_bubble_outline32

repeat152

shareShare

Latent.Space

@latentspacepod

7 months ago

🆕 Everything you should know about Context Engineering youtube.com/watch?v=_IlTcW… Works like Chroma's Context Rot research and Drew Breunig's Context Fails show a lot of issues with naive long context usage: - Context Poisoning - Context Distraction - Context Confusion - Context

thumb_up_off_alt173

chat_bubble_outline4

repeat29

shareShare

Khurram Javed

@khurramjaved_96

7 months ago

This is a big deal. It is the first large-scale demonstration of the advantage of real-time reinforcement learning. The recipe is scalable and requires no intervention in principle; the model can adapt forever as long as it is being used. There is no way to achieve similar

thumb_up_off_alt311

chat_bubble_outline8

repeat20

shareShare

Mingfei Li

@mingfei_x

5 months ago

Interesting read: agentic browser makes same-origin policy irrelevant

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Nathan Lambert

@natolambert

5 months ago

The first research on the fundamentals of character training -- i.e. applying modern post training techniques to ingrain specific character traits into models. All models, datasets, code etc released. Really excited about this project! Sharan was a joy to work with.

thumb_up_off_alt342

chat_bubble_outline13

repeat38

shareShare

Neel Nanda

@neelnanda5

5 months ago

It was great to help with this interactive tutorial on SAEs, what they can be used for, and how they work. Fantastic work by the team!

thumb_up_off_alt138

chat_bubble_outline0

repeat13

shareShare