Tapan jain (@jainnitk) Twitter Tweets • TwiCopy

ₕₐₘₚₜₒₙ

@hamptonism

a year ago

Co-Founder of OpenAi, Karpathy's llm.c with matrix multiplications done by hand

thumb_up_off_alt446

chat_bubble_outline1

repeat59

shareShare

🚀 Google just launched Agent Payments Protocol (AP2) — an open standard for the agent economy. Built on A2A, AP2 enables secure, reliable, and interoperable agent commerce for developers, merchants & the payments industry.

thumb_up_off_alt23

chat_bubble_outline1

repeat3

shareShare

Aakash Gupta

@aakashg0

3 months ago

Andrej Karpathy literally built the neural networks running inside coding assistants. He taught the world deep learning at Stanford. He ran AI at Tesla. If he feels “dramatically behind” as a programmer… that tells you everything about where we are. The confession here is

thumb_up_off_alt7,7K

chat_bubble_outline202

repeat803

shareShare

Jen Zhu

@jenzhuscott

a month ago

This is why Andrej Karpathy will go into history books as one of the most consequential minds in AI of our time. 243 lines of ruthless compression but a FULL training + inference loop for autoregressive transformer. I feel this is also such a genius, quiet defiance of the “AI is

thumb_up_off_alt1,1K

chat_bubble_outline30

repeat78

shareShare

elie

@eliebakouch

18 days ago

attention sink and qwen's gated attention are very similar. here's a visual explanation of why and a recap of different attention sink variant

thumb_up_off_alt455

chat_bubble_outline3

repeat55

shareShare

Sebastian Raschka

@rasbt

17 days ago

While waiting for DeepSeek V4 we got two very strong open-weight LLMs from India yesterday. There are two size flavors, Sarvam 30B and Sarvam 105B model (both reasoning models). Interestingly, the smaller 30B model uses “classic” Grouped Query Attention (GQA), whereas the

thumb_up_off_alt4,4K

chat_bubble_outline45

repeat702

shareShare

Andrej Karpathy

@karpathy

17 days ago

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the

thumb_up_off_alt12,12K

chat_bubble_outline483

repeat1,1K

shareShare

Chen Liang

@crazydonkey200

17 days ago

Andrej Karpathy Very inspiring as always! We are also open sourcing part of our infra on automated research for Gemini to evolve itself at github.com/google-deepmin… More complex than the nanochat setup but closer to SOTA LLM pre/post-training while staying as minimal as possible. More on the way.

thumb_up_off_alt1,1K

chat_bubble_outline13

repeat149

shareShare

AVB

@neural_avb

16 days ago

Yo check out this guy's blogs 🫡 He is regularly writing very cool very high-effort technical articles, clean diagrams, and a really good topic-coverage of LLM internals. mesuvash.github.io/blog/

thumb_up_off_alt555

chat_bubble_outline2

repeat49

shareShare

Andrej Karpathy

@karpathy

16 days ago

The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. Current code synchronously grows a single thread of

thumb_up_off_alt6,6K

chat_bubble_outline469

repeat665

shareShare

Ethan He

@ethanhe_42

15 days ago

My last open-source project before joining xAI is just out today. Megatron Core MoE is probably the best open framework out there to seriously train mixture of experts at scale. It achieves 1233 TFLOPS/GPU for DeepSeek-V3-685B. arxiv.org/abs/2603.07685

thumb_up_off_alt993

chat_bubble_outline39

repeat107

shareShare

Andrew Jiang 🛡️

@andrewjiang

14 days ago

The brilliance of Andrej Karpathy is being able to distill vastly complex concepts and make them simple to understand and implement at a small scale. All it took was Claude Code and $10 on Runpod to spin up a single H100, and I had a world class ML researcher working on autopilot.

The brilliance of <a href="/karpathy/">Andrej Karpathy</a> is being able to distill vastly complex concepts and make them simple to understand and implement at a small scale.

All it took was Claude Code and $10 on <a href="/runpod/">Runpod</a> to spin up a single H100, and I had a world class ML researcher working on autopilot.

thumb_up_off_alt1,1K

chat_bubble_outline39

repeat94

shareShare

Sebastian Raschka

@rasbt

12 days ago

Another week, another noteworthy open-weight LLM release. Nvidia’s Nemotron 3 Super 120B-A12B looks pretty good. Benchmarks are on par with Qwen3.5 122B and GPT-OSS 120B, but the throughput is great! Below is a short, visual architecture rundown.

thumb_up_off_alt782

chat_bubble_outline36

repeat124

shareShare

Andrew Ng

@andrewyng

8 days ago

Should there be a Stack Overflow for AI coding agents to share learnings with each other? Last week I announced Context Hub (chub), an open CLI tool that gives coding agents up-to-date API documentation. Since then, our GitHub repo has gained over 6K stars, and we've scaled from

thumb_up_off_alt4,4K

chat_bubble_outline305

repeat728

shareShare

Yuandong Tian

@tydsh

8 days ago

This looks very interesting :) Why such a behavior happens?

thumb_up_off_alt127

chat_bubble_outline2

repeat6

shareShare

Rohan Paul

@rohanpaul_ai

7 days ago

Alibaba just open-sourced OpenSandbox ( a general-purpose execution environment ) to give AI agents an isolated environment to run code safely. 8k+ Github stars ⭐️ This stops your AI Agent based applications from accessing your actual host infrastructure. By removing the

thumb_up_off_alt750

chat_bubble_outline44

repeat113

shareShare

Tapan jain

ₕₐₘₚₜₒₙ

机器之心 JIQIZHIXIN

Aakash Gupta

Jen Zhu

elie

Sebastian Raschka

Andrej Karpathy

Chen Liang

AVB

Andrej Karpathy

Ethan He

Andrew Jiang 🛡️

Sebastian Raschka

Andrew Ng

Yuandong Tian

Rohan Paul