Tapan jain (@jainnitk) 's Twitter Profile
Tapan jain

@jainnitk

Analytics Director @GroupM

ID: 75007226

calendar_today17-09-2009 13:07:35

5,5K Tweet

418 Takipçi

4,4K Takip Edilen

机器之心 JIQIZHIXIN (@synced_global) 's Twitter Profile Photo

🚀 Google just launched Agent Payments Protocol (AP2) — an open standard for the agent economy. Built on A2A, AP2 enables secure, reliable, and interoperable agent commerce for developers, merchants & the payments industry.

🚀 Google just launched Agent Payments Protocol (AP2) — an open standard for the agent economy.

Built on A2A, AP2 enables secure, reliable, and interoperable agent commerce for developers, merchants & the payments industry.
Aakash Gupta (@aakashg0) 's Twitter Profile Photo

Andrej Karpathy literally built the neural networks running inside coding assistants. He taught the world deep learning at Stanford. He ran AI at Tesla. If he feels “dramatically behind” as a programmer… that tells you everything about where we are. The confession here is

Jen Zhu (@jenzhuscott) 's Twitter Profile Photo

This is why Andrej Karpathy will go into history books as one of the most consequential minds in AI of our time. 243 lines of ruthless compression but a FULL training + inference loop for autoregressive transformer. I feel this is also such a genius, quiet defiance of the “AI is

elie (@eliebakouch) 's Twitter Profile Photo

attention sink and qwen's gated attention are very similar. here's a visual explanation of why and a recap of different attention sink variant

attention sink and qwen's gated attention are very similar. here's a visual explanation of why and a recap of different attention sink variant
Sebastian Raschka (@rasbt) 's Twitter Profile Photo

While waiting for DeepSeek V4 we got two very strong open-weight LLMs from India yesterday. There are two size flavors, Sarvam 30B and Sarvam 105B model (both reasoning models). Interestingly, the smaller 30B model uses “classic” Grouped Query Attention (GQA), whereas the

While waiting for DeepSeek V4 we got two very strong open-weight LLMs from India yesterday.

There are two size flavors, Sarvam 30B and Sarvam 105B model (both reasoning models).

Interestingly, the smaller 30B model uses “classic” Grouped Query Attention (GQA), whereas the
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then:

- the human iterates on the
Chen Liang (@crazydonkey200) 's Twitter Profile Photo

Andrej Karpathy Very inspiring as always! We are also open sourcing part of our infra on automated research for Gemini to evolve itself at github.com/google-deepmin… More complex than the nanochat setup but closer to SOTA LLM pre/post-training while staying as minimal as possible. More on the way.

AVB (@neural_avb) 's Twitter Profile Photo

Yo check out this guy's blogs 🫡 He is regularly writing very cool very high-effort technical articles, clean diagrams, and a really good topic-coverage of LLM internals. mesuvash.github.io/blog/

Yo check out this guy's blogs 🫡

He is regularly writing very cool very high-effort technical articles, clean diagrams, and a really good topic-coverage of LLM internals.

mesuvash.github.io/blog/
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. Current code synchronously grows a single thread of

Ethan He (@ethanhe_42) 's Twitter Profile Photo

My last open-source project before joining xAI is just out today. Megatron Core MoE is probably the best open framework out there to seriously train mixture of experts at scale. It achieves 1233 TFLOPS/GPU for DeepSeek-V3-685B. arxiv.org/abs/2603.07685

My last open-source project before joining xAI is just out today. Megatron Core MoE is probably the best open framework out there to seriously train mixture of experts at scale. It achieves 1233 TFLOPS/GPU for DeepSeek-V3-685B. arxiv.org/abs/2603.07685
Andrew Jiang 🛡️ (@andrewjiang) 's Twitter Profile Photo

The brilliance of Andrej Karpathy is being able to distill vastly complex concepts and make them simple to understand and implement at a small scale. All it took was Claude Code and $10 on Runpod to spin up a single H100, and I had a world class ML researcher working on autopilot.

The brilliance of <a href="/karpathy/">Andrej Karpathy</a> is being able to distill vastly complex concepts and make them simple to understand and implement at a small scale.

All it took was Claude Code and $10 on <a href="/runpod/">Runpod</a> to spin up a single H100, and I had a world class ML researcher working on autopilot.
Sebastian Raschka (@rasbt) 's Twitter Profile Photo

Another week, another noteworthy open-weight LLM release. Nvidia’s Nemotron 3 Super 120B-A12B looks pretty good. Benchmarks are on par with Qwen3.5 122B and GPT-OSS 120B, but the throughput is great! Below is a short, visual architecture rundown.

Another week, another noteworthy open-weight LLM release. Nvidia’s Nemotron 3 Super 120B-A12B looks pretty good.

Benchmarks are on par with Qwen3.5 122B and GPT-OSS 120B, but the throughput is great!

Below is a short, visual architecture rundown.
Andrew Ng (@andrewyng) 's Twitter Profile Photo

Should there be a Stack Overflow for AI coding agents to share learnings with each other? Last week I announced Context Hub (chub), an open CLI tool that gives coding agents up-to-date API documentation. Since then, our GitHub repo has gained over 6K stars, and we've scaled from

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Alibaba just open-sourced OpenSandbox ( a general-purpose execution environment ) to give AI agents an isolated environment to run code safely. 8k+ Github stars ⭐️ This stops your AI Agent based applications from accessing your actual host infrastructure. By removing the

Alibaba just open-sourced OpenSandbox ( a general-purpose execution environment ) to give AI agents an isolated environment to run code safely.

8k+ Github stars ⭐️

This stops your AI Agent based applications from accessing your actual host infrastructure.

By removing the