Jerry Liu (@jerrywliu) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

BASED ✌️ turns 1! One year since its launch at NeurIPS 2023 — and it's helped shape the new wave of efficient LMs. ⚡️ Fastest linear attention kernels 🧠 405B models trained on 16 GPUs 💥 Inspired Mamba-v2, RWKVs, MiniMax Checkout our retrospective below!

thumb_up_off_alt110

chat_bubble_outline3

repeat66

shareShare

hazyresearch

@hazyresearch

3 months ago

The Great American AI Race. I wrote something about how we need a holistic AI effort from academia, industry, and the US government to have the best shot at a freer, better educated, and healthier world in AI. I’m a mega bull on the US and open source AI. Maybe we’re cooking

thumb_up_off_alt85

chat_bubble_outline1

repeat93

shareShare

Dan Fu

@realdanfu

2 months ago

Super excited to share Chipmunk 🐿️- training-free acceleration of diffusion transformers (video, image generation) with dynamic attention & MLP sparsity! Led by Austin Silveria, soham - 3.7x faster video gen, 1.6x faster image gen. Kernels written in TK ⚡️🐱 1/

thumb_up_off_alt52

chat_bubble_outline3

repeat15

shareShare

Silas Alberti

@silasalberti

2 months ago

we built DeepWiki, a free encyclopedia of all GitHub repos some numbers: - 30k repos already indexed - processed 4 billion+ lines of code - the indexing alone cost $300k+ in compute spend

thumb_up_off_alt3,3K

chat_bubble_outline115

repeat422

shareShare

Simon Guo 🦝

@simonguozirui

2 months ago

Will be presenting KernelBench 🍿 at #ICLR2025 workshops! Come find me at: Sunday - 🗞️ Poster at Scaling Self-Improving Foundation Models workshop, Garnet 214-215 Monday - ✨Spotlight (Best Paper) talk at Deep Learning for Code workshop, 11:50AM, Garnet 218-219

thumb_up_off_alt107

chat_bubble_outline7

repeat12

shareShare

Xingyu Zhu

@xingyuzhu_

2 months ago

Happy to share that our work has been accepted by #ICML2025 (ICML Conference) as a 🚨Spotlight Poster🚨! In this paper we discussed how and why ICL capabilities of LLMs can benefit its in-weight learning. See you in Vancouver 🇨🇦 !

thumb_up_off_alt60

chat_bubble_outline2

repeat6

shareShare

Avanika Narayan

@avanika15

a month ago

can you chat privately with a cloud llm—*without* sacrificing speed? excited to release minions secure chat: an open-source protocol for end-to-end encrypted llm chat with <1% latency overhead (even @ 30B+ params!). cloud providers can’t peek—messages decrypt only inside a

thumb_up_off_alt245

chat_bubble_outline13

repeat57

shareShare

Dan Biderman

@dan_biderman

a month ago

We secure all communications with a cloud-hosted LLM, running on an H100 in confidential mode. Latency overhead goes away once you cross the 10B model size. This is our first foray into applied cryptography -- help us refine our ideas.

thumb_up_off_alt38

chat_bubble_outline3

repeat8

shareShare

William Gilpin

@wgilpin0

a month ago

We present Panda: a foundation model for nonlinear dynamics pretrained on 20,000 chaotic ODE discovered via evolutionary search. Panda zero-shot forecasts unseen ODE best-in-class, and can forecast PDE despite having never seen them during training (1/8) arxiv.org/abs/2505.13755

thumb_up_off_alt1,1K

chat_bubble_outline26

repeat323

shareShare

Stella Li

@stellalisy

a month ago

🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…

thumb_up_off_alt1,1K

chat_bubble_outline69

repeat322

shareShare

Benjamin F Spector

@bfspector

a month ago

(1/5) We’ve never enjoyed watching people chop Llamas into tiny pieces. So, we’re excited to be releasing our Low-Latency-Llama Megakernel! We run the whole forward pass in single kernel. Megakernels are faster & more humane. Here’s how to treat your Llamas ethically: (Joint

thumb_up_off_alt863

chat_bubble_outline32

repeat142

shareShare

Alex Ratner

@ajratner

a month ago

Agentic AI will transform every enterprise–but only if agents are trusted experts. The key: Evaluation & tuning on specialized, expert data. I’m excited to announce two new products to support this–Snorkel AI Evaluate & Expert Data-as-a-Service–along w/ our $100M Series D! ---

thumb_up_off_alt849

chat_bubble_outline14

repeat74

shareShare

ollama

@ollama

22 days ago

3 months ago, Stanford's Hazy Research lab introduced Minions, a project that connects Ollama to frontier cloud models to reduce cloud costs by 5-30x while achieving 98% of frontier model accuracy. Secure Minion turns an H100 into a secure enclave, where all memory and

thumb_up_off_alt1,1K

chat_bubble_outline21

repeat171

shareShare

Jordan Juravsky

@jordanjuravsky

20 days ago

Happy Throughput Thursday! We’re excited to release Tokasaurus: an LLM inference engine designed from the ground up for high-throughput workloads with large and small models. (Joint work with Ayush Chakravarthy, Ryan Ehrlich, Sabri Eyuboglu, Bradley Brown, Joseph Shetaye,

thumb_up_off_alt168

chat_bubble_outline3

repeat38

shareShare

Jerry Liu

Gate.io

Simran Arora

hazyresearch

Dan Fu

Silas Alberti

Simon Guo 🦝

Xingyu Zhu

Avanika Narayan

Dan Biderman

William Gilpin

Stella Li

Benjamin F Spector

Alex Ratner

ollama

Jordan Juravsky