Genghan Zhang (@zhang677) 's Twitter Profile
Genghan Zhang

@zhang677

ID: 1704996753160978432

calendar_today21-09-2023 23:11:29

22 Tweet

92 Followers

107 Following

Mo Tiwari (@mo_tiwari) 's Twitter Profile Photo

Thrilled that a paper from my PhD, "Faster Maximum Inner Product Search in High Dimensions" has been accepted to ICML 2024! In the paper, we accelerate state-of-the-art for the Maximum Inner Product Search (MIPS) problem. MIPS is a core subroutine in systems like recommendation

Thrilled that a paper from my PhD, "Faster Maximum Inner Product Search in High Dimensions" has been accepted to ICML 2024!

In the paper, we accelerate state-of-the-art for the Maximum Inner Product Search (MIPS) problem. MIPS is a core subroutine in systems like recommendation
Karan Dalal (@karansdalal) 's Twitter Profile Photo

I’m excited to share a project I’ve been working on for over a year, which I believe will fundamentally change our approach to language models. We’ve designed a new architecture, which replaces the hidden state of an RNN with a machine learning model. This model compresses

I’m excited to share a project I’ve been working on for over a year, which I believe will fundamentally change our approach to language models.

We’ve designed a new architecture, which replaces the hidden state of an RNN with a machine learning model. This model compresses
Xiaolong Wang (@xiaolonw) 's Twitter Profile Photo

Cannot believe this finally happened! Over the last 1.5 years, we have been developing a new LLM architecture, with linear complexity and expressive hidden states, for long-context modeling. The following plots show our model trained from Books scale better (from 125M to 1.3B)

Cannot believe this finally happened! Over the last 1.5 years, we have been developing a new LLM architecture, with linear complexity and expressive hidden states, for long-context modeling. The following plots show our model trained from Books scale better (from 125M to 1.3B)
Jiarui Xu (@jerry_xu_jiarui) 's Twitter Profile Photo

TTT could model long sequences with linear time complexity. It's a drop-in upgrade for any sequence modeling operators like self-attention. It has been super fun to work on TTT with the amazing team! Code is available: github.com/test-time-trai…

Anne Ouyang (@anneouyang) 's Twitter Profile Photo

Kernels are the kernel of deep learning. 🙃...but writing kernels sucks. Can LLMs help? 🤔 Introducing 🌽 KernelBench (Preview), a new coding benchmark designed to evaluate the ability of LLMs to generate ⚡️efficient💨 GPU kernels for optimizing neural network performance.

Kernels are the kernel of deep learning.
🙃...but writing kernels sucks.
Can LLMs help? 🤔

Introducing 🌽 KernelBench (Preview), a new coding benchmark designed to evaluate the ability of LLMs to generate ⚡️efficient💨 GPU kernels for optimizing neural network performance.
Allen Nie (🇺🇦☮️) (@allen_a_nie) 's Twitter Profile Photo

Wow. Nice timing. Anjiang Wei @ EMNLP25 and I just released a new version of our paper arxiv.org/pdf/2410.15625. LLM Agents show surprising exploration/sample efficiency (almost 100x faster than UCB bandit) in optimizing system code. A good domain for coding agents🤔😁

Wow. Nice timing. <a href="/anjiangw/">Anjiang Wei @ EMNLP25</a> and I just released a new version of our paper arxiv.org/pdf/2410.15625. LLM Agents show surprising exploration/sample efficiency (almost 100x faster than UCB bandit) in optimizing system code. A good domain for coding agents🤔😁
Genghan Zhang (@zhang677) 's Twitter Profile Photo

Excited to see the “self-improvement” idea also works on theorem proof, another application that requires complex reasoning with limited data

Weixin Liang (@liang_weixin) 's Twitter Profile Photo

📢 Can LLMs program themselves to run faster? 🏃⏱️ LLM self-taught to code for next-gen AI hardware! arxiv.org/abs/2502.02534 1/ Programming AI accelerators is a major bottleneck in ML. Our self-improving LLM agent learns to write optimized code for new hardware, achieving 3.9x

📢 Can LLMs program themselves to run faster? 🏃⏱️ 

LLM self-taught to code for next-gen AI hardware!
arxiv.org/abs/2502.02534

1/ Programming AI accelerators is a major bottleneck in ML. Our self-improving LLM agent learns to write optimized code for new hardware, achieving 3.9x
Weixin Liang (@liang_weixin) 's Twitter Profile Photo

🚀 Want 2x faster pretraining for your multi-modal LLM? 🧵 Following up on Mixture-of-Transformers (MoT), we're excited to share Mixture-of-Mamba (MoM)! arxiv.org/abs/2501.16295 🔥 Why it matters: MoM applies modality-aware sparsity across image, text, and speech—making

🚀 Want 2x faster pretraining for your multi-modal LLM?

🧵 Following up on Mixture-of-Transformers (MoT), we're excited to share Mixture-of-Mamba (MoM)!
arxiv.org/abs/2501.16295

🔥 Why it matters: MoM applies modality-aware sparsity across image, text, and speech—making
Anne Ouyang (@anneouyang) 's Twitter Profile Photo

New blog post from Nvidia: LLM-generated GPU kernels showing speedups over FlexAttention and achieving 100% numerical correctness on 🌽KernelBench Level 1

New blog post from Nvidia: LLM-generated GPU kernels showing speedups over FlexAttention and achieving 100% numerical correctness on 🌽KernelBench Level 1
Simon Guo 🦝 (@simonguozirui) 's Twitter Profile Photo

LLMs for GPU kernel🌽generation have been getting Pop🍿ular since our preview last Dec; excited to announce 📢 our full paper 📃 for KernelBench! Turns out KernelBench is quite challenging 🧠 — frontier models outperform the PyTorch Eager baseline <20% of the time. More 🧵👇

LLMs for GPU kernel🌽generation have been getting Pop🍿ular since our preview last Dec; excited to announce 📢 our full paper 📃 for KernelBench!

Turns out KernelBench is quite challenging 🧠 —  frontier models outperform the PyTorch Eager baseline &lt;20% of the time.

More 🧵👇
Anjiang Wei (@anjiangw) 's Twitter Profile Photo

We introduce CodeARC, a new benchmark for evaluating LLMs’ inductive reasoning. Agents must synthesize functions from I/O examples—no natural language, just reasoning. 📄 arxiv.org/pdf/2503.23145 💻 github.com/Anjiang-Wei/Co… 🌐 anjiang-wei.github.io/CodeARC-Websit… #LLM #Reasoning #LLM4Code #ARC

We introduce CodeARC, a new benchmark for evaluating LLMs’ inductive reasoning. Agents must synthesize functions from I/O examples—no natural language, just reasoning.
📄 arxiv.org/pdf/2503.23145
💻 github.com/Anjiang-Wei/Co…
🌐 anjiang-wei.github.io/CodeARC-Websit…
#LLM #Reasoning #LLM4Code #ARC