Kevin Xu (@kevin671xu) Twitter Tweets • TwiCopy

enjoy my life

@issei_sato

2 years ago

D1 ケビンさんのループ付きトランスフォーマの表現力に関する理論解析の論文を公開しました。ループ数を増やすことで表現力が上がることを示しました。さらに理論から性能改善案も導出され実際の実験で性能が上がることを確認しています。

thumb_up_off_alt37

chat_bubble_outline1

repeat7

shareShare

First practical Looped Transformer?!😺 Latest research from Google DeepMind proposes relaxed recursive transformers. That works by 1. Take out layers from a pretrained LM 2. Group them together and add LoRA 3. Decode with early exit Able to recover 90% of Gemma-2B! Paper:

First practical Looped Transformer?!😺

Latest research from <a href="/GoogleDeepMind/">Google DeepMind</a> proposes relaxed recursive transformers. That works by

1. Take out layers from a pretrained LM
2. Group them together and add LoRA
3. Decode with early exit

Able to recover 90% of Gemma-2B!

Paper:

thumb_up_off_alt4

chat_bubble_outline2

repeat2

shareShare

Kevin Xu

@kevin671xu

2 years ago

OMG

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Google DeepMind

@googledeepmind

a year ago

Today in nature, we’re presenting GenCast: our new AI weather model which gives us the probabilities of different weather conditions up to 15 days ahead with state-of-the-art accuracy. ☁️⚡ Here’s how the technology works. 🧵goo.gle/49trAOv

Today in <a href="/Nature/">nature</a>, we’re presenting GenCast: our new AI weather model which gives us the probabilities of different weather conditions up to 15 days ahead with state-of-the-art accuracy. ☁️⚡

Here’s how the technology works. 🧵goo.gle/49trAOv

thumb_up_off_alt5,5K

chat_bubble_outline123

repeat934

shareShare

enjoy my life

@issei_sato

a year ago

The following paper has been accepted to ICML 2025! Benign Overfitting in Token Selection of Attention Mechanism Keitaro Sakamoto, Issei Sato arxiv.org/abs/2409.17625

thumb_up_off_alt20

chat_bubble_outline0

repeat2

shareShare

enjoy my life

@issei_sato

a year ago

The following paper has been accepted to ICML 2025! On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding Kevin Xu, Issei Sato arxiv.org/abs/2410.01405

thumb_up_off_alt44

chat_bubble_outline0

repeat6

shareShare

Kevin Xu

@kevin671xu

a year ago

🎉 Our paper on "the expressive power of Looped Transformers" was accepted at #ICML2025 ! To the best of our knowledge, this is the first study to analyze their function approximation capabilities, including approximation rates and universality. arxiv.org/abs/2410.01405

thumb_up_off_alt20

chat_bubble_outline3

repeat4

shareShare

Google DeepMind

@googledeepmind

a year ago

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO

thumb_up_off_alt4,4K

chat_bubble_outline85

repeat663

shareShare

enjoy my life

@issei_sato

a year ago

We have published the following preprint on the analysis of LLMs using computational complexity theory. To CoT or To Loop? A Formal Comparison Between Chain-of-Thought and Looped Transformers Kevin Xu, Issei Sato arxiv.org/abs/2505.19245

thumb_up_off_alt17

chat_bubble_outline1

repeat3

shareShare

enjoy my life

@issei_sato

a year ago

ケビンさんの論文を公開しました。再帰的ループを持つLLMとCoTを使う場合で解ける問題のクラスを計算複雑性理論を用いて解析しました。

thumb_up_off_alt12

chat_bubble_outline0

repeat1

shareShare

Kevin Xu

@kevin671xu

a year ago

再帰的構造によって計算クラスを拡張するCoTの理論解析を出発点に、近年注目される潜在空間上での推論（Loopedモデル）との比較を行いました。一方で、CoTの確率的性質に着目することで、CoTが優位となるケースも特徴付けています。（シェイクスピアは未履修です）

thumb_up_off_alt13

chat_bubble_outline0

repeat2

shareShare

Kevin Xu

@kevin671xu

a year ago

👀

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Kevin Xu

@kevin671xu

a year ago

天国大魔境だ

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Kevin Xu

@kevin671xu

10 months ago

国際会議なのにここ数年英語はFriendsでしか触れてないから突然ラフになるノリの変なやつ爆誕しそうで怖い

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Kevin Xu

@kevin671xu

10 months ago

🚀 Are Looped Transformers universal approximators? 📍 Come see our poster at #ICML2025 🗓️ Wed, July 16, 3:00–5:30 a.m. 📌 East Exhibition Hall A-B #E-3411 📄 Paper: openreview.net/forum?id=H4Buh… Let’s talk about the power of loops 🔁

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Google Cloud Tech

@googlecloudtech

10 months ago

AI runs better here 👇

thumb_up_off_alt2,2K

chat_bubble_outline78

repeat230

shareShare

Kevin Xu

@kevin671xu

9 months ago

拡散モデルの数式だけ見ても（自分には）理解できなかったことがたくさん書いてあって勉強になった sander.ai

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Kevin Xu

@kevin671xu

6 months ago

👀

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Andrew Ng

@andrewyng

6 months ago

Releasing a new "Agentic Reviewer" for research papers. I started coding this as a weekend project, and Yixing Jiang made it much better. I was inspired by a student who had a paper rejected 6 times over 3 years. Their feedback loop -- waiting ~6 months for feedback each time -- was

Releasing a new "Agentic Reviewer" for research papers. I started coding this as a weekend project, and <a href="/jyx_su/">Yixing Jiang</a> made it much better.

I was inspired by a student who had a paper rejected 6 times over 3 years. Their feedback loop -- waiting ~6 months for feedback each time -- was

thumb_up_off_alt3,3K

chat_bubble_outline129

repeat476

shareShare

PLaMo LLM

@plamollm

5 months ago

リリース延期していたPLaMo™翻訳のPDF翻訳機能ですが、本日より提供を開始しました！🎉 長らくお待たせして申し訳ありません🙇‍♂️ 論文などのドキュメントの英日翻訳に力を入れており、原文ファイルのレイアウトのままに読みやすい日本語に翻訳されます。ぜひお試しください‼️ translate.preferredai.jp

thumb_up_off_alt379

chat_bubble_outline0

repeat123

shareShare

Kevin Xu

enjoy my life

Yufan Zhuang

Kevin Xu

Google DeepMind

enjoy my life

enjoy my life

Kevin Xu

Google DeepMind

enjoy my life

enjoy my life

Kevin Xu

Kevin Xu

Kevin Xu

Kevin Xu

Kevin Xu

Google Cloud Tech

Kevin Xu

Kevin Xu

Andrew Ng

PLaMo LLM