Kevin Xu (@kevin671xu) 's Twitter Profile
Kevin Xu

@kevin671xu

Ph.D. student at the University of Tokyo / Deep Learning, Neural Networks, Transformers

ID: 1496180074856730625

linkhttps://sites.google.com/g.ecc.u-tokyo.ac.jp/kevinxu calendar_today22-02-2022 17:48:56

13 Tweet

27 Followers

66 Following

enjoy my life (@issei_sato) 's Twitter Profile Photo

D1 ケビンさんのループ付きトランスフォーマの表現力に関する理論解析の論文を公開しました。ループ数を増やすことで表現力が上がることを示しました。さらに理論から性能改善案も導出され実際の実験で性能が上がることを確認しています。

Yufan Zhuang (@yufan_zhuang) 's Twitter Profile Photo

First practical Looped Transformer?!😺 Latest research from Google DeepMind proposes relaxed recursive transformers. That works by 1. Take out layers from a pretrained LM 2. Group them together and add LoRA 3. Decode with early exit Able to recover 90% of Gemma-2B! Paper:

First practical Looped Transformer?!😺

Latest research from <a href="/GoogleDeepMind/">Google DeepMind</a> proposes relaxed recursive transformers. That works by

1. Take out layers from a pretrained LM
2. Group them together and add LoRA
3. Decode with early exit

Able to recover 90% of Gemma-2B! 

Paper:
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Today in nature, we’re presenting GenCast: our new AI weather model which gives us the probabilities of different weather conditions up to 15 days ahead with state-of-the-art accuracy. ☁️⚡ Here’s how the technology works. 🧵goo.gle/49trAOv

Today in <a href="/Nature/">nature</a>, we’re presenting GenCast: our new AI weather model which gives us the probabilities of different weather conditions up to 15 days ahead with state-of-the-art accuracy. ☁️⚡

Here’s how the technology works. 🧵goo.gle/49trAOv
enjoy my life (@issei_sato) 's Twitter Profile Photo

The following paper has been accepted to ICML 2025! Benign Overfitting in Token Selection of Attention Mechanism Keitaro Sakamoto, Issei Sato arxiv.org/abs/2409.17625

enjoy my life (@issei_sato) 's Twitter Profile Photo

The following paper has been accepted to ICML 2025! On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding Kevin Xu, Issei Sato arxiv.org/abs/2410.01405

Kevin Xu (@kevin671xu) 's Twitter Profile Photo

🎉 Our paper on "the expressive power of Looped Transformers" was accepted at #ICML2025 ! To the best of our knowledge, this is the first study to analyze their function approximation capabilities, including approximation rates and universality. arxiv.org/abs/2410.01405

🎉 Our paper on "the expressive power of Looped Transformers" was accepted at #ICML2025 !  
To the best of our knowledge, this is the first study to analyze their function approximation capabilities, including approximation rates and universality. 
arxiv.org/abs/2410.01405
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO

enjoy my life (@issei_sato) 's Twitter Profile Photo

We have published the following preprint on the analysis of LLMs using computational complexity theory. To CoT or To Loop? A Formal Comparison Between Chain-of-Thought and Looped Transformers Kevin Xu, Issei Sato arxiv.org/abs/2505.19245

enjoy my life (@issei_sato) 's Twitter Profile Photo

ケビンさんの論文を公開しました。 再帰的ループを持つLLMとCoTを使う場合で解ける問題のクラスを計算複雑性理論を用いて解析しました。

Kevin Xu (@kevin671xu) 's Twitter Profile Photo

再帰的構造によって計算クラスを拡張するCoTの理論解析を出発点に、近年注目される潜在空間上での推論(Loopedモデル)との比較を行いました。一方で、CoTの確率的性質に着目することで、CoTが優位となるケースも特徴付けています。(シェイクスピアは未履修です)

再帰的構造によって計算クラスを拡張するCoTの理論解析を出発点に、近年注目される潜在空間上での推論(Loopedモデル)との比較を行いました。一方で、CoTの確率的性質に着目することで、CoTが優位となるケースも特徴付けています。(シェイクスピアは未履修です)
Kevin Xu (@kevin671xu) 's Twitter Profile Photo

国際会議なのにここ数年英語はFriendsでしか触れてないから突然ラフになるノリの変なやつ爆誕しそうで怖い

Kevin Xu (@kevin671xu) 's Twitter Profile Photo

🚀 Are Looped Transformers universal approximators? 📍 Come see our poster at #ICML2025 🗓️ Wed, July 16, 3:00–5:30 a.m. 📌 East Exhibition Hall A-B #E-3411 📄 Paper: openreview.net/forum?id=H4Buh… Let’s talk about the power of loops 🔁

🚀 Are Looped Transformers universal approximators?

📍 Come see our poster at #ICML2025
🗓️ Wed, July 16, 3:00–5:30 a.m.
📌 East Exhibition Hall A-B #E-3411
📄 Paper: openreview.net/forum?id=H4Buh…
Let’s talk about the power of loops 🔁
Kevin Xu (@kevin671xu) 's Twitter Profile Photo

拡散モデルの数式だけ見ても(自分には)理解できなかったことがたくさん書いてあって勉強になった sander.ai

Andrew Ng (@andrewyng) 's Twitter Profile Photo

Releasing a new "Agentic Reviewer" for research papers. I started coding this as a weekend project, and Yixing Jiang made it much better. I was inspired by a student who had a paper rejected 6 times over 3 years. Their feedback loop -- waiting ~6 months for feedback each time -- was

Releasing a new "Agentic Reviewer" for research papers. I started coding this as a weekend project, and <a href="/jyx_su/">Yixing Jiang</a> made it much better.

I was inspired by a student who had a paper rejected 6 times over 3 years. Their feedback loop -- waiting ~6 months for feedback each time -- was
PLaMo LLM (@plamollm) 's Twitter Profile Photo

リリース延期していたPLaMo™翻訳のPDF翻訳機能ですが、本日より提供を開始しました!🎉 長らくお待たせして申し訳ありません🙇‍♂️ 論文などのドキュメントの英日翻訳に力を入れており、原文ファイルのレイアウトのままに読みやすい日本語に翻訳されます。ぜひお試しください‼️ translate.preferredai.jp