YI (@y_imjk) 's Twitter Profile
YI

@y_imjk

Research Engineer 🐟🐠🐡 / ex-UTokyo (eeic2021, Aizawa Lab) / 未踏IT 2022

ID: 1543851399590875143

calendar_today04-07-2022 06:57:41

103 Tweet

288 Followers

323 Following

YI (@y_imjk) 's Twitter Profile Photo

最近は純粋にAIが羨ましくて、寝ずに稼動できるし、ご飯用意したり掃除洗濯したりしなくていいし、体調も基本的には一定で安定性が抜群だし ……というのを考えながら、もう少し人間らしい生活もした方がいいなと生活を振り返るみたいな深夜1時半

Kosuke Nakago (@corochann) 's Twitter Profile Photo

4/8に行われたSakana AIのAI Agent勉強会ですが、せっかくご応募していただいたのに招待できなかった方も多かったので、後半部分で使用した資料を公開しました。 すでにかなりエコシステムが広がっていて全てはカバーできず、研究寄りの観点での動向を広めに紹介しています speakerdeck.com/sakana_ai/2025…

Sam Altman (@sama) 's Twitter Profile Photo

goodbye, GPT-4. you kicked off a revolution. we will proudly keep your weights on a special hard drive to give to some historians in the future.

YI (@y_imjk) 's Twitter Profile Photo

学部生の時は誰よりも早く実験レポートを提出していたのに、今となってはなぜか自転車操業をしている……なぜ……………………

Google AI (@googleai) 's Twitter Profile Photo

Gemini Diffusion, our newest research model, is significantly faster than our fastest model so far AND matches its coding performance. By correcting errors as the model thinks, it is extremely fast for editing tasks like math and coding.

Sakana AI (@sakanaailabs) 's Twitter Profile Photo

Following our Sudoku-based reasoning benchmark announcement, we've been evaluating the latest models to track improvements in their reasoning capabilities. Today, we’re launching the Sudoku-Bench Leaderboard: pub.sakana.ai/sudoku/ New technical report: arxiv.org/abs/2505.16135

Following our Sudoku-based reasoning benchmark announcement, we've been evaluating the latest models to track improvements in their reasoning capabilities.

Today, we’re launching the Sudoku-Bench Leaderboard: pub.sakana.ai/sudoku/

New technical report: arxiv.org/abs/2505.16135
YI (@y_imjk) 's Twitter Profile Photo

謎のハゲタカジャーナル(?)メールがたまに迷惑メール貫通して届くんだけど、敬称がDr.だったりProf.だったりするのはなんなんだろうな……(自動メールなだけなんだろうけど)

DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🚀 DeepSeek-R1-0528 is here! 🔹 Improved benchmark performance 🔹 Enhanced front-end capabilities 🔹 Reduced hallucinations 🔹 Supports JSON output & function calling ✅ Try it now: chat.deepseek.com 🔌 No change to API usage — docs here: api-docs.deepseek.com/guides/reasoni… 🔗

Luke Darlow (@learningluked) 's Twitter Profile Photo

If you’re interested in learning about Continuous Thought Machines (sakana.ai/ctm/), we made interactive notebook tutorials so you can hack around with CTMs ImageNet: github.com/SakanaAI/conti… MNIST Tutorial: github.com/SakanaAI/conti… Let me know if you have any feedback!

Shashank Kotyan (@shashankkotyan) 's Twitter Profile Photo

🚨 Excited to present our paper Percept-Lens at the #ReGenAI Workshop at #CVPR2025! We introduce a 36M-image benchmark to test generalization in AI-generated image detection across 26 datasets & 16+ generative models. 🔍 Benchmark: dataverse.harvard.edu/dataverse/perc…

🚨 Excited to present our paper Percept-Lens at the #ReGenAI Workshop at #CVPR2025!  

We introduce a 36M-image benchmark to test generalization in AI-generated image detection across 26 datasets & 16+ generative models.  

🔍 Benchmark: dataverse.harvard.edu/dataverse/perc…
松井研 / Matsui Lab (@utokyo_bunny) 's Twitter Profile Photo

I'll present my poster paper at #CVPR2025 on June 15! I propose an extremely fast post-processing module for diverse nearest-neighbor searches🚀 Y. Matsui, "LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table" arxiv.org/abs/2506.04790

Sakana AI (@sakanaailabs) 's Twitter Profile Photo

Introducing ALE-Bench, ALE-Agent! Towards Automating Long-Horizon Algorithm Engineering for Hard Optimization Problems Blog: sakana.ai/ale-bench/ Paper: arxiv.org/abs/2506.09050 ALE-Bench is a coding benchmark primarily focused on hard optimization (NP-hard) problems. We

Takuya Akiba (@iwiwi) 's Twitter Profile Photo

AI will soon master Codeforces. So, what's the next challenge? 🚀Introducing ALE-Bench (ALgorithm Engineering Benchmark) 🏆 A new frontier benchmark for algorithmic coding, designed to test long-horizon reasoning on complex problems through trial and error. 🤖What is ALE-Bench?