Alan Dao (@alandao_ai) 's Twitter Profile
Alan Dao

@alandao_ai

AI Researcher at Menlo Research. Author of Jan, Lucy, Jan-nano, Ichigo, AlphaMaze, and various other works at Menlo Research.

ID: 1247124079271751680

linkhttps://alandao.net calendar_today06-04-2020 11:28:55

546 Tweet

324 Takipçi

23 Takip Edilen

Alan Dao (@alandao_ai) 's Twitter Profile Photo

😱Unreasonable efficiency of GPT-OSS-20B reasoning trace 😱 Yeah… it’s really good. This is exactly what we wanted to achieve with the Lucy model, a natural and effective reasoning trace that is less prone to hallucination. Well, Lucy is a 1.7 B model after all, so it’s

😱Unreasonable efficiency of GPT-OSS-20B reasoning trace 😱

Yeah… it’s really good. This is exactly what we wanted to achieve with the Lucy model, a natural and effective reasoning trace that is less prone to hallucination.

Well, Lucy is a 1.7 B model after all, so it’s
Mitko Vasilev (@iotcoi) 's Twitter Profile Photo

Alan Dao Ivan Fioravanti ᯅ It’s too bad that M-chips can't use FP4. To achieve the same quality of output as a Blackwell chip, Apple requires four times the memory. This is yet another strike from NVIDIA's software ecosystem monopoly.

Kieran Klaassen (@kieranklaassen) 's Twitter Profile Photo

Claude Caude can run GPT5. GPT-5 is good at fixing nasty bugs and doing research. In a different way than Claude and they can work together. Just create a Claude code agent called gpt5:

Claude Caude can run GPT5.

GPT-5 is good at fixing nasty bugs and doing research. In a different way than Claude and they can work together.

Just create a Claude code agent called gpt5:
Shuangfei Zhai (@zhaisf) 's Twitter Profile Photo

Unlike an RNN, one attention block alone cannot model anything interesting. And it’s the stacking of it that does wonders. Understanding this compositionality should be at least as important as understanding the attn module itself.

Unlike an RNN, one attention block alone cannot model anything interesting. And it’s the stacking of it that does wonders. Understanding this compositionality should be at least as important as understanding the attn module itself.
Sebastian Raschka (@rasbt) 's Twitter Profile Photo

Pretty cool. I think 2025-2026 will be a stronger focus on these in open source tooling. I.e. having LLMs delegate knowledge-based queries to search, which in turn frees up model capacity to improve reasoning capabilities and tool use.

@levelsio (@levelsio) 's Twitter Profile Photo

I really really like 👋 Jan It's a very friendly app to locally run LLMs, great for privacy I've tried others like LM Studio and Ollama and they're nice but very engineer-built, a bit too difficult for me Jan is simple and cute and pretty and a great alternative to talk to

I really really like <a href="/jandotai/">👋 Jan</a>

It's a very friendly app to locally run LLMs, great for privacy

I've tried others like LM Studio and Ollama and they're nice but very engineer-built, a bit too difficult for me

Jan is simple and cute and pretty and a great alternative to talk to
👋 Jan (@jandotai) 's Twitter Profile Photo

Jan v1 is trending on Hugging Face today. Huge thanks to everyone trying it out, giving feedback, and sharing your setups. We see you 💙

Jan v1 is trending on Hugging Face today.

Huge thanks to everyone trying it out, giving feedback, and sharing your setups.

We see you 💙
👋 Jan (@jandotai) 's Twitter Profile Photo

Have your own version of Perplexity Pro's Deep Research with Jan v1, the open-source alternative. Cookbook: jan.ai/post/jan-v1-fo…