enundiagrisblanco (@enundiagris_) 's Twitter Profile
enundiagrisblanco

@enundiagris_

Ser insondable, hendidura en el aire

ID: 724197060807933953

calendar_today24-04-2016 11:23:29

2,2K Tweet

93 Followers

1,1K Following

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

xLSTM more expressive than transformer, Mamba: arxiv.org/abs/2603.03612 *nonlinear RNNs: sLSTM, LSTM *DPLR linear RNNs: mLSTM, RWKV, DeltaNet *Non PNC1: Mamba, Transformer “fundamental expressivity gaps between linear and nonlinear RNNs” World models require nonlinear RNNs.

xLSTM more expressive than transformer, Mamba:  arxiv.org/abs/2603.03612

*nonlinear RNNs: sLSTM, LSTM
*DPLR linear RNNs: mLSTM, RWKV, DeltaNet 
*Non PNC1:  Mamba, Transformer

“fundamental expressivity gaps between linear and nonlinear RNNs”

World models require nonlinear RNNs.
Reza Bayat (@reza_byt) 's Twitter Profile Photo

Mythos is a looped transformer!? 😳 Should be a Mixture-of-Recursions (MoR) — 2× faster, controlled effort. Dense → sparse MoE was the efficiency unlock of 2023. Uniform loops → MoR is the same move for recursive transformers. Paper reading list below. 🧵

Mythos is a looped transformer!? 😳 Should be a Mixture-of-Recursions (MoR) — 2× faster, controlled effort.

Dense → sparse MoE was the efficiency unlock of 2023.

Uniform loops → MoR is the same move for recursive transformers.

Paper reading list below. 🧵
DAIR.AI (@dair_ai) 's Twitter Profile Photo

NEW paper from Apple. Interesting idea: "Attention to Mamba". The paper introduces a two-stage recipe for cross-architecture distillation from Transformers into Mamba. Naive distillation collapses teacher performance. Their trick: first distill the transformer into a

NEW paper from Apple.

Interesting idea: "Attention to Mamba".

The paper introduces a two-stage recipe for cross-architecture distillation from Transformers into Mamba.

Naive distillation collapses teacher performance. Their trick: first distill the transformer into a
Clutch God (@xsports_1) 's Twitter Profile Photo

> be Yann LeCun > spend years building JEPA at Meta > company focuses on LLaMA instead > his idea stays complicated and unused > robotics plans get dropped > decides to leave and start AMI Labs > builds a much simpler version from scratch > trains it on normal hardware in just a

Alejo (@ecommartinez) 's Twitter Profile Photo

Si has llegado hasta aquí, claramente formas parte de los que van a ir por delante. ⭐ Comparto este tipo de cosas regularmente aquí → Alejo ¿Seguimos?

324.cat (@324cat) 's Twitter Profile Photo

Perdre el mòbil és fàcil. Recuperar-lo també ho pot ser si abans l’has deixat ben configurat amb aquestes quatre accions preventives 3cat.cat/3catinfo/que-c…

Marta Peirano (@minipetite) 's Twitter Profile Photo

Me sumo a la advertencia: Sci-Hub ha pirateado más de 85 millones de artículos de investigación y ahora encima han añadido un bot que responde preguntas utilizando artículos completos y recientes. Esto es un escándalo. Dejo el enlace abajo para que sepas cómo evitarlo.

Santi Torres (@santitorai) 's Twitter Profile Photo

🚨 Karpathy acaba de soltar 40 minutos de oro sobre agentes IA en 2026. Qué aprender, qué construir y qué tirar antes de que te hunda. El 90% de las herramientas actuales no van a sobrevivir 90 días. El filtro ya está hecho. Gratis.