DeepSeek (@deepseek_ai) 's Twitter Profile
DeepSeek

@deepseek_ai

Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.

ID: 1714580962569588736

linkhttps://www.deepseek.com/ calendar_today18-10-2023 09:55:45

139 Tweet

975,975K Takipçi

0 Takip Edilen

DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference! Core components of NSA: • Dynamic hierarchical sparse strategy • Coarse-grained token compression • Fine-grained token selection 💡 With

🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference!

Core components of NSA:
• Dynamic hierarchical sparse strategy
• Coarse-grained token compression
• Fine-grained token selection

💡 With