profile-img
DeepSeek

@deepseek_ai

Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.

calendar_today18-10-2023 09:55:45

59 Tweets

6,7K Followers

0 Following

DeepSeek(@deepseek_ai) 's Twitter Profile Photo

πŸš€ Launching DeepSeek-V2: The Cutting-Edge Open-Source MoE Model!

🌟 Highlights:
> Places top 3 in AlignBench, surpassing GPT-4 and close to GPT-4-Turbo.
> Ranks top-tier in MT-Bench, rivaling LLaMA3-70B and outperforming Mixtral 8x22B.
> Specializes in math, code and reasoning.

πŸš€ Launching DeepSeek-V2: The Cutting-Edge Open-Source MoE Model! 🌟 Highlights: > Places top 3 in AlignBench, surpassing GPT-4 and close to GPT-4-Turbo. > Ranks top-tier in MT-Bench, rivaling LLaMA3-70B and outperforming Mixtral 8x22B. > Specializes in math, code and reasoning.
account_circle
DeepSeek(@deepseek_ai) 's Twitter Profile Photo

DeepSeek-V2 is a strong, economical, and efficient MoE language model, enhanced with exceptional architectural designs in attention mechanisms and sparse layers:

🌟 MLA (Multi-head Latent Attention): a better and faster attention that ensures efficient inference via reducing KV

DeepSeek-V2 is a strong, economical, and efficient MoE language model, enhanced with exceptional architectural designs in attention mechanisms and sparse layers: 🌟 MLA (Multi-head Latent Attention): a better and faster attention that ensures efficient inference via reducing KV
account_circle