Chenglong (@fatedier) 's Twitter Profile
Chenglong

@fatedier

Open-source enthusiast | Creator of frp: a fast reverse proxy | Currently focusing on AI Agents

ID: 1473317059035496448

linkhttps://github.com/fatedier/frp calendar_today21-12-2021 15:39:08

96 Tweet

43 Followers

7 Following

DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power! 🔍 o1-preview-level performance on AIME & MATH benchmarks. 💡 Transparent thought process in real-time. 🛠️ Open-source models & API coming soon! 🌐 Try it now at chat.deepseek.com #DeepSeek

🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power!

🔍 o1-preview-level performance on AIME & MATH benchmarks.
💡 Transparent thought process in real-time.
🛠️ Open-source models & API coming soon!

🌐 Try it now at chat.deepseek.com
#DeepSeek
Chenglong (@fatedier) 's Twitter Profile Photo

I’ve tested many models, but so far, only o1-preview can solve this problem: Using the numbers 5, 5, 5, 5, and 5, each number must be used exactly once and no more than once. Can they be combined using addition, subtraction, multiplication, and division to result in 24?

I’ve tested many models, but so far, only o1-preview can solve this problem:

Using the numbers 5, 5, 5, 5, and 5, each number must be used exactly once and no more than once. Can they be combined using addition, subtraction, multiplication, and division to result in 24?
Chenglong (@fatedier) 's Twitter Profile Photo

In the test results of my agent project, it performed significantly better than gpt-4o-2024-08-06. However, there’s still a noticeable gap compared to claude-3.5-sonnet(new).

Chenglong (@fatedier) 's Twitter Profile Photo

Surprised to see that in my agent project, claude-3.5-sonnet is still outperforming other models after 6 months! Looking forward to GPT-4.5, Claude 4, and Grok 3, but not fully convinced there will be a massive leap in performance just yet.

Chenglong (@fatedier) 's Twitter Profile Photo

After initial testing, grok3 from xAI delivers surprisingly impressive results! Can't wait to dive deeper with my agent project once the API opens up. Exciting times ahead!

Chenglong (@fatedier) 's Twitter Profile Photo

For paid users, using a downgraded model to respond when resources are insufficient, without any notification, is simply fraudulent behavior. Grok is currently the best platform. OpenAI seems more closed off compared to other companies.

For paid users, using a downgraded model to respond when resources are insufficient, without any notification, is simply fraudulent behavior. Grok is currently the best platform. OpenAI seems more closed off compared to other companies.
Chenglong (@fatedier) 's Twitter Profile Photo

I’ve noticed that very few people have paid attention to qwq 32b + groq, 400t/s, with reasoning capabilities comparable to top-tier models—enough to change everything.