NicoNico🦇🔊 (@niangao_g) 's Twitter Profile
NicoNico🦇🔊

@niangao_g

🎓 PhD student @ Hasso Plattner Institute | 🤖 Crafting smarter, not harder AI | Exploring the edges of efficiency

ID: 802190125757124608

calendar_today25-11-2016 16:40:04

592 Tweet

168 Takipçi

2,2K Takip Edilen

Lucas Atkins (@lucasatkins7) 's Twitter Profile Photo

Happy to share DeepMixtral-8x7b-Instruct. A direct extraction/transfer of Mixtral Instruct's experts into Deepseek's architecture. Performance is identical, if not even a bit better, and seems more malleable to training. Collaborators Eric Hartford Fernando Fernandes Neto.

Happy to share DeepMixtral-8x7b-Instruct. A direct extraction/transfer of Mixtral Instruct's experts into Deepseek's architecture. Performance is identical, if not even a bit better, and seems more malleable to training.

Collaborators <a href="/erhartford/">Eric Hartford</a> <a href="/FernandoNetoAi/">Fernando Fernandes Neto</a>.
Ramsri Goutham Golla (@ramsri_goutham) 's Twitter Profile Photo

Honestly my thought currently running 2 production AI SaaS apps! 1. Anthropic's Claude is too weak - Ask it to be sarcastic it say's I can't hurt sentiments etc. Bruh! Lacking JSON mode is a bummer. You can try workarounds but any JSON with nesting is a pain! 2. Even if open

Honestly my thought currently running 2 production AI SaaS apps!

1. Anthropic's Claude is too weak - Ask it to be sarcastic it say's I can't hurt sentiments etc. Bruh!
Lacking JSON mode is a bummer. You can try workarounds but any JSON with nesting is a pain!

2. Even if open
NicoNico🦇🔊 (@niangao_g) 's Twitter Profile Photo

It is good to know that apple's edge deployment uutilizes the same low-bit layout (a mix of INT4/INT2 ) as green-bit-llm. huggingface.co/blog/NicoNico/…

It is good to know that apple's edge deployment uutilizes the same low-bit layout (a mix of INT4/INT2 ) as green-bit-llm.

huggingface.co/blog/NicoNico/…
Awni Hannun (@awnihannun) 's Twitter Profile Photo

I'm super excited about M5. It's going to help a lot with compute-bound workloads in MLX. For example: - Much faster prefill. In other words time-to-first-token will go down. - Faster image / video generation - Faster fine-tuning (LoRA or otherwise) - Higher throughput for

I'm super excited about M5. It's going to help a lot with compute-bound workloads in MLX.

For example:
- Much faster prefill. In other words time-to-first-token will go down. 
- Faster image / video generation
- Faster fine-tuning (LoRA or otherwise)
- Higher throughput for
Ido Salomon (@idosal1) 's Twitter Profile Photo

Building AgentCraft v1 with AgentCraft v0 is 🤌 Managed up to 9 Claude Code agents with the RTS interface so far. There's a lot to explore, but it feels right. v1 coming soon