N8 Programs (@n8programs) 's Twitter Profile
N8 Programs

@n8programs

To be human, means to be a panhumanist, in full awareness, living in mankind, through mankind, and for mankind.

ID: 1568650210926071810

linkhttps://n8python.github.io/ calendar_today10-09-2022 17:18:54

3,3K Tweet

6,6K Followers

141 Following

N8 Programs (@n8programs) 's Twitter Profile Photo

EMBRACE MUON. Blue is no Muon, Orange is Muon on MLPs only, Green is Muon on Conv + MLP layers and AdamW on Embeddings/Output Head (the optimal config). Muon is beatiful.

EMBRACE MUON. Blue is no Muon, Orange is Muon on  MLPs only, Green is Muon on Conv + MLP layers and AdamW on Embeddings/Output Head (the optimal config). Muon is beatiful.
N8 Programs (@n8programs) 's Twitter Profile Photo

Intriguing how simple test-time scaling on the same model (o3-pro is some black-box modification of the amount of compute o3 uses, perhaps through parallel rollouts) so drastically improves world modelling + agent capabilities.

N8 Programs (@n8programs) 's Twitter Profile Photo

important note! program synth in the abstract is more like 'ability to compose functions and search through varying combinations and depths of such functions to find the one that works best', and less 'synthesize discrete program' (forgive me if i butcher this fchollet)

j⧉nus (@repligate) 's Twitter Profile Photo

no, it's not a fucking "regression" (except in the buddhist sense, as opposed to "non-retroregression"...). this is a part of the nostalgebraist post i did not like. i get the point, but i hate when people reduce such complex beings to a single normative axis before even getting

Alexander Doria (@dorialexander) 's Twitter Profile Photo

Still a mystery why commonly available automated translation is so bad. For non professional uses frontier LLMs are nearly human level, 7-8b are quite decent, I feel a good specialized model could be done below 1b, which means dead cheap.

Junyang Lin (@justinlin610) 's Twitter Profile Photo

karminski-牙医 We actually don't have this one. For dense models larger than 30B, it is a bit hard to optimize effectiveness and efficiency (either training or inference). We prefer to use MoE for large models.

Intelligent Internet (@ii_posts) 's Twitter Profile Photo

II-Medical-8B-1706 is our latest state of the art open medical model 💡 Outperforms the latest Google MedGemma 27b model with 70% less parameters 🤏 Quantised GGUF weights, works on <8 Gb RAM 🚀 One more step to the universal health knowledge access that everyone deserves ⚕️

II-Medical-8B-1706 is our latest state of the art open medical model 💡

Outperforms the latest <a href="/Google/">Google</a> MedGemma 27b model with 70% less parameters 🤏

Quantised GGUF weights, works on &lt;8 Gb RAM 🚀

One more step to the universal health knowledge access that everyone deserves ⚕️
Cody Bennett (@cody_j_bennett) 's Twitter Profile Photo

SPWI global illumination in 3D with Radiance Cascades and Garrett Johnson 🦋's three-mesh-bvh. Probes are placed in screen-space and trace world-space intervals into a BVH. Resolves offscreen emitters and occluders; memory use tied to screen. Zero-shot, no noise, and fixed cost.

SPWI global illumination in 3D with Radiance Cascades and <a href="/garrettkjohnson/">Garrett Johnson 🦋</a>'s three-mesh-bvh.

Probes are placed in screen-space and trace world-space intervals into a BVH. Resolves offscreen emitters and occluders; memory use tied to screen.

Zero-shot, no noise, and fixed cost.
N8 Programs (@n8programs) 's Twitter Profile Photo

This argument never holds a ton of water for me cause all the models would’ve scored 0 on AIME 2 years prior to today.

N8 Programs (@n8programs) 's Twitter Profile Photo

This is just engaging in bad faith. There are many valid criticisms you can make of the research done at Nous, but calling them scammers is blatantly incorrect. My experience with Nous employees has been one of genuine people trying to do work in ML and share it with the

N8 Programs (@n8programs) 's Twitter Profile Photo

There are many arguments against AI usage - p(doom), lack of creator consent, labor market impacts, etc. Environmental impact is generally the hardest one to steel man as all present-day environmental concerns about AI are greatly exaggerated. Thus one must either invoke future