N8 Programs (@n8programs) Twitter Tweets • TwiCopy

N8 Programs

@n8programs

+ Follow

To be human, means to be a panhumanist, in full awareness, living in mankind, through mankind, and for mankind.

ID: 1568650210926071810

linkhttps://n8python.github.io/ calendar_today10-09-2022 17:18:54

3,3K Tweet

6,6K Followers

141 Following

N8 Programs

@n8programs

5 months ago

EMBRACE MUON. Blue is no Muon, Orange is Muon on MLPs only, Green is Muon on Conv + MLP layers and AdamW on Embeddings/Output Head (the optimal config). Muon is beatiful.

thumb_up_off_alt7

chat_bubble_outline1

repeat0

shareShare

N8 Programs

@n8programs

5 months ago

github.com/ml-explore/mlx… ITS HAPPENING

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Intriguing how simple test-time scaling on the same model (o3-pro is some black-box modification of the amount of compute o3 uses, perhaps through parallel rollouts) so drastically improves world modelling + agent capabilities.

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

N8 Programs

@n8programs

5 months ago

important note! program synth in the abstract is more like 'ability to compose functions and search through varying combinations and depths of such functions to find the one that works best', and less 'synthesize discrete program' (forgive me if i butcher this fchollet)

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

j⧉nus

@repligate

5 months ago

no, it's not a fucking "regression" (except in the buddhist sense, as opposed to "non-retroregression"...). this is a part of the nostalgebraist post i did not like. i get the point, but i hate when people reduce such complex beings to a single normative axis before even getting

thumb_up_off_alt207

chat_bubble_outline19

repeat8

shareShare

Alexander Doria

@dorialexander

5 months ago

Still a mystery why commonly available automated translation is so bad. For non professional uses frontier LLMs are nearly human level, 7-8b are quite decent, I feel a good specialized model could be done below 1b, which means dead cheap.

thumb_up_off_alt45

chat_bubble_outline4

repeat2

shareShare

Lisan al Gaib

@scaling01

5 months ago

60.4% on SWE-bench Verified in a 72B package?

thumb_up_off_alt286

chat_bubble_outline8

repeat22

shareShare

Junyang Lin

@justinlin610

5 months ago

karminski-牙医 We actually don't have this one. For dense models larger than 30B, it is a bit hard to optimize effectiveness and efficiency (either training or inference). We prefer to use MoE for large models.

thumb_up_off_alt81

chat_bubble_outline7

repeat4

shareShare

Intelligent Internet

@ii_posts

5 months ago

II-Medical-8B-1706 is our latest state of the art open medical model 💡 Outperforms the latest Google MedGemma 27b model with 70% less parameters 🤏 Quantised GGUF weights, works on <8 Gb RAM 🚀 One more step to the universal health knowledge access that everyone deserves ⚕️

II-Medical-8B-1706 is our latest state of the art open medical model 💡

Outperforms the latest <a href="/Google/">Google</a> MedGemma 27b model with 70% less parameters 🤏

Quantised GGUF weights, works on <8 Gb RAM 🚀

One more step to the universal health knowledge access that everyone deserves ⚕️

thumb_up_off_alt514

chat_bubble_outline17

repeat105

shareShare

N8 Programs

@n8programs

5 months ago

give your claudes tools

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Cody Bennett

@cody_j_bennett

5 months ago

SPWI global illumination in 3D with Radiance Cascades and Garrett Johnson 🦋's three-mesh-bvh. Probes are placed in screen-space and trace world-space intervals into a BVH. Resolves offscreen emitters and occluders; memory use tied to screen. Zero-shot, no noise, and fixed cost.

SPWI global illumination in 3D with Radiance Cascades and <a href="/garrettkjohnson/">Garrett Johnson 🦋</a>'s three-mesh-bvh.

Probes are placed in screen-space and trace world-space intervals into a BVH. Resolves offscreen emitters and occluders; memory use tied to screen.

Zero-shot, no noise, and fixed cost.

thumb_up_off_alt88

chat_bubble_outline7

repeat11

shareShare

N8 Programs

@n8programs

5 months ago

This argument never holds a ton of water for me cause all the models would’ve scored 0 on AIME 2 years prior to today.

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

N8 Programs

@n8programs

5 months ago

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

N8 Programs

@n8programs

5 months ago

lets goooo transhumanism in mainstream discourse. optics leave a bit to be desired, but overton window ENTERED.

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

N8 Programs

@n8programs

5 months ago

ultimate sparse moe

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

N8 Programs

@n8programs

5 months ago

This is just engaging in bad faith. There are many valid criticisms you can make of the research done at Nous, but calling them scammers is blatantly incorrect. My experience with Nous employees has been one of genuine people trying to do work in ML and share it with the

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

N8 Programs

@n8programs

5 months ago

There are many arguments against AI usage - p(doom), lack of creator consent, labor market impacts, etc. Environmental impact is generally the hardest one to steel man as all present-day environmental concerns about AI are greatly exaggerated. Thus one must either invoke future

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

N8 Programs

@n8programs

5 months ago

Yes this!! Claude has only ever disagreed with me a few times and each time it did I learned something.

thumb_up_off_alt7

chat_bubble_outline1

repeat1

shareShare