Rain (@rainnekoneko) Twitter Tweets • TwiCopy

Rain

@rainnekoneko

+ Follow

I like rain. // Scientist @ Avey AI // pfp: reddit.com/r/cats/comment…

ID: 1923977157904101376

calendar_today18-05-2025 05:41:33

35 Tweet

220 Followers

63 Following

Moe Shop

@korewamoe

2 years ago

✧ NEW RELEASE ✧ my song "Fluorite" for Gakuen iDOLM@STER is out now everywhere ♡ stream here → nex-tone.link/EwejxAE7K lyrics by やぎぬまかなやぎぬまかな vocals by 七瀬つむぎ七瀬つむぎ

✧ NEW RELEASE ✧

my song "Fluorite" for Gakuen iDOLM@STER is out now everywhere ♡

stream here → nex-tone.link/EwejxAE7K

lyrics by やぎぬまかな <a href="/ygnm_kana/">やぎぬまかな</a>
vocals by 七瀬つむぎ <a href="/tsumugi_nanase/">七瀬つむぎ</a>

thumb_up_off_alt1,1K

chat_bubble_outline13

repeat325

shareShare

Visualizing LLM basics: How do LLMs store facts in its weights? Came across a video by Grant Sanderson on how LLMs store facts in its feed-forward layers (ffwd), something that a lot fewer videos / blogs touch on compared to the famous attention layers. However, ffwd is the key

Visualizing LLM basics:
How do LLMs store facts in its weights?

Came across a video by <a href="/3blue1brown/">Grant Sanderson</a> on how LLMs store facts in its feed-forward layers (ffwd), something that a lot fewer videos / blogs touch on compared to the famous attention layers.

However, ffwd is the key

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Pedro Cuenca

@pcuenq

5 months ago

Download pre-compiled, optimized kernels from the Kernel Hub! Battle tested in transformers and TGI, let us know if you use it in other PyTorch projects 🚀 huggingface.co/blog/hello-hf-…

thumb_up_off_alt52

chat_bubble_outline5

repeat6

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

5 months ago

I wish more of the best AI researchers work on advancing medical AI

thumb_up_off_alt320

chat_bubble_outline35

repeat21

shareShare

Rain

@rainnekoneko

5 months ago

I really didn't expect Avey to get much notice at all, thank you everyone c:

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

癒しの動物

@animalkyat

5 months ago

クオリティ高すぎるのいた

thumb_up_off_alt131,131K

chat_bubble_outline101

repeat6,6K

shareShare

Simo Ryu

@cloneofsimo

5 months ago

Expectation: "When I get a job as ML researcher, I'm going to push infinite-ctx-length diffusion model that faithfully diffuses via second order expansion of fokker-planck equation, with adaptive guidance on control signal and efficient gradient estimation" Your dataset:

thumb_up_off_alt114

chat_bubble_outline7

repeat4

shareShare

erogol

@erogol

5 months ago

Done some midnight coding and added Avey to BlaGPT it is crazy slow and uses way more VRAM at training. I think it needs some iteration to be practical however it landed well amongst other transformer alternatives github.com/erogol/BlaGPT

thumb_up_off_alt2

chat_bubble_outline0

repeat2

shareShare

bycloud

@bycloudai

5 months ago

this basically replaces tokenization by pooling bytes through u-net into chunks and predict the next chunk, which would also predict multiple bytes/words at once arxiv.org/abs/2506.14761

thumb_up_off_alt47

chat_bubble_outline2

repeat3

shareShare

Rain

Moe Shop

Muyu He

Pedro Cuenca

Tanishq Mathew Abraham, Ph.D.

Rain

癒しの動物

Simo Ryu

erogol

bycloud