Baraban (@bara_ban) 's Twitter Profile
Baraban

@bara_ban

Software engineer, drawn towards understanding the first principles of the universe and consciousness.

ID: 169047743

calendar_today21-07-2010 12:15:19

1,1K Tweet

508 Takipçi

327 Takip Edilen

Baraban (@bara_ban) 's Twitter Profile Photo

12 first experiments done github.com/KintaroAI/rese… Currently using Claude Opus 4.6 to run experiments - pretty happy. Thinking about using huggingface.co to store artifacts (models, training logs, figures) Created experiment and testing protocols - will improve them

Baraban (@bara_ban) 's Twitter Profile Photo

Ran a bunch of experiments today with Claude Opus 4.6 as my research partner - comparing baseline vs blend (bigram embedding mixing) vs Hebbian pull (non-learnable co-occurrence force on embeddings) for GPT-2 training on TinyStories. Key findings: 1) Blend-G8 consistently beats

Baraban (@bara_ban) 's Twitter Profile Photo

Been thinking for a while: "possible" and "impossible" are often just names we give to the presence or absence of will.

Baraban (@bara_ban) 's Twitter Profile Photo

I don't know for sure, but it must be pretty tough being an AI safety researcher in the US these days. Every single time you bring up any concern... you just getting this x.com/i/status/19097…

Baraban (@bara_ban) 's Twitter Profile Photo

TIL: large transformers need LR warmup at the start of training. Small models converge fine without it, large ones don't.

Baraban (@bara_ban) 's Twitter Profile Photo

Remember these? Foot-operated door openers that showed up during the pandemic so you didn’t have to touch handles. Did your workplace have it? Someone probably made a fortune.

Baraban (@bara_ban) 's Twitter Profile Photo

A toy model of topographic map formation - how thalamus neurons self-organize spatially through local correlation-based rules. No pre-training, just greedy local attraction. Converges pretty good but can be better. Experiment: take an image, scramble all pixels randomly, then

Baraban (@bara_ban) 's Twitter Profile Photo

TIL while searching for unsupervised competitive learning: - HTM - Hierarchical Temporal Memory and - SOM - Self-Organizing Map Looking for a small unsupervised competitive cell that maps a low-dimensional input vector to a low-dimensional probability output, usually with one