Alec Tschantz (@a_tschantz) Twitter Tweets • TwiCopy

Alec Tschantz

@a_tschantz

+ Follow

Applied AI lead @ verses.ai

ID: 961146888220233729

linkhttps://scholar.google.com/citations?user=5NbVgO0AAAAJ&hl=en calendar_today07-02-2018 07:57:30

1,1K Tweet

1,1K Takipçi

3,3K Takip Edilen

Jeff Johnston

a year ago

How does the brain represent multiple different things at once in a single population of neurons? Justin Fine, Neuro Polarbear, Becket Ebitz, Seng Bum Michael Yoo and I show that it uses semi-orthogonal subspaces for each item. Preprint here: arxiv.org/abs/2309.07766 Tweets below! (1/n)

thumb_up_off_alt118

chat_bubble_outline1

Hadi Vafaii

4 months ago

Imagine a model that unites predictive coding, sparse coding, and rate coding — all of 'em codings — under Bayesian inference. Wouldn't that be amazing? It’s already here: the Poisson Variational Autoencoder 👉🧵[1/n] w/ Dekel Galor & Jacob Yates 📜preprint: arxiv.org/abs/2405.14473

Imagine a model that unites predictive coding, sparse coding, and rate coding — all of 'em codings — under Bayesian inference. Wouldn't that be amazing?

It’s already here: the Poisson Variational Autoencoder 👉🧵[1/n]

w/ <a href="/dekelgalor/">Dekel Galor</a> & <a href="/jcbyts/">Jacob Yates</a>
📜preprint: arxiv.org/abs/2405.14473

thumb_up_off_alt281

chat_bubble_outline4

Kevin Patrick Murphy

4 months ago

I am delighted to share our recent paper: arxiv.org/abs/2405.19681. It can be thought of as a version of the Bayesian Learning Rule, extended to the fully online setting. This was a super fun project with Peter Chang and the unstoppable Matt Jones (matt.colorado.edu). By

thumb_up_off_alt357

chat_bubble_outline5

AK

4 months ago

Transformers are SSMs Generalized Models and Efficient Algorithms Through Structured State Space Duality While Transformers have been the main architecture behind deep learning's success in language modeling, state-space models (SSMs) such as Mamba have recently been shown

Transformers are SSMs

Generalized Models and Efficient Algorithms Through Structured State Space Duality

While Transformers have been the main architecture behind deep learning's success in language modeling, state-space models (SSMs) such as Mamba have recently been shown

thumb_up_off_alt312

chat_bubble_outline2

elvis

4 months ago

The Geometry of Concepts in LLMs Studies the geometry of categorical concepts and how the hierarchical relations between them are encoded in LLMs. Finding from the paper: "Simple categorical concepts are represented as simplices, hierarchically related concepts are orthogonal

The Geometry of Concepts in LLMs

Studies the geometry of categorical concepts and how the hierarchical relations between them are encoded in LLMs.

Finding from the paper: "Simple categorical concepts are represented as simplices, hierarchically related concepts are orthogonal

thumb_up_off_alt608

chat_bubble_outline3

fly51fly

2 months ago

[LG] Learning to (Learn at Test Time): RNNs with Expressive Hidden States Y Sun, X Li, K Dalal, J Xu… [Stanford University & UC San Diego & UC Berkeley] (2024) arxiv.org/abs/2407.04620 - This paper proposes TTT (Test-Time Training) layers, a new class of sequence modeling layers

[LG] Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Y Sun, X Li, K Dalal, J Xu… [Stanford University & UC San Diego & UC Berkeley] (2024)
arxiv.org/abs/2407.04620

- This paper proposes TTT (Test-Time Training) layers, a new class of sequence modeling layers

thumb_up_off_alt23

chat_bubble_outline1

Tommaso Salvatori

2 months ago

A new library to train predictive coding networks. A huge comparative study of multiple ways of training them. An advancement over current state-of-the-art. A large discussion on what did not work. All of this, and more, in our new preprint. arxiv.org/pdf/2407.01163

A new library to train predictive coding networks. A huge comparative study of multiple ways of training them. An advancement over current state-of-the-art. A large discussion on what did not work.

All of this, and more, in our new preprint. arxiv.org/pdf/2407.01163

thumb_up_off_alt13

chat_bubble_outline0

Sukjun (June) Hwang

2 months ago

So many interesting sequence mixers have been unveiled lately! Interested in designing new models on your own? We introduce useful primitives that make your model sub-quadratic, flexible, and powerful! As an example, we present Hydra, a principled bidirectional extension of Mamba

So many interesting sequence mixers have been unveiled lately! Interested in designing new models on your own?
We introduce useful primitives that make your model sub-quadratic, flexible, and powerful! As an example, we present Hydra, a principled bidirectional extension of Mamba

thumb_up_off_alt171

chat_bubble_outline1

Stat.ML Papers

2 months ago

Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI ift.tt/gkWGCOr

thumb_up_off_alt122

chat_bubble_outline0

H.SHIMAZAKI

a month ago

Here is my take on: Explosive neural networks via higher-order interactions in curved statistical manifolds Miguel Aguilera, Pablo A. Morales, Fernando E. Rosas, Hideaki Shimazaki arxiv.org/abs/2408.02326 on behalf of Miguel Aguilera Pablo Morales Fernando Rosas

thumb_up_off_alt46

chat_bubble_outline1

Andreas Kirsch 🇺🇦

a month ago

A small info-theory thread (or at least food for thought): Why is the Bayesian Model Average the best choice? Really why? I'll go through a naive argument (anyone has better references?), simple lower-bounds and decompositions, and pitch a "reverse mutual information" 1/15

A small info-theory thread (or at least food for thought):

Why is the Bayesian Model Average the best choice? Really why?

I'll go through a naive argument (anyone has better references?), simple lower-bounds and decompositions, and pitch a "reverse mutual information"

1/15

thumb_up_off_alt160

chat_bubble_outline4

Ueli Rutishauser

@uelirutishauser

a month ago

Delighted our latest finding! We discovered that abstract representations emerge in the human hippocampus when learning to perform inference. This change in neural geometry is due to disentanglement of discovered latent and observable variables. nature nature.com/articles/s4158…

thumb_up_off_alt499

chat_bubble_outline8

Stat.ML Papers

a month ago

Oja's plasticity rule overcomes several challenges of training neural networks under biological constraints ift.tt/K945DIX

thumb_up_off_alt25

chat_bubble_outline0

Richard Song

a month ago

How does Google optimize its research and systems? We’ve revealed the secrets behind the Vizier Gaussian Process Bandit algorithm, the black-box optimizer that’s been run millions of times! Paper: arxiv.org/abs/2408.11527 Code: github.com/google/vizier Compared to other industry

How does Google optimize its research and systems? We’ve revealed the secrets behind the Vizier Gaussian Process Bandit algorithm, the black-box optimizer that’s been run millions of times!

Paper: arxiv.org/abs/2408.11527
Code: github.com/google/vizier

Compared to other industry

thumb_up_off_alt677

chat_bubble_outline9

Weijie Su

a month ago

New Research (w/ amazing Hangfeng He) "A Law of Next-Token Prediction in Large Language Models" LLMs rely on NTP, but their internal mechanisms seem chaotic. It's difficult to discern how each layer processes data for NTP. Surprisingly, we discover a physics-like law on NTP:

New Research (w/ amazing <a href="/hangfeng_he/">Hangfeng He</a>)

"A Law of Next-Token Prediction in Large Language Models"

LLMs rely on NTP, but their internal mechanisms seem chaotic. It's difficult to discern how each layer processes data for NTP. Surprisingly, we discover a physics-like law on NTP:

thumb_up_off_alt401

chat_bubble_outline7

Michael Levin

@drmichaellevin

17 days ago

New paper: Chris L Buckley, Tim Lewens at Cambridge HPS, Beren Millidge, Alec Tschantz, Richard Watson mdpi.com/1099-4300/26/9… "Natural Induction: Spontaneous Adaptive Organisation without Natural Selection" #evolution, #basalcognition, #physics Abstract: Evolution by

thumb_up_off_alt237

chat_bubble_outline12

Fleur Zeldenrust - @akademienl.social

@fleurzeldenrust

13 days ago

This was such a fun and inspiring collaboration! arxiv.org/abs/2409.05386

thumb_up_off_alt51

chat_bubble_outline0

Mick Bonner

@michaelfbonner

10 days ago

Can we gain a deep understanding of neural representations through dimensionality reduction? Our new work shows that the visual representations of the human brain need to be understood in high dimensions. w/ Raj & Brice Ménard. arxiv.org/abs/2409.06843#

thumb_up_off_alt117

chat_bubble_outline3

Conor Heins

5 days ago

1⃣ Excited to share new research from the Machine Learning Foundations Lab at VERSES AI Research! "Gradient-free variational learning with conditional mixture networks" 📰Paper: arxiv.org/abs/2408.16429 💻Code: github.com/VersesTech/cav…

thumb_up_off_alt25

chat_bubble_outline1