Anirudh Goyal (@anirudhg9119) Twitter Tweets • TwiCopy

Anirudh Goyal

@anirudhg9119

+ Follow

Thinking about thinking.

Spent time at @Berkeley_EECS, @MPI_IS, @GoogleDeepMind.

ID: 2816636344

linkhttps://anirudh9119.github.io/ calendar_today08-10-2014 19:47:25

901 Tweet

5,5K Followers

518 Following

Anirudh Goyal

@anirudhg9119

3 years ago

Temporal Latent Bottleneck combines recurrence and self-attention in an unified way. Recurrence integrates information over time, and self-attention models local dependencies in "short" context. arxiv.org/abs/2205.14794

thumb_up_off_alt122

chat_bubble_outline4

repeat22

shareShare

Anirudh Goyal

@anirudhg9119

3 years ago

Discrete Key-Value Bottleneck (Updated) Compresses the information of a pre-trained model in learnable "key-value" codebook such that knowledge can be quickly adapted in a continual learning fashion. arxiv.org/abs/2207.11240

thumb_up_off_alt104

chat_bubble_outline2

repeat21

shareShare

Anirudh Goyal

@anirudhg9119

2 years ago

Work from Danilo J. Rezende, Shakir Mohamed and Daan Weistra. proceedings.mlr.press/v32/rezende14.… Credit assignment is difficult.

thumb_up_off_alt35

chat_bubble_outline0

repeat3

shareShare

Anirudh Goyal

@anirudhg9119

a year ago

Interesting progress from Rafael Rafailov @ NeurIPS and FTP et. al following our work (applied to mathematical and commonsense reasoning): Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning arxiv.org/abs/2405.00451 (Also discussed in Llama-3 paper, AI at Meta )

Interesting progress from <a href="/rm_rafailov/">Rafael Rafailov @ NeurIPS</a> and <a href="/DivGarg9/">FTP</a> et. al following our work (applied to mathematical and commonsense reasoning):

Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning

arxiv.org/abs/2405.00451

(Also discussed in Llama-3 paper, <a href="/AIatMeta/">AI at Meta</a> )

thumb_up_off_alt71

chat_bubble_outline1

repeat10

shareShare

Nan Rosemary Ke

@rosemary_ke

a year ago

Boosting LLM Performance with Dynamic Skill Selection! 1/ 🚀 What if LLMs could get better at solving math problems by understanding the skills they need? We explored this idea by having LLMs identify and label the skills required for each problem. arxiv.org/abs/2405.12205

thumb_up_off_alt14

chat_bubble_outline2

repeat3

shareShare

Michael Qizhe Shieh

@mpulsewidth

3 months ago

To me, diffusion LMs work because they remove unnecessary inductive biases. The left-to-right inductive bias is natural for human but is unlikely to be natural for AI. This gives more capacity to our models like Transformer having a bigger capacity than LSTM. Our experiment

thumb_up_off_alt247

chat_bubble_outline12

repeat28

shareShare