Yunhyeok Kwak (@yun_h_kwak) Twitter Tweets • TwiCopy

Andrej Karpathy

3 years ago

Language Model Cascades arxiv.org/abs/2207.10342 Good paper and all the references (chain-of-thought, scratchpad, bootstrapping, verifiers, tool-use, retrievals, etc...). There's a quickly growing stack around/above a single large language model, expanding their reasoning power

thumb_up_off_alt608

chat_bubble_outline8

repeat77

shareShare

Andrej Karpathy

@karpathy

3 years ago

Is someone aware of a language model experiment where you keep all the 2022 goodies/data, except swap a Transformer for an LSTM? I expect a gap should exist and is worth thinking about more closely, e.g. from the perspective of being both 1) expressive and 2) SGD optimizable.

thumb_up_off_alt655

chat_bubble_outline35

repeat46

shareShare

Pablo Samuel Castro

@pcastr

3 years ago

Doing some cleaning around the house, stumbled on a notebook filled with failed proofs for what would eventually become our MICo distance for MDPs. It took about a year of banging our heads with failed ideas until we stumbled on the one in our paper, which works really well!

thumb_up_off_alt231

chat_bubble_outline7

repeat14

shareShare

Jung-Woo Ha

@jungwooha2

3 years ago

[5/8] Inwoo Hwang, Sangjun Lee, Yunhyeok Kwak, Seong Joon Oh, Damien Teney, Jin-Hwa Kim, Byoung-Tak Zhang. SelecMix: Debiased Learning by Contradicting-pair Sampling.

thumb_up_off_alt2

chat_bubble_outline1

repeat3

shareShare

Oriol Vinyals

@oriolvinyalsml

3 years ago

This neural network architecture that was showcased at the Tesla AI day is a perfect example of Deep Learning at its finest. Mix and match all the greatest innovations to do something drastic and super ambitious. Congrats!

This neural network architecture that was showcased at the <a href="/Tesla/">Tesla</a> AI day is a perfect example of Deep Learning at its finest. Mix and match all the greatest innovations to do something drastic and super ambitious. Congrats!

thumb_up_off_alt5,5K

chat_bubble_outline112

repeat559

shareShare

Shane Gu

@shaneguml

3 years ago

World Model is a causal predictor Decision Transformer is an anti-causal predictor Hindsight Experience Replay is the Jedi mind trick to flip the causality

thumb_up_off_alt144

chat_bubble_outline2

repeat17

shareShare

Sergey Levine

@svlevine

3 years ago

Yet people can learn without simulators, and even without the Internet. Even what we call "imitation" in robotics is different from how people imitate. As for data, driving looks different from robotics b/c we don't yet have lots of robots, but we will: sergeylevine.substack.com/p/self-improvi…

thumb_up_off_alt91

chat_bubble_outline6

repeat10

shareShare

Google DeepMind

@googledeepmind

3 years ago

Today in nature: #AlphaTensor, an AI system for discovering novel, efficient, and exact algorithms for matrix multiplication - a building block of modern computations. AlphaTensor finds faster algorithms for many matrix sizes: dpmd.ai/dm-alpha-tensor & dpmd.ai/nature-alpha-t… 1/

thumb_up_off_alt7,7K

chat_bubble_outline87

repeat2,2K

shareShare

Jim Fan

@drjimfan

3 years ago

We trained a transformer called VIMA that ingests *multimodal* prompt and outputs controls for a robot arm. A single agent is able to solve visual goal, one-shot imitation from video, novel concept grounding, visual constraint, etc. Strong scaling with model capacity and data!🧵

thumb_up_off_alt861

chat_bubble_outline18

repeat147

shareShare

Misha Laskin

@mishalaskin

3 years ago

In our new work - Algorithm Distillation - we show that transformers can improve themselves autonomously through trial and error without ever updating their weights. No prompting, no finetuning. A single transformer collects its own data and maximizes rewards on new tasks. 1/N

thumb_up_off_alt1,1K

chat_bubble_outline23

repeat244

shareShare

Jin-Hwa Kim

@jnhwkim

3 years ago

"SelecMix: Debiased Learning by Contradicting-pair Sampling" Inwoo Hwang · Sangjun Lee · Yunhyeok Kwak · Seong Joon Oh · Damien Teney · Jin-Hwa Kim* · Byoung-Tak Zhang* Hall J #426 at 4 PM neurips.cc/virtual/2022/p…

thumb_up_off_alt3

chat_bubble_outline1

repeat4

shareShare

Shane Legg

@shanelegg

3 years ago

This is natural given the vast quantity of data suitable for SSL. Nevertheless, I'm predicting a bit of a comeback for RL as we try to shape and refine these systems for various applications.

thumb_up_off_alt74

chat_bubble_outline5

repeat10

shareShare

Jim Fan

@drjimfan

3 years ago

We train Transformers to encode algorithms in their weights, such as sorting, counting, and balancing parentheses from lots of data. I never thought we may also go in the *reverse* direction: *compile* Transformer weights directly from explicit code! Cool paper @DeepMind: 1/🧵

thumb_up_off_alt2,2K

chat_bubble_outline40

repeat407

shareShare

Andrej Karpathy

@karpathy

2 years ago

yay the ability to share ChatGPT conversations is now rolling out. I can share a few favorites. E.g. GPT-4 is great at generating Anki flash cards, helping you to memorize any document. Example: chat.openai.com/share/eef34fe5… Easy to then import in Anki: apps.ankiweb.net

thumb_up_off_alt3,3K

chat_bubble_outline111

repeat352

shareShare

Sebastian Seung

@sebastianseung

2 years ago

We're done! A historic milestone for neuroscience brought to you by the FlyWire Consortium. Don't take my word for it. See the glory of the fly brain for yourself at flywire.ai/gallery

thumb_up_off_alt404

chat_bubble_outline9

repeat99

shareShare

Inwoo Hwang

@inwooryanhwang

2 years ago

Paper rejected from #NeurIPS2023. Frustrated initially as I felt some reviewers focused on irrelevant issues. But it's also a chance to improve. I'll work on better writing and clearer articulation for the next round, and try to keep a positive and humble attitude! 📝🌱 #PhDLife

thumb_up_off_alt69

chat_bubble_outline2

repeat5

shareShare

Sungjin Ahn

@sungjinahn_

a year ago

A large part of my research has been inspired by his theory of System 1 & 2. RIP Daniel Kahneman.

thumb_up_off_alt55

chat_bubble_outline1

repeat4

shareShare

Elisa Granato

@prokaryota

a year ago

thumb_up_off_alt1,1K

chat_bubble_outline2

repeat65

shareShare

Inwoo Hwang

@inwooryanhwang

a year ago

[1] Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction (Oral) paper: arxiv.org/abs/2406.00614 We propose state-conditioned action abstraction that effectively reduces the search space of MCTS under vast combinatorial action space.

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Inwoo Hwang

@inwooryanhwang

a year ago

[1] Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning Tuesday, 13:30-15:00 paper: arxiv.org/abs/2406.03234 We propose a principled approach to discovering fine-grained causal relationships with identifiability guarantees.

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare