Hunter Lang (@hunterjlang) Twitter Tweets • TwiCopy

Hunter Lang

@hunterjlang

+ Follow

phd student at @MIT_CSAIL working in self/weak supervision, nlp with @David_Sontag. he/him

ID: 959854738023047168

linkhttps://web.mit.edu/~hjl/www/ calendar_today03-02-2018 18:22:57

88 Tweet

317 Takipçi

325 Takip Edilen

Monica Agrawal

@monicanagrawal

3 years ago

Getting labeled data for clinical NLP can be prohibitively difficult. In work at #emnlp2022, we find that GPT-3 models perform very well at clinical tasks in the few-shot setting, indicating a new paradigm for transforming EHR notes into actionable data news.mit.edu/2022/large-lan…

thumb_up_off_alt59

chat_bubble_outline1

repeat18

shareShare

Hussein Mozannar

@hsseinmzannar

3 years ago

Can AI tell us when it can predict better than humans and when humans are better? In our AISTATS23 (oral) with MIT-IBM Watson AI Lab paper we build AI models that 1) complement humans and 2) can defer to humans when necessary arxiv.org/abs/2301.06197

Can AI tell us when it can predict better than humans and when humans are better? In our AISTATS23 (oral) with <a href="/MITIBMLab/">MIT-IBM Watson AI Lab</a> paper we build AI models that 1) complement humans and 2) can defer to humans when necessary arxiv.org/abs/2301.06197

thumb_up_off_alt96

chat_bubble_outline4

repeat21

shareShare

Hunter Lang

@hunterjlang

3 years ago

🤩🤩🤩

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Hunter Lang

@hunterjlang

3 years ago

🤩🤩🤩

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Monica Agrawal

@monicanagrawal

2 years ago

Excited to announce what we've been up to this year!

thumb_up_off_alt151

chat_bubble_outline6

repeat8

shareShare

AK

@_akhaliq

2 years ago

Learning to Decode Collaboratively with Multiple Language Models We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent

thumb_up_off_alt197

chat_bubble_outline4

repeat42

shareShare

Shannon Shen

@shannonzshen

2 years ago

Thanks AK for tweeting our work! 🤝Co-LLM trains a language model to collaboratively decode with other LMs. It decides whether to invoke the other assistant LM for each token. By optimally combining the expertise from each LM, co-llm generation can be better than using

thumb_up_off_alt100

chat_bubble_outline3

repeat20

shareShare

Hussein Mozannar

@hsseinmzannar

2 years ago

So happy to defend my PhD thesis and couldn't have done it without a 1 of 1 advisor David Sontag and an incredible committee Eric Horvitz Arvind Satyanarayan Elena Glassman

So happy to defend my PhD thesis and couldn't have done it without a 1 of 1 advisor <a href="/david_sontag/">David Sontag</a> and an incredible committee <a href="/erichorvitz/">Eric Horvitz</a> <a href="/arvindsatya1/">Arvind Satyanarayan</a> <a href="/roboticwrestler/">Elena Glassman</a>

thumb_up_off_alt148

chat_bubble_outline19

repeat9

shareShare

Hunter Lang

@hunterjlang

2 years ago

The "primary area" field on NeurIPS Conference OpenReview has no good option for work on unsupervised or semi-supervised learning?

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare

MIT CSAIL

@mit_csail

a year ago

Can LLMs learn to "phone a friend?" 🧵 MIT CSAIL’s new "Co-LLM" algorithm can pair a general-purpose base LLM w/a more specialized model & help them work together. It reviews each token & sees where it needs to call upon an expert, leading to more accurate & efficient replies to

thumb_up_off_alt138

chat_bubble_outline4

repeat37

shareShare

Ekin Akyürek

@akyurekekin

a year ago

Why do we treat train and test times so differently? Why is one “training” and the other “in-context learning”? Just take a few gradients during test-time — a simple way to increase test time compute — and get a SoTA in ARC public validation set 61%=avg. human score! ARC Prize

thumb_up_off_alt1,1K

chat_bubble_outline36

repeat345

shareShare

Jeremy Bernstein

@jxbz

9 months ago

I just wrote my first blog post in four years! It is called "Deriving Muon". It covers the theory that led to Muon and how, for me, Muon is a meaningful example of theory leading practice in deep learning (1/11)