Hunter Lang (@hunterjlang) 's Twitter Profile
Hunter Lang

@hunterjlang

phd student at @MIT_CSAIL working in self/weak supervision, nlp with @David_Sontag. he/him

ID: 959854738023047168

linkhttps://web.mit.edu/~hjl/www/ calendar_today03-02-2018 18:22:57

88 Tweet

317 Takipçi

325 Takip Edilen

Monica Agrawal (@monicanagrawal) 's Twitter Profile Photo

Getting labeled data for clinical NLP can be prohibitively difficult. In work at #emnlp2022, we find that GPT-3 models perform very well at clinical tasks in the few-shot setting, indicating a new paradigm for transforming EHR notes into actionable data news.mit.edu/2022/large-lan…

Hussein Mozannar (@hsseinmzannar) 's Twitter Profile Photo

Can AI tell us when it can predict better than humans and when humans are better? In our AISTATS23 (oral) with MIT-IBM Watson AI Lab paper we build AI models that 1) complement humans and 2) can defer to humans when necessary arxiv.org/abs/2301.06197

Can AI tell us when it can predict better than humans and when humans are better? In our AISTATS23  (oral) with <a href="/MITIBMLab/">MIT-IBM Watson AI Lab</a>  paper we build AI models that 1) complement humans and 2) can defer to humans when necessary arxiv.org/abs/2301.06197
AK (@_akhaliq) 's Twitter Profile Photo

Learning to Decode Collaboratively with Multiple Language Models We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent

Shannon Shen (@shannonzshen) 's Twitter Profile Photo

Thanks AK for tweeting our work! 🤝Co-LLM trains a language model to collaboratively decode with other LMs. It decides whether to invoke the other assistant LM for each token. By optimally combining the expertise from each LM, co-llm generation can be better than using

MIT CSAIL (@mit_csail) 's Twitter Profile Photo

Can LLMs learn to "phone a friend?" 🧵 MIT CSAIL’s new "Co-LLM" algorithm can pair a general-purpose base LLM w/a more specialized model & help them work together. It reviews each token & sees where it needs to call upon an expert, leading to more accurate & efficient replies to

Can LLMs learn to "phone a friend?" 🧵

MIT CSAIL’s new "Co-LLM" algorithm can pair a general-purpose base LLM w/a more specialized model &amp; help them work together. It reviews each token &amp; sees where it needs to call upon an expert, leading to more accurate &amp; efficient replies to
Ekin Akyürek (@akyurekekin) 's Twitter Profile Photo

Why do we treat train and test times so differently? Why is one “training” and the other “in-context learning”? Just take a few gradients during test-time — a simple way to increase test time compute — and get a SoTA in ARC public validation set 61%=avg. human score! ARC Prize

Why do we treat train and test times so differently?

Why is one “training” and the other “in-context learning”?

Just take a few gradients during test-time — a simple way to increase test time compute — and  get a SoTA in ARC public validation set 61%=avg. human score! <a href="/arcprize/">ARC Prize</a>
Jeremy Bernstein (@jxbz) 's Twitter Profile Photo

I just wrote my first blog post in four years! It is called "Deriving Muon". It covers the theory that led to Muon and how, for me, Muon is a meaningful example of theory leading practice in deep learning (1/11)

I just wrote my first blog post in four years! It is called "Deriving Muon". It covers the theory that led to Muon and how, for me, Muon is a meaningful example of theory leading practice in deep learning

(1/11)