Dimitris Tsipras (@tsiprasd) 's Twitter Profile
Dimitris Tsipras

@tsiprasd

ID: 982200698

linkhttps://dtsipras.com calendar_today01-12-2012 09:37:51

174 Tweet

2,2K Followers

140 Following

Joon Sung Park (@joon_s_pk) 's Twitter Profile Photo

How might an online community look after many people join? My paper w/ lindsay popowski @Carryveggies Meredith Ringel Morris Percy Liang Michael Bernstein introduces "social simulacra": a method of generating compelling social behaviors to prototype social designs 🧵 arxiv.org/abs/2208.04024 #uist2022

Xiang Lisa Li (@xianglisali2) 's Twitter Profile Photo

arxiv.org/abs/2210.15097 We propose contrastive decoding (CD), a more reliable search objective for text generation by contrasting LMs of different sizes. CD takes a large LM (expert LM e.g. OPT-13b) and a small LM (amateur LM e.g. OPT-125m) and maximizes their logprob difference

arxiv.org/abs/2210.15097 
We propose contrastive decoding (CD), a more reliable search objective for text generation by contrasting LMs of different sizes. CD takes a large LM (expert LM e.g. OPT-13b) and a small LM (amateur LM e.g. OPT-125m) and maximizes their logprob difference
Percy Liang (@percyliang) 's Twitter Profile Photo

Language models are becoming the foundation of language technologies, but when do they work or don’t work? In a new CRFM paper, we propose Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of LMs. Holistic evaluation includes three elements:

Aleksander Madry (@aleks_madry) 's Twitter Profile Photo

You’re deploying an ML system, choosing between two models trained w/ diff algs. Same training data, same acc... how do you differentiate their behavior? ModelDiff (gradientscience.org/modeldiff) lets you compare *any* two learning algs! w/ Harshay Shah Sam Park Andrew Ilyas (1/8)

You’re deploying an ML system, choosing between two models trained w/ diff algs. Same training data, same acc... how do you differentiate their behavior?

ModelDiff (gradientscience.org/modeldiff) lets you compare *any* two learning algs!
w/ <a href="/harshays_/">Harshay Shah</a> <a href="/smsampark/">Sam Park</a> <a href="/andrew_ilyas/">Andrew Ilyas</a> (1/8)
Aleksander Madry (@aleks_madry) 's Twitter Profile Photo

Stable diffusion can visualize + improve model failure modes! Leveraging our method, we can generate examples of hard subpopulations, which can then be used for targeted data augmentation to improve reliability. Blog: gradientscience.org/failure-direct… Saachi Jain Hannah Lawrence A.Moitra

Stable diffusion can visualize + improve model failure modes!

Leveraging our method, we can generate examples of hard subpopulations, which can then be used for targeted data augmentation to improve reliability. 

Blog: gradientscience.org/failure-direct…
<a href="/saachi_jain_/">Saachi Jain</a> <a href="/HLawrenceCS/">Hannah Lawrence</a> A.Moitra
Percy Liang (@percyliang) 's Twitter Profile Photo

📣 CRFM announces PubMedGPT, a new 2.7B language model that achieves a new SOTA on the US medical licensing exam. The recipe is simple: a standard Transformer trained from scratch on PubMed (from The Pile) using @mosaicml on the MosaicML Cloud, then fine-tuned for the QA task.

Aleksander Madry (@aleks_madry) 's Twitter Profile Photo

Recent events (ahem) have brought the debate on whether/how to regulate social media back to the forefront. My students Sarah Cen Andrew Ilyas and I have been thinking about this for a *while*. Excited to share the first results of our thinking: aipolicy.substack.com/p/socialmedias… (1/3)

Percy Liang (@percyliang) 's Twitter Profile Photo

Announcing Holistic Evaluation of Language Models (HELM) v0.2.0 with updated results on the new OpenAI, AI21 Labs, and @CohereAI models. HELM now evaluates 34 prominent language models in a standardized way on 42 scenarios x 7 metrics.

Dimitris Papailiopoulos (@dimitrispapail) 's Twitter Profile Photo

Can transformers follow instructions? We explore this in: "Looped Transformers as Programmable Computers" arxiv.org/abs/2301.13196 led by Angeliki (Angeliki Giannou) and Shashank (Shashank Rajput) in collaboartion with the Kangwook Lee and Jason Lee Here is a 🧵

Can transformers follow instructions? We explore this in:

"Looped Transformers as Programmable Computers"
arxiv.org/abs/2301.13196

led by Angeliki (<a href="/AngelikiGiannou/">Angeliki Giannou</a>) and Shashank (<a href="/shashank_r12/">Shashank Rajput</a>) in collaboartion with the <a href="/Kangwook_Lee/">Kangwook Lee</a> and <a href="/jasondeanlee/">Jason Lee</a>

Here is a 🧵
John Hewitt (@johnhewtt) 's Twitter Profile Photo

For this year's CS 224n: Natural Language Processing with Deep Learning, I've written notes on our Self-Attention and Transformers lecture. web.stanford.edu/class/cs224n/r… Topics: Problems with RNNs, then self-attention, then a 'minimal' self-attention architecture, then Transformers.

For this year's CS 224n: Natural Language Processing with Deep Learning, I've written notes on our Self-Attention and Transformers lecture.

web.stanford.edu/class/cs224n/r…

Topics: Problems with RNNs, then self-attention, then a 'minimal' self-attention architecture, then Transformers.
Sang Michael Xie (@sangmichaelxie) 's Twitter Profile Photo

Data selection for LMs (GPT-3, PaLM) is done with heuristics that select data by training a classifier for high-quality text. Can we do better? Turns out we can boost downstream GLUE acc by 2+% by adapting the classic importance resampling algorithm.. arxiv.org/abs/2302.03169 🧵

Data selection for LMs (GPT-3, PaLM) is done with heuristics that select data by training a classifier for high-quality text. Can we do better?

Turns out we can boost downstream GLUE acc by 2+% by adapting the classic importance resampling algorithm..

arxiv.org/abs/2302.03169
🧵
Tatsunori Hashimoto (@tatsu_hashimoto) 's Twitter Profile Photo

We know that language models (LMs) reflect opinions - from internet pre-training, to developers and crowdworkers, and even user feedback. But whose opinions actually appear in the outputs? We make LMs answer public opinion polls to find out: arxiv.org/abs/2303.17548

We know that language models (LMs) reflect opinions - from internet pre-training, to developers and crowdworkers, and even user feedback. But whose opinions actually appear in the outputs? We make LMs answer public opinion polls to find out: arxiv.org/abs/2303.17548
OpenAI (@openai) 's Twitter Profile Photo

We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo. We are collaborating to figure out the details. Thank you so much for your patience through this.