Arkadiy Saakyan (@rkdsaakyan) Twitter Tweets • TwiCopy

Arkadiy Saakyan

@rkdsaakyan

+ Follow

PhD student @ColumbiaCompSci @columbianlp working on human-AI collaboration, AI creativity and explainability. prev. intern @GoogleDeepMind, @AmazonScience

ID: 1439915410263101446

linkhttp://asaakyan.github.io calendar_today20-09-2021 11:33:00

40 Tweet

147 Followers

532 Following

Tuhin Chakrabarty

@tuhinchakr

a year ago

New paper with students Barnard College on testing orthogonal thinking / abstract reasoning capabilities of Large Language Models using the fascinating yet frustratingly difficult The New York Times Connections game. #NLProc #LLMs #GPT4o #Claude3opus 🧵(1/n)

New paper with students <a href="/BarnardCollege/">Barnard College</a> on testing orthogonal thinking / abstract reasoning capabilities of Large Language Models using the fascinating yet frustratingly difficult <a href="/nytimes/">The New York Times</a> Connections game. #NLProc #LLMs #GPT4o #Claude3opus 🧵(1/n)

thumb_up_off_alt397

chat_bubble_outline9

repeat88

shareShare

Emmy Liu

@_emliu

a year ago

Thanks to everyone who attended FigLangWorkshop at #naacl2024 ! If you weren't able to make it, we've made recordings of the panel and keynote available! ☕️ Panel on creativity in the age of LLMs: sites.google.com/view/figlang20… 🎤 Vered Shwartz 's keynote: sites.google.com/view/figlang20…

thumb_up_off_alt22

chat_bubble_outline0

repeat6

shareShare

karthik

@akbirthko

a year ago

legendary

thumb_up_off_alt9,9K

chat_bubble_outline114

repeat1,1K

shareShare

Arkadiy Saakyan

@rkdsaakyan

a year ago

Tuhin is a fantastic mentor and collaborator! So jealous of his future students! 😃

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

Chunting Zhou

@violet_zct

a year ago

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039 Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

thumb_up_off_alt1,1K

chat_bubble_outline24

repeat209

shareShare

Gabriel Agostini

@gsagostini

7 months ago

Migration data lets us study responses to environmental disasters, social change patterns, policy impacts, etc. But public data is too coarse, obscuring these important phenomena. We build MIGRATE: a dataset of yearly flows between 47 billion pairs of US Census Block Groups. 1/5

thumb_up_off_alt13

chat_bubble_outline1

repeat9

shareShare

Tuhin Chakrabarty

@tuhinchakr

6 months ago

Unlike math/code, writing lacks verifiable rewards. So all we get is slop. To solve this we train reward models on expert edits that beat SOTA #LLMs largely on a new Writing Quality benchmark. We also reduce #AI slop by using our RMs at test time boosting alignment with experts.

thumb_up_off_alt202

chat_bubble_outline10

repeat31

shareShare

Ramya Namuduri

@ramya_namuduri

6 months ago

Have that eerie feeling of déjà vu when reading model-generated text 👀, but can’t pinpoint the specific words or phrases 👀? ✨We introduce QUDsim, to quantify discourse similarities beyond lexical, syntactic, and content overlap.

thumb_up_off_alt36

chat_bubble_outline1

repeat15

shareShare

Vishakh Padmakumar

@vishakh_pk

5 months ago

What does it mean for #LLM output to be novel? In work w/ John(Yueh-Han) Chen, Jane Pan, Valerie Chen, He He we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

What does it mean for #LLM output to be novel?
In work w/ <a href="/jcyhc_ai/">John(Yueh-Han) Chen</a>, <a href="/JanePan_/">Jane Pan</a>, <a href="/valeriechen_/">Valerie Chen</a>, <a href="/hhexiy/">He He</a> we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

thumb_up_off_alt82

chat_bubble_outline2

repeat22

shareShare

Chau Minh Pham

@chautmpham

4 months ago

🤔 What if you gave an LLM thousands of random human-written paragraphs and told it to write something new -- while copying 90% of its output from those texts? 🧟 You get what we call a Frankentext! 💡 Frankentexts are surprisingly coherent and tough for AI detectors to flag.

thumb_up_off_alt115

chat_bubble_outline4

repeat33

shareShare

Sarah Wiegreffe (on faculty job market!)

@sarahwiegreffe

4 months ago

A bit late to announce, but I’m excited to share that I'll be starting as an assistant professor at the University of Maryland UMD Department of Computer Science this August. I'll be recruiting PhD students this upcoming cycle for fall 2026. (And if you're a UMD grad student, sign up for my fall seminar!)

thumb_up_off_alt564

chat_bubble_outline70

repeat48

shareShare

METR

@metr_evals

3 months ago

We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.

thumb_up_off_alt5,5K

chat_bubble_outline200

repeat1,1K

shareShare

judah

@joodalooped

2 months ago

frontier model still worse than text-davinci-001 who would have thought?

thumb_up_off_alt2,2K

chat_bubble_outline83

repeat121

shareShare