Vishakh Padmakumar (@vishakh_pk) 's Twitter Profile
Vishakh Padmakumar

@vishakh_pk

PhD Student @NYUDataScience, currently also hanging out with @SemanticScholar @allen_ai

ID: 3322187402

linkhttp://vishakhpk.github.io/ calendar_today21-08-2015 09:15:19

306 Tweet

542 Takipçi

550 Takip Edilen

Gautam Kamath (@thegautamkamath) 's Twitter Profile Photo

I wrote a post on how to connect with people (i.e., make friends) at CS conferences. These events can be intimidating so here's some suggestions on how to navigate them I'm late for #ICLR2025 #NAACL2025, but just in time for #AISTATS2025 and timely for #ICML2025 acceptances! 1/4

I wrote a post on how to connect with people (i.e., make friends) at CS conferences. These events can be intimidating so here's some suggestions on how to navigate them

I'm late for #ICLR2025 #NAACL2025, but just in time for #AISTATS2025 and timely for #ICML2025 acceptances! 1/4
Arkadiy Saakyan (@rkdsaakyan) 's Twitter Profile Photo

Can vision-language models understand figurative meaning like visual metaphors, sarcastic image captions or memes? Come find out at our #NAACL2025 poster on Friday 9am! New task & dataset of images and captions with figurative phenomena like metaphor, idiom, sarcasm, and humor.

Can vision-language models understand figurative meaning like visual metaphors, sarcastic image captions or memes? Come find out at our #NAACL2025 poster on Friday 9am! New task & dataset of images and captions with figurative phenomena like metaphor, idiom, sarcasm, and humor.
Kenneth Huang (@windx0303) 's Twitter Profile Photo

This year's #In2Writing workshop at #NAACL2025 was indeed amazing. We heard voice from teachers, English scholars, NLPers, writers, and industrial folks. See you next time!

This year's #In2Writing workshop at #NAACL2025 was indeed amazing. We heard voice from teachers, English scholars, NLPers, writers, and industrial folks. 

See you next time!
Sanchaita Hazra (@hsanchaita) 's Twitter Profile Photo

Very excited for a new #ICML2025 position paper accepted as oral w Bodhisattwa Majumder & Tuhin Chakrabarty! 😎 What are the longitudinal harms of AI development? We use economic theories to highlight AI’s intertemporal impacts on livelihoods & its role in deepening labor-market inequality.

Very excited for a new #ICML2025 position paper accepted as oral w <a href="/mbodhisattwa/">Bodhisattwa Majumder</a> &amp; <a href="/TuhinChakr/">Tuhin Chakrabarty</a>! 😎

What are the longitudinal harms of AI development?

We use economic theories to highlight AI’s intertemporal impacts on livelihoods &amp; its role in deepening labor-market inequality.
Tuhin Chakrabarty (@tuhinchakr) 's Twitter Profile Photo

Thinking about model welfare and catastrophic risks without considering longitudinal harms cause by Generative AI ? Check out our ICML Conference Oral paper on why AI Safety should prioritize Future of Work. Had lots of fun writing this #GenAI

Roger Beaty (@roger_beaty) 's Twitter Profile Photo

New paper: AI can generate creative ideas when prompted—but can it actually improve our own creativity? In 2 studies (total N = 36,752), we show AI can enhance human creativity through real-time feedback, helping people better evaluate their own ideas. osf.io/preprints/osf/…

Nishant Balepur (@nishantbalepur) 's Twitter Profile Photo

🎉🎉 Excited to have two papers accepted to #ACL2025! Our first paper designs a preference training method to boost LLM personalization 🎨 While the second outlines our position on why MCQA evals are terrible and how to make them better 🙏 Grateful for amazing collaborators!

🎉🎉 Excited to have two papers accepted to #ACL2025!

Our first paper designs a preference training method to boost LLM personalization 🎨
While the second outlines our position on why MCQA evals are terrible and how to make them better 🙏

Grateful for amazing collaborators!
Dayeon (Zoey) Ki (@zoeykii) 's Twitter Profile Photo

1/ How can a monolingual English speaker 🇺🇸 decide if a French translation 🇫🇷 is good enough to be shared? Introducing ❓AskQE❓, an #LLM-based Question Generation + Answering framework that detects critical MT errors and provides actionable feedback 🗣️ #ACL2025

1/ How can a monolingual English speaker 🇺🇸 decide if a French translation 🇫🇷 is good enough to be shared? 

Introducing ❓AskQE❓, an #LLM-based Question Generation + Answering framework that detects critical MT errors and provides actionable feedback 🗣️ 

#ACL2025
Aakanksha Naik (@arnaik19) 's Twitter Profile Photo

🚨Test data is out! 🚨 The testing phase will run until May 24, 5 pm PT. Check out our github for the data + submission instructions. Bring your best models 💪! Participants can also submit shared task reports to Scholarly Document Processing Workshop after the testing phase!

Mina Lee (@minalee__) 's Twitter Profile Photo

What does it mean to write and think with AI? What new possibilities and challenges does that bring? I spoke with THE AI (in Korean) about our group's research and the future of writing with AI. 👩🤖✍️ newstheai.com/news/articleVi…

What does it mean to write and think with AI? What new possibilities and challenges does that bring?

I spoke with THE AI (in Korean) about our group's research and the future of writing with AI. 👩🤖✍️

newstheai.com/news/articleVi…
William Merrill (@lambdaviking) 's Twitter Profile Photo

Padding a transformer’s input with blank tokens (...) is a simple form of test-time compute. Can it increase the computational power of LLMs? 👀 New work with Ashish Sabharwal addresses this with *exact characterizations* of the expressive power of transformers with padding 🧵

Padding a transformer’s input with blank tokens (...) is a simple form of test-time compute. Can it increase the computational power of LLMs? 👀

New work with <a href="/Ashish_S_AI/">Ashish Sabharwal</a> addresses this with *exact characterizations* of the expressive power of transformers with padding 🧵
Joe Stacey (@_joestacey_) 's Twitter Profile Photo

We have a new paper up on arXiv! 🥳🪇 The paper tries to improve the robustness of closed-source LLMs fine-tuned on NLI, assuming a realistic training budget of 10k training examples. Here's a 60 second rundown of what we found!

We have a new paper up on arXiv! 🥳🪇

The paper tries to improve the robustness of closed-source LLMs fine-tuned on NLI, assuming a realistic training budget of 10k training examples. 

Here's a 60 second rundown of what we found!
Simone Luchini (@simone_luchini) 's Twitter Profile Photo

🚨Check out our recent preprint on human-AI co-creativity in story writing! 🔍🔍🔍We investigate the mechanisms that underlie the outcomes of human-AI co-creativity in a highly naturalistic setting. doi.org/10.31234/osf.i…

🚨Check out our recent preprint on human-AI co-creativity in story writing!

🔍🔍🔍We investigate the mechanisms that underlie the outcomes of human-AI co-creativity in a highly naturalistic setting.

doi.org/10.31234/osf.i…
Omar Khattab (@lateinteraction) 's Twitter Profile Photo

I need to read it carefully, but now this IMO is likely most deserving of "important papers in LLM RL since R1". If you try 100 random underpowered tricks and all of them lead to huge gains, but only on certain model class X, the finding is about X, not about the random tricks!

Roger Beaty (@roger_beaty) 's Twitter Profile Photo

LLMs still struggle with creativity. How can we make them more creative? Train AI on what people actually consider creative. We built a dataset of 200k+ human creativity ratings and used it to train a model that outperforms GPT-4o on creativity tests. arxiv.org/abs/2505.14442

Hannah Rose Kirk (@hannahrosekirk) 's Twitter Profile Photo

Why do human–AI relationships need socioaffective alignment? As AI evolves from tools to companions, we must seek systems that enhance rather than exploit our nature as social & emotional beings. Published today in nature Humanities & Social Sciences! nature.com/articles/s4159…

Why do human–AI relationships need socioaffective alignment? As AI evolves from tools to companions, we must seek systems that enhance rather than exploit our nature as social &amp; emotional beings. Published today in <a href="/Nature/">nature</a> Humanities &amp; Social Sciences! nature.com/articles/s4159…
John(Yueh-Han) Chen (@jcyhc_ai) 's Twitter Profile Photo

Do LLMs show systematic generalization of safety facts to novel scenarios? Introducing our work SAGE-Eval, a benchmark consisting of 100+ safety facts and 10k+ scenarios to test this! - Claude-3.7-Sonnet passes only 57% of facts evaluated - o1 and o3-mini passed <45%! 🧵

Do LLMs show systematic generalization of safety facts to novel scenarios?

Introducing our work SAGE-Eval, a benchmark consisting of 100+ safety facts and 10k+ scenarios to test this!

- Claude-3.7-Sonnet passes only 57% of facts evaluated
- o1 and o3-mini passed &lt;45%! 🧵
NYU Center for Data Science (@nyudatascience) 's Twitter Profile Photo

CDS PhD student Vishakh Padmakumar, with co-authors John (Yueh-Han) Chen, Jane Pan, Valerie Chen, and CDS Associate Professor He He, has published new research on the trade-off between originality and quality in LLM outputs. Read more: nyudatascience.medium.com/in-ai-generate…

Vaishnavh Nagarajan (@_vaishnavh) 's Twitter Profile Photo

📢 New paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue: → LLMs are limited in creativity since they learn to predict the next token → creativity can be improved via multi-token learning & injecting noise ("seed-conditioning" 🌱) 1/ 🧵

📢 New paper on creativity &amp; multi-token prediction! We design minimal open-ended tasks to argue:

→ LLMs are limited in creativity since they learn to predict the next token

→ creativity can be improved via multi-token learning &amp; injecting noise ("seed-conditioning" 🌱) 1/ 🧵
John(Yueh-Han) Chen (@jcyhc_ai) 's Twitter Profile Photo

LLMs won’t tell you how to make fake IDs—but will reveal the layouts/materials of IDs and make realistic photos if asked separately. 💥Such decomposition attacks reach 87% success across QA, text-to-image, and agent settings! 🛡️Our monitoring method defends with 93% success! 🧵

LLMs won’t tell you how to make fake IDs—but will reveal the layouts/materials of IDs and make realistic photos if asked separately.

💥Such decomposition attacks reach 87% success across QA, text-to-image, and agent settings!
🛡️Our monitoring method defends with 93% success! 🧵