Jessy Li (@jessyjli) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

I’m excited to share that our paper has been accepted at ICML 2025! 🎉🥳🎊 This work was done during my internship at IBM Research, and it wouldn’t have been possible without a top-notch team Muneeza Azmat Rå¥å Mikhail Yurochkin and my amazing advisor Jessy Li 👏

thumb_up_off_alt31

chat_bubble_outline2

repeat4

shareShare

Jessy Li

@jessyjli

3 months ago

To appear #ICML2025 🎉 Congrats to Hongli!

thumb_up_off_alt21

chat_bubble_outline0

repeat4

shareShare

Jessy Li

@jessyjli

3 months ago

Welcome Elias Stengel-Eskin!!! Super excited that you’ll be joining us 🥳

thumb_up_off_alt36

chat_bubble_outline2

repeat2

shareShare

Jessy Li

@jessyjli

3 months ago

And welcome Chan Young Park !!! 🎊🙌🍾

thumb_up_off_alt8

chat_bubble_outline1

repeat0

shareShare

Peter West

@peterwesttm

3 months ago

I’ve been fascinated lately by the question: what kinds of capabilities might base LLMs lose when they are aligned? i.e. where can alignment make models WORSE? I’ve been looking into this with Christopher Potts and here's one piece of the answer: randomness and creativity

thumb_up_off_alt348

chat_bubble_outline11

repeat59

shareShare

Philippe Laban

@philippelaban

3 months ago

🆕paper: LLMs Get Lost in Multi-Turn Conversation In real life, people don’t speak in perfect prompts. So we simulate multi-turn conversations — less lab-like, more like real use. We find that LLMs get lost in conversation. 👀What does that mean? 🧵1/N 📄arxiv.org/abs/2505.06120

thumb_up_off_alt126

chat_bubble_outline5

repeat30

shareShare

Fei Liu @ #ICLR2025

@feiliu_nlp

3 months ago

✨ Our paper #PlanGenLLMs: A Modern Survey of LLM Planning Capabilities (arxiv.org/pdf/2502.11221) is accepted to the #ACL2025 main conference! Huge thanks to the reviewers for the unanimous 4-4-4 reviews and meta score ❤️ Grateful for your thoughtful feedback! #ACL2025 #NLProc

thumb_up_off_alt152

chat_bubble_outline6

repeat16

shareShare

Liyan Tang

@liyantang4

3 months ago

Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts! ✍🏻Entirely human-written questions by 13 CS researchers 👀Emphasis on visual reasoning – hard to be verbalized via text CoTs 📉Humans reach 93% but 63% from Gemini-2.5-Pro & 38% from Qwen2.5-72B

thumb_up_off_alt70

chat_bubble_outline2

repeat26

shareShare

Jessy Li

@jessyjli

2 months ago

Yayyyyyy welcome 🤘🤘

thumb_up_off_alt34

chat_bubble_outline1

repeat0

shareShare

Jessy Li

@jessyjli

2 months ago

Super thrilled that Kanishka Misra 🌊 is going to join UT Linguistics Dept as our newest computational linguistics faculty member -- looking forward to doing great research together! 🧑‍🎓Students: Kanishka is a GREAT mentor -- apply to be his PhD student in the upcoming cycle!!

thumb_up_off_alt40

chat_bubble_outline0

repeat4

shareShare

Sebastian Joseph

@sebajoed

2 months ago

How good are LLMs at 🔭 scientific computing and visualization 🔭? AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results. SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧵

thumb_up_off_alt18

chat_bubble_outline1

repeat8

shareShare

Jessy Li

@jessyjli

2 months ago

Is AI ready to play a real role in science? This work with CosmicAI evaluates LLMs targeting the implementation of scientific workflows, and the scientific utility of visualizations from LLM-generated code -- and the answer is not yet, even with the best SOTA models 👇

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

Asher Zheng

@asher_zheng00

2 months ago

Language is often strategic, but LLMs tend to play nice. How strategic are they really? Probing into that is key for future safety alignment.🛟 👉Introducing CoBRA🐍, a framework that assesses strategic language. Work with my amazing advisors Jessy Li and David Beaver! 🧵👇

thumb_up_off_alt20

chat_bubble_outline2

repeat8

shareShare

CosmicAI

@cosmicai_inst

2 months ago

CosmicAI collab: benchmarking the utility of LLMs in astronomy coding workflows & focusing on the key research capability of scientific visualization. Sebastian Joseph Jessy Li Murtaza Husain Greg Durrett Dr. Stephanie Juneau paul.torrey Adam Bolton, Stella Offner, Juan Frias, Niall Gaffney

thumb_up_off_alt7

chat_bubble_outline0

repeat6

shareShare

Jessy Li

@jessyjli

2 months ago

We have very good frameworks for cooperative dialog… but how about the opposite? Asher Zheng’s new paper takes a game-theoretic view and develops new metrics to quantify non-cooperative language ♟️ Turns out LLMs don’t have the pragmatic capabilities to perceive these…

thumb_up_off_alt18

chat_bubble_outline2

repeat3

shareShare

Jessy Li

@jessyjli

a month ago

Check out this new opinion piece from Sebastian and Lily! We have really powerful AI systems now, so what’s the bottleneck preventing the wider adoption of fact checking systems, in high stakes scenarios like medicine? It’s how we define the tasks 👇

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Jessy Li

@jessyjli

a month ago

If you’ll be at #icml2025, check out Hongli’s work on context-specific principles!

thumb_up_off_alt15

chat_bubble_outline0

repeat0

shareShare

Ramya Namuduri

@ramya_namuduri

a month ago

Excited to share that QUDsim has been accepted to #COLM2025!! 🎉🎉

thumb_up_off_alt20

chat_bubble_outline1

repeat5

shareShare

Manya Wadhwa

@manyawadhwa1

a month ago

Happy to share that EvalAgent has been accepted to #COLM2025 Conference on Language Modeling 🎉🇨🇦 We introduce a framework to identify implicit and diverse evaluation criteria for various open-ended tasks! 📜 arxiv.org/pdf/2504.15219

thumb_up_off_alt70

chat_bubble_outline1

repeat16

shareShare

Hongli Zhan

@honglizhan

a month ago

👇Happening this afternoon 4:30pm! Come meet Mikhail Yurochkin, Rå¥å, and I, at East Exhibition Hall #1103. 📍I’m also on the industry job market this coming year! Let’s connect and chat about opportunities in the industry :)

👇Happening this afternoon 4:30pm! Come meet <a href="/Yurochkin_M/">Mikhail Yurochkin</a>, <a href="/RayaHoresh/">Rå¥å</a>, and I, at East Exhibition Hall #1103.

📍I’m also on the industry job market this coming year! Let’s connect and chat about opportunities in the industry :)

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare