Ella Minzhi Li (@ellaminzhili) Twitter Tweets • TwiCopy

Min-Yen Kan

a year ago

A good chance to take advantage of a AI research programme dedicated to reaching out to ASEAN undergraduate, masters students and young faculty members. Come work with my group at wing.nus! #llm #NLProc #sigir #www

thumb_up_off_alt13

chat_bubble_outline0

repeat3

shareShare

Yijia Shao

@echoshao8899

a year ago

LM agents today primarily aim to automate tasks. Can we turn them into collaborative teammates? Introducing Collaborative Gym (Co-Gym), a framework for enabling & evaluating human-agent collaboration! I now get used to agents proactively seeking confirmation or my deep thinking.

thumb_up_off_alt189

chat_bubble_outline17

repeat92

shareShare

Dora Zhao

@dorazhao9

a year ago

Todo lists, docs, email style – if you've got individual or team knowledge you want ChatGPT/Claude to have access to, Knoll (knollapp.com) is a personal RAG store from Stanford University that you can add any knowledge into. Instead of copy-pasting into your prompt every time,

thumb_up_off_alt83

chat_bubble_outline5

repeat28

shareShare

Shafiq Joty

@jotyshafiq

a year ago

Thanks to the awesome collaborators. Hope this survey could be a good reference for everyone!

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

Salesforce AI Research

@sfresearch

a year ago

🚨 New Survey Alert! 🚨 🧠”A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems” 📘 Paper: bit.ly/4cnAhvq 🧠 Project Page: bit.ly/3E6ROv6 🧵 Researcher's thread: 👇 (1/6) Reasoning is the key to unlocking true AI

thumb_up_off_alt65

chat_bubble_outline1

repeat18

shareShare

Ella Minzhi Li

@ellaminzhili

a year ago

Excited about LLM reasoning? 🤖🧠 Our latest survey dives into the field through regime & architecture dimensions, as well as input & output perspectives!💡Grateful to collaborate with amazing researchers on this exciting work—check it out!👇

thumb_up_off_alt11

chat_bubble_outline3

repeat0

shareShare

Brendan Jowett

@jowettbrendan

a year ago

🚨 BREAKING: Google just dropped the most practical AI release of 2025. It handles your emails, data, docs, meetings and does it with context. This is AI that actually saves time. Here’s everything they announced (and why it matters for your team):

thumb_up_off_alt5,5K

chat_bubble_outline85

repeat538

shareShare

elvis

@omarsar0

a year ago

// A Survey of Frontiers in LLM Reasoning // Nice survey on reasoning LLM with focus on inference scaling, enhancing reasoning, and applications in agentic systems.

thumb_up_off_alt425

chat_bubble_outline2

repeat104

shareShare

Michael Ryan

@michaelryan207

10 months ago

Check out CAVA 🍾 our new benchmark for end-to-end voice assistants! Large Audio Models are the next frontier for AI assistants, but what is still missing from making these models into seamless voice assistants? Inspired by discussions with practitioners, we identify six

thumb_up_off_alt27

chat_bubble_outline1

repeat5

shareShare

Will Held

@williambarrheld

10 months ago

Large Audio Models should be the foundation models for voice assistants, but most benchmarks focus on chat & audio analysis skills. Read about our big team effort to develop a set of benchmarks to cover all the capabilities a model needs to support a great voice assistant!

thumb_up_off_alt19

chat_bubble_outline0

repeat5

shareShare

Ella Minzhi Li

@ellaminzhili

10 months ago

Check out CAVA🥂a benchmark for evaluating how Large Audio Models perform on practical tasks that matter for real-world voice assistants: talkarena.org/cava

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

John Yang

@jyangballin

10 months ago

40% with just 1 try per task: SWE-agent-LM-32B is the new #1 open source model on SWE-bench Verified. We built it by synthesizing a ton of agentic training data from 100+ Python repos. Today we’re open-sourcing the toolkit that made it happen: SWE-smith.

thumb_up_off_alt638

chat_bubble_outline25

repeat132

shareShare

Yutong Zhang

@zhangyt0704

9 months ago

AI companions aren’t science fiction anymore 🤖💬❤️ Thousands are turning to AI chatbots for emotional connection – finding comfort, sharing secrets, and even falling in love. But as AI companionship grows, the line between real and artificial relationships blurs. 📰 “Can A.I.

thumb_up_off_alt174

chat_bubble_outline4

repeat53

shareShare

Min-Yen Kan

@knmnyn

8 months ago

📣 R/T! Q❓: Do you agree that ethics ⚖️ in LLMs & #NLProc are impt 4 such impactful tech 🤖? Going to ACL 2025 in Vienna? Want to learn more? Join us 27 Jul for our ✨ Ethics Tutorial✨ & 30 Jul for 🕊️ of a 🪶! #ACL2025NLP w/ Luciana Benotti Guido Ivetta Ella Minzhi Li

📣 R/T! Q❓: Do you agree that ethics ⚖️ in LLMs & #NLProc are impt 4 such impactful tech 🤖? Going to <a href="/aclmeeting/">ACL 2025</a> in Vienna? Want to learn more? Join us 27 Jul for our ✨ Ethics Tutorial✨ & 30 Jul for 🕊️ of a 🪶! #ACL2025NLP w/ <a href="/LucianaBenotti/">Luciana Benotti</a> <a href="/guido_ivetta/">Guido Ivetta</a> <a href="/EllaMinzhiLi/">Ella Minzhi Li</a>

thumb_up_off_alt18

chat_bubble_outline0

repeat6

shareShare

Yi Tay

@yitayml

8 months ago

First official Gold medal at IMO from DeepMind🥇 with Gemini Deep Think. A general purpose text-in text-out model achieving gold medal is something quite unthinkable just about one year ago and here we are! The frontier of AI is incredibly exciting! Happy to have co-led /

thumb_up_off_alt532

chat_bubble_outline29

repeat30

shareShare

Stanford NLP Group

@stanfordnlp

8 months ago

.Stanford NLP Group papers at ACL 2025 in Vienna next week: • HumT DumT: Measuring and controlling human-like language in LLMs Myra Cheng Sunny Yu Dan Jurafsky • Controllable and Reliable Knowledge-Intensive Task Agents with Declarative GenieWorksheets Harshit Joshi Shicheng Liu

.<a href="/stanfordnlp/">Stanford NLP Group</a> papers at <a href="/aclmeeting/">ACL 2025</a> in Vienna next week:
• HumT DumT: Measuring and controlling human-like language in LLMs <a href="/chengmyra1/">Myra Cheng</a> <a href="/sunnyyuych/">Sunny Yu</a> <a href="/jurafsky/">Dan Jurafsky</a>
• Controllable and Reliable Knowledge-Intensive Task Agents with Declarative GenieWorksheets
<a href="/harshitj__/">Harshit Joshi</a> <a href="/ShichengGLiu/">Shicheng Liu</a>

thumb_up_off_alt92

chat_bubble_outline0

repeat23

shareShare

Min-Yen Kan

@knmnyn

8 months ago

Last call 👋 for participation for our PM tutorial 🔥Navigating Ethical ⚖️ Challenges in NLP: Hands-on strategies for students & researchers🔥 at #aclmeeting 2025 in Vienna! See you 🫵 Sun, 27 Jul! (w/ Luciana Benotti Guido Ivetta Ella Minzhi Li & more!)

thumb_up_off_alt14

chat_bubble_outline0

repeat5

shareShare

Yanzhe Zhang

@stevenyzzhang

7 months ago

Soon, AI agents will act for us—collaborating, negotiating, and sharing data. But can they truly protect our privacy? We simulate privacy-critical scenarios, using alternating search to evolve attacks and defenses, uncovering severe vulnerabilities and building protections.

thumb_up_off_alt77

chat_bubble_outline2

repeat26

shareShare

Shafiq Joty

@jotyshafiq

6 months ago

We can now say we have a stable data and multi-turn RL training recipe for building autonomous deep research agents. Thanks to the awesome team!

thumb_up_off_alt17

chat_bubble_outline0

repeat7

shareShare

Zora Wang

@zhiruow

4 months ago

Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine

thumb_up_off_alt244

chat_bubble_outline7

repeat53

shareShare