Mo Yu (@bishop_gorov) Twitter Tweets • TwiCopy

MIT NLP Group

@mitnlp

8 years ago

@yujiabao2510 Shiyu Chang Mo Yu

thumb_up_off_alt0

chat_bubble_outline0

repeat1

shareShare

📢 Excited to share our new work 💥 FaithDial: A Faithful Benchmark for Information-Seeking Dialogue 📄 arxiv.org/pdf/2204.10757… 🌐 mcgill-nlp.github.io/FaithDial/ 👩‍💻 github.com/McGill-NLP/Fai… joint work w. Siva Reddy, Edoardo Ponti, Ehsan Kamalloo, Osmar Zaiane, Mo Yu, Sivan Milton #NLProc

thumb_up_off_alt58

chat_bubble_outline2

repeat16

shareShare

Yisi Sang

@yisisang

4 years ago

Can your model understand the characters in the story? Our new task 𝗧𝗩𝗦𝗵𝗼𝘄𝗚𝘂𝗲𝘀𝘀 requires multi-aspect persona understanding in stories! 👩‍💻:arxiv.org/pdf/2204.07721… 📰:github.com/YisiSang/TVSHO… joint work w.Mo Yu MM @Shunyuyao12 JingLi Jeffrey Stanton

thumb_up_off_alt8

chat_bubble_outline2

repeat5

shareShare

Yisi Sang

@yisisang

4 years ago

🥳Yahoo! Our survey paper on machine narrative reading comprehension got accepted by IJCAI 2022 in Vienna. ♥️Thank you my mentors and collaborators MM JingLi Mo Yu Jeffrey Stanton for the awesome teamwork!! 📰: arxiv.org/pdf/2205.00299…

thumb_up_off_alt17

chat_bubble_outline4

repeat4

shareShare

Nouha Dziri

@nouhadziri

4 years ago

🚀🚀 Super happy that my work at Google AI "Evaluating attribution in dialogue systems: the BEGIN benchmark" got accepted at TACL 🥳 This is a work with wonderful collaborators Hannah Rashkin, David Reitter and Tal Linzen. Stay tuned for more details but in short: 👇

thumb_up_off_alt140

chat_bubble_outline5

repeat17

shareShare

Junjie Wu

@jieeijjie

a year ago

🚀 Introducing PhysiCo: A New Benchmark for Evaluating Abstract Understanding in LLMs! 🚀 📚Link: physico-benchmark.github.io While models like o3 have made impressive strides on ARC-AGI, how well do LLMs truly grasp the abstract patterns in ARC-style tasks? (1/5)

thumb_up_off_alt5

chat_bubble_outline1

repeat3

shareShare

AK

@_akhaliq

a year ago

The Stochastic Parrot on LLM's Shoulder A Summative Assessment of Physical Concept Understanding

thumb_up_off_alt86

chat_bubble_outline3

repeat11

shareShare

Junjie Wu

@jieeijjie

a year ago

🚀 Can LLMs think beyond memorization? Our latest study on fluid intelligence shows why models like GPT-4o struggle with truly novel problem-solving on ARC-AGI. 📷 Project Website: wujunjie1998.github.io/araoc-benchmar… (1/4)

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare