Mo Yu (@bishop_gorov) 's Twitter Profile
Mo Yu

@bishop_gorov

HIT, CS, NLP

ID: 171853213

calendar_today28-07-2010 09:44:37

10 Tweet

67 Followers

134 Following

Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

πŸ“’ Excited to share our new work πŸ’₯ FaithDial: A Faithful Benchmark for Information-Seeking Dialogue πŸ“„ arxiv.org/pdf/2204.10757… 🌐 mcgill-nlp.github.io/FaithDial/ πŸ‘©β€πŸ’» github.com/McGill-NLP/Fai… joint work w. Siva Reddy, Edoardo Ponti, Ehsan Kamalloo, Osmar Zaiane, Mo Yu, Sivan Milton #NLProc

Yisi Sang (@yisisang) 's Twitter Profile Photo

Can your model understand the characters in the story? Our new task π—§π—©π—¦π—΅π—Όπ˜„π—šπ˜‚π—²π˜€π˜€ requires multi-aspect persona understanding in stories! πŸ‘©β€πŸ’»:arxiv.org/pdf/2204.07721… πŸ“°:github.com/YisiSang/TVSHO… joint work w.Mo Yu MM @Shunyuyao12 JingLi Jeffrey Stanton

Can your model understand the characters in the story? Our new task π—§π—©π—¦π—΅π—Όπ˜„π—šπ˜‚π—²π˜€π˜€ requires multi-aspect persona understanding in stories! 

πŸ‘©β€πŸ’»:arxiv.org/pdf/2204.07721…
πŸ“°:github.com/YisiSang/TVSHO…

joint work w.<a href="/Bishop_Gorov/">Mo Yu</a> <a href="/moutaigua8183/">MM</a> @Shunyuyao12 JingLi <a href="/jfstn/">Jeffrey Stanton</a>
Yisi Sang (@yisisang) 's Twitter Profile Photo

πŸ₯³Yahoo! Our survey paper on machine narrative reading comprehension got accepted by IJCAI 2022 in Vienna. β™₯️Thank you my mentors and collaborators MM JingLi Mo Yu Jeffrey Stanton for the awesome teamwork!! πŸ“°: arxiv.org/pdf/2205.00299…

Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

πŸš€πŸš€ Super happy that my work at Google AI "Evaluating attribution in dialogue systems: the BEGIN benchmark" got accepted at TACL πŸ₯³ This is a work with wonderful collaborators Hannah Rashkin, David Reitter and Tal Linzen. Stay tuned for more details but in short: πŸ‘‡

Junjie Wu (@jieeijjie) 's Twitter Profile Photo

πŸš€ Introducing PhysiCo: A New Benchmark for Evaluating Abstract Understanding in LLMs! πŸš€ πŸ“šLink: physico-benchmark.github.io While models like o3 have made impressive strides on ARC-AGI, how well do LLMs truly grasp the abstract patterns in ARC-style tasks? (1/5)

πŸš€ Introducing PhysiCo: A New Benchmark for Evaluating Abstract Understanding in LLMs! πŸš€

πŸ“šLink: physico-benchmark.github.io

While models like o3 have made impressive strides on ARC-AGI, how well do LLMs truly grasp the abstract patterns in ARC-style tasks?

(1/5)
Junjie Wu (@jieeijjie) 's Twitter Profile Photo

πŸš€ Can LLMs think beyond memorization? Our latest study on fluid intelligence shows why models like GPT-4o struggle with truly novel problem-solving on ARC-AGI. πŸ“· Project Website: wujunjie1998.github.io/araoc-benchmar… (1/4)