
Yunxiang Zhang
@yunxiangzhang4
CS PhD student @UMichCSE, BS @PKU1898, #NLP
ID: 1399732727880949766
https://yunx-z.github.io/ 01-06-2021 14:21:11
34 Tweet
109 Followers
236 Following


๐ How Verifiable Are LM Responses in the Wild? A Three-Way Factuality Benchmark Meet ๐ ๐๐๐ญ๐๐๐ง๐๐ก โ an updatable benchmark for evaluating language models' factuality in real-world scenarios. ๐ huggingface.co/spaces/launch/โฆ LaunchNLP MichiganAI Computer Science and Engineering at Michigan


Heard of the Alaska-Hawaii merger?๐คWonder if LLMs know itโs pending government approval before it can happen? They stumble, but weโve got a fixโ๏ธ! Dive into my #EMNLP2024 work ๐๐๐ซ๐ซ๐๐ญ๐ข๐ฏ๐-๐จ๐-๐๐ก๐จ๐ฎ๐ ๐ก๐ญโa special prompting technique to unlock LLMsโ temporal reasoning




๐จAnnouncing SCALR @ COLM 2025 โ Call for Papers!๐จ The 1st Workshop on Test-Time Scaling and Reasoning Models (SCALR) is coming to Conference on Language Modeling in Montreal this October! This is the first workshop dedicated to this growing research area. ๐ scalr-workshop.github.io


๐จ Deadline for SCALR 2025 Workshop: Testโtime Scaling & Reasoning Models at COLM '25 Conference on Language Modeling is approaching!๐จ scalr-workshop.github.io ๐งฉ Call for short papers (4โฏpages, nonโarchival) now open on OpenReview! Submit by Juneโฏ23,โฏ2025; notifications out Julyโฏ24. Topics

