Peter Jansen ( @peterjansen-ai.bsky.social ) (@peterjansen_ai) 's Twitter Profile
Peter Jansen ( @peterjansen-ai.bsky.social )

@peterjansen_ai

Associate Professor @uarizona; Visiting Scientist @allen_ai, AI/NLP; DiscoveryWorld; EntailmentBank; ScienceWorld; textgames.org list. Tweets/opinions my own

ID: 974390207867858944

linkhttp://cognitiveai.org calendar_today15-03-2018 21:01:43

5,5K Tweet

1,1K Followers

654 Following

Peter Jansen ( @peterjansen-ai.bsky.social ) (@peterjansen_ai) 's Twitter Profile Photo

Can language models be used as world simulators? In our ACL 2024 paper, we show -- not really. GPT-4 is only ~60% accurate at simulating state changes based on common-sense tasks, like boiling water. Preprint: arxiv.org/pdf/2406.06485 Ai2 Microsoft Research ACL 2025

Can language models be used as world simulators?  In our ACL 2024 paper, we show -- not really. 

GPT-4 is only ~60% accurate at simulating state changes based on common-sense tasks, like boiling water.  

Preprint: arxiv.org/pdf/2406.06485
<a href="/allen_ai/">Ai2</a> <a href="/MSFTResearch/">Microsoft Research</a> <a href="/aclmeeting/">ACL 2025</a>