Greg Durrett
@gregd_nlp
CS professor at UT Austin. I do NLP most of the time. he/him
06-12-2017 17:16:17
1,0K Tweets
5,9K Followers
756 Following
This is a cool method, but 'superhuman' is an overclaim based on the data shown. There are better datasets than FActScore for evaluating this:
ExpertQA arxiv.org/abs/2309.07852 by Chaitanya Malaviya +al
Factcheck-GPT arxiv.org/abs/2311.09000 by Yuxia Wang +al (+ same methodology) 🧵