UMassNLP (@umass_nlp) 's Twitter Profile
UMassNLP

@umass_nlp

Natural language processing group at UMass Amherst @umasscs. Led by @thompson_laure @MohitIyyer @brendan642 @andrewmccallum #nlproc

ID: 1427673854336438281

linkhttps://nlp.cs.umass.edu/ calendar_today17-08-2021 16:50:10

95 Tweet

1,1K Takipçi

375 Takip Edilen

Tu Vu (@tuvllms) 's Twitter Profile Photo

I would also like to thank all of my labmates UMassNLP and friends at UMass Amherst, my mentors and collaborators at Google AI and Microsoft Research, and my family and friends all over the world who gave me support and encouragement throughout my Ph.D. journey.

Tu Vu (@tuvllms) 's Twitter Profile Photo

Moving forward, I will be splitting my time as a research scientist at Google AI and an assistant professor Virginia Tech Computer Science. I will also be recruiting Ph.D. students starting in Fall 2024 to work on effective and efficient transfer learning in the era of LLMs, please come join me!

Mohit Iyyer (@mohitiyyer) 's Twitter Profile Photo

Huge congrats @tuvuumass, who just became my first graduated PhD student!! He'll be starting his own group soon Virginia Tech Computer Science, so prospective PhD applicants interested in topics like multitask/multimodal transfer learning, or param-efficient LLM adaptation: def apply to work with him!

UMassNLP (@umass_nlp) 's Twitter Profile Photo

In Prize-winning Paper, UMass Amherst Computer Scientists Release Guidelines for Evaluating AI-Generated Text : UMass Amherst umass.edu/news/article/p…

brendan o'connor (@brendan642) 's Twitter Profile Photo

Reminder - for the terrific interdisciplinary Text as Data conference, abstract submissions coming up - due Aug 4! tada2023.org It's a great, small, non-archival conference to discuss emerging work with folks across social sciences, humanities, and computer science.

Sheshera Mysore (@msheshera) 's Twitter Profile Photo

I’m at #sigir2023 and presenting our work on interactively controllable personalisation! Come listen in room 101 between 11-12.30!

Yapei Chang (@yapeichang) 's Twitter Profile Photo

Can LLMs summarize books exceeding their context windows? We design an evaluation protocol for collecting fine-grained human judgments on LLM-generated summaries & propose BooookScore, a reference-free automatic metric for narrative coherence. arxiv.org/abs/2310.00785 🧵below:

Tu Vu (@tuvllms) 's Twitter Profile Photo

🚨 New Google AI paper: 🤖 LLMs are game-changers, but can they help us navigate a constantly changing world? 🤔 As of now, our work shows that LLMs, no matter their size, struggle when it comes to fast-changing knowledge & false premises. 📰: arxiv.org/abs/2310.03214 👇

🚨 New <a href="/GoogleAI/">Google AI</a> paper:

🤖 LLMs are game-changers, but can they help us navigate a constantly changing world? 🤔

As of now, our work shows that LLMs, no matter their size, struggle when it comes to fast-changing knowledge &amp; false premises.

📰: arxiv.org/abs/2310.03214
👇
Mohit Iyyer (@mohitiyyer) 's Twitter Profile Photo

Evaluating the factuality of LLMs is tricky: what if they answer a question correctly but also generate a bunch of unrelated made-up stuff? We eval LLM answers to our new FreshQA dataset in both a "strict" (no made up stuff) and "relaxed" setting, see the paper for more!

Quoc Le (@quocleix) 's Twitter Profile Photo

A weakness of LLMs is that they don’t know recent events well. This is nice work from Tu developing a benchmark (FreshQA) to measure factuality of recent events, and a simple method to improve search integration for better performance on the benchmark.

Jason Wei (@_jasonwei) 's Twitter Profile Photo

Nice paper by Tu Vu on factuality in LLMs: arxiv.org/abs/2310.03214, enjoyed contributing in a minor role to it while I was at Google. The main takeaway for me is that most factuality benchmarks for LLMs don't really take into account the fact that many types of knowledge

Tu Vu (@tuvllms) 's Twitter Profile Photo

📢 Want to adapt your outdated LLM to our ever-changing world? 🌏 Check out our code for FreshPrompt at github.com/freshllms/fres…. Colab: tinyurl.com/freshprompt-co…. 🙏 We are grateful to SerpApi for their generous sponsorship of 5000 searches for FreshPrompt's users.

📢 Want to adapt your outdated LLM to our ever-changing world? 🌏

Check out our code for FreshPrompt at github.com/freshllms/fres….
Colab: tinyurl.com/freshprompt-co….

🙏 We are grateful to <a href="/serp_api/">SerpApi</a> for their generous sponsorship of 5000 searches for FreshPrompt's users.
Andrew Drozdov (@mrdrozdov) 's Twitter Profile Photo

✨ New Paper ✨ Deep dive on demonstrations to enhance LLM-based passage ranking 🚀 insights for pointwise ranking using query likelihood 🚀 huggingface.co/papers/2310.14…

Tu Vu (@tuvllms) 's Twitter Profile Photo

📢 🌟PhD Openings🌟: I am recruiting PhD students this cycle at Virginia Tech. If you want to dive into: - in-context learning & tool-use LLMs - instruction tuning - parameter-efficient transfer learning - few-shot learning please apply by Dec 15! 👉tuvllms.github.io

Ankita Gupta (@anki98765) 's Twitter Profile Photo

Check out ezCoref, our open-source tool for easy coreference annotation across languages/domains. Demo: azkaban.cs.umass.edu:8877/tutorial Re-annotation study via ezCoref reveals interesting deviations from prior work. 📜aclanthology.org/2023.findings-… #CRAC2023 EMNLP 2025 Dec 6, 2:50PM 🧵👇

Check out ezCoref, our open-source tool for easy coreference annotation across languages/domains.
Demo: azkaban.cs.umass.edu:8877/tutorial 
Re-annotation study via ezCoref reveals interesting deviations from prior work.
📜aclanthology.org/2023.findings-…

#CRAC2023 <a href="/emnlpmeeting/">EMNLP 2025</a> Dec 6, 2:50PM  
🧵👇
Mohit Iyyer (@mohitiyyer) 's Twitter Profile Photo

So proud to have hooded my first five PhDs today: Tu Vu, Kalpesh Krishna, Simeng Sun, Andrew Drozdov, and Nader Akoury. Now, they're either training LLMs at Google, Nvidia, and Databricks, or staying in academia at Virginia Tech and Cornell. Excited to watch their careers blossom!

So proud to have hooded my first five PhDs today: <a href="/tuvllms/">Tu Vu</a>, <a href="/kalpeshk2011/">Kalpesh Krishna</a>, <a href="/simeng_ssun/">Simeng Sun</a>, <a href="/mrdrozdov/">Andrew Drozdov</a>, and Nader Akoury. Now, they're either training LLMs at Google, Nvidia, and Databricks, or staying in academia at Virginia Tech and Cornell. Excited to watch their careers blossom!
Tu Vu (@tuvllms) 's Twitter Profile Photo

🚨 New Google DeepMind paper 🚨 We trained Foundational Large Autorater Models (FLAMe) on extensive human evaluations, achieving the best RewardBench perf. among generative models trained solely on permissive data, surpassing both GPT-4 & 4o. 📰: arxiv.org/abs/2407.10817 🧵:👇

🚨 New <a href="/GoogleDeepMind/">Google DeepMind</a> paper 🚨

We trained Foundational Large Autorater Models (FLAMe) on extensive human evaluations, achieving the best RewardBench perf. among generative models trained solely on permissive data, surpassing both GPT-4 &amp; 4o.

📰: arxiv.org/abs/2407.10817
🧵:👇
Kalpesh Krishna (@kalpeshk2011) 's Twitter Profile Photo

Check out our new Google AI paper: we curate a mixture of 5M human judgments to train general-purpose foundational autoraters. Strong LLM-as-judge scores on RewardBench (87.8%), and highest perf among baselines on LLMAggreFact + 6 other benchmarks! 📰 arxiv.org/abs/2407.10817 👇

Prateek Yadav (@prateeky2806) 's Twitter Profile Photo

Given that I have been closely working with Tu Vu and Kalpesh Krishna, I can say that he is extremely well read, hard working and this paper is amazing. People should definitely check out the FLAMe as this is going to be impactful.