USC NLP (@nlp_usc) 's Twitter Profile
USC NLP

@nlp_usc

The NLP group at @USCViterbi. @DaniYogatama+@_jessethomason_+@jieyuzhao11+@robinomial+@swabhz+@xiangrenNLP at @CSatUSC + researchers @USC_ICT, @USC_ISI.

ID: 1002211204897517568

linkhttps://nlp.usc.edu/ calendar_today31-05-2018 15:32:26

351 Tweet

3,3K Followers

363 Following

Huihan Li ๐Ÿ›ฉ๏ธ ICLR 2025 (@huihan_li) 's Twitter Profile Photo

Feeling hard generating challenging evaluation data for LLMs? Check our work๐Ÿ‘‡! Introducing LINK๐Ÿ”—, the first framework for systematically generating data in the long-tail distribution, guided by symbolic rules arxiv.org/abs/2311.07237 w/USC NLP MOSAIC ๐Ÿงตโฌ‡๏ธ #NLProc [1/n]

Feeling hard generating challenging evaluation data for LLMs? Check our work๐Ÿ‘‡!

Introducing LINK๐Ÿ”—, the first framework for systematically generating data in the long-tail distribution, guided by symbolic rules

arxiv.org/abs/2311.07237
w/<a href="/nlp_usc/">USC NLP</a> <a href="/ai2_mosaic/">MOSAIC</a> ๐Ÿงตโฌ‡๏ธ
#NLProc 

[1/n]
USC NLP (@nlp_usc) 's Twitter Profile Photo

We're excited to attend #SocalNLP today! ICYMI, sunny southern California is a fantastic place to do #NLProc, come check out what USC NLP [nlp.usc.edu] has been working on lately! And did we say we're hiring PhD students this fall? ๐ŸŒด๐Ÿ–๏ธโ˜€๏ธ

Brihi Joshi (@brihij) 's Twitter Profile Photo

Throwback to when Sean Ren ๐Ÿ”† and our lab made our wishlist and dream research directions to discuss in our lab meeting โ€” very helpful in contextualising our work in the age of LLMs!! ๐Ÿ™Œ๐Ÿผ USC NLP is such a great place to do research ๐Ÿซถ

Linlu Qiu (@linluqiu) 's Twitter Profile Photo

How good are LMs at inductive reasoning? How are their behaviors similar to/contrasted with those of humans? We study these via iterative hypothesis refinement. We observe that LMs are phenomenal hypothesis proposers, but they also behave as puzzling inductive reasoners: (1/n)

How good are LMs at inductive reasoning? How are their behaviors similar to/contrasted with those of humans?

We study these via iterative hypothesis refinement. We observe that LMs are phenomenal hypothesis proposers, but they also behave as puzzling inductive reasoners:

(1/n)
Sean Ren (@xiangrennlp) 's Twitter Profile Photo

Arrived at NOLA for #NeurIPS2023๐Ÿ”ฅ Exciting time to chat about limits/science of LLMs, โ€œslowโ€ reasoning & explainability. Join our posters for a fun disucssion๐Ÿป Ads: USC CS is hiring tenured track AI faculty + USC NLP is looking for strong PhD students. Talk to us!

Arrived at NOLA for #NeurIPS2023๐Ÿ”ฅ Exciting time to chat about limits/science of LLMs, โ€œslowโ€ reasoning &amp; explainability. Join our posters for a fun disucssion๐Ÿป

Ads: USC CS is hiring tenured track AI faculty + USC NLP is looking for strong PhD students. Talk to us!
Johnny Tian-Zheng Wei (@johntzwei) 's Twitter Profile Photo

To detect if your data was used for LLM pretraining, consider using data watermarks: arxiv.org/pdf/2402.10892โ€ฆ Detection can be framed as hypothesis testing (statistical guarantees!), if you contributed multiple training documents and watermarked them before public release. ๐Ÿงต

Sean Ren (@xiangrennlp) 's Twitter Profile Photo

Absolutely thrilled to receive this honor. Rarely for a researcher could have their first PhD publication win a Test of Time Award (for 10 years of its cumulative impact). Iโ€™m super grateful for the chance to collaborate with Xiao on this fun project โ€” turns out to be a

Matthew Finlayson โœˆ๏ธ NeurIPS (@mattf1n) 's Twitter Profile Photo

Wanna know gpt-3.5-turbo's embed size? We find a way to extract info from LLM APIs and estimate gpt-3.5-turboโ€™s embed size to be 4096. With the same trick we also develop 25x faster logprob extraction, audits for LLM APIs, and more! ๐Ÿ“„ arxiv.org/abs/2403.09539 Hereโ€™s how 1/๐Ÿงต

Wanna know gpt-3.5-turbo's embed size? We find a way to extract info from LLM APIs and estimate gpt-3.5-turboโ€™s embed size to be 4096. With the same trick we also develop 25x faster logprob extraction, audits for LLM APIs, and more!
๐Ÿ“„ arxiv.org/abs/2403.09539
Hereโ€™s how 1/๐Ÿงต
Soumya Sanyal (@ssanyal8) 's Twitter Profile Photo

New paper ๐Ÿšจ Looking for a strong, open-sourced entailment-verification model to verify your model generations for consistency? โœ… You can now use the ๐Ÿค—model huggingface.co/soumyasanyal/nโ€ฆ for this! Our FlanT5-xxl finetuned model can predict entailment errors better than GPT3.5 and

Xisen Jin (@xisenj) 's Twitter Profile Photo

๐ŸงLMs forget upstream knowledge when continuously fine-tuned. When fine-tuned on new data, can we forecast what upstream examples will be forgotten? ๐ŸฅณExcited to share our #ICML Spotlight paper on forecasting example forgetting! ๐Ÿ”—Project page: inklab.usc.edu/lm-forgetting-โ€ฆ

๐ŸงLMs forget upstream knowledge when continuously fine-tuned. When fine-tuned on new data, can we forecast what upstream examples will be forgotten?

๐ŸฅณExcited to share our #ICML Spotlight paper on forecasting example forgetting!

๐Ÿ”—Project page: inklab.usc.edu/lm-forgetting-โ€ฆ
Sean Ren (@xiangrennlp) 's Twitter Profile Photo

Congratulations to the GDM Google DeepMind team on their best paper award at #ICML2024 & Appreciate @afedercooper's shout out to our concurrent paper ๐Ÿ™Œ If you are into the topic of recovering model info through just its output logits, check out our paper led by Matthew Finlayson too!

Qinyuan Ye (๐Ÿ‘€Jobs) (@qinyuan_ye) 's Twitter Profile Photo

Introducing ๐—Ÿ๐—ถ๐—ณ๐—ฒ๐—น๐—ผ๐—ป๐—ด ๐—œ๐—–๐—Ÿ and ๐—ง๐—ฎ๐˜€๐—ธ ๐—›๐—ฎ๐˜†๐˜€๐˜๐—ฎ๐—ฐ๐—ธ, a new approach for evaluating long-context LMs, featuring ever-changing task streams that controllably fill the context window, and NIAH-style visualization for easy diagnosis. ๐Ÿ“œ arxiv.org/abs/2407.16695 ๐Ÿงต

Introducing ๐—Ÿ๐—ถ๐—ณ๐—ฒ๐—น๐—ผ๐—ป๐—ด ๐—œ๐—–๐—Ÿ and ๐—ง๐—ฎ๐˜€๐—ธ ๐—›๐—ฎ๐˜†๐˜€๐˜๐—ฎ๐—ฐ๐—ธ, a new approach for evaluating long-context LMs, featuring ever-changing task streams that controllably fill the context window, and NIAH-style visualization for easy diagnosis.

๐Ÿ“œ arxiv.org/abs/2407.16695

๐Ÿงต
Kaitlyn Zhou โœˆ๏ธ CSCW, EMNLP! (@kaitlynzhou) 's Twitter Profile Photo

Excited to see everyone soon at #acl2024 in Bangkok! I'll be presenting our work, Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty arxiv.org/abs/2401.06730 Poster session 3 on Aug 12 at 16:00! W/ Maarten Sap (he/him) Jena Hwang Sean Ren

Excited to see everyone soon at #acl2024 in Bangkok!

I'll be presenting our work, Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty arxiv.org/abs/2401.06730

Poster session 3 on Aug 12 at 16:00! W/ <a href="/MaartenSap/">Maarten Sap (he/him)</a> <a href="/JenaHwang2/">Jena Hwang</a> <a href="/xiangrenNLP/">Sean Ren</a>
Sean Ren (@xiangrennlp) 's Twitter Profile Photo

Arriving in Bangkok for ACL 2025! ๐Ÿ˜ƒ Will be sharing our recent work on logical scaffolding, model uncertainty expression & multi-hop entailment inference w/ folks USC NLP + Kaitlyn Zhou โœˆ๏ธ CSCW, EMNLP! +friends Ai2 I'm also helping on the <AI / ALL> summit w/ Sahara AI ๐Ÿ”† ๐Ÿ‘‡๐Ÿ‘‡

Arriving in Bangkok for <a href="/aclmeeting/">ACL 2025</a>! ๐Ÿ˜ƒ

Will be sharing our recent work on logical scaffolding, model uncertainty expression &amp; multi-hop entailment inference  w/ folks <a href="/nlp_usc/">USC NLP</a> + <a href="/KaitlynZhou/">Kaitlyn Zhou โœˆ๏ธ CSCW, EMNLP!</a> +friends <a href="/allen_ai/">Ai2</a> 

I'm also helping on the &lt;AI / ALL&gt; summit 
w/ <a href="/SaharaLabsAI/">Sahara AI ๐Ÿ”†</a> 
๐Ÿ‘‡๐Ÿ‘‡
Sean Ren (@xiangrennlp) 's Twitter Profile Photo

Find us at the posters! Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs w/ Siyuan Wang Yejin Choi et al Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty w/ Kaitlyn Zhou โœˆ๏ธ CSCW, EMNLP!, Maarten Sap (he/him) et al.

Sean Ren (@xiangrennlp) 's Twitter Profile Photo

Join us at the co-located <AI / ALL> summit on Aug 15, with the social party in the evening! lu.ma/mxcx5bia co-hosted with SCB 10X SambaNova Systems sponsored by Amazon Web Services participated by folks AI at Meta @google Cohere For AI Together AI

Sahara AI (@saharalabsai) 's Twitter Profile Photo

Proud moment seeing our CEO & Co-Founder Sean Ren ๐Ÿ”† alongside his USC NLP students at ACL 2025. Supporting the next generation of thought leaders in AI is exactly what drives us forward.

Proud moment seeing our CEO &amp; Co-Founder <a href="/xiangrenNLP/">Sean Ren ๐Ÿ”†</a> alongside his <a href="/nlp_usc/">USC NLP</a> students at <a href="/aclmeeting/">ACL 2025</a>. 

Supporting the next generation of thought leaders in AI is exactly what drives us forward.
Huihan Li ๐Ÿ›ฉ๏ธ ICLR 2025 (@huihan_li) 's Twitter Profile Photo

Heading to #EMNLP2024, down to chat! Excited to present our work (Wed 10:30am) on systematic data generation in long-tail (low confidence) distribution for more challenging evaluation. ๐Ÿงต๐Ÿ‘‡ ๐Ÿ“ฐ: arxiv.org/abs/2311.07237 ๐Ÿ’ป: github.com/INK-USC/LINK ๐Ÿ”–: zenodo.org/records/101179โ€ฆ

Heading to #EMNLP2024, down to chat! Excited to present our work (Wed 10:30am) on systematic data generation in long-tail (low confidence) distribution for more challenging evaluation. ๐Ÿงต๐Ÿ‘‡

๐Ÿ“ฐ: arxiv.org/abs/2311.07237
๐Ÿ’ป: github.com/INK-USC/LINK
๐Ÿ”–: zenodo.org/records/101179โ€ฆ
Sean Ren (@xiangrennlp) 's Twitter Profile Photo

Proud of my student Huihan Li and intern Arnav presenting their #ICLR2025 work on attributing culture-conditioned generation to LLMโ€™s training corpora. Fun time meeting many friends. Ping me if you want to chat about model security, interpretability and human-LM interaction!

Proud of my student <a href="/huihan_li/">Huihan Li</a> and intern Arnav presenting their #ICLR2025 work on attributing culture-conditioned generation to LLMโ€™s training corpora.

Fun time meeting many friends. Ping me if you want to chat about model security, interpretability and human-LM interaction!
Sean Ren (@xiangrennlp) 's Twitter Profile Photo

Thrilled for the Best Paper Award runner-up at #NAACL2025! ๐Ÿฅณ Even when answers are incorrect, people may rely more on LLMs if they use warm and emphatic expressions! We analyze the risks of human over-reliance on LLM expressions of uncertainty: arxiv.org/pdf/2407.07950 w/