Jan Buys
@janmbuys
CS Lecturer / Natural Language Processing researcher at the University of Cape Town. Previously @uwnlp, @CompSciOxford.
ID: 337833850
http://www.janmbuys.com 18-07-2011 17:12:45
21 Tweet
234 Followers
355 Following
Announcing SWAG, a new natural language inference dataset, to appear at #emnlp2018. We present a general framework for collecting adversarial QA pairs at scale, minimizing bias. With Yonatan Bisk - not atttending any conferences, @royschwartz02, Yejin Choi. rowanzellers.com/swag/ arxiv.org/abs/1808.05326
WFSAs are dead. Long live WFSAs! Our new EMNLP paper "Rational Recurrences" is out: arxiv.org/abs/1808.09357 Work by Hao Peng, @royschwartz02, me, and Noah A. Smith. We analyze several recent RNNs as finite-state automata with neural transition weights.
Today, we're sharing insights on how to defend against Neural Fake News, AI-written disinformation that looks like real news.🕵️♀️🛡️ Joint with Ari Holtzman @hjrashkin Yonatan Bisk - not atttending any conferences Ali Farhadi Franzi Roesner & Yejin Choi at Allen School + Ai2. arxiv.org/abs/1905.12616 thread 1/4
Sentence summarization with Information Bottleneck 🍾 and no supervision! 😎 Self-supervised and unsupervised approaches 🍾BottleSum🍾 to appear at @emnlp2019 arxiv.org/abs/1909.07405 With Ari Holtzman Jan Buys Yejin Choi at Allen School and Ai2
🍾Information Bottleneck🍾 in action at #emnlp2019! (1) Specializing embeddings for parsing (arxiv.org/abs/1910.00163) by Xiang & Jason Eisner (2) 🍾BottleSum🍾 unsupervised & self-supervised summarization (arxiv.org/abs/1909.07405) with Peter West Ari Holtzman Jan Buys
Blog post: announcing "Findings of EMNLP", a new ACL anthology journal. "What sets Findings apart from the main conference papers is that there is no requirement for high perceived impact, and accordingly solid work in untrendy areas... will be eligible." 2020.emnlp.org/blog/2020-04-1…
I am at #NAACL2024 to present our paper on the role of subwords in multilingual MT (w/ Jan Buys). aclanthology.org/2024.findings-… Stop by our poster (Wednesday 11am) to learn about the tradeoffs between tokenizers in inducing cross-lingual transfer across different linguistic contexts.