DukeNLP (@duke_nlp) Twitter Tweets • TwiCopy

Sam Wiseman

4 years ago

Newish #EMNLP2021 work w/ Arturs Backurs & Karl Stratos: we try to generate text (in a data-to-text setting) by splicing together pieces of retrieved neighbor text. Paper: arxiv.org/pdf/2101.08248… 1/3

thumb_up_off_alt64

chat_bubble_outline3

repeat14

shareShare

DukeNLP

@duke_nlp

4 years ago

The DukeNLP group is hiring PhD students in all areas of natural language processing! Apply at gradschool.duke.edu/admissions/app… by Dec 15 to work with Sam Wiseman or Bhuwan Dhingra.

thumb_up_off_alt24

chat_bubble_outline0

repeat3

shareShare

Mohit Bansal

@mohitban47

4 years ago

See a glimpse👇of how beautiful @unc +research triangle fall colors are 😍 Come join our awesome group of UNC NLP UNC Computer Science students+staff+faculty (& great neighbors eg. DukeNLP). We are hiring at all levels (phd, postdocs, faculty); feel free to ping any of us with questions 🙏

thumb_up_off_alt75

chat_bubble_outline3

repeat8

shareShare

Bhuwan Dhingra

@bhuwandhingra

3 years ago

🤔 When does a factoid question need a *long* answer? 🤖 "Long" could mean multiple things: either you ask for a city with a very long name or … Read Ivan Stelmakh's internship paper to get the second part of the answer! arxiv.org/abs/2204.06092

thumb_up_off_alt51

chat_bubble_outline3

repeat15

shareShare

Phyllis Ang

@phyllis_ang_

3 years ago

Increasing the input length often increases accuracy on NLP tasks like summarization. But given limited time and a fixed number of GPUs, is it better to increase model size or input sequence length? Find the answer in our latest work: arxiv.org/abs/2204.07288 1/3

thumb_up_off_alt36

chat_bubble_outline1

repeat6

shareShare

Bhuwan Dhingra

@bhuwandhingra

2 years ago

New Preprint from Yukun Huang! Can an LLM determine when its responses are incorrect? Our latest paper dives into "Calibrating long-form generations from an LLM". Discover more at arxiv.org/abs/2402.06544 (1/n)

New Preprint from <a href="/YukunHuang9/">Yukun Huang</a>!

Can an LLM determine when its responses are incorrect? Our latest paper dives into "Calibrating long-form generations from an LLM". Discover more at arxiv.org/abs/2402.06544 (1/n)

thumb_up_off_alt44

chat_bubble_outline2

repeat8

shareShare

Jinho Choi

@jinho_d_choi

a year ago

150+ people registered for the SouthNLP 2024 at Emory University on 4/5. The schedule is available on our website: southnlp.github.io/southnlp2024/ Registration is open until March 10th. If you plan to attend, please register by completing the form here: forms.gle/NBWrgtgM5KgUq3…

thumb_up_off_alt18

chat_bubble_outline0

repeat11

shareShare

Bhuwan Dhingra

@bhuwandhingra

a year ago

🧐 Can we generate *LLM-proof* math problems❓ 👉Check out the new preprint from @ruoyuxyz , Chengxuan Huang and Junlin Wang : arxiv.org/abs/2402.17916 #LLMs #NLProc 🧵(1/6)

🧐 Can we generate *LLM-proof* math problems❓

👉Check out the new preprint from @ruoyuxyz , Chengxuan Huang and <a href="/JunlinWang3/">Junlin Wang</a> : arxiv.org/abs/2402.17916 #LLMs #NLProc

🧵(1/6)

thumb_up_off_alt27

chat_bubble_outline2

repeat6

shareShare

DukeNLP

@duke_nlp

a year ago

Welcome to Duke!

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Junlin Wang

@junlinwang3

a year ago

🦝Excited to announce our work on robustness & security of LLM systems! 𝐑𝐚𝐜𝐜𝐨𝐨𝐧: Prompt Extraction Benchmark of LLM-Integrated Applications Prompt extraction from LLM-integrated apps like GPT-s is a critical security concern. ‼️

thumb_up_off_alt11

chat_bubble_outline1

repeat4

shareShare

Roy Xie

@royxie_

a year ago

🚨 Breaking: >90% AUC on the WikiMIA dataset for membership inference! Want to know if your data is in LLM's training set?🔍 Check out our latest work "ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods" ✨ royxie.com/recall-project… 🧵1/6

thumb_up_off_alt30

chat_bubble_outline1

repeat10

shareShare

Ghazal Khalighinejad

@ghazalkhn

a year ago

🎉 Excited to share that IsoBench has been accepted at Conference on Language Modeling! IsoBench features isomorphic inputs across Math/Graph problems, Chess games, and Physics/Chemistry questions. Check out the dataset here: huggingface.co/datasets/isobe…

thumb_up_off_alt30

chat_bubble_outline0

repeat5

shareShare

Bhuwan Dhingra

@bhuwandhingra

10 months ago

🧵When should LLMs trust external contexts in RAG? New paper from Yukun Huang and Sanxing Chen enhances LLMs’ *situated faithfulness* to external contexts -- even when they are wrong!👇

🧵When should LLMs trust external contexts in RAG?

New paper from <a href="/YukunHuang9/">Yukun Huang</a> and <a href="/sanxing_chen/">Sanxing Chen</a> enhances LLMs’ *situated faithfulness* to external contexts -- even when they are wrong!👇

thumb_up_off_alt57

chat_bubble_outline2

repeat13

shareShare

Ghazal Khalighinejad

@ghazalkhn

10 months ago

📢 New preprint on a benchmark for multimodal information extraction! Structured data extraction from long documents consisting of interconnected data in text, tables, and figures remains a challenge. MatViX aims to fill this gap. matvix-bench.github.io

thumb_up_off_alt26

chat_bubble_outline1

repeat6

shareShare

Bhuwan Dhingra

@bhuwandhingra

6 months ago

**New paper from Roy Xie ** Do LLMs know when they have read enough to answer a question? We show how language models can STOP processing input text early without losing accuracy – Why waste 40,000 tokens when 500 suffice? 🧵 📄 Paper: arxiv.org/abs/2502.01025

**New paper from <a href="/RoyXie_/">Roy Xie</a> **

Do LLMs know when they have read enough to answer a question?

We show how language models can STOP processing input text early without losing accuracy – Why waste 40,000 tokens when 500 suffice? 🧵

📄 Paper: arxiv.org/abs/2502.01025

thumb_up_off_alt10

chat_bubble_outline3

repeat4

shareShare

Junlin Wang

@junlinwang3

4 months ago

Excited to share work from my Together AI internship—a deep dive into inference‑time scaling methods 🧠 We rigorously evaluated verifier‑free inference-time scaling methods across both reasoning and non‑reasoning LLMs. Some key findings: 🔑 Even with huge rollout budgets,

Excited to share work from my <a href="/togethercompute/">Together AI</a> internship—a deep dive into inference‑time scaling methods 🧠

We rigorously evaluated verifier‑free inference-time scaling methods across both reasoning and non‑reasoning LLMs. Some key findings:

🔑 Even with huge rollout budgets,

thumb_up_off_alt177

chat_bubble_outline1

repeat60

shareShare

Bhuwan Dhingra

@bhuwandhingra

4 months ago

At #ICLR2025? Check out Yukun Huang ‘s poster tomorrow on when to trust external contexts using LLMs..

At #ICLR2025? Check out <a href="/YukunHuang9/">Yukun Huang</a> ‘s poster tomorrow on when to trust external contexts using LLMs..

thumb_up_off_alt10

chat_bubble_outline1

repeat1

shareShare

Bhuwan Dhingra

@bhuwandhingra

3 months ago

📢 New Preprint from Raghuveer @ NAACL25 on Multimodal Contrastive Learning: Breaking the Batch Barrier (B3) 📢 TL;DR: Smart batch mining based on community detection achieves state of the art on the MMEB benchmark. Preprint: arxiv.org/pdf/2505.11293 Code: github.com/raghavlite/B3

thumb_up_off_alt16

chat_bubble_outline2

repeat4

shareShare

Bhuwan Dhingra

@bhuwandhingra

3 months ago

Glad to share a new ACL Findings paper from @MaxHolsman and Yukun Huang! We introduce Fuzzy Speculative Decoding (FSD) which extends speculative decoding to allow a tunable exchange of generation quality and inference acceleration. Paper: arxiv.org/abs/2502.20704

Glad to share a new ACL Findings paper from @MaxHolsman and <a href="/YukunHuang9/">Yukun Huang</a>!

We introduce Fuzzy Speculative Decoding (FSD) which extends speculative decoding to allow a tunable exchange of generation quality and inference acceleration.

Paper: arxiv.org/abs/2502.20704

thumb_up_off_alt10

chat_bubble_outline1

repeat2

shareShare

Roy Xie

@royxie_

3 months ago

Can we train reasoning LLMs to generate answers as they think? Introducing 𝐈𝐧𝐭𝐞𝐫𝐥𝐞𝐚𝐯𝐞𝐝 𝐑𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠! We train LLMs to alternate between thinking & answering 🚀 Reducing Time-to-First-Token (TTFT) by over 80% ⚡AND improving Pass@1 accuracy up to 19.3%!📈 🧵 1/n

thumb_up_off_alt178

chat_bubble_outline1

repeat35

shareShare