Daniel Israel (@danielmisrael) Twitter Tweets • TwiCopy

Anji Liu

a year ago

[1/n] 🚀Diffusion models for discrete data excel at modeling text, but they need hundreds to thousands of diffusion steps to perform well. We show that this is caused by the fact that discrete diffusion models predict each output token *independently* at each denoising step.

thumb_up_off_alt213

chat_bubble_outline5

repeat33

shareShare

Christina Chance

@christinachanc

a year ago

1/n uclanlp is researching how Black, LGBTQIA+, & women communities perceive and are affected by content moderation, as it relates to English-language social media content using reclaimed language. As part of this, we are recruiting annotators (forms.gle/KP6F9gDCo8Skjs…) …

thumb_up_off_alt21

chat_bubble_outline1

repeat12

shareShare

Zhe Zeng

@zhezeng0908

a year ago

📢 I’m recruiting PhD students UVA Computer Science for Fall 2025! 🎯 Neurosymbolic AI, probabilistic ML, trustworthiness, AI for science. See my website for more details: zzeng.me 📬 If you're interested, apply and mention my name in your application: engineering.virginia.edu/department/com…

thumb_up_off_alt222

chat_bubble_outline3

repeat73

shareShare

Benjie Wang

@benjiewang_cs

a year ago

You have some model/knowledge (e.g. Bayes Net, Probabilistic/Logic Program, DB) and some query (e.g. MAP, Causal Adjustment) you want to ask. When can you compute this efficiently? Find out @ NeurIPS today in Poster Session 6 East, #3801. Paper: arxiv.org/abs/2412.05481

thumb_up_off_alt14

chat_bubble_outline1

repeat3

shareShare

Daniel Israel

@danielmisrael

a year ago

I really enjoyed contributing to this project and am excited to share what we have built!

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

9 months ago

Enabling Autoregressive Models to Fill In Masked Tokens Hybrid autoregressive and masked language model for infilling by training a linear decoder that takes their concatenated hidden states as input. Provides faster inference with KV caching. MARIA significantly outperforms

thumb_up_off_alt117

chat_bubble_outline5

repeat15

shareShare

Siyan Zhao

@siyan_zhao

9 months ago

Excited to release PrefEval (ICLR '25 Oral), a benchmark for evaluating LLMs’ ability to infer, memorize, and adhere to user preferences in long-context conversations! ⚠️We find that cutting-edge LLMs struggle to follow user preferences—even in short contexts. This isn't just

thumb_up_off_alt133

chat_bubble_outline3

repeat26

shareShare

Aditya Grover

@adityagrover_

9 months ago

A few months ago, we started Inception Labs, a new generative AI startup with a rockstar founding team. At Inception, we are challenging the status quo for language generation. Our first results bring blazing fast speeds at 1000+ tokens/sec while matching the quality of leading

thumb_up_off_alt608

chat_bubble_outline66

repeat41

shareShare

Hritik Bansal

@hbxnov

9 months ago

Video generative models hold the promise of being general-purpose simulators of the physical world 🤖 How far are we from this goal❓ 📢Excited to announce VideoPhy-2, the next edition in the series to test the physical likeness of the generated videos for real-world actions. 🧵

thumb_up_off_alt54

chat_bubble_outline1

repeat22

shareShare

Zilei Shao

@zileishao

9 months ago

What happens if we tokenize cat as [ca, t] rather than [cat]? LLMs are trained on just one tokenization per word, but they still understand alternative tokenizations. We show that this can be exploited to bypass safety filters without changing the text itself. #AI #LLMs #Token

thumb_up_off_alt20

chat_bubble_outline2

repeat3

shareShare

Hritik Bansal

@hbxnov

8 months ago

📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️ Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵

thumb_up_off_alt260

chat_bubble_outline2

repeat49

shareShare

Shufan (Jack) Li

@li78658171

6 months ago

📢(1/11)Diffusion LMs are fast and controllable at inference time! But why restrict such benefits for processing text data? We are excited to announce LaViDa, one of the first and fastest large diffusion LM for vision-language understanding!!

thumb_up_off_alt168

chat_bubble_outline3

repeat38

shareShare

Lucas Bandarkar

@lucasbandarkar

6 months ago

The unreasonable effectiveness of model merging for cross-lingual transfer ! Our preprint evaluates a number of *modular* approaches to fine-tuning LLMs that "assign" model params to either task or language. Surprisingly, merging experts beats all ! 🧵1/4 arxiv.org/abs/2505.18356

thumb_up_off_alt129

chat_bubble_outline2

repeat21

shareShare