Yuqing Yang (@yyqcode) Twitter Tweets • TwiCopy

Qinyuan Ye (👀Jobs)

a year ago

I'll present a poster for Lifelong ICL and Task Haystack at #NeurIPS2024! ⏰ Wednesday 11am-2pm 📍 East Exhibit Hall A-C #2802 📜 arxiv.org/abs/2407.16695 My co-first author Xiaoyue Xu is applying to PhD programs and I am looking jobs in industry! Happy to connect at NeurIPS!

thumb_up_off_alt35

chat_bubble_outline0

repeat6

shareShare

Tengxiao Liu

@tengxiaoliu

a year ago

Come join the #NeurIPS2024 poster session and discuss whether language models can learn to skip steps in reasoning! 🗓Dec 12, Thursday, 11:00 am - 2:00 pm 📍East Exhibit Hall A-C #2900 Feel free to stop by and say hi! I am actively seeking Summer 2025 internship opportunities!

thumb_up_off_alt13

chat_bubble_outline0

repeat3

shareShare

Muru Zhang

@zhang_muru

10 months ago

Running your model on multiple GPUs but often found the speed not satisfiable? We introduce Ladder-residual, a parallelism-aware architecture modification that makes 70B Llama with tensor parallelism ~30% faster! Work done at Together AI. Co-1st author with Mayank Mishra

thumb_up_off_alt322

chat_bubble_outline5

repeat61

shareShare

Tianyi Zhou

@tianyi_zhou12

9 months ago

Billion-parameter LLMs still struggle with simple arithmetic? 📞 FoNE (Fourier Number Embedding) tackles this problem. By mapping numbers directly into Fourier space, it bypasses tokenization and significantly improves numerical accuracy with better efficiency and accuracy.

thumb_up_off_alt22

chat_bubble_outline1

repeat12

shareShare

Linxin Song

@linxins2

8 months ago

Want to know what your LLM don’t know? This is how 👇 Preprint: arxiv.org/abs/2503.23361 Code: github.com/uscnlp-lime/SEA

thumb_up_off_alt77

chat_bubble_outline3

repeat22

shareShare

Deqing Fu

@deqingfu

6 months ago

Textual steering vectors can improve visual understanding in multimodal LLMs! You can extract steering vectors via any interpretability toolkit you like -- SAEs, MeanShift, Probes -- and apply them to image or text tokens (or both) of Multimodal LLMs. And They Steer!

thumb_up_off_alt48

chat_bubble_outline1

repeat14

shareShare

Linxin Song

@linxins2

6 months ago

🚨 We discovered a surprising side effect of Reinforcement Finetuning (RFT): it makes LLMs more confidently wrong on unanswerable questions. We call this the hallucination tax: a drop in refusal behavior that leads to overconfident hallucinations. 🧵 1/n

thumb_up_off_alt269

chat_bubble_outline5

repeat41

shareShare

Dongwei Jiang

@dongwei__jiang

5 months ago

🧵 Recent studies show LLMs can self-improve their responses when given external feedback. But how effectively can they incorporate it? We tested this systematically—and found they can't fully integrate feedback, even when the feedback is high-quality and backed by ground-truth.

thumb_up_off_alt108

chat_bubble_outline3

repeat31

shareShare

Xi Ye

@xiye_nlp

5 months ago

There’s been hot debate about (The Illusion of) The Illusion of Thinking. My take: it’s not that models can’t reason — they just aren’t perfect at long-form generation yet. We eval reasoning models on LongProc benchmark (requiring generating 8K CoTs, see thread). Reasoning

thumb_up_off_alt34

chat_bubble_outline1

repeat8

shareShare

Chenxin An

@anchancy46881

5 months ago

# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels

thumb_up_off_alt441

chat_bubble_outline23

repeat80

shareShare

Johnny Tian-Zheng Wei

@johntzwei

a month ago

Announcing 🔭✨Hubble, a suite of open-source LLMs to advance the study of memorization! Pretrained models up to 8B params, with controlled insertion of texts (e.g., book passages, biographies, test sets, and more!) designed to emulate key memorization risks 🧵

thumb_up_off_alt113

chat_bubble_outline2

repeat38

shareShare