Runlong Zhou (@vectorzhou) Twitter Tweets • TwiCopy

Runlong Zhou

@vectorzhou

+ Follow

PhD student @uwcse; Prev: Undergrad at IIIS, @Tsinghua_Uni

ID: 3437500454

linkhttp://vectorzhou.com calendar_today03-09-2015 13:58:28

86 Tweet

184 Takipçi

298 Takip Edilen

AniPlaylist

@aniplaylist

a year ago

🚨 FINAL FANTASY XVI Original Soundtracks on music streaming platforms At midnight, on September 18, FINAL FANTASY XVI & DLC OSTs will finally be released on music streaming platforms! 🔥 Links for Spotify & Apple Music below

thumb_up_off_alt6,6K

chat_bubble_outline78

repeat1,1K

shareShare

Runlong Zhou

@vectorzhou

a year ago

Good choice, EMNLP😉

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare

Runlong Zhou

@vectorzhou

a year ago

Yifang is more than an amazing colleague and collaborator. Chat!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Runlong Zhou

@vectorzhou

7 months ago

Feel free to stop by our poster if you are interested in online DPO and RLHF. Our proposed method -- a mixture of uniform sampler and policy-dependent sampler -- provably achieves a quadratic convergence rate!

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Avinandan Bose

@avibose22

7 months ago

🧠 Your LLM should model how you think, not reduce you to preassigned traits 📢 Introducing LoRe: a low-rank reward modeling framework for personalized RLHF ❌ Demographic grouping/handcrafted traits ✅ Infers implicit preferences ✅ Few-shot adaptation 📄 arxiv.org/abs/2504.14439

thumb_up_off_alt110

chat_bubble_outline2

repeat26

shareShare

Ruizhe Shi

@smellycat_zzz

6 months ago

Two-stage RLHF or one-stage DPO: Which one is better for learning from preferences? Equal under strong assumptions, but representation differences break the tie. Our paper reveals their fine-grained performance gaps under various conditions. paper: arxiv.org/abs/2505.19770

thumb_up_off_alt53

chat_bubble_outline3

repeat13

shareShare

elvis

@omarsar0

6 months ago

The Illusion of Thinking in LLMs Apple researchers discuss the strengths and limitations of reasoning models. Apparently, reasoning models "collapse" beyond certain task complexities. Lots of important insights on this one. (bookmark it!) Here are my notes:

thumb_up_off_alt4,4K

chat_bubble_outline135

repeat652

shareShare

Runlong Zhou

@vectorzhou

5 months ago

Thrilled to announce that CASCADE is accepted to #COLM2025! It is one of the most interesting project I've done this far -- so I'm grateful to all the reviewers and ACs for helping me to improve the paper. Most importantly! Thank you Yi Zhang for your mentorship and support

thumb_up_off_alt29

chat_bubble_outline1

repeat3

shareShare

Runlong Zhou

@vectorzhou

4 months ago

Transformers can be exponentially more efficient than RNNs in code analysis tasks. Check out our #COLM2025 paper!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Ruoming Pang

@ruomingpang

4 months ago

In this report we describe the 2025 Apple Foundation Models ("AFM"). We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device AFM model. machinelearning.apple.com/research/apple…

thumb_up_off_alt446

chat_bubble_outline297

repeat88

shareShare

Yi Wu

@jxwuyi

4 months ago

Tired intricate system code for RL training? 🤯 We release AReaL-lite – A lightweight AReaL version for AI researchers! 🚀#opensource ✨ Algorithm-first design & APIs🎉 ✨ 80% less code w. 90% AReaL's full efficiency 🎉 ✨ Customizable agentic RL🎉 🔗 github.com/inclusionAI/AR…

thumb_up_off_alt63

chat_bubble_outline3

repeat22

shareShare