Dong Yu (@dong_yu_ai) Twitter Tweets • TwiCopy

Dong Yu

@dong_yu_ai

+ Follow

An AI Researcher

ID: 1586172592335237120

calendar_today29-10-2022 01:46:48

4 Tweet

9 Followers

135 Following

Game Theory Papers

@do

a year ago

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning. arxiv.org/abs/2407.00617

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

Tencent presents Video-to-Audio Generation with Hidden Alignment Generating semantically and temporally aligned audio content in accordance with video input has become a focal point for researchers, particularly following the remarkable breakthrough in text-to-video generation.

thumb_up_off_alt142

chat_bubble_outline1

repeat28

shareShare

Zhaopeng Tu

@tuzhaopeng

9 months ago

Can reinforcement learning scale beyond math and coding tasks? Introducing Reinforcement Learning with Verifiable Rewards (RLVR) across diverse, less-structured domains (e.g., medicine, chemistry, psychology, economics, and education), where well-structured reference answers

thumb_up_off_alt745

chat_bubble_outline18

repeat149

shareShare

Dong Yu

@dong_yu_ai

6 months ago

We are pleased to open-source our recent work in music/song generation. It's among the top models available so far. Huggingface: lnkd.in/gE2PsY8X Code: lnkd.in/gFY-K9Ye Paper: lnkd.in/gNw8dVHV Experiencing: lnkd.in/gDrj_j6S

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Dong Yu

@dong_yu_ai

5 months ago

We have some interesting findings in our recent work "One Token to Fool LLM-as-a-Judge" (arxiv.org/abs/2507.08794) that will affect RLVR with generative reward models.

thumb_up_off_alt13

chat_bubble_outline0

repeat5

shareShare

Wenhao Yu

@wyu_nd

4 months ago

𝑳𝑳𝑴𝒔 can really 𝑺𝒆𝒍𝒇-𝑬𝒗𝒐𝒍𝒗𝒆, 𝒘𝒊𝒕𝒉𝒐𝒖𝒕 𝑯𝒖𝒎𝒂𝒏 𝑫𝒂𝒕𝒂! -- One LLM, two roles: Challenger creates tasks, Solver answers them. -- No data, no labels, just a base model that learns and improves itself! We name it 𝑹-𝒛𝒆𝒓𝒐: arxiv.org/abs/2508.05004

thumb_up_off_alt883

chat_bubble_outline17

repeat158

shareShare

Dong Yu

@dong_yu_ai

4 months ago

I gave an invited survey talk at Interspeech 2025 today on the topic of Conversational Agent. The slide deck is available at sites.google.com/view/dongyu888…

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Dong Yu

Game Theory Papers

AK

Zhaopeng Tu

Dong Yu

Dong Yu

Wenhao Yu

Dong Yu