Wenda Xu (@wendaxu2) 's Twitter Profile
Wenda Xu

@wendaxu2

I work on evaluation of AI-generated text and LLM post-training. Research Scientist @GoogleAI. PhD @UCSB

ID: 1448188793794686979

linkhttps://xu1998hz.github.io calendar_today13-10-2021 07:28:17

304 Tweet

1,1K Followers

366 Following

Jiachen Li (@jiachenli11) 's Twitter Profile Photo

I am actively seeking industrial opportunities and bring expertise in training T2I and T2V diffusion models, along with a strong background in Deep RL. If you think my skills align with your needs, feel free to reach out!

Wenda Xu (@wendaxu2) 's Twitter Profile Photo

Michael is a rising star, fun labmate and great person! Please consider hiring him. He will surprise you with his Mandarin skill 🤓

Xuandong Zhao (@xuandongzhao) 's Twitter Profile Photo

I am deeply sorry and heartbroken over the loss of Felix Hill. His post docs.google.com/document/d/1aE… is a poignant reminder of the mental health challenges we face in the fast-paced and high-pressure AI field. Lately, I’ve also been feeling overwhelmed by the rapid advancements in

Yuchen Jin (@yuchenj_uw) 's Twitter Profile Photo

This "Aha moment" in the DeepSeek-R1 paper is huge: Pure reinforcement learning (RL) enables an LLM to automatically learn to think and reflect. This challenges the prior belief that replicating OpenAI's o1 reasoning models requires extensive CoT data. It turns out you just

This "Aha moment" in the DeepSeek-R1 paper is huge:

Pure reinforcement learning (RL) enables an LLM to automatically learn to think and reflect.

This challenges the prior belief that replicating OpenAI's o1 reasoning models requires extensive CoT data. It turns out you just
Xiao Pu (@xiaosophiapu) 's Twitter Profile Photo

🚀 Excited to share that our work has been accepted to #NAACL2025! We show that LLM watermarks can be removed in a black-box setting 🛠️ For more details: arxiv.org/abs/2411.01222

Lewis Tunstall (@_lewtun) 's Twitter Profile Photo

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open! 🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1. 🧠

Wenda Xu (@wendaxu2) 's Twitter Profile Photo

Are there any papers that theoretically or quantitatively demonstrate that training a language understanding model, like a metric or reward model, is easier than training a language generation model? Alternatively, should I justify this based on the differences in the output

Lei Li (@lileics) 's Twitter Profile Photo

a newly baked Dr. Congratulations to Wenda Xu for successfully defending his phd thesis "On Evaluation and Efficient Post-training for LLMs". Highly recommend his slides: covering RL training, better KD, LLM/text gen evaluation, bias in LLM as a judge: docs.google.com/presentation/d…