Rongwu Xu (@rongwu_xu) Twitter Tweets • TwiCopy

Rongwu Xu

@rongwu_xu

+ Follow

MS, BEng @Tsinghua_Uni. Prev @AlibabaGroup. I work on #AI #NLProc+Safety/Psychology/CogSci/CSS

ID: 1759949458119307264

linkhttps://rongwuxu.com calendar_today20-02-2024 14:33:51

136 Tweet

266 Takipçi

255 Takip Edilen

Owain Evans

@owainevans_uk

9 months ago

Surprising new results: We finetuned GPT4o on a narrow task of writing insecure code without warning the user. This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis. This is *emergent misalignment* & we cannot fully explain it 🧵

thumb_up_off_alt6,6K

chat_bubble_outline432

repeat984

shareShare

Zhijiang Guo

@zhijiangg

9 months ago

🚀Exciting to see how recent advancements like OpenAI’s O1/O3 & DeepSeek’s R1 are pushing the boundaries! Check out our latest survey on Complex Reasoning with LLMs. Analyzed over 300 papers to explore the progress. Paper: arxiv.org/pdf/2502.17419 Github: github.com/zzli2022/Aweso…

thumb_up_off_alt158

chat_bubble_outline2

repeat63

shareShare

Wenhu Chen

@wenhuchen

8 months ago

As a researcher, it's easy to get distracted by what others are working on. I've seen many people conducting research on problems they don't genuinely care about—just because the community values them (e.g., solving Math Olympiad problems). It's important to focus on research

thumb_up_off_alt440

chat_bubble_outline18

repeat42

shareShare

yuwen lu

@yuwen_lu_

8 months ago

much of interpretability are hci problems. a lot of it is to help humans understand what this black box algorithm is. at a meta level, mech interp researchers deal with hci problems every day. visualization, direct manipulation, mental model, etc. are all classic hci things!

thumb_up_off_alt50

chat_bubble_outline3

repeat5

shareShare

Csaba Szepesvari

@csabaszepesvari

23 days ago

Andrej Karpathy Andrej Karpathy I think it would be good to distinguish RL as a problem from the algorithms that people use to address RL problems. This would allow us to discuss if the problem is with the algorithms, or if the problem is with posing a problem as an RL problem. 1/x

thumb_up_off_alt415

chat_bubble_outline9

repeat39

shareShare

Rongwu Xu

@rongwu_xu

23 days ago

Great work! I am thinking about combining this with environment generation arxiv.org/pdf/2402.15391 as a way to create "endlessly" diverse data/envs to improve AI

thumb_up_off_alt1

chat_bubble_outline1

repeat2

shareShare

Tim Althoff

@timalthoff

14 days ago

(please reshare) I'm recruiting multiple PhD students and Postdocs Allen School UW NLP (bdata.uw.edu). Focus areas incl. psychosocial AI simulation and safety, Human-AI collaboration. PhD: cs.washington.edu/academics/phd/… Postdocs: docs.google.com/document/d/1h4…

(please reshare) I'm recruiting multiple PhD students and Postdocs <a href="/uwcse/">Allen School</a> <a href="/uwnlp/">UW NLP</a>
(bdata.uw.edu). Focus areas incl. psychosocial AI simulation and safety, Human-AI collaboration.

PhD: cs.washington.edu/academics/phd/…

Postdocs: docs.google.com/document/d/1h4…

thumb_up_off_alt395

chat_bubble_outline8

repeat110

shareShare

Ken Gu

@kenqgu

12 days ago

True intelligence = reasoning about new information, not memorized facts. How can we scalably create benchmarks that are completely novel yet have known answers? Meet SynthWorlds, an eval & data-gen framework to disentangle reasoning and knowledge⬇️🧵 📄arxiv.org/pdf/2510.24427

thumb_up_off_alt105

chat_bubble_outline4

repeat14

shareShare

🔥 Matt Dancho (Business Science) 🔥

@mdancho84

5 days ago

🔥 GPT-6 may not just be smarter. It literally might be alive (in the computational sense). A new research paper, SEAL: Self-Adapting Language Models (arXiv:2506.10943), describes how an AI can continuously learn after deployment, evolving its own internal representations

thumb_up_off_alt689

chat_bubble_outline45

repeat114

shareShare

Jason Weston

@jaseweston

5 days ago

Scaling Agent Learning via Experience Synthesis 📝: arxiv.org/abs/2511.03773 Scaling training environments for RL by simulating them with reasoning LLMs! Environment models + Replay-buffer + New tasks = cheap RL for any environments! - Strong improvements over non-RL-ready

thumb_up_off_alt525

chat_bubble_outline17

repeat100

shareShare