Rongwu Xu (@rongwu_xu) 's Twitter Profile
Rongwu Xu

@rongwu_xu

MS, BEng @Tsinghua_Uni. Prev @AlibabaGroup. I work on #AI #NLProc+Safety/Psychology/CogSci/CSS

ID: 1759949458119307264

linkhttps://rongwuxu.com calendar_today20-02-2024 14:33:51

136 Tweet

266 Takipçi

255 Takip Edilen

Owain Evans (@owainevans_uk) 's Twitter Profile Photo

Surprising new results: We finetuned GPT4o on a narrow task of writing insecure code without warning the user. This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis. This is *emergent misalignment* & we cannot fully explain it 🧵

Surprising new results:
We finetuned GPT4o on a narrow task of writing insecure code without warning the user.
This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis.

This is *emergent misalignment* & we cannot fully explain it 🧵
Zhijiang Guo (@zhijiangg) 's Twitter Profile Photo

🚀Exciting to see how recent advancements like OpenAI’s O1/O3 & DeepSeek’s R1 are pushing the boundaries! Check out our latest survey on Complex Reasoning with LLMs. Analyzed over 300 papers to explore the progress. Paper: arxiv.org/pdf/2502.17419 Github: github.com/zzli2022/Aweso…

🚀Exciting to see how recent advancements like OpenAI’s O1/O3 & DeepSeek’s R1 are pushing the boundaries! 
Check out our latest survey on Complex Reasoning with LLMs. Analyzed over 300 papers to explore the progress.
Paper: arxiv.org/pdf/2502.17419
Github: github.com/zzli2022/Aweso…
Wenhu Chen (@wenhuchen) 's Twitter Profile Photo

As a researcher, it's easy to get distracted by what others are working on. I've seen many people conducting research on problems they don't genuinely care about—just because the community values them (e.g., solving Math Olympiad problems). It's important to focus on research

yuwen lu (@yuwen_lu_) 's Twitter Profile Photo

much of interpretability are hci problems. a lot of it is to help humans understand what this black box algorithm is. at a meta level, mech interp researchers deal with hci problems every day. visualization, direct manipulation, mental model, etc. are all classic hci things!

Csaba Szepesvari (@csabaszepesvari) 's Twitter Profile Photo

Andrej Karpathy Andrej Karpathy I think it would be good to distinguish RL as a problem from the algorithms that people use to address RL problems. This would allow us to discuss if the problem is with the algorithms, or if the problem is with posing a problem as an RL problem. 1/x

Rongwu Xu (@rongwu_xu) 's Twitter Profile Photo

Great work! I am thinking about combining this with environment generation arxiv.org/pdf/2402.15391 as a way to create "endlessly" diverse data/envs to improve AI

Tim Althoff (@timalthoff) 's Twitter Profile Photo

(please reshare) I'm recruiting multiple PhD students and Postdocs Allen School UW NLP (bdata.uw.edu). Focus areas incl. psychosocial AI simulation and safety, Human-AI collaboration. PhD: cs.washington.edu/academics/phd/… Postdocs: docs.google.com/document/d/1h4…

(please reshare) I'm recruiting multiple PhD students and Postdocs <a href="/uwcse/">Allen School</a> <a href="/uwnlp/">UW NLP</a>
(bdata.uw.edu). Focus areas incl. psychosocial AI simulation and safety, Human-AI collaboration.

PhD: cs.washington.edu/academics/phd/…

Postdocs: docs.google.com/document/d/1h4…
Ken Gu (@kenqgu) 's Twitter Profile Photo

True intelligence = reasoning about new information, not memorized facts. How can we scalably create benchmarks that are completely novel yet have known answers? Meet SynthWorlds, an eval & data-gen framework to disentangle reasoning and knowledge⬇️🧵 📄arxiv.org/pdf/2510.24427

True intelligence = reasoning about new information, not memorized facts.

How can we scalably create benchmarks that are completely novel yet have known answers?

Meet SynthWorlds, an eval &amp; data-gen framework to disentangle reasoning and knowledge⬇️🧵

📄arxiv.org/pdf/2510.24427
🔥 Matt Dancho (Business Science) 🔥 (@mdancho84) 's Twitter Profile Photo

🔥 GPT-6 may not just be smarter. It literally might be alive (in the computational sense). A new research paper, SEAL: Self-Adapting Language Models (arXiv:2506.10943), describes how an AI can continuously learn after deployment, evolving its own internal representations

🔥 GPT-6 may not just be smarter. 

It literally might be alive (in the computational sense).

A new research paper, SEAL: Self-Adapting Language Models (arXiv:2506.10943), describes how an AI can continuously learn after deployment, evolving its own internal representations
Jason Weston (@jaseweston) 's Twitter Profile Photo

Scaling Agent Learning via Experience Synthesis 📝: arxiv.org/abs/2511.03773 Scaling training environments for RL by simulating them with reasoning LLMs! Environment models + Replay-buffer + New tasks = cheap RL for any environments! - Strong improvements over non-RL-ready

Scaling Agent Learning via Experience Synthesis
📝: arxiv.org/abs/2511.03773

Scaling training environments for RL by simulating them with reasoning LLMs!

Environment models + Replay-buffer + New tasks = cheap RL for any environments!

- Strong improvements over non-RL-ready