Genglin Liu (@genglin_liu) 's Twitter Profile
Genglin Liu

@genglin_liu

PhD student @UCLA, #NLP

ID: 1620555839630184448

linkhttps://genglinliu.github.io/ calendar_today31-01-2023 22:53:27

58 Tweet

173 Takipçi

516 Takip Edilen

Dan Hendrycks (@danhendrycks) 's Twitter Profile Photo

As an alternative to RLHF and adversarial training, we released short-circuiting. It makes models ~100x more robust. It works for LLMs, multimodal models, and agents. Unlike before, I now think robustly stopping models from generating harmful outputs may be highly tractable and

As an alternative to RLHF and adversarial training, we released short-circuiting.
It makes models ~100x more robust. It works for LLMs, multimodal models, and agents.

Unlike before, I now think robustly stopping models from generating harmful outputs may be highly tractable and
Heng Ji (@hengjinlp) 's Twitter Profile Photo

We have won two NAACL2024 Outstanding Paper Awards! Congratulations to Chi Han, Shizhe Diao, Yi Fung, Xingyao Wang, Yangyi Chen and all students and collaborators! Chi Han Chi Han will be on academic job market next year! arxiv.org/pdf/2308.16137 arxiv.org/pdf/2311.09677

We have won two NAACL2024 Outstanding Paper Awards! Congratulations to Chi Han, Shizhe Diao, Yi Fung, Xingyao Wang, Yangyi Chen and all students and collaborators! Chi Han <a href="/Glaciohound/">Chi Han</a> will be on academic job market next year! 
arxiv.org/pdf/2308.16137
arxiv.org/pdf/2311.09677
Shizhe Diao (@shizhediao) 's Twitter Profile Photo

Excited to share our R-Tuning got an outstanding paper award@NAACL 2024! Take a look at this paper to see how to align your LLMs to honesty. arxiv.org/abs/2311.09677 This work is finished during my visit at UIUC. Thanks for Prof. Ji and Prof. Zhang’s supervision!

Chi Han (@glaciohound) 's Twitter Profile Photo

🎖 Excited to receive an outstanding paper award at NAACL2024 for LM-Infinite "Zero-Shot Extreme Length Generalization for Large Language Models" work! We extend to 200M length with no parameter updates, with downstream improvements arxiv.org/abs/2308.16137 github.com/Glaciohound/LM…

Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

If you're attending ICML 2024, join my 2-hour tutorial on Monday July 22 to explore the Physics of Language Model - all 6 parts. Visit: physics.allen-zhu.com and it will be live-streamed on Zoom. BONUS: this is the premiere of Part 2.1 + 2.2, don't miss out! #ICML2024 #MetaAI

If you're attending ICML 2024, join my 2-hour tutorial on Monday July 22 to explore the Physics of Language Model - all 6 parts. Visit: physics.allen-zhu.com and it will be live-streamed on Zoom. BONUS: this is the premiere of Part 2.1 + 2.2, don't miss out!  #ICML2024 #MetaAI
Haoyi Qiu (@haoyiqiu) 's Twitter Profile Photo

🌐 Are LLM agents prepared to navigate the rich diversity of cultural and social norms? 🏠 CASA tests them on real-world tasks like online shopping and social discussion forums, revealing that current agents show less than 10% awareness and over 40% norm violations. 🧠 We’re

🌐 Are LLM agents prepared to navigate the rich diversity of cultural and social norms? 🏠 CASA tests them on real-world tasks like online shopping and social discussion forums, revealing that current agents show less than 10% awareness and over 40% norm violations.

🧠 We’re
Yu (Bryan) Zhou (@yu_bryan_zhou) 's Twitter Profile Photo

📢 A single line of code to thoroughly evaluate your LLM for Embodied Decision Making 📢 Please checkout our new NeurIPS D&B Oral Paper!! (Part-1 of my summer intern works Stanford Vision and Learning Lab)

Liwei Jiang (@liweijianglw) 's Twitter Profile Photo

I'm thrilled to share that our Delphi paper is officially published today at Nature Machine Intelligence after almost four years of hard works from all my amazing collaborators (a quite insane timeline considering the rapid AI world)! Special thanks to the unwavering support of my advisor,

I'm thrilled to share that our Delphi paper is officially published today at <a href="/NatMachIntell/">Nature Machine Intelligence</a> after almost four years of hard works from all my amazing collaborators (a quite insane timeline considering the rapid AI world)! Special thanks to the unwavering support of my advisor,
Zhenhailong Wang (@zhenhailongw) 's Twitter Profile Photo

📱Current mobile agents struggle with real-world tasks that align with human needs—like finding the best deal across 3 apps. 💸 Introducing Mobile-Agent-E: a novel mobile assistant designed for complex, long-horizon tasks and capable of self-evolving🐣🐥through experience. 🧵1/3

📱Current mobile agents struggle with real-world tasks that align with human needs—like finding the best deal across 3 apps. 💸
Introducing Mobile-Agent-E: a novel mobile assistant designed for complex, long-horizon tasks and capable of self-evolving🐣🐥through experience.
🧵1/3
Yuji Zhang (@yuji_zhang_nlp) 's Twitter Profile Photo

🔍New findings of knowledge overshadowing! Why do LLMs hallucinate over all true training data? 🤔Can we predict hallucinations even before model training or inference? 🚀Check out our new preprint: [arxiv.org/pdf/2502.16143] The Law of Knowledge Overshadowing: Towards

🔍New findings of knowledge overshadowing! Why do LLMs hallucinate over all true training data? 🤔Can we predict hallucinations even before model training or inference? 
🚀Check out our new preprint: [arxiv.org/pdf/2502.16143] The Law of Knowledge Overshadowing: Towards
Salman (@salman1422571) 's Twitter Profile Photo

🚨 Excited to share our new paper on 𝕏-Teaming! 🤖 Multiagent system for multiturn jaibreaking 🔍 96.2% attack success against Claude 3.7 (immune to single-turn attacks!) 💥 Upto 98.1% attack success on leading model 🛡️ Released 30K safety dataset 🧵below #AI #LLMSafety

🚨 Excited to share our new paper on 𝕏-Teaming!

🤖 Multiagent system for multiturn jaibreaking

🔍 96.2% attack success against Claude 3.7 (immune to single-turn attacks!) 

💥 Upto 98.1% attack success on leading model

🛡️ Released 30K safety dataset

🧵below 
#AI #LLMSafety
Salman (@salman1422571) 's Twitter Profile Photo

🚨Thrilled to share our new work: AI debate combats misinformation better than single AI advisors! 🤔We tested if two AIs debating opposite sides helps biased humans judge controversial COVID-19 claims more accurately. Paper: arxiv.org/abs/2506.02175 🧵👇 #AI #Debate

🚨Thrilled to share our new work: AI debate combats misinformation better than single AI advisors!

🤔We tested if two AIs debating opposite sides helps biased humans judge controversial COVID-19 claims more accurately.

Paper: arxiv.org/abs/2506.02175
🧵👇
#AI #Debate
Liwei Jiang (@liweijianglw) 's Twitter Profile Photo

Wondering whether AI debates can drive biased perspectives toward truth? Our answer is YES and this scalable oversight work is now accepted to #NeurIPS2025 ! Finally bringing a large-scale human study into an AI conference! (+++ my first time as a last-ish author is very fun!

Liwei Jiang (@liweijianglw) 's Twitter Profile Photo

(Thu Oct 9, 11:00am–1:00pm) Poster Session 5 𝐏𝐨𝐬𝐭𝐞𝐫 #𝟒𝟒: X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents; w/ amazing co-leads Salman James Shiffer In this work, we introduce a 𝐜𝐨𝐦𝐩𝐫𝐞𝐡𝐞𝐧𝐬𝐢𝐯𝐞 and 𝐞𝐚𝐬𝐲-𝐭𝐨-𝐫𝐮𝐧

(Thu Oct 9, 11:00am–1:00pm) Poster Session 5

𝐏𝐨𝐬𝐭𝐞𝐫 #𝟒𝟒: X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents; w/ amazing co-leads <a href="/salman1422571/">Salman</a> <a href="/jamesnshiffer/">James Shiffer</a> 

In this work, we introduce a 𝐜𝐨𝐦𝐩𝐫𝐞𝐡𝐞𝐧𝐬𝐢𝐯𝐞 and 𝐞𝐚𝐬𝐲-𝐭𝐨-𝐫𝐮𝐧