Genglin Liu (@genglin_liu) 's Twitter Profile
Genglin Liu

@genglin_liu

PhD student @UCLA, #NLP

ID: 1620555839630184448

linkhttps://genglinliu.github.io/ calendar_today31-01-2023 22:53:27

58 Tweet

173 Followers

516 Following

Dan Hendrycks (@danhendrycks) 's Twitter Profile Photo

As an alternative to RLHF and adversarial training, we released short-circuiting. It makes models ~100x more robust. It works for LLMs, multimodal models, and agents. Unlike before, I now think robustly stopping models from generating harmful outputs may be highly tractable and

As an alternative to RLHF and adversarial training, we released short-circuiting.
It makes models ~100x more robust. It works for LLMs, multimodal models, and agents.

Unlike before, I now think robustly stopping models from generating harmful outputs may be highly tractable and
Heng Ji (@hengjinlp) 's Twitter Profile Photo

We have won two NAACL2024 Outstanding Paper Awards! Congratulations to Chi Han, Shizhe Diao, Yi Fung, Xingyao Wang, Yangyi Chen and all students and collaborators! Chi Han Chi Han will be on academic job market next year! arxiv.org/pdf/2308.16137 arxiv.org/pdf/2311.09677

We have won two NAACL2024 Outstanding Paper Awards! Congratulations to Chi Han, Shizhe Diao, Yi Fung, Xingyao Wang, Yangyi Chen and all students and collaborators! Chi Han <a href="/Glaciohound/">Chi Han</a> will be on academic job market next year! 
arxiv.org/pdf/2308.16137
arxiv.org/pdf/2311.09677
Shizhe Diao (@shizhediao) 's Twitter Profile Photo

Excited to share our R-Tuning got an outstanding paper award@NAACL 2024! Take a look at this paper to see how to align your LLMs to honesty. arxiv.org/abs/2311.09677 This work is finished during my visit at UIUC. Thanks for Prof. Ji and Prof. Zhangโ€™s supervision!

Chi Han (@glaciohound) 's Twitter Profile Photo

๐ŸŽ– Excited to receive an outstanding paper award at NAACL2024 for LM-Infinite "Zero-Shot Extreme Length Generalization for Large Language Models" work! We extend to 200M length with no parameter updates, with downstream improvements arxiv.org/abs/2308.16137 github.com/Glaciohound/LMโ€ฆ

Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

If you're attending ICML 2024, join my 2-hour tutorial on Monday July 22 to explore the Physics of Language Model - all 6 parts. Visit: physics.allen-zhu.com and it will be live-streamed on Zoom. BONUS: this is the premiere of Part 2.1 + 2.2, don't miss out! #ICML2024 #MetaAI

If you're attending ICML 2024, join my 2-hour tutorial on Monday July 22 to explore the Physics of Language Model - all 6 parts. Visit: physics.allen-zhu.com and it will be live-streamed on Zoom. BONUS: this is the premiere of Part 2.1 + 2.2, don't miss out!  #ICML2024 #MetaAI
Haoyi Qiu (@haoyiqiu) 's Twitter Profile Photo

๐ŸŒ Are LLM agents prepared to navigate the rich diversity of cultural and social norms? ๐Ÿ  CASA tests them on real-world tasks like online shopping and social discussion forums, revealing that current agents show less than 10% awareness and over 40% norm violations. ๐Ÿง  Weโ€™re

๐ŸŒ Are LLM agents prepared to navigate the rich diversity of cultural and social norms? ๐Ÿ  CASA tests them on real-world tasks like online shopping and social discussion forums, revealing that current agents show less than 10% awareness and over 40% norm violations.

๐Ÿง  Weโ€™re
Yu (Bryan) Zhou (@yu_bryan_zhou) 's Twitter Profile Photo

๐Ÿ“ข A single line of code to thoroughly evaluate your LLM for Embodied Decision Making ๐Ÿ“ข Please checkout our new NeurIPS D&B Oral Paper!! (Part-1 of my summer intern works Stanford Vision and Learning Lab)

Liwei Jiang (@liweijianglw) 's Twitter Profile Photo

I'm thrilled to share that our Delphi paper is officially published today at Nature Machine Intelligence after almost four years of hard works from all my amazing collaborators (a quite insane timeline considering the rapid AI world)! Special thanks to the unwavering support of my advisor,

I'm thrilled to share that our Delphi paper is officially published today at <a href="/NatMachIntell/">Nature Machine Intelligence</a> after almost four years of hard works from all my amazing collaborators (a quite insane timeline considering the rapid AI world)! Special thanks to the unwavering support of my advisor,
Zhenhailong Wang (@zhenhailongw) 's Twitter Profile Photo

๐Ÿ“ฑCurrent mobile agents struggle with real-world tasks that align with human needsโ€”like finding the best deal across 3 apps. ๐Ÿ’ธ Introducing Mobile-Agent-E: a novel mobile assistant designed for complex, long-horizon tasks and capable of self-evolving๐Ÿฃ๐Ÿฅthrough experience. ๐Ÿงต1/3

๐Ÿ“ฑCurrent mobile agents struggle with real-world tasks that align with human needsโ€”like finding the best deal across 3 apps. ๐Ÿ’ธ
Introducing Mobile-Agent-E: a novel mobile assistant designed for complex, long-horizon tasks and capable of self-evolving๐Ÿฃ๐Ÿฅthrough experience.
๐Ÿงต1/3
Yuji Zhang (@yuji_zhang_nlp) 's Twitter Profile Photo

๐Ÿ”New findings of knowledge overshadowing! Why do LLMs hallucinate over all true training data? ๐Ÿค”Can we predict hallucinations even before model training or inference? ๐Ÿš€Check out our new preprint: [arxiv.org/pdf/2502.16143] The Law of Knowledge Overshadowing: Towards

๐Ÿ”New findings of knowledge overshadowing! Why do LLMs hallucinate over all true training data? ๐Ÿค”Can we predict hallucinations even before model training or inference? 
๐Ÿš€Check out our new preprint: [arxiv.org/pdf/2502.16143] The Law of Knowledge Overshadowing: Towards
Salman (@salman1422571) 's Twitter Profile Photo

๐Ÿšจ Excited to share our new paper on ๐•-Teaming! ๐Ÿค– Multiagent system for multiturn jaibreaking ๐Ÿ” 96.2% attack success against Claude 3.7 (immune to single-turn attacks!) ๐Ÿ’ฅ Upto 98.1% attack success on leading model ๐Ÿ›ก๏ธ Released 30K safety dataset ๐Ÿงตbelow #AI #LLMSafety

๐Ÿšจ Excited to share our new paper on ๐•-Teaming!

๐Ÿค– Multiagent system for multiturn jaibreaking

๐Ÿ” 96.2% attack success against Claude 3.7 (immune to single-turn attacks!) 

๐Ÿ’ฅ Upto 98.1% attack success on leading model

๐Ÿ›ก๏ธ Released 30K safety dataset

๐Ÿงตbelow 
#AI #LLMSafety
Salman (@salman1422571) 's Twitter Profile Photo

๐ŸšจThrilled to share our new work: AI debate combats misinformation better than single AI advisors! ๐Ÿค”We tested if two AIs debating opposite sides helps biased humans judge controversial COVID-19 claims more accurately. Paper: arxiv.org/abs/2506.02175 ๐Ÿงต๐Ÿ‘‡ #AI #Debate

๐ŸšจThrilled to share our new work: AI debate combats misinformation better than single AI advisors!

๐Ÿค”We tested if two AIs debating opposite sides helps biased humans judge controversial COVID-19 claims more accurately.

Paper: arxiv.org/abs/2506.02175
๐Ÿงต๐Ÿ‘‡
#AI #Debate
Liwei Jiang (@liweijianglw) 's Twitter Profile Photo

Wondering whether AI debates can drive biased perspectives toward truth? Our answer is YES and this scalable oversight work is now accepted to #NeurIPS2025 ! Finally bringing a large-scale human study into an AI conference! (+++ my first time as a last-ish author is very fun!

Liwei Jiang (@liweijianglw) 's Twitter Profile Photo

(Thu Oct 9, 11:00amโ€“1:00pm) Poster Session 5 ๐๐จ๐ฌ๐ญ๐ž๐ซ #๐Ÿ’๐Ÿ’: X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents; w/ amazing co-leads Salman James Shiffer In this work, we introduce a ๐œ๐จ๐ฆ๐ฉ๐ซ๐ž๐ก๐ž๐ง๐ฌ๐ข๐ฏ๐ž and ๐ž๐š๐ฌ๐ฒ-๐ญ๐จ-๐ซ๐ฎ๐ง

(Thu Oct 9, 11:00amโ€“1:00pm) Poster Session 5

๐๐จ๐ฌ๐ญ๐ž๐ซ #๐Ÿ’๐Ÿ’: X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents; w/ amazing co-leads <a href="/salman1422571/">Salman</a> <a href="/jamesnshiffer/">James Shiffer</a> 

In this work, we introduce a ๐œ๐จ๐ฆ๐ฉ๐ซ๐ž๐ก๐ž๐ง๐ฌ๐ข๐ฏ๐ž and ๐ž๐š๐ฌ๐ฒ-๐ญ๐จ-๐ซ๐ฎ๐ง