 
                                Qin Liu
@qinliu_nlp
PhD student @UC_Davis | MS & BA @FudanUni | AI safety and Trustworthy LLMs
ID: 4536349159
https://qinliu9.github.io 12-12-2015 08:39:00
48 Tweet
107 Followers
311 Following
 
         
        ๐ Introducing ๐ฆ๐ฒ๐ฎ๐ฟ๐ฐ๐ต-๐ฅ๐ญ โ the first ๐ฟ๐ฒ๐ฝ๐ฟ๐ผ๐ฑ๐๐ฐ๐๐ถ๐ผ๐ป ๐ผ๐ณ ๐๐ฒ๐ฒ๐ฝ๐๐ฒ๐ฒ๐ธ-๐ฅ๐ญ (๐๐ฒ๐ฟ๐ผ) for training reasoning and search-augmented LLM agents with reinforcement learning! This is a step towards training an ๐ผ๐ฝ๐ฒ๐ป-๐๐ผ๐๐ฟ๐ฐ๐ฒ ๐ข๐ฝ๐ฒ๐ป๐๐ โ๐๐ฒ๐ฒ๐ฝ
 
        ๐ Excited to share MetaScale, our latest work advancing LLM reasoning capabilities! MetaScale empowers GPT-4o to match or even surpass frontier reasoning models like o1, Claude-3.5-Sonnet, and o1-mini on the challenging Arena-Hard benchmark (lmarena.ai). Additionally, MetaScale
 
                        
                    
                    
                    
                 
         
         
         
         
         
         
        ACLRollingReview EMNLP 2025 Urgent help needed. acFZ: initial score 3 ๐ง Complete silence during discussion. โฐ 4am PST, 9 min before deadline: quietly drops to 2. with โThanks for the rebuttal. I have updated the score.โ โ ๏ธ No explanation. No notice. No chance to respond. (0/n)
 
                        
                    
                    
                    
                 
         
        Excited to share that two of my first-author papers were accepted to #EMNLP2025! โจ๐ 1๏ธโฃ Code Execution as Grounded Supervision for LLM Reasoning (Main) 2๏ธโฃ Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation (Findings) Huge thanks to my collaborators๐
 
         
         
                        ![Wenjie Jacky Mo (@wenjie_jacky_mo) on Twitter photo Worried about backdoors in LLMs?
๐ Check out our #NAACL2025 work on test-time backdoor mitigation!
โ
 Black-box ๐ฆ
โ
 Plug-and-play ๐ก๏ธ
We explore:
โ Defensive Demonstrations ๐งช
โ Self-generated Prefixes ๐งฉ
โ Self-refinement โ๏ธ
๐ arxiv.org/abs/2311.09763
๐งต[1/n] Worried about backdoors in LLMs?
๐ Check out our #NAACL2025 work on test-time backdoor mitigation!
โ
 Black-box ๐ฆ
โ
 Plug-and-play ๐ก๏ธ
We explore:
โ Defensive Demonstrations ๐งช
โ Self-generated Prefixes ๐งฉ
โ Self-refinement โ๏ธ
๐ arxiv.org/abs/2311.09763
๐งต[1/n]](https://pbs.twimg.com/media/GngiT0xbYAAvpDr.jpg) 
                         
                        ![Xiaofei Wen (@xiaofei_wen_mk) on Twitter photo Can LLM guardrails think twice before deciding?
โจ Check out our #ACL2025 paper: THINKGUARD โ a critique-augmented safety guardrail!
โ
 Structured critiques
โ
 Interpretable decisions
โ
 Robust against adversarial prompts
๐ arxiv.org/abs/2502.13458
๐งต[1/n] Can LLM guardrails think twice before deciding?
โจ Check out our #ACL2025 paper: THINKGUARD โ a critique-augmented safety guardrail!
โ
 Structured critiques
โ
 Interpretable decisions
โ
 Robust against adversarial prompts
๐ arxiv.org/abs/2502.13458
๐งต[1/n]](https://pbs.twimg.com/media/Gr-Yb04XEAAA7Vk.jpg) 
                         
                        