Yiwei Wang (@wangyiw33973985) 's Twitter Profile
Yiwei Wang

@wangyiw33973985

Postdoc at UCLA CS @UCLA, @UCLANLP

ID: 1555047236240752640

linkhttps://wangywust.github.io/ calendar_today04-08-2022 04:25:36

39 Tweet

118 Takipรงi

70 Takip Edilen

Chaowei Xiao (@chaoweix) 's Twitter Profile Photo

๐Ÿš€ Introducing AutoDAN, a method that automatically generates SEMANTICALLY MEANINGFUL #Jailbreak prompts for #redteaming aligned #LLMs . arxiv: arxiv.org/pdf/2310.04451โ€ฆ

๐Ÿš€ Introducing AutoDAN, a  method that automatically generates SEMANTICALLY MEANINGFUL #Jailbreak prompts for #redteaming aligned #LLMs .
arxiv: arxiv.org/pdf/2310.04451โ€ฆ
Kai-Wei Chang (@kaiwei_chang) 's Twitter Profile Photo

I'm grateful for the opportunity to serve as SIGDAT officer. Thank you all for the support and ideas contributing to improving our community. ๐Ÿฅฐ sigdat.org/organization

Haoyi Qiu (@haoyiqiu) 's Twitter Profile Photo

๐Ÿ”ฅ Unlocking the power of Abstract Meaning Representations, AMRFact generates coherent, factually inconsistent summaries with high error-type coverage to improve the factuality evaluation on abstractive summarization! ๐Ÿ“ฃ Check out our new #NAACL2024๐Ÿ‡ฒ๐Ÿ‡ฝwork: arxiv.org/abs/2311.09521

๐Ÿ”ฅ Unlocking the power of Abstract Meaning Representations, AMRFact generates coherent, factually inconsistent summaries with high error-type coverage to improve the factuality evaluation on abstractive summarization!
๐Ÿ“ฃ Check out our new #NAACL2024๐Ÿ‡ฒ๐Ÿ‡ฝwork: arxiv.org/abs/2311.09521
Violet Peng (@violetnpeng) 's Twitter Profile Photo

Proud of this work where we show event detection can generalize from one epidemic to another by identifying epidemic-related event types (e.g. symptoms) even if the actual mentions (e.g. of symptoms) are distinctive. So training on COVID, we can generalize to Monkeypox! #NAACL24

Yu Yang (@yuyang_i) 's Twitter Profile Photo

Excited about training on synthetic data? Different stages of training might need different synthetic data! ๐Ÿง ๐Ÿ’ก Check out our #ICLR2024 paper on Progressive Dataset Distillation (PDD๐Ÿ˜‰) at PS#2 Halle B#9! It tailors synthetic data to each training stage for better performance!

๐ŸŒดMuhao Chen๐ŸŒด (@muhao_chen) 's Twitter Profile Photo

I have to miss #ICLR2024 due to teaching, but would still like to shoutout for our UniversalNER. This is an extremely strong open NER system that provides precise recognition in many domains and for any new entity types (outperforming ChatGPT by 9% on 43 NER datasets across 9

Yiwei Wang (@wangyiw33973985) 's Twitter Profile Photo

๐ŸŽณJailbreak of "Safe" Large Language Models is as simple as a string replacement. ๐Ÿ’ฅOur recent research finds that specifying an output prefix of LLMs will make the jailbreak easy and effective. researchsquare.com/article/rs-438โ€ฆ

๐ŸŽณJailbreak of "Safe" Large Language Models is as simple as a string replacement. 

๐Ÿ’ฅOur recent research finds that specifying an output prefix of LLMs will make the jailbreak easy and effective.

researchsquare.com/article/rs-438โ€ฆ
Axel Darmouni (@adarmouni) 's Twitter Profile Photo

Jailbreaking through asking for coherent outputs ๐Ÿงต๐Ÿ“– Read of the day, day 49: Frustratingly Easy Jailbreak of Large Language Models via Output Prefix Attacks, by Yiwei Wang et al from UCLA Yet another way of breaking through LLMโ€™s defense systems found. This idea is

Jailbreaking through asking for coherent outputs

๐Ÿงต๐Ÿ“– Read of the day, day 49: Frustratingly Easy Jailbreak of Large Language Models via Output Prefix Attacks, by <a href="/wangyiw33973985/">Yiwei Wang</a> et al from UCLA

Yet another way of breaking through LLMโ€™s defense systems found.

This idea is
Byron (@byron52238498) 's Twitter Profile Photo

๐Ÿš€Excited to share our latest research on knowledge editing (KE) in large language models! We unveil a novel approach, DeCK, which enhances in-context editing (ICE) by addressing stubborn knowledge that is tough to edit. DeCK boosts ICE performance by up to 219% on MQuAKE! ๐Ÿš€

๐Ÿš€Excited to share our latest research on knowledge editing (KE) in large language models! We unveil a novel approach, DeCK, which enhances in-context editing (ICE) by addressing stubborn knowledge that is tough to edit. DeCK boosts ICE performance by up to 219% on MQuAKE! ๐Ÿš€
Wenxuan Zhou (@wenxuanzhou_96) 's Twitter Profile Photo

Introducing WPO: Enhancing RLHF with Weighted Preference Optimization ๐ŸŒŸ Our new preference optimization method reweights preference data to simulate on-policy preference optimization using off-policy data, combining efficiency with high performance. โœ… up to 5.6% better than

Fei Wang (@fwang_nlp) 's Twitter Profile Photo

๐ŸŒŸ ๐Œ๐ฎ๐ฅ๐ญ๐ข๐ฆ๐จ๐๐š๐ฅ ๐ƒ๐๐Ž๐ŸŒŸ ๐Ÿ” DPO over-prioritizes language-only preference ๐Ÿš€ Introducing mDPO: optimizes image-conditioned preference ๐Ÿ† Best 3B MLLM with reduced hallucination, beats LLaVA 7/13B with DPO Collaboration with Microsoft Research huggingface.co/papers/2406.11โ€ฆ

๐ŸŒŸ ๐Œ๐ฎ๐ฅ๐ญ๐ข๐ฆ๐จ๐๐š๐ฅ ๐ƒ๐๐Ž๐ŸŒŸ
๐Ÿ” DPO over-prioritizes language-only preference
๐Ÿš€ Introducing mDPO: optimizes image-conditioned preference
๐Ÿ† Best 3B MLLM with reduced hallucination, beats LLaVA 7/13B with DPO

Collaboration with <a href="/MSFTResearch/">Microsoft Research</a>  

huggingface.co/papers/2406.11โ€ฆ
AIDB (@ai_database) 's Twitter Profile Photo

LLM็ ”็ฉถ่€…ๆœฌไบบใซใ‚ˆใ‚‹่งฃ่ชฌ่จ˜ไบ‹ใŒๅ‡บใพใ—ใŸใ€‚ ai-data-base.com/archives/72359 ไธญๅ›ฝ็ง‘ๅญฆ้™ขๅคงๅญฆ๏ผˆUniversity of Chinese Academy of Sciences๏ผ‰ใฎBaolong Biๆฐใ‚‰ใซใ‚ˆใ‚‹ใ€Œ้ ‘ๅ›บใช็Ÿฅ่ญ˜ใ€ใฎ็ทจ้›†ใ‚ขใƒ—ใƒญใƒผใƒใซ้–ขใ™ใ‚‹่ซ–ๆ–‡ใงใ™ใ€‚

Andrew (@andrewmichaelio) 's Twitter Profile Photo

๐Ÿšจ NEW OPENAI MODEL: o1 โ€œo1 spends more time thinking before it responds. In a qualifying exam for the International Mathematics Olympiad (IMO), GPT-4o correctly solved only 13% of problems, while the reasoning model scored 83%. Their coding abilities were evaluated in

Zhengzhong Tu (@_vztu) 's Twitter Profile Photo

๐ŸšจVision-Language Models (VLMs) are truly amazing. Ever wonder if their visual and textual "brains" always agree? I am excited to share our latest paper, where we tackle a critical challenge in VLMs, dubbed the ๐œ๐ซ๐จ๐ฌ๐ฌ-๐ฆ๐จ๐๐š๐ฅ๐ข๐ญ๐ฒ ๐ฉ๐š๐ซ๐š๐ฆ๐ž๐ญ๐ซ๐ข๐œ ๐ค๐ง๐จ๐ฐ๐ฅ๐ž๐๐ ๐ž

๐ŸšจVision-Language Models (VLMs) are truly amazing. Ever wonder if their visual and textual "brains" always agree? I am excited to share our latest paper, where we tackle a critical challenge in VLMs, dubbed the ๐œ๐ซ๐จ๐ฌ๐ฌ-๐ฆ๐จ๐๐š๐ฅ๐ข๐ญ๐ฒ ๐ฉ๐š๐ซ๐š๐ฆ๐ž๐ญ๐ซ๐ข๐œ ๐ค๐ง๐จ๐ฐ๐ฅ๐ž๐๐ ๐ž
Bingxuan Li (@bingxuan_l) 's Twitter Profile Photo

โš™๏ธ Introducing METAL! A multi-agent framework to generate charts that precisely replicate visual details in the reference. ๐Ÿ“ˆ We show that test-time scaling with the multi-agent system can bring 5.2% gain over the current best result on ChatMIMIC! ๐ŸŒ metal-chart-generation.github.io

โš™๏ธ Introducing METAL! A multi-agent framework to generate charts that precisely replicate visual details in the reference.

๐Ÿ“ˆ We show that test-time scaling with the multi-agent system can bring 5.2% gain over the current best result on ChatMIMIC!

๐ŸŒ metal-chart-generation.github.io
Yiwei Wang (@wangyiw33973985) 's Twitter Profile Photo

Sharing a new GitHub repository for collecting and sharing papers on the emerging topic of Context Engineering, which has seen broad adoption in industry: github.com/Meirtz/Awesomeโ€ฆ A corresponding survey paper is also coming soon. Thanks for reading! โ˜€๏ธ

Sharing a new GitHub repository for collecting and sharing papers on the emerging topic of Context Engineering, which has seen broad adoption in industry:

github.com/Meirtz/Awesomeโ€ฆ

A corresponding survey paper is also coming soon. Thanks for reading! โ˜€๏ธ
Yiwei Wang (@wangyiw33973985) 's Twitter Profile Photo

๐Ÿ“„ A Survey of Context Engineering for Large Language Models ๐Ÿง  arXiv link: arxiv.org/abs/2507.13334 Context Engineering is the art of generating, acquiring, processing, and managing contextual information for language model agents. ๐Ÿ“š GitHub project: github.com/Meirtz/Awesomeโ€ฆ

๐Ÿ“„ A Survey of Context Engineering for Large Language Models
๐Ÿง  arXiv link: arxiv.org/abs/2507.13334
Context Engineering is the art of generating, acquiring, processing, and managing contextual information for language model agents.

๐Ÿ“š GitHub project: github.com/Meirtz/Awesomeโ€ฆ