Zekun Li (@zekunli0323) 's Twitter Profile
Zekun Li

@zekunli0323

CS Ph.D. student @UCSBNLP, intern at #GoogleGemini, #MSFTResearch, Meta #RealityLabs, interested in #NLProc, #LLM, and #AI4HScience

ID: 1327982489118228485

calendar_today15-11-2020 14:31:19

92 Tweet

1,1K Followers

746 Following

Canyu Chen (@canyuchen3) 's Twitter Profile Photo

๐Ÿค”Are your open-source LLMs really safe? ๐ŸšจIt may be injected with misinformation or bias! Our new paper "๐‚๐š๐ง ๐„๐๐ข๐ญ๐ข๐ง๐  ๐‹๐‹๐Œ๐ฌ ๐ˆ๐ง๐ฃ๐ž๐œ๐ญ ๐‡๐š๐ซ๐ฆ?" (Project website: llm-editing.github.io ) sheds light on the emerging challenges of LLMs, especially the

๐Ÿค”Are your open-source LLMs really safe? 
๐ŸšจIt may be injected with misinformation or bias!  

Our new paper "๐‚๐š๐ง ๐„๐๐ข๐ญ๐ข๐ง๐  ๐‹๐‹๐Œ๐ฌ ๐ˆ๐ง๐ฃ๐ž๐œ๐ญ ๐‡๐š๐ซ๐ฆ?" (Project website: llm-editing.github.io ) sheds light on the emerging challenges of LLMs, especially the
Zekun Li (@zekunli0323) 's Twitter Profile Photo

๐Ÿ‘‰ Check out our new paper on injecting misinformation and bias into LLMs via knowledge editing, as a new type of safety threat: editing threat. ๐ŸงWe found that: (1) Editing attach can inject both commonsense and long-tail misinformation into LLMs. (2) Editing attack can

Jerry Liu (@jerryjliu0) 's Twitter Profile Photo

I made a multi-agent system for multimodal retrieval and report generation ๐ŸŽจ - check it out! ๐Ÿ‘‡ Have talked to a lot of users recently that are interested in using agents to build the final document instead of getting back a chatbot response. There's a general feeling that this

I made a multi-agent system for multimodal retrieval and report generation ๐ŸŽจ - check it out! ๐Ÿ‘‡

Have talked to a lot of users recently that are interested in using agents to build the final document instead of getting back a chatbot response. There's a general feeling that this
Sherry Yang (@sherryyangml) 's Twitter Profile Photo

Checkout Generative Hierarchical Materials Search (GenMS) โ€“ a framework for generating crystal structures from natural language. Website: generative-materials.github.io Paper: arxiv.org/abs/2409.06762

Xin Eric Wang @ ICLR 2025 (@xwang_lk) 's Twitter Profile Photo

๐Ÿš€ Since its invention, the mouse has been our way to control computers. But what if it didnโ€™t have to be? ๐Ÿค” Thrilled to introduce Agent S, a new state-of-the-art GUI agent framework that interacts with computers just like a human and takes on the toughest automation challenges.

Wenda Xu (@wendaxu2) 's Twitter Profile Photo

I am on job market for full-time industry positions. My research focuses on text generation evaluation and LLM alignment. If you have relevant positions, Iโ€™d love to connect! Here are list of my publications and summary of my research:

Antonis Antoniades (@anton_iades) 's Twitter Profile Photo

๐Ÿง‘โ€๐Ÿ’ป Human software engineers constantly re-evaluate their approaches through experience. ๐Ÿค– However, LLM-based software agents can often get stuck in ineffective dead ends. Introducing SWE-Search: a multi-agent framework integrating search and self-refinement to enable software

Kexun Zhang@ICLR 2025 (@kexun_zhang) 's Twitter Profile Photo

Everyone talks about scaling inference compute after o1. But how exactly should we do that? We studied compute allocation for sampling -- a basic operation in most LLM meta-generators, and found that optimized allocation can save as much as 128x compute! arxiv.org/abs/2410.22480

Everyone talks about scaling inference compute after o1. But how exactly should we do that? We studied compute allocation for sampling -- a basic operation in most LLM meta-generators, and found that optimized allocation can save as much as 128x compute!
arxiv.org/abs/2410.22480
Ming Yin (@mingyin_0312) 's Twitter Profile Photo

I'm on the academic job market this year! My research centers around applying sequential decision-making techniques to build more efficient and reliable real-world AI systems. My work spans theory, methodology, and applications in RL/AI. If you're hiring, I'd love to connect!

Huan Sun (OSU) (@hhsun1) 's Twitter Profile Photo

The field of โ€œAgentsโ€ is expanding rapidly, making it challenging to keep up with the latest developments. Weโ€™ve compiled a list of awesome papers in three subareas to help the community: ๐ŸŒŸGUI Agents: github.com/OSU-NLP-Group/โ€ฆ (the most crowded subarea), led by Boyu Gou

Wenhu Chen (@wenhuchen) 's Twitter Profile Photo

I spent the weekend reading some recent great math+reasoning papers: 1. AceMath (arxiv.org/abs/2412.15084) 2. rStar-Math (arxiv.org/pdf/2501.04519) 3. PRIME (arxiv.org/abs/2412.01981) Here are some of my naive thoughts! It could be wrong. All of these papers are showing possible

I spent the weekend reading some recent great math+reasoning papers:
1. AceMath (arxiv.org/abs/2412.15084)
2. rStar-Math (arxiv.org/pdf/2501.04519)
3. PRIME (arxiv.org/abs/2412.01981)
Here are some of my naive thoughts! It could be wrong.

All of these papers are showing possible
elvis (@omarsar0) 's Twitter Profile Photo

Large Language Diffusion Models (LLaDA) Proposes a diffusion-based approach that can match or beat leading autoregressive LLMs in many tasks. If true, this could open a new path for large-scale language modeling beyond autoregression. More on the paper: Questioning

Large Language Diffusion Models (LLaDA)

Proposes a diffusion-based approach that can match or beat leading autoregressive LLMs in many tasks.

If true, this could open a new path for large-scale language modeling beyond autoregression.

More on the paper:

Questioning
Mingyang Chen (@chen_mingyang) 's Twitter Profile Photo

๐ŸŒŸIntroducing ๐—ฅ๐—ฒ๐—ฆ๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต: Learning to Reason with Search for LLMs via Reinforcement Learning. An open-source project that combines ๐—ฅ๐—Ÿ and ๐—ฅ๐—”๐—š for LLMs! ๐Ÿ’กLike Deepseek-R1-Zero and Deep Research, we start with pretrained models and use RL to empower them with the

Zekun Li (@zekunli0323) 's Twitter Profile Photo

๐Ÿš€ Introducing MassGen โ€” our new Multi-Agent Scaling System! Inspired by Grok Heavy & Gemini Deep Think, MassGen enables: ๐Ÿง  Parallel processing ๐Ÿ”— Intelligence sharing ๐Ÿ” Iterative refinement โœ… Cross-model consensus (across Google AI, OpenAI, @xAI agents and more) Check it