Yangsibo Huang (@yangsibohuang) 's Twitter Profile
Yangsibo Huang

@yangsibohuang

Research scientist @GoogleAI. Prev: PhD from @Princeton @PrincetonPLI. ML security & privacy. Opinions are my own.

ID: 2878031881

linkhttp://hazelsuko07.github.io/yangsibo/ calendar_today26-10-2014 08:09:08

276 Tweet

3,3K Followers

709 Following

Mengdi Wang (@mengdiwang10) 's Twitter Profile Photo

Princeton University #AI is recruiting Postdoc Fellows in AI for Accelerating Invention! Join us if you want to advance generative AI, RL and AI applications in engineering and science! Apply here today: puwebp.princeton.edu/AcadHire/apply… Ryan Adams Jennifer Rexford Princeton Engineering Princeton University

Princeton University #AI is recruiting Postdoc Fellows in AI for Accelerating Invention! 

Join us if you want to advance generative AI, RL and AI applications in engineering and science! Apply here today:
puwebp.princeton.edu/AcadHire/apply…

<a href="/ryan_p_adams/">Ryan Adams</a> <a href="/jrexnet/">Jennifer Rexford</a> <a href="/EPrinceton/">Princeton Engineering</a> <a href="/Princeton/">Princeton University</a>
Wenting Zhao (@wzhao_nlp) 's Twitter Profile Photo

Can you really train a smart LLM without copyrighted material? There has been hope that small LM + retrieval might circumvent data requirements. We think this approach is a bit of a mirage, which only improves performance on simple tasks, but hurts the reasoning capabilities.

Xindi Wu (@cindy_x_wu) 's Twitter Profile Photo

How good is the compositional generation capability of current Text-to-Image models? arxiv.org/abs/2408.14339 Introducing ConceptMix, our new benchmark that evaluates how well models can generate images that accurately combine multiple visual concepts, pushing beyond simple,

How good is the compositional generation capability of current Text-to-Image models? arxiv.org/abs/2408.14339

Introducing ConceptMix, our new benchmark that evaluates how well models can generate images that accurately combine multiple visual concepts, pushing beyond simple,
Ahmad Beirami (@abeirami) 's Twitter Profile Photo

Excellent tips! 1. Always have a 1yr research vision/plan which'll guide what to work on. 2. Go after big/important problems rather than incremental research. 3. If a problem is such that you know someone else is going to crack it in the next 3mos, that's not worth your while.

Leshem Choshen 🤖🤗 (@lchoshen) 's Twitter Profile Photo

Human feedback is critical for aligning LLMs, so why don’t we collect it in the open ecosystem?🧐 We (15 orgs) gathered the key issues and next steps. Envisioning a community-driven feedback platform, like Wikipedia alphaxiv.org/abs/2408.16961 🧵

Human feedback is critical for aligning LLMs, so why don’t we collect it in the open ecosystem?🧐
We (15 orgs) gathered the key issues and next steps.
Envisioning
a community-driven feedback platform, like Wikipedia

alphaxiv.org/abs/2408.16961
🧵
Yangsibo Huang (@yangsibohuang) 's Twitter Profile Photo

Collecting, using, and sharing human feedback on models brings up new privacy and copyright concerns. We discuss these issues and key considerations in Sections 3.6 and 3.7.

Yangsibo Huang (@yangsibohuang) 's Twitter Profile Photo

I recently began exploring how memorization affects model capabilities. E.g., we found that image generation models struggle with prompts that combine more than 3 visual concepts (e.g., "red," "fluffy," "squared," "smartphone") & we attribute this to their training data.

Dawn Song (@dawnsongtweets) 's Twitter Profile Photo

Large Language Model Agents is the next frontier. Really excited to announce our Berkeley course on LLM Agents, also available for anyone to join as a MOOC, starting Sep 9 (Mon) 3pm PT! 📢 Sign up & join us: llmagents-learning.org

Large Language Model Agents is the next frontier. Really excited to announce our Berkeley course on LLM Agents, also available for anyone to join as a MOOC, starting Sep 9 (Mon) 3pm PT! 📢
Sign up &amp; join us: llmagents-learning.org
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

We’re releasing DataGemma: open models that enhance LLM factuality by grounding them with real-world data from Google's Data Commons. 💡 It tackles hallucinations in AI models to generate more accurate and useful responses. Here’s how they work 🧵 dpmd.ai/47nWbvK

We’re releasing DataGemma: open models that enhance LLM factuality by grounding them with real-world data from <a href="/Google/">Google</a>'s Data Commons. 💡

It tackles hallucinations in AI models to generate more accurate and useful responses.

Here’s how they work 🧵 dpmd.ai/47nWbvK
Xinyun Chen (@xinyun_chen_) 's Twitter Profile Photo

Super glad to see a lot of excitement about our course! Again, huge thanks to Denny Zhou for coming to Berkeley and sharing insights on LLM reasoning!! Please join us on 2nd lecture, Shunyu Yao will give an overview of LLM agents and share his thoughts on important directions.

Sadhika Malladi (@sadhikamalladi) 's Twitter Profile Photo

Submit to the Math of Modern Machine Learning (M3L) workshop at NeurIPS 2024! Deadline is Sep 29. sites.google.com/view/m3l-2024/

Bill Yuchen Lin 🤖 (@billyuchenlin) 's Twitter Profile Photo

Both🍓o1-mini and o1-preview by OpenAI are on our ZeroEval reasoning leaderboard Ai2 now! Note that there is a significant improvement on 🦓 ZebraLogic and MATH-L5! 🔗 Link on Hugging Face: hf.co/spaces/allenai…

Both🍓o1-mini and o1-preview by <a href="/OpenAI/">OpenAI</a> are on our  ZeroEval reasoning leaderboard <a href="/allen_ai/">Ai2</a> now! Note that there is a significant improvement on 🦓 ZebraLogic and MATH-L5!  

🔗 Link on <a href="/huggingface/">Hugging Face</a>: hf.co/spaces/allenai…
A. Feder Cooper (@afedercooper) 's Twitter Profile Photo

Exciting announcement! The submission portal for ACM CS+Law '25 is now open! Please send your papers in to this amazing venue at the intersection of computer science and law. CFP: computersciencelaw.org/2025 Submit: cslaw25.hotcrp.com cc The GenLaw Center

FAR.AI (@farairesearch) 's Twitter Profile Photo

"Please learn from our mistakes. Don't do exactly the same things that we did, or you'll end up in ten years with having nothing to show for it." — Nicholas Carlini urging AI researchers to avoid the pitfalls of past adversarial ML research at the Vienna Alignment Workshop 2024.

Yuntian Deng (@yuntiandeng) 's Twitter Profile Photo

Is OpenAI's o1 a good calculator? We tested it on up to 20x20 multiplication—o1 solves up to 9x9 multiplication with decent accuracy, while gpt-4o struggles beyond 4x4. For context, this task is solvable by a small LM using implicit CoT with stepwise internalization. 1/4

Is OpenAI's o1 a good calculator? We tested it on up to 20x20 multiplication—o1 solves up to 9x9 multiplication with decent accuracy, while gpt-4o struggles beyond 4x4. For context, this task is solvable by a small LM using implicit CoT with stepwise internalization. 1/4
Robin Jia (@robinomial) 's Twitter Profile Photo

Really excited about this new workshop we’re proposing for *CL! Memorization of training data is both fascinating to analyze and has a wide range of legal/privacy/benchmarking/social implications. Please vote if you’re interested!

Yangsibo Huang (@yangsibohuang) 's Twitter Profile Photo

I'm super excited about the *CL workshop we're planning to organize on LLM memorization, & its implications for compliance (privacy/copyright) and capabilities (evaluation/generalization). Plz help by RT and voting in the thread!