Michi Yasunaga (@michiyasunaga) 's Twitter Profile
Michi Yasunaga

@michiyasunaga

Researcher @OpenAI

ID: 1182194132309012481

linkhttp://michiyasunaga.github.io calendar_today10-10-2019 07:23:34

308 Tweet

3,3K Followers

882 Following

Shirley Wu (@shirleyyxwu) 's Twitter Profile Photo

🔥 AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning 🔥 is on NeurIPS Conference 2024! LLM agents are great, but they don't always make the best use of tools! AvaTaR is an automated framework that optimizes an LLM agent’s tool usage for any task. The "magic" lies

🔥 AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning 🔥 is on <a href="/NeurIPSConf/">NeurIPS Conference</a> 2024!

LLM agents are great, but they don't always make the best use of tools!

AvaTaR is an automated framework that optimizes an LLM agent’s tool usage for any task. The "magic" lies
Joon Sung Park (@joon_s_pk) 's Twitter Profile Photo

Simulating human behavior with AI agents promises a testbed for policy and the social sciences. We interviewed 1,000 people for two hours each to create generative agents of them. These agents replicate their source individuals’ attitudes and behaviors. 🧵arxiv.org/abs/2411.10109

Simulating human behavior with AI agents promises a testbed for policy and the social sciences. We interviewed 1,000 people for two hours each to create generative agents of them. These agents replicate their source individuals’ attitudes and behaviors. 🧵arxiv.org/abs/2411.10109
Akari Asai (@akariasai) 's Twitter Profile Photo

🚨 I’m on the job market this year! 🚨 I’m completing my Allen School Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵

🚨 I’m on the job market this year! 🚨
I’m completing my <a href="/uwcse/">Allen School</a> Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵
Marjan Ghazvininejad (@gh_marjan) 's Twitter Profile Photo

Everyone’s talking about synthetic data generation — but what’s the recipe for scaling it without model collapse? 🤔 Meet ALMA: Alignment with Minimal Annotation. We've developed a new technique for generating synthetic data and aligning LLMs that achieves performance close to

John Nguyen (@__johnnguyen__) 's Twitter Profile Photo

🥪New Paper! 🥪Introducing Byte Latent Transformer (BLT) - A tokenizer free model scales better than BPE based models with better inference efficiency and robustness. 🧵

🥪New Paper! 🥪Introducing Byte Latent Transformer (BLT) - A tokenizer free model scales better than BPE based models with better inference efficiency and robustness.  🧵
Lili Yu (Neurips24) (@liliyu_lili) 's Twitter Profile Photo

We scaled up Megabyte and ended up with a BLT! A pure byte-level model, has a steeper scaling law than the BPE-based models. With up to 8B parameters, BLT matches Llama 3 on general NLP tasks—plus it excels on long-tail data and can manipulate substrings more effectively. The

Weijia Shi (@weijiashi2) 's Twitter Profile Photo

Introducing 𝐋𝐥𝐚𝐦𝐚𝐅𝐮𝐬𝐢𝐨𝐧: empowering Llama 🦙 with diffusion 🎨 to understand and generate text and images in arbitrary sequences. ✨ Building upon Transfusion, our recipe fully preserves Llama’s language performance while unlocking its multimodal understanding and

Introducing 𝐋𝐥𝐚𝐦𝐚𝐅𝐮𝐬𝐢𝐨𝐧: empowering Llama 🦙 with diffusion 🎨 to understand and generate text and images in arbitrary sequences.

✨ Building upon Transfusion, our recipe fully preserves Llama’s language performance while unlocking its multimodal understanding and
Junhong Shen (@junhongshen1) 's Twitter Profile Photo

Introducing Content-Adaptive Tokenizer (CAT) 🐈! An image tokenizer that adapts token count based on image complexity, offering flexible 8x, 16x, or 32x compression! Unlike fixed-length tokenizers, CAT optimizes both representation efficiency and quality. Importantly, we use just

Introducing Content-Adaptive Tokenizer (CAT) 🐈! An image tokenizer that adapts token count based on image complexity, offering flexible 8x, 16x, or 32x compression! Unlike fixed-length tokenizers, CAT optimizes both representation efficiency and quality. Importantly, we use just
Marjan Ghazvininejad (@gh_marjan) 's Twitter Profile Photo

As Vision-Language Models (VLMs) grow more powerful, we need better reward models to align them with human intent. But how can we evaluate these models effectively? There are many aspects to evaluate them from -- correctness, human preference, reasoning, safety, etc.

Zhaofeng Wu @ ICLR (@zhaofeng_wu) 's Twitter Profile Photo

Robust reward models are critical for alignment/inference-time algos, auto eval, etc. (e.g. to prevent reward hacking which could render alignment ineffective). ⚠️ But we found that SOTA RMs are brittle 🫧 and easily flip predictions when the inputs are slightly transformed 🍃 🧵

Robust reward models are critical for alignment/inference-time algos, auto eval, etc. (e.g. to prevent reward hacking which could render alignment ineffective). ⚠️ But we found that SOTA RMs are brittle 🫧 and easily flip predictions when the inputs are slightly transformed 🍃 🧵