Scott Yih (@scottyih) 's Twitter Profile
Scott Yih

@scottyih

Research Scientist at Facebook AI Research (FAIR)

ID: 52565429

linkhttp://scottyih.org calendar_today30-06-2009 23:53:20

149 Tweet

1,1K Takipçi

787 Takip Edilen

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Newly published work from FAIR, Chameleon: Mixed-Modal Early-Fusion Foundation Models. This research presents a family of early-fusion token-based mixed-modal models capable of understanding & generating images & text in any arbitrary sequence. Paper ➡️ go.fb.me/7rb19n

Newly published work from FAIR, Chameleon: Mixed-Modal Early-Fusion Foundation Models.

This research presents a family of early-fusion token-based mixed-modal models capable of understanding & generating images & text in any arbitrary sequence.

Paper ➡️ go.fb.me/7rb19n
Srini Iyer (@sriniiyer88) 's Twitter Profile Photo

Excited to release our work from last year showcasing a stable training recipe for fully token-based multi-modal early-fusion auto-regressive models! arxiv.org/abs/2405.09818 Huge shout out to Armen Aghajanyan Ramakanth Luke Zettlemoyer Gargi Ghosh and other co-authors. (1/n)

Yu Su @#ICLR2025 (@ysu_nlp) 's Twitter Profile Photo

Super excited to introduce HippoRAG, a method I enjoyed developing the most in 2024. It’s led by my amazing student Bernal Bernal Jiménez and joint with Yiheng Shu Yu Gu Michi Yasunaga. Bernal’s thread gives a good technical account, so I’ll just share some personal thoughts

Minghan (@alexlimh23) 's Twitter Profile Photo

Curious about enhancing factuality and attribution in LLM generation? Check out our paper: arxiv.org/abs/2405.19325 Introducing NEST🪺: Nearest Neighbor Speculative Decoding for LLM Generation and Attribution, a training-free method that adds real-world texts into LLM generation.

Gargi Ghosh (@gargighosh) 's Twitter Profile Photo

Open sourcing Chameleon! Our work from last year - early fusion multimodal foundation model. We are releasing multimodalality in the input with text generation in the output( though the model was trained to generate text and image).

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Last week we released Meta Chameleon: a new mixed-modal research model from Meta FAIR. Get the models ➡️ go.fb.me/4m87kk The 7B & 34B safety tuned models we’ve released can take any combination of text and images as input and produce text outputs using a new early

Asli Celikyilmaz (@real_asli) 's Twitter Profile Photo

🚀💡We're hiring interns for 2025 at FAIR @ AI at Meta Work on cutting-edge projects: social reasoning, alignment, interaction, multi-agent communication & more with text/multimodal LLMs. Apply now! 🔗metacareers.com/jobs/119904986…

AI at Meta (@aiatmeta) 's Twitter Profile Photo

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️ go.fb.me/w23lmz

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness.

Paper ➡️ go.fb.me/w23lmz
Gargi Ghosh (@gargighosh) 's Twitter Profile Photo

We released new research - Byte Latent Transformer(BLT) BLT encodes bytes into dynamic patches using light-weight local models and processes them with a large latent transformer. Think of it as a transformer sandwich!

We released new research - Byte Latent Transformer(BLT)
BLT encodes bytes into dynamic patches using light-weight local models and processes them with a large latent transformer. Think of it as a transformer sandwich!
Gargi Ghosh (@gargighosh) 's Twitter Profile Photo

Last one of the year - EWE: arxiv.org/pdf/2412.18069 Ewe (Explicit Working Memory), enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources.

Last one of the year - EWE: arxiv.org/pdf/2412.18069
Ewe (Explicit Working Memory), enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources.
AI at Meta (@aiatmeta) 's Twitter Profile Photo

New research from Meta FAIR — Meta Memory Layers at Scale. This work takes memory layers beyond proof-of-concept, proving their utility at contemporary scale ➡️ go.fb.me/3lbt4m

Rulin Shao (@rulinshao) 's Twitter Profile Photo

Meet ReasonIR-8B✨the first retriever specifically trained for reasoning tasks! Our challenging synthetic training data unlocks SOTA scores on reasoning IR and RAG benchmarks. ReasonIR-8B ranks 1st on BRIGHT and outperforms search engine and retriever baselines on MMLU and GPQA🔥

Meet ReasonIR-8B✨the first retriever specifically trained for reasoning tasks! Our challenging synthetic training data unlocks SOTA scores on reasoning IR and RAG benchmarks. ReasonIR-8B ranks 1st on BRIGHT and outperforms search engine and retriever baselines on MMLU and GPQA🔥
Jason Weston (@jaseweston) 's Twitter Profile Photo

🌿Introducing MetaCLIP 2 🌿 📝: arxiv.org/abs/2507.22062 code, model: github.com/facebookresear… After four years of advancements in English-centric CLIP development, MetaCLIP 2 is now taking the next step: scaling CLIP to worldwide data. The effort addresses long-standing

🌿Introducing MetaCLIP 2 🌿
📝: arxiv.org/abs/2507.22062
code, model: github.com/facebookresear…

After four years of advancements in English-centric CLIP development, MetaCLIP 2 is now taking the next step: scaling CLIP to worldwide data. The effort addresses long-standing
Jason Weston (@jaseweston) 's Twitter Profile Photo

...is today a good day for new paper posts? 🤖Learning to Reason for Factuality 🤖 📝: arxiv.org/abs/2508.05618 - New reward func for GRPO training of long CoTs for *factuality* - Design stops reward hacking by favoring precision, detail AND quality - Improves base model across

...is today a good day for new paper posts? 
🤖Learning to Reason for Factuality 🤖
📝: arxiv.org/abs/2508.05618
- New reward func for GRPO training of long CoTs for *factuality*
- Design stops reward hacking by favoring precision, detail AND quality
- Improves base model across
Jessy Lin (@realjessylin) 's Twitter Profile Photo

🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge? In new work with AI at Meta, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results: * 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia

🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge?

In new work with <a href="/AIatMeta/">AI at Meta</a>, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results:

* 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia
Gargi Ghosh (@gargighosh) 's Twitter Profile Photo

New research from FAIR- Active Reading: a framework to learn a given set of material with self-generated learning strategies for generalized and expert domains(such as Finance). Absorb significantly more knowledge than vanilla finetuning and usual data augmentations strategies

Zhepei Wei ✈️ ICLR 2025 (@weizhepei) 's Twitter Profile Photo

🤔Ever wondered why your post-training methods (SFT/RL) make LLMs reluctant to say “I don't know?” 🤩Introducing TruthRL — a truthfulness-driven RL method that significantly reduces hallucinations while achieving accuracy and proper abstention! 📃arxiv.org/abs/2509.25760 🧵[1/n]

🤔Ever wondered why your post-training methods (SFT/RL) make LLMs reluctant to say “I don't know?”

🤩Introducing TruthRL — a truthfulness-driven RL method that significantly reduces hallucinations while achieving accuracy and proper abstention!

📃arxiv.org/abs/2509.25760
🧵[1/n]
Hritik Bansal (@hbxnov) 's Twitter Profile Photo

New paper 📢 Most powerful vision-language (VL) reasoning datasets remain proprietary 🔒, hindering efforts to study their principles and develop similarly effective datasets in the open 🔓. Thus, we introduce HoneyBee, a 2.5M-example dataset created through careful data

New paper 📢 Most powerful vision-language (VL) reasoning datasets remain proprietary 🔒, hindering efforts to study their principles and develop similarly effective datasets in the open 🔓. 

Thus, we introduce HoneyBee, a 2.5M-example dataset created through careful data