JIE GAO (@jerrygaodextrys) Twitter Tweets • TwiCopy

Google AI

6 months ago

Today on the blog we introduce a notion of sufficient context to examine retrieval augmented generation (RAG) systems, developing a method to classify instances, analyzing failures of RAG systems & proposing a way to reduce hallucinations. Read more →goo.gle/43gp3Vk

thumb_up_off_alt393

chat_bubble_outline34

repeat86

shareShare

Omar Khattab

@lateinteraction

6 months ago

Google folks continues to do awesome late interaction work. Compared to vanilla ColBERT, a version of this new “CRISP achieves an 11x reduction in the number of vectors—with only a 3.6% quality loss”.

thumb_up_off_alt154

chat_bubble_outline3

repeat14

shareShare

JIE GAO

@jerrygaodextrys

6 months ago

Beyond simple Q/A pairs or triplet based data—it creates complex synthetic data for end to end RAG components, covering both single & multi-hop queries with varying logic and cross-docs & cross-sents "clues"

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

JIE GAO

@jerrygaodextrys

6 months ago

Pro tip: validate your continued-pretrained model on domain data before diving into downstream tasks. Avoid wasted runs.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

tomaarsen

@tomaarsen

6 months ago

Wow, ColBERT/Late Interaction/Multi-vector search models just keep on winning right now. State-of-the-art on several domains on the BRIGHT benchmark (reasoning-intensive retrieval) with just 149M parameters, while outperforming 8B ones. Trained in 2 hours. Wild!

thumb_up_off_alt129

chat_bubble_outline2

repeat22

shareShare

tomaarsen

@tomaarsen

5 months ago

Relabeling datasets for Information Retrieval improves NDCG@10 of both embedding models & cross-encoder rerankers. This was already the prevalent belief, but now it's been confirmed. Great job Nandan Thakur, Crystina Zhang, Xueguang Ma & Jimmy Lin

thumb_up_off_alt109

chat_bubble_outline4

repeat14

shareShare

Charlie Marsh

@charliermarsh

5 months ago

You can set `UV_TORCH_BACKEND=auto` and uv will automatically install the right CUDA-enabled PyTorch for your machine, zero configuration

thumb_up_off_alt2,2K

chat_bubble_outline73

repeat230

shareShare

JIE GAO

@jerrygaodextrys

5 months ago

Based on simple yet effective semantic selection criterion: 1. Negatives closer to the query than positives; 2. Yet far enough from the positive to avoid noise; Use clustering and dimensionality reduction to do negative sampling at scale.

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Sumit

@_reachsumit

5 months ago

Towards Better Instruction Following Retrieval Models Yuchen Zhuang et al. introduce a large-scale training corpus with over 38,000 instruction-query-passage triplets for enhancing retrieval models in instruction-following IR 📝arxiv.org/abs/2505.21439 👨🏽‍💻huggingface.co/datasets/InF-I…

thumb_up_off_alt18

chat_bubble_outline0

repeat2

shareShare

elvis

@omarsar0

5 months ago

New Lens on RAG Systems RAG systems are more brittle than you think, even when provided sufficient context. Great work from Google and collaborators. Good tips for devs included. Here are my notes:

thumb_up_off_alt1,1K

chat_bubble_outline33

repeat234

shareShare

Ravid Shwartz Ziv

@ziv_ravid

5 months ago

You know all those arguments that LLMs think like humans? Turns out it's not true. 🧠 In our paper "From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning" we test it by checking if LLMs form concepts the same way humans do Yann LeCun Chen Shani Dan Jurafsky

thumb_up_off_alt1,1K

chat_bubble_outline81

repeat317

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

5 months ago

There is no AI research program in the US without Chinese and Indian students. If you think otherwise, it’s because you’re not a researcher

thumb_up_off_alt4,4K

chat_bubble_outline134

repeat209

shareShare

Omar Khattab

@lateinteraction

5 months ago

🤩 in some cases up to 554% speedup for ColBERT models against PLAID, which is already a ridiculously fast engine for late interaction

thumb_up_off_alt88

chat_bubble_outline2

repeat11

shareShare

Braintrust

@braintrustdata

5 months ago

It's a great day to run evals. AI Engineer

It's a great day to run evals.

<a href="/aiDotEngineer/">AI Engineer</a>

thumb_up_off_alt23

chat_bubble_outline7

repeat3

shareShare

Qwen

@alibaba_qwen

5 months ago

🚀 Proud to introduce the Qwen3-Embedding and Qwen3-Reranker Series – setting new standards in multilingual text embedding and relevance ranking! ✨ Highlights: ✅ Available in 0.6B / 4B / 8B versions ✅ Supports 119 languages ✅ State-of-the-Art performance on MMTEB , MTEB ,

thumb_up_off_alt1,1K

chat_bubble_outline57

repeat296

shareShare

tomaarsen

@tomaarsen

5 months ago

Qwen is continuing their habit of state-of-the-art releases with 3 extraordinarily strong embedding models and 3 powerful reranker models, focusing on multilingual text retrieval and more. Details in 🧵

thumb_up_off_alt169

chat_bubble_outline1

repeat29

shareShare

Xueguang Ma

@xueguang_ma

5 months ago

Very strong embedding model!!! If anyone is interested in further fine-tuning Qwen3-embed with custom data. Here is the command with Tevatron. github.com/texttron/tevat…

thumb_up_off_alt183

chat_bubble_outline2

repeat22

shareShare

Zhijing Jin✈️ ICLR Singapore

@zhijingjin

5 months ago

Really excited about our recent large collaboration work on NLP for Social Good. The work stems from our discussions at the NLP for Positive Impact Workshop (EMNLP 2024) Workshop at #EMNLP2024 EMNLP 2025. Thanks to all our awesome collaborators, workshop attendees and all supporters!

Really excited about our recent large collaboration work on NLP for Social Good. The work stems from our discussions at the <a href="/NLP4PosImpact/">NLP for Positive Impact Workshop (EMNLP 2024)</a> Workshop at #EMNLP2024 <a href="/emnlpmeeting/">EMNLP 2025</a>. Thanks to all our awesome collaborators, workshop attendees and all supporters!

thumb_up_off_alt102

chat_bubble_outline3

repeat22

shareShare

Google DeepMind

@googledeepmind

5 months ago

Extract – a system built by the UK government, using our Gemini foundational model – will help council planners make faster decisions. 🚀 Using multimodal reasoning, it turns complex planning documents – even handwritten notes and blurry maps – into digital data in just 40s.

thumb_up_off_alt1,1K

chat_bubble_outline58

repeat272

shareShare

Michael Moor

@michael_d_moor

5 months ago

Excited to announce MIRIAD — a large-scale dataset of 5,821,948 medical question-answer pairs, each rephrased from passages in the medical literature. Great collab with Qinyue Zheng, Salman Abdullah, Sam Rawal, MD, Cyril Zakka, MD, Sophie Ostmeier, Maximilian Purk, Eduardo Reis, Eric Topol &

thumb_up_off_alt392

chat_bubble_outline1

repeat103

shareShare