Xinyu Crystina Zhang | on job market (@crystina_z) 's Twitter Profile
Xinyu Crystina Zhang | on job market

@crystina_z

PhD @UWaterloo ugrad @HKUST | prev. Google DM @cohere #CLOVA, MPI | Multilingual | IR | author of MIRACL.ai Mr. TyDi 🦋crystinaz.bsky.social | on job market!

ID: 1233368927264288769

linkhttps://crystina-z.github.io calendar_today28-02-2020 12:30:52

126 Tweet

573 Followers

573 Following

Catherine Arnett (@linguist_cat) 's Twitter Profile Photo

✨New pre-print✨ Crosslingual transfer allows models to leverage their representations for one language to improve performance on another language. We characterize the acquisition of shared representations in order to better understand how and when crosslingual transfer happens.

✨New pre-print✨ Crosslingual transfer allows models to leverage their representations for one language to improve performance on another language. We characterize the acquisition of shared representations in order to better understand how and when crosslingual transfer happens.
Freda Shi (@fredahshi) 's Twitter Profile Photo

📢I just made the slides public for this talk. TL; DR: how we computer scientists adapt insights from linguistics to analyze and improve our models. Comments & discussion are welcomed; the recording from Vector is forthcoming. docs.google.com/presentation/d…

📢I just made the slides public for this talk. TL; DR: how we computer scientists adapt insights from linguistics to analyze and improve our models. Comments & discussion are welcomed; the recording from Vector is forthcoming. docs.google.com/presentation/d…
Freda Shi (@fredahshi) 's Twitter Profile Photo

On my way to NAACL✈️! If you're also there and interested in grounding, don't miss our tutorial on "Learning Language through Grounding"! Mark your calendar: May 3rd, 14:00-17:30, Ballroom A. Another exciting collaboration with Martin Ziqiao Ma Jiayuan Mao Parisa Kordjamshidi Michigan SLED Lab!

On my way to NAACL✈️! If you're also there and interested in grounding, don't miss our tutorial on "Learning Language through Grounding"! 
Mark your calendar: May 3rd, 14:00-17:30, Ballroom A. 

Another exciting collaboration with <a href="/ziqiao_ma/">Martin Ziqiao Ma</a> <a href="/maojiayuan/">Jiayuan Mao</a> <a href="/Kordjamshidi/">Parisa Kordjamshidi</a> <a href="/SLED_AI/">Michigan SLED Lab</a>!
Xinyu Crystina Zhang | on job market (@crystina_z) 's Twitter Profile Photo

On my way to #NAACL2025 ✈️ I'll present the paper on Friday (May 2) 9-10:30am at poster session 7. Happy to chat about any aspect of multilingualism and culture! I'm also open to postdoc and visiting positions in the US. Definitely reach out if you have any opportunities.

Akari Asai (@akariasai) 's Twitter Profile Photo

Excited to be at the Foundation Models for Science Conference in NYC and NAACL in Albuquerque this week! I’ll be presenting OpenScholar (arxiv.org/abs/2411.14199), CodeRAG-Bench (arxiv.org/abs/2406.14497) and others & organize a workshop! Come say high 🧵

Xueguang Ma (@xueguang_ma) 's Twitter Profile Photo

Sharing some updates on Tevatron-2.0 toolkit (accepted as a SIGIR 2025 demo) together with OmniEmbed-v0.1 Tevatron-2.0 aims to better support the training of unified embedding models across tasks, languages, and modalities, facilitating future research in better information

Shengyao Zhuang (@shengyaozhuang) 's Twitter Profile Photo

One embedding model for all modalities and cross different languages! We will demo the model training pipeline in #SIGIR2025 Our OmniEmbed-v0.1 also demonstrate very strong performance on MAGMaR multimodal retrieval shared task eval.ai/web/challenges…

Benjamin Minixhofer (@bminixhofer) 's Twitter Profile Photo

We achieved the first instance of successful subword-to-byte distillation in our (just updated) paper. This enables creating byte-level models at a fraction of the cost of what was needed previously. As a proof-of-concept, we created byte-level Gemma2 and Llama3 models. 🧵

We achieved the first instance of successful subword-to-byte distillation in our (just updated) paper.

This enables creating byte-level models at a fraction of the cost of what was needed previously.

As a proof-of-concept, we created byte-level Gemma2 and Llama3 models.

🧵
Jiaang Li (@jiaangli) 's Twitter Profile Photo

🚀New Preprint Alert 🚀 Can Multimodal Retrieval Enhance Cultural Awareness in Vision-Language Models? Excited to introduce RAVENEA, a new benchmark aimed at evaluating cultural understanding in VLMs through RAG.

🚀New Preprint Alert 🚀
Can Multimodal Retrieval Enhance Cultural Awareness in Vision-Language Models?

Excited to introduce RAVENEA, a new benchmark aimed at evaluating cultural understanding in VLMs through RAG.
Nandan Thakur (@beirmug) 's Twitter Profile Photo

Did you know that fine-tuning retrievers & re-rankers on large but unclean training datasets can harm their performance? 😡 In our new preprint, we re-examine popular IR training data quality by pruning datasets and identifying and relabeling 𝐟𝐚𝐥𝐬𝐞-𝐧𝐞𝐠𝐚𝐭𝐢𝐯𝐞𝐬! 🏷️

Did you know that fine-tuning retrievers &amp; re-rankers on large but unclean training datasets can harm their performance? 😡

In our new preprint, we re-examine popular IR training data quality by pruning datasets and identifying and relabeling 𝐟𝐚𝐥𝐬𝐞-𝐧𝐞𝐠𝐚𝐭𝐢𝐯𝐞𝐬! 🏷️
Eugene Yang (@eyangtw) 's Twitter Profile Photo

🚨Wouldn’t it be nice if your agentic search system could reason over all your docs? ✨Introducing Rank-K, a listwise reranker that benefits from test-time compute and long-context! Rank-K sets a new SoTA for reasoning-based reranking, without reasoning chains from other models.

🚨Wouldn’t it be nice if your agentic search system could reason over all your docs?

✨Introducing Rank-K, a listwise reranker that benefits from test-time compute and long-context! Rank-K sets a new SoTA for reasoning-based reranking, without reasoning chains from other models.
Xinyu Crystina Zhang | on job market (@crystina_z) 's Twitter Profile Photo

The more data, the better? 🤔 Only if they are clean! Introducing our latest work on relabeling hard negatives in massive IR training sets! 📝 Cleaner data → stronger embeddings & rerankers. Read more here ⬇️

Jimmy Lin (@lintool) 's Twitter Profile Photo

💥 My awesome University of Waterloo ugrad student Sisi Li - with the help of Ronak Pradeep - slapped an MCP server in front of Pyserini to create MCPyserini and connected it to Claude to create DeepResearcherini! 🤪 Here, an example of RAG using the MS MARCO v1 passage collection.

Wenyan Li (@wenyan62) 's Twitter Profile Photo

Excited to share our multimodal temporal culture benchmark is released 🚀🚀🚀 Dataset is public on 🤗 huggingface Check it out!! arxiv.org/abs/2506.01565 huggingface.co/datasets/lizho…

Xueguang Ma (@xueguang_ma) 's Twitter Profile Photo

Very strong embedding model!!! If anyone is interested in further fine-tuning Qwen3-embed with custom data. Here is the command with Tevatron. github.com/texttron/tevat…

Very strong embedding model!!!

If anyone is interested in further fine-tuning Qwen3-embed with custom data. Here is the command with Tevatron.  github.com/texttron/tevat…
Xueguang Ma (@xueguang_ma) 's Twitter Profile Photo

Sharing our recent efforts on applying OmniEmbed to large-scale video retrieval MultiVENT2.0! tl;dr, we achieve SoTA on the MAGMAR shared task leaderboard. More importantly, we provide in-depth analysis on the effectiveness of different input modalities for video retrieval.

Sharing our recent efforts on applying OmniEmbed to large-scale video retrieval MultiVENT2.0!

tl;dr, we achieve SoTA on the MAGMAR shared task leaderboard.

More importantly, we provide in-depth analysis on the effectiveness of different input modalities for video retrieval.
Jimmy Lin (@lintool) 's Twitter Profile Photo

In December 2024 Pankaj Gupta Gilad Mishne Will Horn and I put out a rather cryptic arXiv paper musing about the future of search: arxiv.org/abs/2412.18956. I’m now able to share what I’ve been up to! 🧵(1/9)

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation "we introduce WM-ABench, a large-scale benchmark comprising 23 fine-grained evaluation dimensions across 6 diverse simulated environments with controlled counterfactual simulations. Through 660

Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation

"we introduce WM-ABench, a large-scale benchmark comprising 23  fine-grained evaluation dimensions across 6 diverse simulated  environments with controlled counterfactual simulations. Through 660