Siyu Yuan
@siyu_yuan_
Ph.D. candidate at Fudan University. Ex-Research Intern at
@MSFTResearch Asia and @BytedanceTalk AI Lab
ID: 967629804941074432
https://siyuyuan.github.io/ 25-02-2018 05:18:18
141 Tweet
620 Followers
482 Following
📢New LLM Agents Benchmark! Introducing 🌟MIRAI🌟: A groundbreaking benchmark crafted for evaluating LLM agents in temporal forecasting of international events with tool use and complex reasoning! 📜 Arxiv: arxiv.org/abs/2407.01231 🔗 Project page: mirai-llm.github.io 🧵1/N
🔥We release the first open-source 1.4T-token RAG datastore and present a scaling study for RAG on perplexity and downstream tasks! We show LM+RAG scales better than LM alone, with better performance for the same training compute (pretraining+indexing) retrievalscaling.github.io 🧵
1/n This Tuesday afternoon, catch my ICML paper presentation at Hall C 4-9 #1006. I’ll reveal the training stability differences between traditional nonlinear recurrent neural networks and the latest state-space models, with a focus on long-term memory. openreview.net/forum?id=BwG8h…
4/n I’ll be presenting a poster on LongSSM at the NGSM workshop! It’s all about length extrapolation in recurrent models. Swing by for a coffee chat and let’s discuss! ☕ I’m also interested in length extension for transformers and multi-modal scenarios. arxiv.org/abs/2406.02080