Tri Dao
@tri_dao
Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.
ID: 568879807
https://tridao.me 02-05-2012 07:13:50
690 Tweet
23,23K Followers
442 Following
If you like ML systems and San Diego beaches, you should work with Dan! Meanwhile we're lucky to have him spending some time Together AI. I've seen some early preview of what he's building, it's mind-blowing!
The Mamba in the Llama: arxiv.org/abs//2408.15237 RNN are neat. Here's a video describing how to make them work really well with little money: youtube.com/watch?v=A5ff8h… (by Junxiong Wang and Daniele Paliotta )
We have some excellent interns like James Together AI this summer doing research on efficient training & inference. Sparsity in LLM feels fundamental, here's an example of using sparsity to get faster LLM decoding
The work on spec dec for large batch and long context Together AI has been a great collaboration with Beidi Chen and her lab. Check our their MagicDec, it's and elegant way to use e.g. StreamingLLM to reduce the KV cache of the draft model