Artidoro Pagnoni
@artidoropagnoni
PhD student in NLP at UW with Luke Zettlemoyer
ID: 3583993995
08-09-2015 04:35:23
238 Tweet
892 Followers
451 Following
Easily Fine-tune AI at Meta Llama 3 70B! π¦ I am excited to share a new guide on how to fine-tune Llama 3 70B with PyTorch FSDP, Q-Lora, and Flash Attention 2 (SDPA) using Hugging Face build for consumer-size GPUs (4x 24GB). π Blog: philschmid.de/fsdp-qlora-llaβ¦ The blog covers: π¨βπ»
Check out our latest work (co-led with Alex Fabbri) on Summary of a Haystack (SummHay). A challenging task that shows long-context summarization with precise citation is far from solved... Got a long-context LLM or RAG you want to test? Code: github.com/salesforce/sumβ¦
The Alpaca moment of Large Multimodal Models! Can we build native LMMs just like Llama for simple multimodal generation? Introducing Anole: the first open-source, autoregressive native LMM for multimodal generation. Building on Chameleon by AI at Meta: github.com/GAIR-NLP/anole
π₯We release the first open-source 1.4T-token RAG datastore and present a scaling study for RAG on perplexity and downstream tasks! We show LM+RAG scales better than LM alone, with better performance for the same training compute (pretraining+indexing) retrievalscaling.github.io π§΅
Iβm very excited to join Northeastern U. Khoury College of Computer Sciences as an assistant professor starting Fall '25!! Looking forward to working with the amazing people there! Until then I'll be a postdoc at NLP @ Uni Vienna with Ben Roth, so reach out if you want to meet up while I'm over in Europe β¨
π Excited to share our latest work: Transfusion! A new multi-modal generative training combining language modeling and image diffusion in a single transformer! Huge shout to Chunting Zhou Omer Levy Michi Yasunaga Arun Babu Kushal Tirumala and other collaborators.