GX Xu
@gx_nlp
Research Engineer @ Redhat AI Innovation
ID: 1542500294340186112
30-06-2022 13:28:40
26 Tweet
63 Followers
313 Following
Today, with Tim Dettmers, Hugging Face, & @mobius_labs, we're releasing FSDP/QLoRA, a new project that lets you efficiently train very large (70b) models on a home computer with consumer gaming GPUs. 1/🧵 answer.ai/posts/2024-03-…
A personal note: Unitxt originated within the Leshem (Legend) Choshen 🤖🤗 fusing team, aiming to streamline the sharing of academic outputs, primarily through model weights but also data. In the process of training various models on numerous datasets, we encountered significant challenges related
Congrats Google DeepMind on the new Gemma-2 27B & 9B release! Gemma-2 was tested in the Arena under the codename "*late-june-chatbots" and now out of stealth. Its early result matches the best open models (Llama-3-70B, Nemotron-340B) with only 27B parameters! Impressively,