Zechun Liu (@zechunliu) 's Twitter Profile
Zechun Liu

@zechunliu

Research Scientist @Meta, Visiting Scholar @CarnegieMellon, PhD from @HKUST, Undergrad @FudanUniv

ID: 1672856688062500864

calendar_today25-06-2023 06:38:30

30 Tweet

289 Followers

69 Following

Zechun Liu (@zechunliu) 's Twitter Profile Photo

Excited to announce the #iccv2023 workshop on Low-Bit Quantized Neural Networks (LBQNN)! Call for papers: sites.google.com/view/lbqnn-icc… (by August 1, 2023). Topics can vary from computer vision to language models. Any work related to low-bit quantization is welcome.

Excited to announce the #iccv2023 workshop on Low-Bit Quantized Neural Networks (LBQNN)!

Call for papers: sites.google.com/view/lbqnn-icc… (by August 1, 2023).

Topics can vary from computer vision to language models. Any work related to low-bit quantization is welcome.
Yunyang Xiong (@youngxiong1) 's Twitter Profile Photo

Our vision-language LLM, MiniGPT-v2, achieves state-of-the-art performances on a broad range of vision-language tasks compared with recent generalist models. Try our demo at minigpt-v2.github.io.

Our vision-language LLM, MiniGPT-v2, achieves state-of-the-art performances on a broad range of vision-language tasks compared with recent generalist models. Try our demo at minigpt-v2.github.io.
Zechun Liu (@zechunliu) 's Twitter Profile Photo

🤩MobileLLM source code is available on github.com/facebookresear… ! 🎊Besides the MobileLLM-125M/350M models reported in the original paper, we also included results for MobileLLM-600M/1B/1.5B. Please kindly check our repo. 🌟Paper: arxiv.org/abs/2402.14905

🤩MobileLLM source code is available on github.com/facebookresear… !
🎊Besides the MobileLLM-125M/350M models reported in the original paper, we also included results for MobileLLM-600M/1B/1.5B. Please kindly check our repo.
🌟Paper: arxiv.org/abs/2402.14905
Zechun Liu (@zechunliu) 's Twitter Profile Photo

🎯As an extension of SpinQuant, we propose RoLoRA, which integrates rotation into QAT (LoRA+Quantization). 🎊It achieves 29.5 points improvement of 4-bit weight-activation quantized LLaMA2-13B on commonsense reasoning tasks compared to baseline. 🌟paper: arxiv.org/pdf/2407.08044

Zechun Liu (@zechunliu) 's Twitter Profile Photo

🎉I'm excited to share the news that SpinQuant supported the live demo in Meta Connect! We just made our 4-bit quantized LLaMA SpinQuant model publicly available. Check it out if you're interested: ai.meta.com/blog/meta-llam…

Yunyang Xiong (@youngxiong1) 's Twitter Profile Photo

🚨VideoLLM from Meta!🚨 LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding 📝Paper: huggingface.co/papers/2410.17… 🧑🏻‍💻Code: github.com/Vision-CAIR/Lo… 🚀Project (Demo): vision-cair.github.io/LongVU We propose LongVU, a video LLM with a spatiotemporal adaptive

🚨VideoLLM from Meta!🚨
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

📝Paper: huggingface.co/papers/2410.17…
🧑🏻‍💻Code: github.com/Vision-CAIR/Lo…
🚀Project (Demo): vision-cair.github.io/LongVU

We propose LongVU, a video LLM with a spatiotemporal adaptive
Zechun Liu (@zechunliu) 's Twitter Profile Photo

🚀We're thrilled to announce the MobileLLM weights are Available on HuggingFace: huggingface.co/collections/fa… 📱MobileLLM is a state-of-the-art language model designed for mobile devices: arxiv.org/abs/2402.14905 🔥Explore the pretraining code on GitHub: github.com/facebookresear…

🚀We're thrilled to announce the MobileLLM weights are Available on HuggingFace: huggingface.co/collections/fa…

📱MobileLLM is a state-of-the-art language model designed for mobile devices: arxiv.org/abs/2402.14905

🔥Explore the pretraining code on GitHub: github.com/facebookresear…
Zechun Liu (@zechunliu) 's Twitter Profile Photo

Thanks Yann LeCun for promoting our work. 🎉 MobileLLM models at sizes 125M 350M 600M are now available on HuggingFace! 🚀 huggingface.co/collections/fa…

Yunyang Xiong (@youngxiong1) 's Twitter Profile Photo

🚀Excited to share our Efficient Track Anything. It is small but mighty, >2x faster than SAM2 on A100 and runs > 10 FPS on iPhone 15 Pro Max. How’d we do it? EfficientSAM + Efficient Memory Attention! Paper: arxiv.org/pdf/2411.18933 Project (demo): yformer.github.io/efficient-trac… with:

🚀Excited to share our Efficient Track Anything. 
It is small but mighty, >2x faster than SAM2 on A100 and runs > 10 FPS on iPhone 15 Pro Max. How’d we do it? EfficientSAM + Efficient Memory Attention!

Paper: arxiv.org/pdf/2411.18933
Project (demo): yformer.github.io/efficient-trac…

with:
Forrest Iandola (@fiandola) 's Twitter Profile Photo

[1/n] 𝗘𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗧𝗿𝗮𝗰𝗸 𝗔𝗻𝘆𝘁𝗵𝗶𝗻𝗴 from Meta: interactive video segmentation and tracking on an iPhone!

Yuandong Tian (@tydsh) 's Twitter Profile Photo

We introduce ParetoQ, a series of pre-trained models that show SoTA in trinary (1.58bit), 2/3/4-bit quantization for SLMs (up to 3B parameters) using initial full pre-training + QAT later. In addition, we also discover that the representation changes substantially after low-bit

Beidi Chen (@beidichen) 's Twitter Profile Photo

⏰📢After years of working on long-context efficiency, I’ve started to doubt if it’s truly necessary (Many of you have probably noticed the decline of interest in long llms). Despite strong models like Gemini, short-context + retrieval often do the trick—faster, cheaper, and

Zechun Liu (@zechunliu) 's Twitter Profile Photo

🚀 We’re releasing ParetoQ, a family of quantized MobileLLMs — ultra-efficient, performance-retaining models for edge devices. 🧠 Smallest model: 1-bit, 125M → only 16MB on disk 📈 1.58-bit 600M even beats 1.58-bit 3B from BitNet(1-bit Era paper) 🔥 👉 Models:

🚀 We’re releasing ParetoQ, a family of quantized MobileLLMs — ultra-efficient, performance-retaining models for edge devices.

🧠 Smallest model: 1-bit, 125M → only 16MB on disk
📈 1.58-bit 600M even beats 1.58-bit 3B from BitNet(1-bit Era paper) 🔥

👉 Models:
PyTorch (@pytorch) 's Twitter Profile Photo

Quantization of large language models aims to cut compute and memory needs while keeping performance. 𝐏𝐚𝐫𝐞𝐭𝐨𝐐 delivers SOTA results across bit-widths, showing 1.58-, 2-, and 3-bit quantization offer better size-accuracy trade-offs than 4-bit. 💡 Read more: