Rush Tabesh (@rush_tabesh) 's Twitter Profile
Rush Tabesh

@rush_tabesh

Rush (Soroush) Tabesh | Ph.D. Student @ ISTAustria | Efficiency in Deep Learning

ID: 1694364063629737984

linkhttp://tabesh.me calendar_today23-08-2023 15:01:17

4 Tweet

24 Followers

72 Following

Dan Alistarh (@dalistarh) 's Twitter Profile Photo

The code for RoSA: Accurate Parameter-Efficient Fine-Tuning via Sparse + Low-Rank Adapters is now available: github.com/IST-DASLab/RoSA along with a PEFT integration github.com/IST-DASLab/pefโ€ฆ As a bonus, we also QRoSA, which implements the same idea, but with quantized base weights.

Rush Tabesh (@rush_tabesh) 's Twitter Profile Photo

Happy to introduce #HALO lower-precision fine-tuning for LLMs. With proper Hadamard transforms, #HALO enables accurate INT8/FP6 fine-tuningโ€”lossless speedups up to 1.41ร—. ๐Ÿ“„ Paper: arxiv.org/pdf/2501.02625 ๐Ÿ’ป Code: github.com/IST-DASLab/HALO #LLM #Quantization

Egor Zverev @ICLR 2025 (@egor_zverev_ai) 's Twitter Profile Photo

(1/n) In our #ICLR2025 paper, we explore a fundamental issue that enables prompt injections: ๐‹๐‹๐Œ๐ฌโ€™ ๐ข๐ง๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ ๐ญ๐จ ๐ฌ๐ž๐ฉ๐š๐ซ๐š๐ญ๐ž ๐ข๐ง๐ฌ๐ญ๐ซ๐ฎ๐œ๐ญ๐ข๐จ๐ง๐ฌ ๐Ÿ๐ซ๐จ๐ฆ ๐๐š๐ญ๐š ๐ข๐ง ๐ญ๐ก๐ž๐ข๐ซ ๐ข๐ง๐ฉ๐ฎ๐ญ โœ… Definition of separation ๐Ÿ‘‰ SEP Benchmark ๐Ÿ” LLM evals on SEP

(1/n) In our #ICLR2025  paper, we explore a fundamental issue that enables prompt injections: ๐‹๐‹๐Œ๐ฌโ€™ ๐ข๐ง๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ ๐ญ๐จ ๐ฌ๐ž๐ฉ๐š๐ซ๐š๐ญ๐ž ๐ข๐ง๐ฌ๐ญ๐ซ๐ฎ๐œ๐ญ๐ข๐จ๐ง๐ฌ ๐Ÿ๐ซ๐จ๐ฆ ๐๐š๐ญ๐š ๐ข๐ง ๐ญ๐ก๐ž๐ข๐ซ ๐ข๐ง๐ฉ๐ฎ๐ญ

โœ… Definition of separation
๐Ÿ‘‰ SEP Benchmark
๐Ÿ” LLM evals on SEP
Dan Alistarh (@dalistarh) 's Twitter Profile Photo

Our QuEST paper was selected for Oral Presentation at ICLR Sparsity in LLMs Workshop at ICLR 2025 workshop! QuEST is the first algorithm with Pareto-optimal LLM training for 4bit weights/activations, and can even train accurate 1-bit LLMs. Paper: arxiv.org/abs/2502.05003 Code: github.com/IST-DASLab/QuEโ€ฆ