Animesh Singh (@animeshsingh) Twitter Tweets • TwiCopy

Yam Peleg

2 years ago

You thought that you can go to sleep now?? Orca 2 Just dropped. Paper: arxiv.org/pdf/2311.11045… Results: Orca 2 13B beats LLaMA-Chat-70B TL;DR: Training smaller model to reason by using multiple techniques: step-by-step, recall then generate, recall-reason-generate, direct

thumb_up_off_alt2,2K

chat_bubble_outline48

repeat338

shareShare

Alex Volkov (Thursd/AI)

@altryne

a year ago

Lol what, Linkedin casually dropped a kernel repo that reduces LLM training time on multi GPU by 20% while reducing memory by 60%? 😂 I'll start posting more on linkedin from now on h/t Wing Lian (caseus) (who of course already merged this into Axolotl 👏) github.com/linkedin/Liger…

thumb_up_off_alt161

chat_bubble_outline5

repeat28

shareShare

Yam Peleg

@yampeleg

a year ago

What is the most random event you can make up right now? Exactly! LinkedIN just dropped the highest performance gpu kernels (triton) for training LLMs. ( WTF?? 😅 ) Throughout up by up to 20% Mem reduced by up to 60% Out of the box support HF models. github.com/linkedin/Liger…

thumb_up_off_alt432

chat_bubble_outline13

repeat62

shareShare

Byron Hsu

@hsu_byron

a year ago

(1/n) Training LLMs can be hindered by out-of-memory, scaling batch size, and seq length. Add one line to boost multi-GPU training throughput by 20% and reduce memory usage by 60%. Introducing Liger-Kernel: Efficient Triton Kernels for LLM Training. github.com/linkedin/Liger…

thumb_up_off_alt973

chat_bubble_outline21

repeat171

shareShare

Byron Hsu

@hsu_byron

a year ago

If you've read this far, be sure to star our repo at github.com/linkedin/Liger…! We would like to thank Animesh Singh, Haowen Ning, Yanning Chen for the leadership support, shao, Qingquan Song, Yun Dai Vignesh Kothapalli Shivam Sahni Zain Merchant for the

thumb_up_off_alt34

chat_bubble_outline1

repeat2

shareShare

Wing Lian (caseus)

@winglian

a year ago

If you're doing any form of finetuning, using this is an instant win.

thumb_up_off_alt113

chat_bubble_outline2

repeat9

shareShare

Casper Hansen

@casper_hansen_

a year ago

Byron Hsu Wow this is huge! I want to thank you on behalf of the open source community for releasing this 🔥😃

thumb_up_off_alt7

chat_bubble_outline2

repeat1

shareShare

Andrej Karpathy

@karpathy

a year ago

Byron Hsu looks very nice! makes me want to write a 100% triton nanoGPT :)

thumb_up_off_alt327

chat_bubble_outline14

repeat5

shareShare

Byron Hsu

@hsu_byron

a year ago

As the biggest Andrej Karpathy fanboy, all the hard work paid off 🫡

thumb_up_off_alt416

chat_bubble_outline3

repeat3

shareShare

kalomaze

@kalomaze

a year ago

from intervitens > previously I was maxing out vram with cpu param offload, and now even without offloading, I only get 75% vram usage It actually... just works™ (4b FFT on 4x3090s, bs1 @ 8192 context)

thumb_up_off_alt37

chat_bubble_outline3

repeat3

shareShare

Byron Hsu

@hsu_byron

a year ago

Crazy day and thank you all! 500 stars within 12 hours. Have a good night 💤

thumb_up_off_alt353

chat_bubble_outline2

repeat18

shareShare

Animesh Singh

@animeshsingh

a year ago

Nothing less expected from a name like LIGER. Keep the roar on! Huge thanks to OSS community!!

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Tianle Cai @ ICLR 2025🇸🇬

@tianle_cai

a year ago

We've heard many complaints about the high GPU memory requirements for training Medusa heads on models with large vocabularies. This is no longer an issue, thanks to the amazing Liger kernel developed by the LinkedIn team (Byron Hsu and team) The Liger kernel cleverly fuses the

thumb_up_off_alt101

chat_bubble_outline2

repeat16

shareShare

LLaMA Factory

@llamafactory_ai

a year ago

We've integrated the Liger Kernel into LLaMA-Factory. It achieves ~10% speed up and ~25% memory reduction when fine-tuning Llama-3 8B on 2k sequences. Try it out at LLaMA-Factory🚀

thumb_up_off_alt88

chat_bubble_outline2

repeat11

shareShare

Byron Hsu

@hsu_byron

a year ago

Liger Kernel is officially supported in SFTTrainer, the most popular trainer for LLM fine tuning. Add `--use_liger_kernel` to supercharge the training with one flag. Hugging Face Thomas Wolf Lewis Tunstall

thumb_up_off_alt93

chat_bubble_outline1

repeat12

shareShare

Rohan Paul

@rohanpaul_ai

a year ago

Linkedin's great Liger Kernel repo just released the updated version. - New Integrations: SFTtrainer, Axolotl, LLaMa-Factory - New Models Support: Phi3 & Qwen2 - AutoModel API: Meet AutoLigerKernelForCausalLM - Enhanced FusedLinearCrossEntropy: support bias term

thumb_up_off_alt25

chat_bubble_outline2

repeat8

shareShare

Byron Hsu

@hsu_byron

a year ago

…iinfragpuskernelsllmsa.splashthat.com Last 3 days to RSVP the insightful ML sys meetup at LinkedIn campus! Lianmin Zheng has joined as a speaker on SGLang! Animesh Singh Guanhua Wang Ying Sheng Pradeep Ramani

…iinfragpuskernelsllmsa.splashthat.com

Last 3 days to RSVP the insightful ML sys meetup at LinkedIn campus! <a href="/lm_zheng/">Lianmin Zheng</a> has joined as a speaker on SGLang!

<a href="/AnimeshSingh/">Animesh Singh</a> <a href="/Guanhua_Wang_/">Guanhua Wang</a> <a href="/ying11231/">Ying Sheng</a> <a href="/_prrama/">Pradeep Ramani</a>

thumb_up_off_alt13

chat_bubble_outline0

repeat4

shareShare

Byron Hsu

@hsu_byron

a year ago

! 100K downloads

thumb_up_off_alt43

chat_bubble_outline1

repeat6

shareShare

Anush Elangovan

@anushelangovan

10 months ago

Pytorch zoom backend: An experimental Triton first integration into PyTorch eager mode where the kernels are written in Triton (instead of CUDA or HIP). Uses Liger kernels now but can use any Triton kernel. Runs llamas. hack away at it. Thoughts ? github.com/nod-ai/pytorch…

thumb_up_off_alt91

chat_bubble_outline3

repeat11

shareShare

MLOps Community

@mlopscommunity

8 months ago

We had the pleasure of chatting with Animesh Singh in our latest podcast episode! Animesh is the Director of GBU Infrastructure at LinkedIn and has a wealth of insights on scaling LLMs and optimizing GPU infrastructure.

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare