Vinod Grover (@vinodg) Twitter Tweets • TwiCopy

Vinod Grover

@vinodg

+ Follow

Sr Distinguished Engineer @nvidia. Compilers, CUDA C++, PL, Machine Learning and Systems. tweets and opinions are personal.

ID: 14155615

linkhttp://dblp.org/pid/87/2753.html calendar_today16-03-2008 01:33:57

1,1K Tweet

2,2K Followers

1,1K Following

Vinod Grover

@vinodg

a year ago

AI Compiler research intern positions at NVIDIA (Seattle). Functional programming, polyhedral compilation, program synthesis, compiler optimization. Send resume to [email protected]

thumb_up_off_alt170

chat_bubble_outline0

repeat26

shareShare

Vinod Grover

@vinodg

a year ago

Preprint arxiv.org/pdf/2412.13398

thumb_up_off_alt39

chat_bubble_outline0

repeat4

shareShare

We are excite to announce FlashInfer v0.2! Core contributions of this release include: - Block/Vector Sparse (Paged) Attention on FlashAttention-3 - JIT compilation for customized attention variants - Fused Multi-head Latent Attention (MLA) decoding kernel - Lots of bugfix and

thumb_up_off_alt163

chat_bubble_outline6

repeat41

shareShare

Luis Ceze

@luisceze

a year ago

Amazing to see Flashinfer’s traction in the short 8mo since it was first introduced. Try out the latest release.

thumb_up_off_alt19

chat_bubble_outline0

repeat2

shareShare

Vinod Grover

@vinodg

a year ago

Scaling Deep Learning Training with MPMD Pipeline Parallelism. Joint work with Anxhelo Xhebraj Sean Lee Hanfeng Chen arxiv.org/abs/2412.14374

thumb_up_off_alt27

chat_bubble_outline0

repeat5

shareShare

Vartika Singh

@vartuattheghat

a year ago

Check out Vinod Grover and team's work on scaling JAX based DL Training

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Vinod Grover

@vinodg

a year ago

Latest version of flashInfer paper with some cool ideas!

thumb_up_off_alt19

chat_bubble_outline0

repeat2

shareShare

Vijay

@__tensorcore__

10 months ago

🔥🚨 CUTLASS Blackwell is here 🚨🔥 3.8 release is loaded with support for new features of Blackwell, even an attention kernel 👀 Go check it out here: github.com/nvidia/cutlass Can't wait to see what y'all end up cooking with this over the next few moths and years 💚

thumb_up_off_alt125

chat_bubble_outline5

repeat32

shareShare

Vinod Grover

@vinodg

10 months ago

Pipeline Parallelism in JAX!

thumb_up_off_alt32

chat_bubble_outline0

repeat1

shareShare

Anxhelo Xhebraj

@0xa95

9 months ago

If you would like to share your work on array programming please consider submitting your paper to ARRAY '25 (co-located with PLDI ). pldi25.sigplan.org/home/ARRAY-202…

thumb_up_off_alt14

chat_bubble_outline0

repeat4

shareShare

Vinod Grover

@vinodg

9 months ago

Accepted for publication in #MLSys25 conference!

thumb_up_off_alt27

chat_bubble_outline1

repeat2

shareShare

Zihao Ye

@ye_combinator

9 months ago

Check out the intra-kernel profiler in flashinfer to visualize the timeline of each SM/warpgroup in the lifecycle of a CUDA persistent kernel: github.com/flashinfer-ai/… You can clearly understand how tensor/cuda cores overlapping, variable length load-balancing and fusion works.

thumb_up_off_alt146

chat_bubble_outline2

repeat31

shareShare

Vinod Grover

@vinodg

8 months ago

#NewProfilePic

thumb_up_off_alt15

chat_bubble_outline0

repeat0

shareShare

Tianqi Chen

@tqchenml

8 months ago

Happy to share our latest work at ASPLOS 2025! LLMs are dynamic, both in sequence and batches. Relax brings an ML compiler IR that globally tracks symbolic shapes across functions on multiple levels. Bring efficient and flexible LLM AOT compilation arxiv.org/abs/2311.02103.

thumb_up_off_alt135

chat_bubble_outline4

repeat35

shareShare

Rajeev Alur

@rajeevalur

7 months ago

Congratulations to Swarat Chaudhuri (PhD, CIS@Penn 2007) for this wonderful honor from Guggenheim Foundation gf.org/stories/announ…

thumb_up_off_alt40

chat_bubble_outline0

repeat6

shareShare

xjdr

@_xjdr

7 months ago

This is an SM90 to SM100 porting guide deep research and i made that is mostly accurate. sharing in case others might find it useful x.com/_xjdr/status/1…

thumb_up_off_alt159

chat_bubble_outline5

repeat12

shareShare

zhyncs

@zhyncs42

7 months ago

MLSys 2025 is coming up! Want to meet the developers behind FlashInfer, XGrammar, and SGLang LMSYS Org in person? Join us for the Happy Hour on May 12—we’d love to see you there! lu.ma/dl99yjoe

thumb_up_off_alt35

chat_bubble_outline0

repeat9

shareShare

Tianqi Chen

@tqchenml

6 months ago

#MLSys2025 make sure to attend 10:30am keynote Ion Stoica An AI stack: from scaling AI workloads to evaluating LLMs. Checkout full schedule at mlsys.org/virtual/2025/c…

#MLSys2025 make sure to attend 10:30am keynote <a href="/istoica05/">Ion Stoica</a> An AI stack: from scaling AI workloads to evaluating LLMs. Checkout full schedule at mlsys.org/virtual/2025/c…

thumb_up_off_alt55

chat_bubble_outline2

repeat14

shareShare

Aws Albarghouthi 🍉 أوس

@awsto

5 months ago

Here's a paper describing quantum computing using standard programming constructs, w/o linear algebra! Goal: demystify quantum computing + serve as a formal foundation for reasoning about quantum programs. paper paper eprint.iacr.org/2025/1091.pdf code github.com/qqq-wisc/qwla

thumb_up_off_alt238

chat_bubble_outline5

repeat54

shareShare

samim

@samim

5 months ago

thumb_up_off_alt16

chat_bubble_outline0

repeat2

shareShare