Vinod Grover (@vinodg) 's Twitter Profile
Vinod Grover

@vinodg

Sr Distinguished Engineer @nvidia. Compilers, CUDA C++, PL, Machine Learning and Systems. tweets and opinions are personal.

ID: 14155615

linkhttp://dblp.org/pid/87/2753.html calendar_today16-03-2008 01:33:57

1,1K Tweet

2,2K Takipçi

1,1K Takip Edilen

Vinod Grover (@vinodg) 's Twitter Profile Photo

AI Compiler research intern positions at NVIDIA (Seattle). Functional programming, polyhedral compilation, program synthesis, compiler optimization. Send resume to [email protected]

Zihao Ye (@ye_combinator) 's Twitter Profile Photo

We are excite to announce FlashInfer v0.2! Core contributions of this release include: - Block/Vector Sparse (Paged) Attention on FlashAttention-3 - JIT compilation for customized attention variants - Fused Multi-head Latent Attention (MLA) decoding kernel - Lots of bugfix and

We are excite to announce FlashInfer v0.2!

Core contributions of this release include:
- Block/Vector  Sparse (Paged) Attention on FlashAttention-3 
- JIT compilation for customized attention variants
- Fused Multi-head Latent Attention (MLA) decoding kernel
- Lots of bugfix and
Vijay (@__tensorcore__) 's Twitter Profile Photo

🔥🚨 CUTLASS Blackwell is here 🚨🔥 3.8 release is loaded with support for new features of Blackwell, even an attention kernel 👀 Go check it out here: github.com/nvidia/cutlass Can't wait to see what y'all end up cooking with this over the next few moths and years 💚

🔥🚨 CUTLASS Blackwell is here 🚨🔥

3.8 release is loaded with support for new features of Blackwell, even an attention kernel 👀

Go check it out here: github.com/nvidia/cutlass

Can't wait to see what y'all end up cooking with this over the next few moths and years 💚
Anxhelo Xhebraj (@0xa95) 's Twitter Profile Photo

If you would like to share your work on array programming please consider submitting your paper to ARRAY '25 (co-located with PLDI ). pldi25.sigplan.org/home/ARRAY-202…

Zihao Ye (@ye_combinator) 's Twitter Profile Photo

Check out the intra-kernel profiler in flashinfer to visualize the timeline of each SM/warpgroup in the lifecycle of a CUDA persistent kernel: github.com/flashinfer-ai/… You can clearly understand how tensor/cuda cores overlapping, variable length load-balancing and fusion works.

Check out the intra-kernel profiler in flashinfer to visualize the timeline of each SM/warpgroup in the lifecycle of a CUDA persistent kernel:

github.com/flashinfer-ai/…

You can clearly understand how tensor/cuda cores overlapping, variable length load-balancing and fusion works.
Tianqi Chen (@tqchenml) 's Twitter Profile Photo

Happy to share our latest work at ASPLOS 2025! LLMs are dynamic, both in sequence and batches. Relax brings an ML compiler IR that globally tracks symbolic shapes across functions on multiple levels. Bring efficient and flexible LLM AOT compilation arxiv.org/abs/2311.02103.

xjdr (@_xjdr) 's Twitter Profile Photo

This is an SM90 to SM100 porting guide deep research and i made that is mostly accurate. sharing in case others might find it useful x.com/_xjdr/status/1…

zhyncs (@zhyncs42) 's Twitter Profile Photo

MLSys 2025 is coming up! Want to meet the developers behind FlashInfer, XGrammar, and SGLang LMSYS Org in person? Join us for the Happy Hour on May 12—we’d love to see you there! lu.ma/dl99yjoe

Tianqi Chen (@tqchenml) 's Twitter Profile Photo

#MLSys2025 make sure to attend 10:30am keynote Ion Stoica An AI stack: from scaling AI workloads to evaluating LLMs. Checkout full schedule at mlsys.org/virtual/2025/c…

#MLSys2025 make sure to attend 10:30am keynote <a href="/istoica05/">Ion Stoica</a>  An AI stack: from scaling AI workloads to evaluating LLMs. Checkout full schedule at mlsys.org/virtual/2025/c…
Aws Albarghouthi 🍉 أوس (@awsto) 's Twitter Profile Photo

Here's a paper describing quantum computing using standard programming constructs, w/o linear algebra! Goal: demystify quantum computing + serve as a formal foundation for reasoning about quantum programs. paper paper eprint.iacr.org/2025/1091.pdf code github.com/qqq-wisc/qwla

Here's a paper describing quantum computing using standard programming constructs, w/o linear algebra!  

Goal: demystify quantum computing + serve as a formal foundation for reasoning about quantum programs.  paper 

paper eprint.iacr.org/2025/1091.pdf
code github.com/qqq-wisc/qwla