Vinod Grover (@vinodg) 's Twitter Profile
Vinod Grover

@vinodg

Sr Distinguished Engineer, @nvidia Compilers, CUDA C++, PL, Machine Learning and Systems. tweets and opinions are personal.

ID: 14155615

linkhttp://dblp.org/pid/87/2753.html calendar_today16-03-2008 01:33:57

1,1K Tweet

2,2K Followers

1,1K Following

Vinod Grover (@vinodg) 's Twitter Profile Photo

Interesting holiday reading on the Verse PL :) simon.peytonjones.org/assets/pdfs/ha… simon.peytonjones.org/assets/pdfs/ve…

Umang Mathur (@mathur_umang) 's Twitter Profile Photo

Our work "𝗗𝘆𝗻𝗮𝗺𝗶𝗰 𝗥𝗮𝗰𝗲 𝗗𝗲𝘁𝗲𝗰𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗢(𝟭) 𝗦𝗮𝗺𝗽𝗹𝗲𝘀" appears at POPL 2025 this year. Its a cute result demonstrating (yet again) how algorithmic techniques can be used to design simple effective solutions to traditionally hard systems-y problems.

Vinod Grover (@vinodg) 's Twitter Profile Photo

Excited to announce that our paper "Graphene: An IR for Optimized Tensor Computations on GPUs" has been accepted at ASPLOS '23! Joint work with Bastian Hagedorn , Bin Fan, Hanfeng Chen, Cris Ceka and Michael Garland #asplos23

Simon Cooksey (@_graymalkin) 's Twitter Profile Photo

Pleased to share that our 2022 ISCA Paper 'Mixed-Proxy Extensions for the NVIDIA PTX Memory Consistency Model' has been awarded an Honourable Mention for IEEE Micro Top Picks 2023! dl.acm.org/doi/abs/10.114…

Andrew Kerr (@arkerr) 's Twitter Profile Photo

CUTLASS 3.0 has just been released, offering optimal performance on NVIDIA’s H100 and a new approach to template metaprogramming in CUDA C++. github.com/NVIDIA/cutlass

Bastian Hagedorn (@hagedornbastian) 's Twitter Profile Photo

Our ASPLOS paper Graphene: An IR for Optimized Tensor Computations on GPUs is now available as open access here: dl.acm.org/doi/abs/10.114…

Vinod Grover (@vinodg) 's Twitter Profile Photo

Looking to hire Compiler Researchers and Engineers Research Scientist - AI Compiler : nvidia.wd5.myworkdayjobs.com/NVIDIAExternal… Senior Software Engineer – AI Compilers for LLM: nvidia.wd5.myworkdayjobs.com/NVIDIAExternal…

Tri Dao (@tri_dao) 's Twitter Profile Photo

Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/

Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/
Vinod Grover (@vinodg) 's Twitter Profile Photo

Compiler research intern positions at NVIDIA (Seattle) for ML. Functional programming, polyhedral compilation, program synthesis, compiler optimization. Send resume to [email protected]