Daniel Koceja
@danielkoceja
ID: 2684016816
27-07-2014 03:40:52
1 Tweet
77 Followers
7 Following
I want to highlight some of the coolest kernel work by Daniel Koceja, specifically around making TTT layers fast for training and video generation. TLDR: Our kernel does Tensor Parallel across Streaming Multiprocessors so we can efficiently train an RNN whose hidden state is a