Nkechi
@kechieanyanwu
multidisciplinary. aspirational. believer.
ID: 1491388626516774913
https://www.youtube.com/@kechiea 09-02-2022 12:29:12
222 Tweet
59 Followers
75 Following
Thinking Machines tldr: LLM inference nondeterminism isn't just floating-point non-associativity or GPU concurrent execution, the core culprit is batching variance, where server load unpredictably alters numerics. Batch-invariant kernels unlock true reproducibility, finally making RL "on-policy"