profile-img
Andrej Karpathy

@karpathy

πŸ§‘β€πŸ³. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets πŸ§ πŸ€–πŸ’₯

calendar_today21-04-2009 06:49:15

8,7K Tweets

979,4K Followers

905 Following

Andrej Karpathy(@karpathy) 's Twitter Profile Photo

New (2h13m πŸ˜…) lecture: 'Let's build the GPT Tokenizer'

Tokenizers are a completely separate stage of the LLM pipeline: they have their own training set, training algorithm (Byte Pair Encoding), and after training implement two functions: encode() from strings to tokens, and

New (2h13m πŸ˜…) lecture: 'Let's build the GPT Tokenizer' Tokenizers are a completely separate stage of the LLM pipeline: they have their own training set, training algorithm (Byte Pair Encoding), and after training implement two functions: encode() from strings to tokens, and
account_circle