ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile
ฦฌโฒ˜ ๐Ÿ‘พ

@tm23twt

19 . ml . engineerโœŒ๏ธ. tm23-portfolio.vercel.app . tmwork.vercel.app

ID: 1918285228403253249

calendar_today02-05-2025 12:43:57

1,1K Tweet

348 Followers

253 Following

ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

ml is hard, like genuinely hard chat not trying to demotivate y'all but just sharing the reality. my mind is fcked up reading the concepts, and the concepts are so interlinked that if you'll fck one the whole thing will fall. need to give this shit some fcking time broo :(

ml is hard, like genuinely hard chat
not trying to demotivate y'all but just sharing the reality. my mind is fcked up reading the concepts, and the concepts are so interlinked that if you'll fck one the whole thing will fall. 
need to give this shit some fcking time broo :(
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

Implemented Transformer Architecture in pytorch :) finally chat took around 3 days to get this shit done, around 8-9 classes are there, now i got a good understanding of this beautiful shit. Will finish the remaining small portion of paper today only, then we move ahead :) link๐Ÿ‘‡

Implemented Transformer Architecture in pytorch :)
finally chat took around 3 days to get this shit done, around 8-9 classes are there, now i got a good understanding of this beautiful shit. Will finish the remaining small portion of paper today only, then we move ahead :)
link๐Ÿ‘‡
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

one of the worst thing you can do is trying to reach tail of a language, like i wanted to finish html, python or any shit completely. the more bigger the lecture, the more better it is i thought, but in reality we apply few concepts regularly, so stop chasing those outliers :)

one of the worst thing you can do is trying to reach tail of a language, like i wanted to finish html, python or any shit completely. the more bigger the lecture, the more better it is i thought, but in reality we apply few concepts regularly, so stop chasing those outliers :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

chat we're done with the paper & architecture though kv caching still needs some clarity, for this month will try to finish this 600 pages book, maybe somewhere we'll code GPT as it is not that hard, also i don't have great expectations from this book, let's see how it goes :)

chat we're done with the paper & architecture though kv caching still needs some clarity, for this month will try to finish this 600 pages book, maybe somewhere we'll code GPT as it is not that hard, also i don't have great expectations from this book, let's see how it goes :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

chat there are 12 chapters in this book, currently at 2nd i genuinely want to finish this shit before this month end. daily around 20 pages, not a big task but along the way we read papers, build some shit so it is a fun challenge, let's see if i can pull this off book - hollm :)

chat there are 12 chapters in this book, currently at 2nd i genuinely want to finish this shit before this month end. daily around 20 pages, not a big task but along the way we read papers, build some shit so it is a fun challenge, let's see if i can pull this off
book - hollm :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

plans for the day chat currently at page 77 will target 160, maybe we'll read papers of Byte pair encoding & Word piece along the way, this one deals with tokens & embeddings, then next one will be on LLMs. honestly im doubting this book cause there is less code, more theory :)

plans for the day chat
currently at page 77 will target 160, maybe we'll read papers of Byte pair encoding & Word piece along the way, this one deals with tokens & embeddings, then next one will be on LLMs. honestly im doubting this book cause there is less code, more theory :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

it's weird that i started ml coz i started hating dev, like what other choice was there. in cs ur either in web or ml (talking abt majority), ig it's better to know what you don't want, slowly the hate for dev was replaced by the curiosity for ml so chat know what u don't want :)

it's weird that i started ml coz i started hating dev, like what other choice was there. in cs ur either in web or ml (talking abt majority), ig it's better to know what you don't want, slowly the hate for dev was replaced by the curiosity for ml
so chat know what u don't want :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

onto third chapter baby, i think this llm phase will go pretty smooth, new topics will be introduced from next chapter ig, also built a song recommendation system (quite easy). after this one ,will read paper or more theory of Byte pair encoding & other methods what abt u chat :)

onto third chapter baby, i think this llm phase will go pretty smooth, new topics will be introduced from next chapter ig, also built a song recommendation system (quite easy). after this one ,will read paper or more theory of Byte pair encoding & other methods
what abt u chat :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

fck it, read around 70 pages today will deal with this sparse attention shit tomorrow, also maybe we'll code byte pair encoding from scratch to get better understanding. will try to finish this book asap then we'll read some papers & code some more shit from scratch :)

fck it, read around 70 pages today will deal with this sparse attention shit tomorrow, also maybe we'll code byte pair encoding from scratch to get better understanding. will try to finish this book asap then we'll read some papers & code some more shit from scratch :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

plans for the day chat - finish chapter 3 (around 16 pages left) - read paper of Byte Pair Encoding (popular one) - implement BPE from scratch - maybe RoPE & Flash attention as well what abt u chat :)

plans for the day chat

- finish chapter 3 (around 16 pages left)
- read paper of Byte Pair Encoding (popular one)
- implement BPE from scratch 
- maybe RoPE & Flash attention as well

what abt u chat :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

แด˜สŸแด€ษดษดษชษดษข แด›แด แดกส€ษชแด›แด‡ ๊œฑแดแดแด‡ แดแด€แด„สœษชษดแด‡ สŸแด‡แด€ส€ษดษชษดษข ส™สŸแดษข๊œฑ แด„สœแด€แด› ษช๊œฐ แดœ ษขแดแด› แด€ษดส ษชแด…แด‡แด€, แด„แดแดแดแด‡ษดแด› แด…แดแดกษดโœŒ๏ธ

แด˜สŸแด€ษดษดษชษดษข แด›แด แดกส€ษชแด›แด‡ ๊œฑแดแดแด‡ แดแด€แด„สœษชษดแด‡ สŸแด‡แด€ส€ษดษชษดษข ส™สŸแดษข๊œฑ แด„สœแด€แด›
ษช๊œฐ แดœ ษขแดแด› แด€ษดส ษชแด…แด‡แด€, แด„แดแดแดแด‡ษดแด› แด…แดแดกษดโœŒ๏ธ
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

๐—ป๐—ผ๐˜๐—ต๐—ถ๐—ป๐—ด ๐—ท๐˜‚๐˜€๐˜ ๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด ๐—•๐˜†๐˜๐—ฒ ๐—ฝ๐—ฎ๐—ถ๐—ฟ ๐—ฒ๐—ป๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด ๐—ณ๐—ฟ๐—ผ๐—บ ๐˜€๐—ฐ๐—ฟ๐—ฎ๐˜๐—ฐ๐—ต for those who don't know this popular method is used as tokenization algorithm for models like GPT-2 to GPT-4, Llama 3 & many more :)

๐—ป๐—ผ๐˜๐—ต๐—ถ๐—ป๐—ด ๐—ท๐˜‚๐˜€๐˜ ๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด ๐—•๐˜†๐˜๐—ฒ ๐—ฝ๐—ฎ๐—ถ๐—ฟ ๐—ฒ๐—ป๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด ๐—ณ๐—ฟ๐—ผ๐—บ ๐˜€๐—ฐ๐—ฟ๐—ฎ๐˜๐—ฐ๐—ต

for those who don't know this popular method is used as tokenization algorithm for models like GPT-2 to GPT-4, Llama 3 & many more :)
ฦฌโฒ˜ ๐Ÿ‘พ (@tm23twt) 's Twitter Profile Photo

๐——๐—ผ๐—ป๐—ฒ ๐˜„๐—ถ๐˜๐—ต ๐—•๐˜†๐˜๐—ฒ ๐—ฃ๐—ฎ๐—ถ๐—ฟ ๐—˜๐—ป๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ now onto next paper of Rotary Positional Embeddings, im slowly getting hang of the research papers bro. will share the code of bpe later. also weekend is cooked chat, btw what y'all doing :) paper link๐Ÿ‘‡

๐——๐—ผ๐—ป๐—ฒ ๐˜„๐—ถ๐˜๐—ต ๐—•๐˜†๐˜๐—ฒ ๐—ฃ๐—ฎ๐—ถ๐—ฟ ๐—˜๐—ป๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ
now onto next paper of Rotary Positional Embeddings, im slowly getting hang of the research papers bro. will share the code of bpe later. also weekend is cooked chat, btw what y'all doing :)
paper link๐Ÿ‘‡