ƬⲘ 👾 (@tm23twt) 's Twitter Profile
ƬⲘ 👾

@tm23twt

19 . ml . engineer✌️. tm23-portfolio.vercel.app . tmwork.vercel.app

ID: 1918285228403253249

calendar_today02-05-2025 12:43:57

1,1K Tweet

348 Takipçi

253 Takip Edilen

ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

ml is hard, like genuinely hard chat not trying to demotivate y'all but just sharing the reality. my mind is fcked up reading the concepts, and the concepts are so interlinked that if you'll fck one the whole thing will fall. need to give this shit some fcking time broo :(

ml is hard, like genuinely hard chat
not trying to demotivate y'all but just sharing the reality. my mind is fcked up reading the concepts, and the concepts are so interlinked that if you'll fck one the whole thing will fall. 
need to give this shit some fcking time broo :(
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

Implemented Transformer Architecture in pytorch :) finally chat took around 3 days to get this shit done, around 8-9 classes are there, now i got a good understanding of this beautiful shit. Will finish the remaining small portion of paper today only, then we move ahead :) link👇

Implemented Transformer Architecture in pytorch :)
finally chat took around 3 days to get this shit done, around 8-9 classes are there, now i got a good understanding of this beautiful shit. Will finish the remaining small portion of paper today only, then we move ahead :)
link👇
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

one of the worst thing you can do is trying to reach tail of a language, like i wanted to finish html, python or any shit completely. the more bigger the lecture, the more better it is i thought, but in reality we apply few concepts regularly, so stop chasing those outliers :)

one of the worst thing you can do is trying to reach tail of a language, like i wanted to finish html, python or any shit completely. the more bigger the lecture, the more better it is i thought, but in reality we apply few concepts regularly, so stop chasing those outliers :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

chat we're done with the paper & architecture though kv caching still needs some clarity, for this month will try to finish this 600 pages book, maybe somewhere we'll code GPT as it is not that hard, also i don't have great expectations from this book, let's see how it goes :)

chat we're done with the paper & architecture though kv caching still needs some clarity, for this month will try to finish this 600 pages book, maybe somewhere we'll code GPT as it is not that hard, also i don't have great expectations from this book, let's see how it goes :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

chat there are 12 chapters in this book, currently at 2nd i genuinely want to finish this shit before this month end. daily around 20 pages, not a big task but along the way we read papers, build some shit so it is a fun challenge, let's see if i can pull this off book - hollm :)

chat there are 12 chapters in this book, currently at 2nd i genuinely want to finish this shit before this month end. daily around 20 pages, not a big task but along the way we read papers, build some shit so it is a fun challenge, let's see if i can pull this off
book - hollm :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

plans for the day chat currently at page 77 will target 160, maybe we'll read papers of Byte pair encoding & Word piece along the way, this one deals with tokens & embeddings, then next one will be on LLMs. honestly im doubting this book cause there is less code, more theory :)

plans for the day chat
currently at page 77 will target 160, maybe we'll read papers of Byte pair encoding & Word piece along the way, this one deals with tokens & embeddings, then next one will be on LLMs. honestly im doubting this book cause there is less code, more theory :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

it's weird that i started ml coz i started hating dev, like what other choice was there. in cs ur either in web or ml (talking abt majority), ig it's better to know what you don't want, slowly the hate for dev was replaced by the curiosity for ml so chat know what u don't want :)

it's weird that i started ml coz i started hating dev, like what other choice was there. in cs ur either in web or ml (talking abt majority), ig it's better to know what you don't want, slowly the hate for dev was replaced by the curiosity for ml
so chat know what u don't want :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

onto third chapter baby, i think this llm phase will go pretty smooth, new topics will be introduced from next chapter ig, also built a song recommendation system (quite easy). after this one ,will read paper or more theory of Byte pair encoding & other methods what abt u chat :)

onto third chapter baby, i think this llm phase will go pretty smooth, new topics will be introduced from next chapter ig, also built a song recommendation system (quite easy). after this one ,will read paper or more theory of Byte pair encoding & other methods
what abt u chat :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

fck it, read around 70 pages today will deal with this sparse attention shit tomorrow, also maybe we'll code byte pair encoding from scratch to get better understanding. will try to finish this book asap then we'll read some papers & code some more shit from scratch :)

fck it, read around 70 pages today will deal with this sparse attention shit tomorrow, also maybe we'll code byte pair encoding from scratch to get better understanding. will try to finish this book asap then we'll read some papers & code some more shit from scratch :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

plans for the day chat - finish chapter 3 (around 16 pages left) - read paper of Byte Pair Encoding (popular one) - implement BPE from scratch - maybe RoPE & Flash attention as well what abt u chat :)

plans for the day chat

- finish chapter 3 (around 16 pages left)
- read paper of Byte Pair Encoding (popular one)
- implement BPE from scratch 
- maybe RoPE & Flash attention as well

what abt u chat :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

ᴘʟᴀɴɴɪɴɢ ᴛᴏ ᴡʀɪᴛᴇ ꜱᴏᴍᴇ ᴍᴀᴄʜɪɴᴇ ʟᴇᴀʀɴɪɴɢ ʙʟᴏɢꜱ ᴄʜᴀᴛ ɪꜰ ᴜ ɢᴏᴛ ᴀɴʏ ɪᴅᴇᴀ, ᴄᴏᴍᴍᴇɴᴛ ᴅᴏᴡɴ✌️

ᴘʟᴀɴɴɪɴɢ ᴛᴏ ᴡʀɪᴛᴇ ꜱᴏᴍᴇ ᴍᴀᴄʜɪɴᴇ ʟᴇᴀʀɴɪɴɢ ʙʟᴏɢꜱ ᴄʜᴀᴛ
ɪꜰ ᴜ ɢᴏᴛ ᴀɴʏ ɪᴅᴇᴀ, ᴄᴏᴍᴍᴇɴᴛ ᴅᴏᴡɴ✌️
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

𝗻𝗼𝘁𝗵𝗶𝗻𝗴 𝗷𝘂𝘀𝘁 𝗰𝗼𝗱𝗶𝗻𝗴 𝗕𝘆𝘁𝗲 𝗽𝗮𝗶𝗿 𝗲𝗻𝗰𝗼𝗱𝗶𝗻𝗴 𝗳𝗿𝗼𝗺 𝘀𝗰𝗿𝗮𝘁𝗰𝗵 for those who don't know this popular method is used as tokenization algorithm for models like GPT-2 to GPT-4, Llama 3 & many more :)

𝗻𝗼𝘁𝗵𝗶𝗻𝗴 𝗷𝘂𝘀𝘁 𝗰𝗼𝗱𝗶𝗻𝗴 𝗕𝘆𝘁𝗲 𝗽𝗮𝗶𝗿 𝗲𝗻𝗰𝗼𝗱𝗶𝗻𝗴 𝗳𝗿𝗼𝗺 𝘀𝗰𝗿𝗮𝘁𝗰𝗵

for those who don't know this popular method is used as tokenization algorithm for models like GPT-2 to GPT-4, Llama 3 & many more :)
ƬⲘ 👾 (@tm23twt) 's Twitter Profile Photo

𝗗𝗼𝗻𝗲 𝘄𝗶𝘁𝗵 𝗕𝘆𝘁𝗲 𝗣𝗮𝗶𝗿 𝗘𝗻𝗰𝗼𝗱𝗶𝗻𝗴 𝗽𝗮𝗽𝗲𝗿 now onto next paper of Rotary Positional Embeddings, im slowly getting hang of the research papers bro. will share the code of bpe later. also weekend is cooked chat, btw what y'all doing :) paper link👇

𝗗𝗼𝗻𝗲 𝘄𝗶𝘁𝗵 𝗕𝘆𝘁𝗲 𝗣𝗮𝗶𝗿 𝗘𝗻𝗰𝗼𝗱𝗶𝗻𝗴 𝗽𝗮𝗽𝗲𝗿
now onto next paper of Rotary Positional Embeddings, im slowly getting hang of the research papers bro. will share the code of bpe later. also weekend is cooked chat, btw what y'all doing :)
paper link👇