Tanishq Mathew Abraham, Ph.D.(@iScienceLuvr) 's Twitter Profileg
Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

PhD at 19 |
Founder and CEO at @MedARC_AI |
Research Director at @StabilityAI |
@kaggle Notebooks GM |
Biomed. engineer @ 14 |
TEDx talk➡https://t.co/xPxwKTq6Qb

ID:441465751

linkhttps://tanishq.ai calendar_today20-12-2011 03:45:50

12,9K Tweets

54,5K Followers

1,0K Following

Tanishq Mathew Abraham, Ph.D.(@iScienceLuvr) 's Twitter Profile Photo

xLSTM: Extended Long Short-Term Memory

abs: arxiv.org/abs/2405.04517

Leveraging the latest techniques from modern LLMs, mitigating known limitations of LSTMs (introducing sLSTM and mLSTM memory cells that form the xLSTM blocks), and scaling up results in a highly competitive…

xLSTM: Extended Long Short-Term Memory abs: arxiv.org/abs/2405.04517 Leveraging the latest techniques from modern LLMs, mitigating known limitations of LSTMs (introducing sLSTM and mLSTM memory cells that form the xLSTM blocks), and scaling up results in a highly competitive…
account_circle
Tanishq Mathew Abraham, Ph.D.(@iScienceLuvr) 's Twitter Profile Photo

🦣 MAmmoTH2: Scaling Instructions from the Web

proj page: tiger-ai-lab.github.io/MAmmoTH2/
abs: arxiv.org/abs/2405.03548

Introduces the WebInstruct dataset, a dataset of 10M instructions examples harvested from Common Crawl through a 3-step process: 1) seed data from crawling quiz…

🦣 MAmmoTH2: Scaling Instructions from the Web proj page: tiger-ai-lab.github.io/MAmmoTH2/ abs: arxiv.org/abs/2405.03548 Introduces the WebInstruct dataset, a dataset of 10M instructions examples harvested from Common Crawl through a 3-step process: 1) seed data from crawling quiz…
account_circle
Tanishq Mathew Abraham, Ph.D.(@iScienceLuvr) 's Twitter Profile Photo

Is Flash Attention Stable?

abs: arxiv.org/abs/2405.02803

Researchers from Meta introduce a new framework to understand the effects of numeric deviation and apply this framework to analyze Flash Attention.

'Flash Attention sees roughly an order of magnitude more numeric…

Is Flash Attention Stable? abs: arxiv.org/abs/2405.02803 Researchers from Meta introduce a new framework to understand the effects of numeric deviation and apply this framework to analyze Flash Attention. 'Flash Attention sees roughly an order of magnitude more numeric…
account_circle
Tanishq Mathew Abraham, Ph.D.(@iScienceLuvr) 's Twitter Profile Photo

A reminder that some of the most competitive open-source foundation models come from China: Qwen, Yi, InternLM, Deepseek, BGE, CogVLM, etc.

The narrative that China is behind on AI is simply not true. Instead they are making major contributions to the ecosystem and community.

account_circle
Tanishq Mathew Abraham, Ph.D.(@iScienceLuvr) 's Twitter Profile Photo

First of all, depending on the textbooks, twenty can be a lot!

Second of all, there's so much tacit, practical knowledge that is not in textbooks and often the only way to get that knowledge is by doing stuff hands-on & talking to others.

So IMO this take isn't very accurate

account_circle
Tanishq Mathew Abraham, Ph.D.(@iScienceLuvr) 's Twitter Profile Photo

Join me in congratulating my younger sister Tiara Abraham 👏👏

At 18 she is the youngest student across all 7 Indiana University campuses to receive a grad degree 🎓!

Next up, doctorate in music also at IU Bloomington 🎵🎶

account_circle