Sascha Kirch (@sascha_kirch) Twitter Tweets • TwiCopy

Sascha Kirch

@sascha_kirch

+ Follow

🎓 PhD Student in Deep Learning @ UNED 🚙 Expert Deep Learning @ Bosch 🤖 Collaborating Researcher @ Volograms ⚡️President Elect @ IEEE Eta Kappa Nu Nu Alpha

ID: 1479556576260276224

linkhttps://sascha-kirch.github.io/ calendar_today07-01-2022 20:52:44

89 Tweet

47 Followers

42 Following

Towards Data Science

@tdatascience

a year ago

"On our journey towards Mamba selective state space models and their recent achievements in research, understanding the state space model is crucial." Sascha Kirch explores the Mamba state space model in a new, well-illustrated deep dive. buff.ly/4dwUE9b

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Towards Data Science

@tdatascience

a year ago

The Mamba model architecture has generated a lot of buzz as a potential replacement for the powerful Transformer. Sascha Kirch shares the first in a series of articles, aiming to look closely at its inner workings and potential use cases.. buff.ly/3SL95Ov

thumb_up_off_alt17

chat_bubble_outline0

repeat5

shareShare

Towards Data Science

@tdatascience

a year ago

In part three of his top-notch series on Mamba state space models, Sascha Kirch turns to use cases focused on images, videos, and time series. buff.ly/3Z6C0R4

thumb_up_off_alt10

chat_bubble_outline0

repeat3

shareShare

Towards Data Science

@tdatascience

a year ago

"The Structured SSM approximates the context using the HiPPO matrix resulting in some compression, while it can be trained more efficiently as the RNN because of its convolutional representation." Sascha Kirch's deep dive explores Mamba state space models for images, videos,

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Sascha Kirch

@sascha_kirch

a year ago

❗ VisionMamba is 2.8x faster than a VisionTransformer and saves 86,8% GPU memory inferring high-resolution images of 1248x1248 pixels. medium.com/towards-data-s…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Towards Data Science

@tdatascience

a year ago

Dive into the future of image processing with Vision Mamba! Unlike Transformer models, Vision Mamba’s sub quadratic scaling is a game-changer for dense-prediction tasks on high-res images. Read Sascha Kirch's full article now. towardsdatascience.com/vision-mamba-l… #DataScience

thumb_up_off_alt13

chat_bubble_outline1

repeat5

shareShare

Towards Data Science

@tdatascience

a year ago

Ever wondered how Vision Mamba outperforms Transformers in handling long sequences and high-resolution images? It’s all about state representation! Discover the innovative design choices making waves in vision tech, written by Sascha Kirch. #DataScience #MachineLearning

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Towards Data Science

@tdatascience

a year ago

The Mamba model architecture has generated a lot of buzz as a potential replacement for the powerful Transformer. Sascha Kirch recently shared the first in a series of articles, aiming to look closely at its inner workings and potential use cases. towardsdatascience.com/towards-mamba-…

thumb_up_off_alt13

chat_bubble_outline0

repeat5

shareShare

Towards Data Science

@tdatascience

a year ago

Being equipped with the Mamba selective state space model, we are now able to let history repeat itself and transfer the success of SSMs from sequence data to non-sequence data: Images. 🖊️ by Sascha Kirch | #DataScience #Programming towardsdatascience.com/vision-mamba-l…

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Agustin C. Caminero Herraez

@accaminero

8 months ago

Curso gratuito sobre #Industria40! Aprende Big Data, IoT, Cloud y AI con laboratorios remotos 🏭⚙️ y acelera la transformación digital 🎓 Certificación gratuita y de pago 🔗 Inscríbete aquí: iedra.uned.es/courses/course… #TransformaciónDigital #Educación #MOOC #industria40 #i4labs

thumb_up_off_alt2

chat_bubble_outline0

repeat3

shareShare

AI Advances

@aiadvances

7 months ago

You may have heard the phrase: “Attention scales poorly with the sequence length N, specifically with O(N²).” In this article, Sascha Kirch explores why attention is so slow & resource-intensive on modern GPUs, and how FlashAttention addresses the issue. ai.gopubby.com/5a9f2407d739

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Sascha Kirch

@sascha_kirch

6 months ago

🙈 Still avoiding einsum() in NumPy, Pytorch, Jax or TensorFlow? You’re not alone—and it’s time we fix that. ai.gopubby.com/still-avoiding…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare