Deep-Learning Times 💙 (@s_chatterjee66) Twitter Tweets • TwiCopy

Deep-Learning Times 💙

@s_chatterjee66

+ Follow

Incoming Researcher @ KrutrimLLM || CS @ Indian Statistical Institute || A researcher in deep learning. Looking for collaborations
Let's connect !!

ID: 1916431237960413184

linkhttps://www.linkedin.com/in/sandeep-chatterjee-918290143/ calendar_today27-04-2025 09:57:15

89 Tweet

16 Followers

420 Following

Niloofar (on faculty job market!)

@niloofar_mire

7 months ago

📣Thrilled to announce I’ll join Carnegie Mellon University (CMU Engineering & Public Policy & Language Technologies Institute | @CarnegieMellon) as an Assistant Professor starting Fall 2026! Until then, I’ll be a Research Scientist at AI at Meta FAIR in SF, working with Kamalika Chaudhuri’s amazing team on privacy, security, and reasoning in LLMs!

📣Thrilled to announce I’ll join Carnegie Mellon University (<a href="/CMU_EPP/">CMU Engineering & Public Policy</a> & <a href="/LTIatCMU/">Language Technologies Institute | @CarnegieMellon</a>) as an Assistant Professor starting Fall 2026!

Until then, I’ll be a Research Scientist at <a href="/AIatMeta/">AI at Meta</a> FAIR in SF, working with <a href="/kamalikac/">Kamalika Chaudhuri</a>’s amazing team on privacy, security, and reasoning in LLMs!

thumb_up_off_alt1,1K

chat_bubble_outline212

repeat67

shareShare

HarshVivek Singh

@hvsbanwait

5 months ago

When an old man dies, a library burns to the ground

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat111

shareShare

Sophia Yang, Ph.D.

@sophiamyang

5 months ago

How to train a model that actually understands both audio and text like Voxtral from Mistral AI? Here is a quick video walkthrough of the paper.

thumb_up_off_alt1,1K

chat_bubble_outline13

repeat138

shareShare

Srinivas Alavilli

@srinualavilli

5 months ago

Swalpa Math problem idu. Try this instead

thumb_up_off_alt4,4K

chat_bubble_outline173

repeat340

shareShare

Yujin Kim

@yujin301300

5 months ago

Introducing our new work: 🚀Mixture-of-Recursions! 🪄We propose a novel framework that dynamically allocates recursion depth per token. 🪄MoR is an efficient architecture with fewer params, reduced KV cache memory, and 2× greater throughput— maintaining comparable performance!

thumb_up_off_alt328

chat_bubble_outline9

repeat58

shareShare

Deep-ML

@real_deep_ml

5 months ago

Great new blog post by Sebastian Raschka about the evolution and advancement of different LLM architectures!

Great new blog post by <a href="/rasbt/">Sebastian Raschka</a> about the evolution and advancement of different LLM architectures!

thumb_up_off_alt181

chat_bubble_outline2

repeat22

shareShare

Edoardo Debenedetti

@edoardo_debe

5 months ago

Excited to start as a Research Scientist Intern at Meta, in the GenAI Red Team, where I will keep working on AI agents security. I'll be based in the Bay Area, so reach out if you're around and wanna chat about AI security!

thumb_up_off_alt366

chat_bubble_outline24

repeat10

shareShare

MathMatize Memes

@mathmatize

5 months ago

Hurwitz's theorem is surprising

thumb_up_off_alt3,3K

chat_bubble_outline59

repeat216

shareShare

John D. Cook

@johndcook

5 months ago

MathMatize Memes Here's a theorem that should blow your mind: en.wikipedia.org/wiki/Kolmogoro…

thumb_up_off_alt147

chat_bubble_outline9

repeat12

shareShare

Vivek Galatage

@vivekgalatage

5 months ago

Advances in Computer Vision from MIT Open Courseware is available freely and is one of the fantastic courses.

thumb_up_off_alt1,1K

chat_bubble_outline6

repeat134

shareShare

Frank Nielsen

@frnknlsn

5 months ago

An introduction to information geometry in 10 pages: "The Many Faces of Information Geometry" Open access PDF

thumb_up_off_alt533

chat_bubble_outline2

repeat75

shareShare

Nate Chen

@chengua46724992

5 months ago

Why do FFNs use ReLU instead of more precise ones like Exp? "We propose the following hypothesis: A kernel with lower retrieval precision encourages a more polysemantic key–value memory: multiple unrelated facts can be stored under the same key space" Great and inspiring read!

thumb_up_off_alt315

chat_bubble_outline2

repeat49

shareShare

Yam Peleg

@yampeleg

5 months ago

Wild paper They prove (!!) a transformer block (Attn + MLP) running on prompt Outputs the same logits with no prompt If MLP weights updated by vector: W′ = W + ΔW Calc from attn latent: ΔW = (W·Δa) × (A(x)ᵀ / ‖A(x)‖²) Given prompt: Δa = A(C, x) − A(x) Fucking fine tuning.

thumb_up_off_alt2,2K

chat_bubble_outline44

repeat192

shareShare

alphaXiv

@askalphaxiv

5 months ago

In-context learning is just gradient descent without explicit training! This paper "Learning without training: The implicit dynamics of in-context learning" shows that ICL can be mathematically interpreted as an implicit low-rank weight update during inference.

thumb_up_off_alt649

chat_bubble_outline11

repeat92

shareShare

ℏεsam

@hesamation

5 months ago

Fuck ML tutorials. This is a collection of 300 ML system design case studies in real world, from Stripe, Spotify, Netflix, Meta, etc. Perfect for interviews and to learn how it’s done in the battlefield. Wish there was a similar thing for agents!

thumb_up_off_alt5,5K

chat_bubble_outline25

repeat668

shareShare

Jackson Atkins

@jacksonatkinsx

5 months ago

LLMs can now self-optimize. A new method allows an AI to rewrite its own prompts to achieve up to 35x greater efficiency, outperforming both Reinforcement Learning and Fine-Tuning for complex reasoning. UC Berkeley, Stanford, and Databricks introduce a new method called GEPA

thumb_up_off_alt1,1K

chat_bubble_outline47

repeat333

shareShare

Tanmoy Chakraborty

@tanmoy_chak

5 months ago

India is nowhere in the NLP map. 😢 We, cumulatively, need to push ourselves hard. ACL 2025 #nlproc

India is nowhere in the NLP map. 😢
We, cumulatively, need to push ourselves hard.
<a href="/aclmeeting/">ACL 2025</a> #nlproc

thumb_up_off_alt111

chat_bubble_outline10

repeat9

shareShare

Rosinality

@rosinality

5 months ago

Geometric-Mean Policy Optimization Using geometric mean for the importance ratio, similar to GSPO (arxiv.org/abs/2507.18071).

thumb_up_off_alt251

chat_bubble_outline6

repeat33

shareShare

Sumanth

@sumanth_077

5 months ago

Build a Large Language Model from scratch! This repository contains the code for developing, pretraining, and finetuning a GPT-like large language model. 100% Free & Open Source

thumb_up_off_alt1,1K

chat_bubble_outline14

repeat363

shareShare

Dr. PM Dhakate

@paragenetics

4 months ago

An amazing video, our national animal and bird, together in one frame! A perfect symbol of India's vibrant spirit. Wishing everyone a Happy Independence Day. आप सभी को स्वतंत्रता दिवस की हार्दिक बधाई एवं शुभकामनाएं, जय हिंद। 🇮🇳 VC: Rakesh Bhatt #IndependenceDay #JaiHind

thumb_up_off_alt4,4K

chat_bubble_outline92

repeat852

shareShare