Amardeep Singh Sidhu (@thefirehacker) Twitter Tweets • TwiCopy

Amardeep Singh Sidhu

5 days ago

We need more real LLM training case studies. If you’ve seen or shared actual training runs, drop them here 👇 💡 We’re collecting these and turning them into structured breakdowns that are easy to learn and apply. Reading the Curves: How real LLMs learn, spike, recover, and

thumb_up_off_alt5

chat_bubble_outline1

repeat2

shareShare

Ramin Nasibov

@raminnasibov

5 days ago

I saw a guy at coffeeshop today. No iPhone. No laptop. No tablet. Just sitting there. Drinking his coffee.

thumb_up_off_alt16,16K

chat_bubble_outline1,1K

repeat979

shareShare

DPIIT India

@dpiitgoi

4 days ago

DPIIT releases operational guidelines for ₹10,000 crore Startup India Fund of Funds 2.0—a major push to deepen India’s startup ecosystem. #StartupIndia Read more: pib.gov.in/PressReleasePa…

thumb_up_off_alt341

chat_bubble_outline10

repeat57

shareShare

Anish Moonka

@anisha_moonka

4 days ago

A 9-year-old in India saw two peacocks walking through dawn fog. She ran to her dad, grabbed his camera, knelt on the dirt, and took one photo. It placed at the BBC Wildlife Photographer of the Year, picked from almost 60,000 entries across 117 countries. The photographer is

thumb_up_off_alt7,7K

chat_bubble_outline22

repeat919

shareShare

Amardeep Singh Sidhu

@thefirehacker

3 days ago

I found this feature in cursor today "Generate commit message" , very useful. don't have prompt CC or Cursor " generate commit msg for staged code"

thumb_up_off_alt4

chat_bubble_outline2

repeat3

shareShare

⚚Sage

@be_like_sage

3 days ago

Me in a Teams meeting, waiting to say "Nothing From my side"

thumb_up_off_alt32,32K

chat_bubble_outline120

repeat4,4K

shareShare

Grad

@grad62304977

3 days ago

I'm still confused by some of the decisions done in deepseek v4 Main confusion is why the huge focus on reducing KV cache size when with something like HiSparse u can offload most of ur kv cache (making ur decode compute bound) This also is compensated with a huge 128 heads and

thumb_up_off_alt253

chat_bubble_outline27

repeat8

shareShare

Erfanzar

@eraznafre

3 days ago

Releasing SpectraX is a JAX-native neural-network library built around true MPMD pipeline parallelism. Each physical rank compiles and runs its own XLA program — no shared shard_map HLO, no SPMD-same-shape constraint. Heterogeneous stages (eg, embed → blocks → head), nine

thumb_up_off_alt157

chat_bubble_outline6

repeat18

shareShare

Puneet Kumar

@puneetiitm

2 days ago

The real sign Indian airports have arrived isn’t the marble or the lounges. It’s that the 90-minute buffer is now obsolete. Mumbai T1 today: in at 5:30, seated by 5:42. T2 last week — same. Delhi — same. When efficiency becomes predictable, the real luxury is leaving home later.

thumb_up_off_alt693

chat_bubble_outline21

repeat17

shareShare

Jeremy Howard

@jeremyphoward

2 days ago

This is great - DeepSeek V4 supports prefill! :D Most other providers have been dropping support for this critically important capability, so wonderful to see at least one company stepping up. api-docs.deepseek.com/guides/chat_pr…

thumb_up_off_alt420

chat_bubble_outline14

repeat27

shareShare

Yacine Mahdid

@yacinelearning

2 days ago

if you are interested in a great lecture on self-distillation I’ve finished editing a ~1h30min lecture with two stellar researchers in that space Jonas Hübotter and idan shenfeld lots of different article distilled into one presentation and a whole lot of questions answered!

thumb_up_off_alt1,1K

chat_bubble_outline14

repeat118

shareShare

Ben Burtenshaw

@ben_burtenshaw

2 days ago

Humanity's Last Hackathon is NOW OPEN for registration. This is not a normal hackathon. You will be judged on the context, not the code! Use Codex OpenAI Developers to build and optimize models for local inference (kernels on Max metal). Submit through GPU MODE. Climb the

thumb_up_off_alt348

chat_bubble_outline20

repeat32

shareShare

Matej Sirovatka

@m_sirovatka

2 days ago

We partnered with Hugging Face and OpenAI to allow you to write Metal kernels with AI only. We've seen a rise of AI generated submissions so this time you submit directly through codex to write the fastest kernels to run your AI locally 🫡

thumb_up_off_alt139

chat_bubble_outline7

repeat7

shareShare

Claude

@claudeai

2 days ago

Claude now connects to the tools creative professionals already use. With the new Blender connector, you can debug a scene, build new tools, or batch-apply changes across every object, directly from Claude.

thumb_up_off_alt22,22K

chat_bubble_outline989

repeat1,1K

shareShare

Keller Jordan

@kellerjordan0

2 days ago

Modded-NanoGPT Optimization Benchmark Hundreds of neural network optimizers have been proposed in the literature, recently including dozens citing Muon: MARS, SWAN, REG, ADANA, Newton-Muon, TrasMuon, AdaMuon, HTMuon, COSMOS, Conda, ASGO, SAGE, and Magma, to name a few. The

thumb_up_off_alt217

chat_bubble_outline6

repeat28

shareShare

B62 Studios

@b62studios

a day ago

It’s time for Japan to experience the Dhurandhar energy! ⚔️🔥 Arriving in theatres across Japan on 10.7.2026.

thumb_up_off_alt2,2K

chat_bubble_outline28

repeat298

shareShare

Priyaa

@pritopian

17 hours ago

India doesn’t need to build its own nuclear program. But it must lead in ensuring the benefits of deterrence are widely shared. See how absurd that sounds? you can’t lead in distributing a benefit you don’t control. if you don't own any part of the core capability chain, you

thumb_up_off_alt1,1K

chat_bubble_outline42

repeat380

shareShare