Imama Shehzad (@caffeinix_alche) Twitter Tweets • TwiCopy

Imama Shehzad

@caffeinix_alche

5 months ago

Back after a very long awaited vacation now is the time to grind back......

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Never knew that XML-based reasoning would give the best fine-grained control for text generation. By integrating structured tags in prompts, we're able to control context flow and target specific attributes like tone and intent.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

4 months ago

Damn gemma 3 is a model with so much complexity.....

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

4 months ago

Finally it's the weekend.Time to get relaxed and have some real work done....

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

4 months ago

These lines hit different ✨✨ Itne gham jo mil rahe hain mujhko, Iski kya wajah hai, Maine to kabhi tumhara dil, Dukhaya bhi nahi... Ki ab nahi hai waadon par yaken.... Dil toot sa gaya hai, Aankhen yaad kar rahi hain, Tasveeron se teri kab se, Baat kar rahi hai, Aa jao.....

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

4 months ago

Just because something isn't a science, doesn't mean it's flawed. It simply means it exists outside that framework.The absence of current scientific understanding doesn't equate to non-existence of some concepts. -Richard Feynman

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

4 months ago

Let's rethinking to make it better

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

3 months ago

Is there a fundamental trade-off between reasoning and instruction following in LLMs? 'Scaling Reasoning, Losing Control' says yes! Their findings suggest that improving reasoning capability comes at the cost of reduced instruction adherence.Quite an interesting read...

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

3 months ago

"Sometimes we don't want to heal because the pain is the last link to what we've lost." - Ibn Sina It's true,sometimes we cling to the ache because it's all that's left of what we miss. It feels like betrayal to let go, like erasing some beautiful memories .

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

3 months ago

Ohk so I don't have any cricket interest but the hype made me watch the RCB match and damn it's worth it ....

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

3 months ago

Craving some North Indian food in Koramangala.Where's the best North Indian food at? Let me know your top picks!

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

3 months ago

Well quite an interesting read! the focus on distinguishing 'generalizable reasoning' from 'pattern matching' in LLMs is great. Also the idea that models might be picking up 'partial heuristics' rather than true understanding justifies the surprising failures in production.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

3 months ago

Absolutely! Beyond data unlocking capabilities,What's the optimal balance between diverse and highly targeted datasets for specific capabilities? And for synthetic data, how do we measure its 'transferability' and ensure it doesn't add spurious correlations or hallucination ?

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

3 months ago

Building multi-turn synthetic dialogues dataset isn’t about random prompts. It’s CoT reasoning + realistic disfluencies + strict call flow control — no seed data needed. It’s not just generation. It’s a simulation.....

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

2 months ago

Can synthetic, semantic-free data boost algorithmic reasoning in LLMs? Yes. - Procedural datasets inject inductive biases like long-range memory. - Swapping attention/MLP layers across models + fine-tuning = big gains. A must-read if you're into reasoning & pre-training LLMs.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

2 months ago

Absolutely! This trick works by freeing the computational graph associated with that specific loss. By calling it on individual loss components, memory is released sequentially, preventing the accumulation of the full graph for the combined loss. Smart memory management! 🤓😎

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

2 months ago

Okay so another weekend passed and I can't complete my to-do list AGAIN. It's a constant battle between feeling like I deserve to relax and enjoy myself after a tough week or should I complete my never ending to do list.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Imama Shehzad

@caffeinix_alche

2 months ago

Okay so I am gonna admit it watching gradient norm and loss curves during training isn’t just “nice to have.” They’re the real-time health checks for the model optimizer. Miss the spikes or stalls and you’ll be chasing bugs that graphs could’ve told in seconds.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare