amar (@amarproduct) Twitter Tweets • TwiCopy

amar

@amarproduct

+ Follow

product @supermodel_ai. prev @microsoft

ID: 1879612550746406912

calendar_today15-01-2025 19:56:15

37 Tweet

31 Followers

269 Following

amar

@amarproduct

a year ago

2025 is year of the Gemini

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

amar

@amarproduct

a year ago

real-time learning is a crucial component in building agents that autonomously improve at their intended task. I'm excited to see how your product evolves. congrats Andy Kasey Zhang

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

I think they’re addressing different parts of the value chain. Uber cabs might become obsolete but they’ll continue as a distribution platform for companies like Waymo and Tesla to deploy their vehicles

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

amar

@amarproduct

a year ago

inquiring to buy ai.com

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

amar

@amarproduct

a year ago

this is a must-read for anyone looking to better understand reasoning models

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

amar

@amarproduct

9 months ago

Vending-Bench by Andon Labs is a great example of practical eval design. Grounding model performance in real-world UX flows is what we need more of. andonlabs.com/evals/vending-…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

amar

@amarproduct

9 months ago

This is a great way to frame AI — not as magic, but as new leverage. When a new form of leverage emerges (like AI agents today), there’s a brief window where the output far exceeds the input before the crowd catches up and margins compress

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

amar

@amarproduct

6 months ago

📍Cursor Cafe in NYC Thanks for hosting Ben Lang

📍Cursor Cafe in NYC

Thanks for hosting <a href="/benln/">Ben Lang</a>

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Ali Ansari

@aliniikk

3 months ago

x.com/i/article/2009…

thumb_up_off_alt4,4K

chat_bubble_outline240

repeat622

shareShare

Shizhe Diao

@shizhediao

3 months ago

RLVR is powerful — but how do you train with multiple rewards effectively? 🤔 🎯GDPO (not GRPO) is coming. We introduce Group reward-Decoupled Normalization Policy Optimization (GDPO), a new multi-reward RL algorithm that consistently improves per-reward convergence over GRPO

thumb_up_off_alt809

chat_bubble_outline23

repeat133

shareShare

amar

@amarproduct

3 months ago

this explains a lot. multi reward GRPO kind of felt unstable which makes sense considering it was designed for single-objective optimization. basically, summing rewards before normalization forces distinct groups to provide identical signals time to update the default :)

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare