Alexandre Drouin (@alexandredrouin) 's Twitter Profile
Alexandre Drouin

@alexandredrouin

Head of Frontier Agent Research @ ServiceNow Research / Adj. Professor @universitelaval @Mila_Quebec

ID: 267909797

linkhttp://alexdrouin.com calendar_today17-03-2011 19:51:18

547 Tweet

948 Takipçi

663 Takip Edilen

Dawn Song (@dawnsongtweets) 's Twitter Profile Photo

Really excited to announce our Advanced LLM Agents MOOC (Spring 2025)! Building on the success of our LLM Agents MOOC from Fall 2024 (15K+ registered learners, ~9K Discord members, 200K+ lecture views on YouTube), we are excited to extend the MOOC this semester to cover some more

Really excited to announce our Advanced LLM Agents MOOC (Spring 2025)!
Building on the success of our LLM Agents MOOC from Fall 2024 (15K+ registered learners, ~9K Discord members, 200K+ lecture views on YouTube), we are excited to extend the MOOC this semester to cover some more
Léo Boisvert (@leoboisvert) 's Twitter Profile Photo

📊 Breaking: Claude 3.7 Sonnet scores 51.5% on WorkArena benchmark! Surprising finding: The newer Claude 3.7 Sonnet (51.5%) performs below Claude 3.5 (56.4%) on our tests! 👀 Maybe newer isn't always better? Both Claude 3.7 and o3-mini are underperforming their predecessors.

📊 Breaking: Claude 3.7 Sonnet scores 51.5% on WorkArena benchmark!
Surprising finding: The newer Claude 3.7 Sonnet (51.5%) performs below Claude 3.5 (56.4%) on our tests! 👀
Maybe newer isn't always better? Both Claude 3.7 and o3-mini are underperforming their predecessors.
Dawn Song (@dawnsongtweets) 's Twitter Profile Photo

🚀 Really excited to launch #AgentX competition hosted by UC Berkeley RDI UC Berkeley alongside our LLM Agents MOOC series (a global community of 22k+ learners & growing fast). Whether you're building the next disruptive AI startup or pushing the research frontier, AgentX is your

🚀 Really excited to launch #AgentX competition hosted by <a href="/BerkeleyRDI/">UC Berkeley RDI</a> <a href="/UCBerkeley/">UC Berkeley</a> alongside our LLM Agents MOOC series (a global community of 22k+ learners &amp; growing fast). Whether you're building the next disruptive AI startup or pushing the research frontier, AgentX is your
Juan A. Rodríguez 💫 (@joanrod_ai) 's Twitter Profile Photo

I’m excited to announce that 💫StarVector has been accepted at CVPR 2025! Over a year in the making, StarVector opens a new paradigm for Scalable Vector Graphics (SVG) generation by harnessing multimodal LLMs to generate SVG code that aesthetically mirrors input images and text.

Gaurav Sahu (@dem_fier) 's Twitter Profile Photo

🚀 Exciting news! Our work LitLLM has been accepted in TMLR! LitLLM helps researchers write literature reviews by combining keyword+embedding-based search, and LLM-powered reasoning to find relevant papers and generate high-quality reviews. LitLLM.github.io 🧵 (1/5)

Dawn Song (@dawnsongtweets) 's Twitter Profile Photo

🔥 Thrilled to announce the Agentic AI Summit 2025—the first summit dedicated to #AgenticAI in the Bay Area, hosted by UC Berkeley RDI UC Berkeley! 🚀 Building on momentum from our LLM Agents MOOC (23k+ global learners!), we're creating the LARGEST gathering of its kind—1,500+

🔥 Thrilled to announce the Agentic AI Summit 2025—the first summit dedicated to #AgenticAI in the Bay Area, hosted by <a href="/BerkeleyRDI/">UC Berkeley RDI</a> <a href="/UCBerkeley/">UC Berkeley</a>! 🚀

Building on momentum from our LLM Agents MOOC (23k+ global learners!), we're creating the LARGEST gathering of its kind—1,500+
Torsten Scholak (@tscholak) 's Twitter Profile Photo

🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇

🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨
Speed ⚡ + Accuracy 📈 + Efficiency 💸
This model punches above its weight, beating bigger LLMs while training on a fraction of the compute.
Built with Fast-LLM, our in-house training stack.
🧵👇
ServiceNow Research (@servicenowrsrch) 's Twitter Profile Photo

10 Years on and now recognized by ICLR 2026 for standing up to the test-of-time. Please join us in congratulating 🇺🇦 Dzmitry Bahdanau, Kyunghyun Cho & Yoshua Bengio for their seminal work titled “Neural Machine Translation by Jointly Learning to Align and Translate”. arxiv.org/abs/1409.0473

Gabriel Huang (@gabrielhuang9) 's Twitter Profile Photo

1/ How do we evaluate agent vulnerabilities in situ, in dynamic environments, under realistic threat models? We present 🔥 DoomArena 🔥 — a plug-in framework for grounded security testing of AI agents. ✨Project : servicenow.github.io/DoomArena/ 📝Paper: arxiv.org/abs/2504.14064

Alexandre Drouin (@alexandredrouin) 's Twitter Profile Photo

Can your AI agent make it through DoomArena? 😈 Introducing a plug-in framework that adds a layer of security testing on top of any benchmark for AI agents.

Krishnamurthy (Dj) Dvijotham (@djdvij) 's Twitter Profile Photo

1/n Wish you could evaluate AI agents for security vulnerabilities in a realistic setting? Wish no more - today we release DoomArena, a framework that plugs in to YOUR agentic benchmark and enables injecting attacks consistent with any threat model YOU specify

1/n Wish you could evaluate AI agents for security vulnerabilities in a realistic setting? Wish no more - today we release DoomArena, a framework that plugs in to YOUR agentic benchmark and enables injecting attacks consistent with any threat model YOU specify
Sara Hooker (@sarahookr) 's Twitter Profile Photo

It is critical for scientific integrity that we trust our measure of progress. The lmarena.ai has become the go-to evaluation for AI progress. Our release today demonstrates the difficulty in maintaining fair evaluations on lmarena.ai, despite best intentions.

It is critical for scientific integrity that we trust our measure of progress. 

The <a href="/lmarena_ai/">lmarena.ai</a> has become the go-to evaluation for AI progress.

Our release today demonstrates the difficulty in maintaining fair evaluations on <a href="/lmarena_ai/">lmarena.ai</a>, despite best intentions.
Juan A. Rodríguez 💫 (@joanrod_ai) 's Twitter Profile Photo

Thanks AK for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF). 🧠 We think we cracked SVG generalization with this one. Go read the paper! arxiv.org/abs/2505.20793 More details on

Thanks <a href="/_akhaliq/">AK</a> for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF). 

🧠 We think we cracked SVG generalization with this one.

Go read the paper! arxiv.org/abs/2505.20793

More details on
Emiliano Penaloza (@emilianopp_) 's Twitter Profile Photo

Excited that our paper "Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization" was accepted to ICML 2025! We show how Preference Optimization can reduce the impact of noisy concept labels in CBMs. 🧵/9

Massimo Caccia (@masscaccia) 's Twitter Profile Photo

🎉 Our paper “𝐻𝑜𝑤 𝑡𝑜 𝑇𝑟𝑎𝑖𝑛 𝑌𝑜𝑢𝑟 𝐿𝐿𝑀 𝑊𝑒𝑏 𝐴𝑔𝑒𝑛𝑡: 𝐴 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑖𝑠” got an 𝐨𝐫𝐚𝐥 at next week’s 𝗜𝗖𝗠𝗟 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗨𝘀𝗲 𝗔𝗴𝗲𝗻𝘁𝘀! 🖥️🧠 We present the 𝐟𝐢𝐫𝐬𝐭 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞

🎉 Our paper “𝐻𝑜𝑤 𝑡𝑜 𝑇𝑟𝑎𝑖𝑛 𝑌𝑜𝑢𝑟 𝐿𝐿𝑀 𝑊𝑒𝑏 𝐴𝑔𝑒𝑛𝑡: 𝐴 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑖𝑠” got an 𝐨𝐫𝐚𝐥 at next week’s 𝗜𝗖𝗠𝗟 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗨𝘀𝗲 𝗔𝗴𝗲𝗻𝘁𝘀! 🖥️🧠

We present the 𝐟𝐢𝐫𝐬𝐭 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞
Alexandre Drouin (@alexandredrouin) 's Twitter Profile Photo

📢 Attention Attention ServiceNow Research is hiring a Research Scientist with a focus on Agent Safety+Security 👩🏻‍🔬 Join us to work on impactful open research projects like 🔹DoomArena: github.com/ServiceNow/doo… 🔹BrowserGym: github.com/ServiceNow/Bro… Apply: jobs.smartrecruiters.com/ServiceNow/744…

Massimo Caccia (@masscaccia) 's Twitter Profile Photo

Our oral is tomorrow at 14:40 PDT during ICML Conference’s Workshop on Computer Use Agents (West Meeting Room 211–214)! Attending virtually? Zoom link & details here: icml.cc/virtual/2025/w…