Nicolas Chapados (@nicolaschapados) Twitter Tweets • TwiCopy

Tianyu Zhang

a year ago

[1/n] We are happy to announce our new VLM task: Visual Caption Restoration along with datasets: arxiv.org/abs/2406.06462, tiny.cc/m06lyz Try yourself before diving in😀 Authors: T. Zhang, S. Wang, L. Li, G. Zhang, P. Taslakian, S. Rajeswar, J. Fu, B. Liu, Y. Bengio

thumb_up_off_alt17

chat_bubble_outline2

repeat12

shareShare

Alexandre Lacoste

@alex_lacoste_

a year ago

We’re really excited to release this large collaborative work for unifying web agent benchmarks under the same roof. In this TMLR paper, we dive in-depth into #BrowserGym and #AgentLab. We also present some unexpected performances from Claude 3.5-Sonnet

thumb_up_off_alt104

chat_bubble_outline3

repeat32

shareShare

Nicolas Chapados

@nicolaschapados

10 months ago

Thrilled to be speaking at the first Workshop for Research on Agent Language Models, at ACL 2025 this summer! Congrats to the organizers for putting together a strong program on a timely topic. Consider submitting your work (March 1st deadline).

thumb_up_off_alt19

chat_bubble_outline0

repeat8

shareShare

Gaurav Sahu

@dem_fier

10 months ago

A little to the party, but really happy to share that our work (arxiv.org/abs/2407.07341) from ServiceNow Research got accepted to #NAACL2025 (Findings), where we propose two sample-efficient methods for effective short and long document summarization! NAACL HLT 2025 1/3

thumb_up_off_alt12

chat_bubble_outline1

repeat4

shareShare

Léo Boisvert

@leoboisvert

10 months ago

📊 Fresh WorkArena benchmark results just dropped! Plot twist: o1-mini (51.8%) > o3-mini (48.2%) Either o1-mini had its coffee this morning ☕️ or we've stumbled upon something interesting 🧐 Replication studies welcome!

thumb_up_off_alt12

chat_bubble_outline1

repeat6

shareShare

Krishnamurthy (Dj) Dvijotham

@djdvij

10 months ago

You drop a model, we drop our eval, boom! Plot twist on o1 vs o3 on our challenging workarena++ benchmark of enterprise knowledge worker tasks

thumb_up_off_alt11

chat_bubble_outline1

repeat4

shareShare

Ravid Shwartz Ziv

@ziv_ravid

10 months ago

🧵 I forgot to update, but our paper "SEQ-VCR: Preventing Collapse in Intermediate Transformer Representations" has been accepted to ICLR! Let me tell you why this is cool paper... Rifat Arefin, Gopeshh Subbaraj, Nicolas Gontier, Yann LeCun, Irina Rish Chris Pal

thumb_up_off_alt117

chat_bubble_outline7

repeat28

shareShare

Ahmed Masry

@ahmed_masry97

10 months ago

Happy to announce AlignVLM📏: a novel approach to bridging vision and language latent spaces for multimodal understanding in VLMs! 🌍📄🖼️ 🔗 Read the paper: arxiv.org/abs/2502.01341 🧵👇 Thread

thumb_up_off_alt208

chat_bubble_outline2

repeat54

shareShare

Dawn Song

@dawnsongtweets

10 months ago

Really excited to announce our Advanced LLM Agents MOOC (Spring 2025)! Building on the success of our LLM Agents MOOC from Fall 2024 (15K+ registered learners, ~9K Discord members, 200K+ lecture views on YouTube), we are excited to extend the MOOC this semester to cover some more

thumb_up_off_alt204

chat_bubble_outline9

repeat44

shareShare

Dawn Song

@dawnsongtweets

9 months ago

🚀 Really excited to launch #AgentX competition hosted by UC Berkeley RDI UC Berkeley alongside our LLM Agents MOOC series (a global community of 22k+ learners & growing fast). Whether you're building the next disruptive AI startup or pushing the research frontier, AgentX is your

🚀 Really excited to launch #AgentX competition hosted by <a href="/BerkeleyRDI/">UC Berkeley RDI</a> <a href="/UCBerkeley/">UC Berkeley</a> alongside our LLM Agents MOOC series (a global community of 22k+ learners & growing fast). Whether you're building the next disruptive AI startup or pushing the research frontier, AgentX is your

thumb_up_off_alt410

chat_bubble_outline20

repeat108

shareShare

METR

@metr_evals

8 months ago

When will AI systems be able to carry out long projects independently? In new research, we find a kind of “Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.

thumb_up_off_alt4,4K

chat_bubble_outline158

repeat826

shareShare

Nicolas Chapados

@nicolaschapados

8 months ago

Amazing work by the ServiceNow Research team: a model that converts an image to a vector (SVG) representation! Unleash the artist within...

thumb_up_off_alt24

chat_bubble_outline0

repeat5

shareShare

P Shravan Nayak

@pshravannayak

8 months ago

🚀 Super excited to announce UI-Vision: the largest and most diverse desktop GUI benchmark for evaluating agents in real-world desktop GUIs in offline settings. 📄 Paper: arxiv.org/abs/2503.15661 🌐 Website: uivision.github.io 🧵 Key takeaways 👇

thumb_up_off_alt75

chat_bubble_outline2

repeat30

shareShare

Nicolas Chapados

@nicolaschapados

8 months ago

Finally! Useful scientific literature reviews with LLMs.

thumb_up_off_alt9

chat_bubble_outline0

repeat3

shareShare

Nicolas Chapados

@nicolaschapados

8 months ago

A new open LLM pushing the boundary of performance vs efficiency, from ServiceNow & ServiceNow Research! 💪🏼✨🚀

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Nicolas Chapados

@nicolaschapados

7 months ago

Ever wonder what security vulnerabilities lie in your AI agents? Our new DoomArena framework lets you find out!

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Nicolas Chapados

@nicolaschapados

7 months ago

All good science starts with a good literature review, and AI can help with that :) LitLLM, brought to you by the awesome team at ServiceNow Research.

thumb_up_off_alt12

chat_bubble_outline0

repeat3

shareShare

Nicolas Chapados

@nicolaschapados

7 months ago

Looking for a scalable async parallel library for reinforcement learning? Look no further than PipelineRL from the ServiceNow Research team!

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

P Shravan Nayak

@pshravannayak

7 months ago

🚀 Excited to share that UI-Vision has been accepted at ICML 2025! 🎉 We have also released the UI-Vision grounding datasets. Test your agents on it now! 🚀 🤗 Dataset: huggingface.co/datasets/Servi… #ICML2025 #AI #DatasetRelease #Agents

thumb_up_off_alt36

chat_bubble_outline0

repeat15

shareShare

Juan A. Rodríguez 💫

@joanrod_ai

6 months ago

Thanks AK for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF). 🧠 We think we cracked SVG generalization with this one. Go read the paper! arxiv.org/abs/2505.20793 More details on

Thanks <a href="/_akhaliq/">AK</a> for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF).

🧠 We think we cracked SVG generalization with this one.

Go read the paper! arxiv.org/abs/2505.20793

More details on

thumb_up_off_alt122

chat_bubble_outline3

repeat41

shareShare