Alexandre L.-Piché (@alexpiche_) 's Twitter Profile
Alexandre L.-Piché

@alexpiche_

Searching for Q* at @ServiceNowRSRCH, Prev. PhD @MilaMontreal & Research intern at @DeepMind.

ID: 394263815

linkhttp://alexpiche.github.io calendar_today19-10-2011 20:28:01

122 Tweet

1,1K Followers

4,4K Following

Nicolas Chapados (@nicolaschapados) 's Twitter Profile Photo

Thrilled to share the release of StarCoder2! ServiceNow , Hugging Face, and NVIDIA have partnered to deliver a family of open-access code LLMs to help developers everywhere tap the power of GenAI to build software better. Check out model checkpoints on the Hugging Face Hub!

Alexandre Lacoste (@alex_lacoste_) 's Twitter Profile Photo

How capable are web agents at solving knowledge work tasks? 🤔 Are LLMs up to the challenge? 🤖 Introducing WorkArena: a benchmark where agents meet the world 𝘸𝘪𝘭𝘥 web of enterprise software 🌐🖥️ Paper: bit.ly/4a7FiFV Website: bit.ly/3VkdJ87 🧵 1/7

Alexandre L.-Piché (@alexpiche_) 's Twitter Profile Photo

We can tweak the target accuracy to obtain different behaviors. High target accuracy: ReSearch is very cautious and produces less claims on average. Low target accuracy: ReSearch is less cautious, produces more claims, and yet is *still* more accurate than default behavior.

We can tweak the target accuracy to obtain different behaviors.
High target accuracy: ReSearch is very cautious and produces less claims on average.
Low target accuracy: ReSearch is less cautious, produces more claims, and yet is *still* more accurate than default behavior.
Rosie Zhao (@rosieyzh) 's Twitter Profile Photo

In our new work on evaluating optimizers for LLM training, we perform a series of experiments to investigate the role of adaptivity in optimizers like Adam in achieving good performance and stability. A thread: 🧵

In our new work on evaluating optimizers for LLM training, we perform a series of experiments to investigate the role of adaptivity in optimizers like Adam in achieving good performance and stability. A thread: 🧵
Alexandre Lacoste (@alex_lacoste_) 's Twitter Profile Photo

Most of our team is at #ICML2024 , reach out if you want to meet. We'll be presenting WorkArena and BrowserGym: Poster Session 2 on Tuesday, Hall C 4-9 #610 arxiv.org/abs/2403.07718

Alexandre Drouin (@alexandredrouin) 's Twitter Profile Photo

Interested in time series forecasting and LLMs? We are looking for visiting researchers to work on context-aided forecasting (example below): * Benchmarking * Multimodal Foundation Models * Agentic forecasting assistants When: Jan '25 - 8 months Details: bit.ly/sc25q1

Interested in time series forecasting and LLMs?

We are looking for visiting researchers to work on context-aided forecasting (example below):
* Benchmarking
* Multimodal Foundation Models
* Agentic forecasting assistants

When: Jan '25 - 8 months
Details: bit.ly/sc25q1
🇺🇦 Dzmitry Bahdanau (@dbahdanau) 's Twitter Profile Photo

🚨 New agent framework! 🚨 My team at ServiceNow Research is releasing TapeAgents: a holistic framework for agent development and optimization. At its core is the tape: a structured agent log. Repo: github.com/ServiceNow/Tap… Paper: servicenow.com/research/TapeA… Why you should care: 🧵

🚨 New agent framework! 🚨

My team at <a href="/ServiceNowRSRCH/">ServiceNow Research</a>  is releasing TapeAgents: a holistic framework for agent development and optimization. At its core is the tape: a structured agent log.

Repo: github.com/ServiceNow/Tap…
Paper: servicenow.com/research/TapeA…

Why you should care: 🧵
Krishnamurthy (Dj) Dvijotham (@djdvij) 's Twitter Profile Photo

The dominant paradigm in AI alignment is to learn from human feedback. But what form should this feedback take? A simple thumbs up/down suffice? Finer-grained attributes ? Our paper ojs.aaai.org/index.php/AIES… led by the amazing Katie Collins at #AIES studies these questions

Krishnamurthy (Dj) Dvijotham (@djdvij) 's Twitter Profile Photo

I am also hiring for my new team at ServiceNow Research, please reach out if you are at the conference and interested in building the future of secure AI for the enterprise. We have openings for interns, engineers and researchers

Alexandre Lacoste (@alex_lacoste_) 's Twitter Profile Photo

Anthropic Early results with Claude 3.5 sonnet for our new paper. We're probably not even using it right yet and its performance is through the roof, leaving o1-mini in the dust (o1-preview results are coming). See github.com/ServiceNow/Bro… for a growing amount of web-ui benchmarks.

<a href="/AnthropicAI/">Anthropic</a> Early results with Claude 3.5 sonnet for our new paper. We're probably not even using it right yet and its performance is through the roof, leaving o1-mini in the dust (o1-preview results are coming).

See github.com/ServiceNow/Bro…
for a growing amount of web-ui benchmarks.
🇺🇦 Dzmitry Bahdanau (@dbahdanau) 's Twitter Profile Photo

I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: github.com/ServiceNow/Pip… Blog: huggingface.co/blog/ServiceNo…

I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference!

Code: github.com/ServiceNow/Pip…
Blog: huggingface.co/blog/ServiceNo…