Rasool Fakoor (@rasoolfa) 's Twitter Profile
Rasool Fakoor

@rasoolfa

Research in RL & ML at AWS AI.

ID: 1016995602

linkhttps://rasoolfa.github.io/ calendar_today17-12-2012 08:38:52

595 Tweet

385 Takipçi

912 Takip Edilen

Yao Liu (@yaoliucs) 's Twitter Profile Photo

Offline RL is much harder than online RL or imitation learning as it needs to solve a sequence of counterfactual reasoning problems. That often gives an error of (1+\delta)^H, where delta is the one-step divergence of policy or extrapolation of Q and H is the horizon. 1/N

Yao Liu (@yaoliucs) 's Twitter Profile Photo

One common misconception about (deep) RL is that is was done by first defining some empirical loss as objective and then deriving model updating rules from GD, just like supervised learning. This is NOT the case for popular RL algorithms like policy gradient or TD-based. 1/N

Rasool Fakoor (@rasoolfa) 's Twitter Profile Photo

Our team at AWS is *hiring* interns and full-time researchers! Yao Liu, @pratikac, I, and others work on RL, alignment, large models, and ML in general. If you have a strong relevant publications in those areas, please fill out this form. forms.gle/5KsNZ1zyKArLF4…

Alex Smola (@smolix) 's Twitter Profile Photo

Proud to release the first LLM from BosonAI. Higgs-Llama-3-70B, built for characters and gameplay, trained on Boson-3 base. With great MMLU-Pro performance. boson.ai/higgs-opensour…

Jesse Zhang (@jesse_y_zhang) 's Twitter Profile Photo

How can robots efficiently learn **new tasks/in new settings**? Introducing EXTRACT: a reinforcement learning (RL) framework that extracts a discrete + continuously parameterized skill library from offline data for efficient RL on new tasks! Accepted to CoRL 2024: 🧵👇

Jesse Zhang (@jesse_y_zhang) 's Twitter Profile Photo

I’ll be presenting this work at CoRL 2024 in about a month. Let’s chat about sample-efficient robot adaptation! Website: jessezhang.net/projects/extra… Paper: arxiv.org/abs/2406.17768 Coauthors: Minho Heo, Zuxin Liu, Erdem Bıyık, Joseph Lim, Yao Liu, Rasool Fakoor

Ke Yang (@empathyang) 's Twitter Profile Photo

👾 Introducing AgentOccam: Automating Web Tasks with LLMs! 🌐 AgentOccam showcases the impressive power of Large Language Models (LLMs) on web tasks, without any in-context examples, new agent roles, online feedback, or search strategies. 🏄🏄🏄 🧙 Link: arxiv.org/abs/2410.13825

👾 Introducing AgentOccam: Automating Web Tasks with LLMs! 🌐 AgentOccam showcases the impressive power of Large Language Models (LLMs) on web tasks, without any in-context examples, new agent roles, online feedback, or search strategies. 🏄🏄🏄
🧙 Link: arxiv.org/abs/2410.13825
Ke Yang (@empathyang) 's Twitter Profile Photo

Excited to announce that our web agent paper, AgentOccam, has been accepted to ICLR 2025! 🏂🏂🏂 Huge thanks to all collaborators! 😊 Special thanks to my brilliant and considerate mentor, Yao Yao Liu, for your constant guidance and encouragement! Sapana Sapana Chaudhary and Rasool

Excited to announce that our web agent paper, AgentOccam, has been accepted to ICLR 2025! 🏂🏂🏂 Huge thanks to all collaborators! 😊
Special thanks to my brilliant and considerate mentor, Yao <a href="/yaoliucs/">Yao Liu</a>, for your constant guidance and encouragement! Sapana <a href="/Sapana_007/">Sapana Chaudhary</a> and Rasool
Tianwei Ni (@twni2016) 's Twitter Profile Photo

Can we make LLMs reason effectively without a huge inference time cost? We show a powerful approach through learning and forgetting! Our recipe: 1️⃣ Aggregate reasoning paths from diverse sources: Chain-of-Thought, inference-time search (Tree-of-Thought, Reasoning-via-Planning),

Can we make LLMs reason effectively without a huge inference time cost?
We show a powerful approach through learning and forgetting!

Our recipe:
1️⃣ Aggregate reasoning paths from diverse sources: Chain-of-Thought, inference-time search (Tree-of-Thought, Reasoning-via-Planning),
Robert Yang (@guangyurobert) 's Twitter Profile Photo

Many of you have known us as Altera. Today, I'm happy to share that we are now officially Fundamental Research Labs Research Labs! We will be unveiling our next big step today, so it felt perfect to reintroduce ourselves: digitalhumanity.substack.com/p/introducing-…

nico (@nicochristie) 's Twitter Profile Photo

Shortcut – the first superhuman excel agent – is live. While not perfect, Shortcut beats first year analysts from McKinsey/Goldman head-to-head 89.1% (220:27) when blindly judged by their managers. We even gave humans 10x more time. Try Shortcut now (before your boss does).

Robert Yang (@guangyurobert) 's Twitter Profile Photo

Our Excel Agent, Shortcut, is generally available now! Greatly improved trust-worthiness & accuracy. ~90% win rate against top first-year analysts 26 days since early access, 28 versions shipped So proud of the team, and really appreciate all the feedback from our users!

Rasool Fakoor (@rasoolfa) 's Twitter Profile Photo

Our team is *hiring* interns & researchers! We’re a small team of hardcore researchers & engineers working on foundation models, agentic methods, and embodiment. If you have strong publications and related experience, plz fill out application form. forms.gle/4bUeFfksUhCLap…