Rasool Fakoor (@rasoolfa) Twitter Tweets • TwiCopy

Yao Liu

2 years ago

Offline RL is much harder than online RL or imitation learning as it needs to solve a sequence of counterfactual reasoning problems. That often gives an error of (1+\delta)^H, where delta is the one-step divergence of policy or extrapolation of Q and H is the horizon. 1/N

thumb_up_off_alt24

chat_bubble_outline1

repeat2

shareShare

Yao Liu

@yaoliucs

2 years ago

One common misconception about (deep) RL is that is was done by first defining some empirical loss as objective and then deriving model updating rules from GD, just like supervised learning. This is NOT the case for popular RL algorithms like policy gradient or TD-based. 1/N

thumb_up_off_alt13

chat_bubble_outline1

repeat2

shareShare

Rasool Fakoor

@rasoolfa

2 years ago

Our team at AWS is *hiring* interns and full-time researchers! Yao Liu, @pratikac, I, and others work on RL, alignment, large models, and ML in general. If you have a strong relevant publications in those areas, please fill out this form. forms.gle/5KsNZ1zyKArLF4…

thumb_up_off_alt24

chat_bubble_outline0

repeat4

shareShare

Alex Smola

@smolix

a year ago

Proud to release the first LLM from BosonAI. Higgs-Llama-3-70B, built for characters and gameplay, trained on Boson-3 base. With great MMLU-Pro performance. boson.ai/higgs-opensour…

thumb_up_off_alt50

chat_bubble_outline1

repeat9

shareShare

Jesse Zhang

@jesse_y_zhang

a year ago

How can robots efficiently learn **new tasks/in new settings**? Introducing EXTRACT: a reinforcement learning (RL) framework that extracts a discrete + continuously parameterized skill library from offline data for efficient RL on new tasks! Accepted to CoRL 2024: 🧵👇

thumb_up_off_alt128

chat_bubble_outline5

repeat34

shareShare

Jesse Zhang

@jesse_y_zhang

a year ago

I’ll be presenting this work at CoRL 2024 in about a month. Let’s chat about sample-efficient robot adaptation! Website: jessezhang.net/projects/extra… Paper: arxiv.org/abs/2406.17768 Coauthors: Minho Heo, Zuxin Liu, Erdem Bıyık, Joseph Lim, Yao Liu, Rasool Fakoor

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Ke Yang

@empathyang

a year ago

👾 Introducing AgentOccam: Automating Web Tasks with LLMs! 🌐 AgentOccam showcases the impressive power of Large Language Models (LLMs) on web tasks, without any in-context examples, new agent roles, online feedback, or search strategies. 🏄🏄🏄 🧙 Link: arxiv.org/abs/2410.13825

thumb_up_off_alt60

chat_bubble_outline3

repeat28

shareShare

Ke Yang

@empathyang

9 months ago

Excited to announce that our web agent paper, AgentOccam, has been accepted to ICLR 2025! 🏂🏂🏂 Huge thanks to all collaborators! 😊 Special thanks to my brilliant and considerate mentor, Yao Yao Liu, for your constant guidance and encouragement! Sapana Sapana Chaudhary and Rasool

thumb_up_off_alt16

chat_bubble_outline0

repeat6

shareShare

Tianwei Ni

@twni2016

6 months ago

Can we make LLMs reason effectively without a huge inference time cost? We show a powerful approach through learning and forgetting! Our recipe: 1️⃣ Aggregate reasoning paths from diverse sources: Chain-of-Thought, inference-time search (Tree-of-Thought, Reasoning-via-Planning),

thumb_up_off_alt23

chat_bubble_outline0

repeat6

shareShare

Robert Yang

@guangyurobert

5 months ago

Many of you have known us as Altera. Today, I'm happy to share that we are now officially Fundamental Research Labs Research Labs! We will be unveiling our next big step today, so it felt perfect to reintroduce ourselves: digitalhumanity.substack.com/p/introducing-…

thumb_up_off_alt108

chat_bubble_outline7

repeat11

shareShare

nico

@nicochristie

3 months ago

Shortcut – the first superhuman excel agent – is live. While not perfect, Shortcut beats first year analysts from McKinsey/Goldman head-to-head 89.1% (220:27) when blindly judged by their managers. We even gave humans 10x more time. Try Shortcut now (before your boss does).

thumb_up_off_alt5,5K

chat_bubble_outline212

repeat399

shareShare

Robert Yang

@guangyurobert

3 months ago

Our Excel Agent, Shortcut, is generally available now! Greatly improved trust-worthiness & accuracy. ~90% win rate against top first-year analysts 26 days since early access, 28 versions shipped So proud of the team, and really appreciate all the feedback from our users!

thumb_up_off_alt232

chat_bubble_outline9

repeat18

shareShare

Rasool Fakoor

@rasoolfa

3 months ago

Our team is *hiring* interns & researchers! We’re a small team of hardcore researchers & engineers working on foundation models, agentic methods, and embodiment. If you have strong publications and related experience, plz fill out application form. forms.gle/4bUeFfksUhCLap…

thumb_up_off_alt14

chat_bubble_outline1

repeat3

shareShare

Rasool Fakoor

@rasoolfa

3 months ago

The application closes on Tuesday (8/12). If you are interested, please apply and don't wait until the last minute.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare