Haotian Fu (@haotiannnnnnnnn) 's Twitter Profile
Haotian Fu

@haotiannnnnnnnn

PhD student @BrownUniversity. Reinforcement Learning/Embodied Agents. Intern at The Boston Dynamics AI Institute. Prev Intern @MSFTResearch.

ID: 1110780990278885376

linkhttps://haotianfu.me/ calendar_today27-03-2019 05:50:00

66 Tweet

266 Takipçi

399 Takip Edilen

Haotian Fu (@haotiannnnnnnnn) 's Twitter Profile Photo

We will present this work at ICML in person next Thursday. Feel free to drop by if you have any questions or just want to say hello! #ICML2024

Xidong Feng (@xidong_feng) 's Twitter Profile Photo

I had an amazing internship experience working at the Discovery team! Don't hesitate to reach out/apply if you are passionate about RL, Planning and LLMs!

Paul Zhou (@zhiyuan_zhou_) 's Twitter Profile Photo

Can robots self-improve by collecting data autonomously🤖? Introducing SOAR: a system for large-scale autonomous data collection 🚀 and autonomous improvement📈of a multi-task language-conditioned policy in diverse scenes without human interventions . auto-improvement.github.io

Yilun Du (@du_yilun) 's Twitter Profile Photo

I'm recruiting PhD students this year with interest in machine learning, embodied AI, or AI for science! If you are interested in constructing fundamental tools to improve Generative AI and exploring how these tools can be used for intelligent embodied agents and science,

Xidong Feng (@xidong_feng) 's Twitter Profile Photo

Thrilled to share that I’ll be joining Google DeepMind as a Research Scientist with the Discovery Team! It’s a dream come true—8 years ago, I watched AlphaGo live in high school, and now I am lucky enough to be part of this incredible journey. Can’t wait to discover what’s next!

Haotian Fu (@haotiannnnnnnnn) 's Twitter Profile Photo

We will present EPO: Hierarchical LLM Agents with Environment Preference Optimization at #EMNLP2024 next week, which achieves SOTA performance on ALFRED by finetuning a hierarchical Llama2-based agent and modeling the environment's preference signals. Check out the thread below!

TuringPost (@theturingpost) 's Twitter Profile Photo

Natural Language Reinforcement Learning (NLRL) redefines Reinforcement Learning (RL). The main idea: In NLRL, the core parts of RL like goals, strategies, and evaluation methods are reimagined using natural language instead of rigid math. What are the benefits? - NLRL uses not

Natural Language Reinforcement Learning (NLRL) redefines Reinforcement Learning (RL).

The main idea:
In NLRL, the core parts of RL like goals, strategies, and evaluation methods are reimagined using natural language instead of rigid math.

What are the benefits?

- NLRL uses not
Calvin Luo (@calvinyluo) 's Twitter Profile Photo

Frozen diffusion models pretrained on web images and videos can teach embodied agents to accomplish novel behaviors and goals in diverse physical environments! We present TADPoLe🐸, a way to “distill” large-scale pretraining into text-conditioned policy learning! #NeurIPS2024

Zilai Zeng (@zilaizeng) 's Twitter Profile Photo

Want to know how web-scale pretrained diffusion models can provide text-conditioned reward signals for learning novel behaviors across different robotic configurations and environments? Check our poster at #NeurIPS2024 on Thursday, December 12th at 4:30 pm (Poster #6706)!

Xidong Feng (@xidong_feng) 's Twitter Profile Photo

NLRL is now open-sourced at github.com/waterhorse1/Na…! Check it and welcome to try NLRL in other LLM agents, reasoning, and coding tasks!

TuringPost (@theturingpost) 's Twitter Profile Photo

NLRL, or Natural Language Reinforcement Learning, is about adapting RL methods to work in the natural language field. Traditional RL aims to learn a policy (strategy) guiding the agent to the best action in each state. Instead of this, NLRL integrates a Chain-of-Thought

NLRL, or Natural Language Reinforcement Learning, is about adapting RL methods to work in the natural language field.

Traditional RL aims to learn a policy (strategy) guiding the agent to the best action in each state.

Instead of this, NLRL integrates a Chain-of-Thought
DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🚀 DeepSeek-R1 is here! ⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & technical report 🏆 MIT licensed: Distill & commercialize freely! 🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today! 🐋 1/n

🚀 DeepSeek-R1 is here!

⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today!

🐋 1/n
RAI Institute (@rai_inst) 's Twitter Profile Photo

In this demo, Ultra Mobile Vehicle (UMV) drives, turns, jumps, tricks, and comes to a sudden stop called a track-stand. All of the driving, landings, balance, and track-stands are done using reinforcement learning.

Paul Zhou (@zhiyuan_zhou_) 's Twitter Profile Photo

Can we make robot policy evaluation easier and less time consuming? Introducing AutoEval, a system that *autonomously* evaluates generalist policies 24/7 and closely matches human results. We make 4 tasks 💫publicly available💫 Submit your policy at auto-eval.github.io! 🧵👇

Zilai Zeng (@zilaizeng) 's Twitter Profile Photo

Internet-scale videos capture rich knowledge about motions and behaviors, but how can we effectively adapt them for domain-specific robotic tasks and further unlock behavior generalization? Come to our poster (#418) at #ICLR2025 and chat with Calvin Luo on April 25th at 3:30pm!

Omar Khattab (@lateinteraction) 's Twitter Profile Photo

Sigh, it's a bit of a mess. Let me just give you guys the full nuance in one stream of consciousness since I think we'll continue to get partial interpretations that confuse everyone. All the little things I post need to always be put together in one place. First, I have long