Generative AI & RL Community (@rlcommunity8) 's Twitter Profile
Generative AI & RL Community

@rlcommunity8

Community of Generative AI and Reinforcement Learning Researchers, Practitioners and Enthusiasts. Monthly Meetup and Newsletter.

ID: 1090954837708234753

calendar_today31-01-2019 12:47:57

1,1K Tweet

2,2K Followers

508 Following

Sergey Levine (@svlevine) 's Twitter Profile Photo

Can we get LLMs to "hedge" and express uncertainty rather than hallucinate? For this we first have to understand why hallucinations happen. In new work led by Katie Kang we propose a model of hallucination that leads to a few solutions, including conservative reward models šŸ§µšŸ‘‡

Can we get LLMs to "hedge" and express uncertainty rather than hallucinate? For this we first have to understand why hallucinations happen. In new work led by <a href="/katie_kang_/">Katie Kang</a> we propose a model of hallucination that leads to a few solutions, including conservative reward models šŸ§µšŸ‘‡
Jeff Dean (@jeffdean) 's Twitter Profile Photo

We're starting to roll out API support for Gemini 1.5 Pro for developers. We're excited to see what you build with the 1M token context window! We'll be onboarding people to the API slowly at first, and then we'll ramp it up. In the meantime, developers can try out Gemini 1.5

Nando de Freitas (@nandodf) 's Twitter Profile Photo

An important take on intelligent machines today, arguing that they already have understanding and subjective experience , by Geoffrey Hinton | youtu.be/iHCeAotHZa4?si… via YouTube

An important take on intelligent machines today, arguing that they already have understanding and subjective experience , by Geoffrey Hinton | youtu.be/iHCeAotHZa4?si… via <a href="/YouTube/">YouTube</a>
Christopher Manning (@chrmanning) 's Twitter Profile Photo

LLMs like ChatGPT are an amazingly powerful breakthrough in AI and a transformative general purpose technology, like electricity or the internet. LLMs will reshape work and our lives this decade. They are not just a blurry photocopier or an extruder of meaningless word sequences.

LLMs like ChatGPT are an amazingly powerful breakthrough in AI and a transformative general purpose technology, like electricity or the internet. LLMs will reshape work and our lives this decade. They are not just a blurry photocopier or an extruder of meaningless word sequences.
Fei-Fei Li (@drfeifei) 's Twitter Profile Photo

One year ago, we first introduced BEHAVIOR-1K, which we hope will be an important step towards human-centered robotics. After our year-long beta, we’re thrilled to announce its full release, which our team just presented at NVIDIA #GTC2024. 1/n

Ali (@itsalichaudhry) 's Twitter Profile Photo

Our acceleration towards AGI is much faster than many anticipate. And by AGI I mean God-level AI that can literally do anything, not just profit from stocks. The biggest and foremost risk of this is that we might not even realise when it's here because we don't know how it

Sergey Levine (@svlevine) 's Twitter Profile Photo

A fun chat with Craig S. Smith from back in December at NeurIPS: youtu.be/Tk1pX_IMYzQ?si… Thanks Craig S. Smith for the chat, lots of fun questions. I hope I didn't ramble too muchšŸ™‚

Matei Zaharia (@matei_zaharia) 's Twitter Profile Photo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B. databricks.com/blog/introduci…

Thomas Wolf (@thom_wolf) 's Twitter Profile Photo

[75min talk] i finally recorded this lecture I gave two weeks ago because people kept asking me for a video so here it is, enjoy "The Little guide to building Large Language Models in 2024" tried to keep it short and comprehensive – focusing on concepts that are crucial for

[75min talk] i finally recorded this lecture I gave two weeks ago because people kept asking me for a video

so here it is, enjoy "The Little guide to building Large Language Models in 2024"

tried to keep it short and comprehensive – focusing on concepts that are crucial for
Andrew Ng (@andrewyng) 's Twitter Profile Photo

Last week, I described four design patterns for AI agentic workflows that I believe will drive significant progress this year: Reflection, Tool use, Planning and Multi-agent collaboration. Instead of having an LLM generate its final output directly, an agentic workflow prompts

Jared Friedman (@snowmaker) 's Twitter Profile Photo

(0/25) Here's a list of 25 YC companies that have trained their own AI models. Reading through these will give you a good sense of what the near future will look like.

Neel Nanda (@neelnanda5) 's Twitter Profile Photo

Sparse autoencoders are currently a big deal in mech interp, but there's not a good, concise intro to what they are. I'm currently taking a stab at writing one! Here's the draft TLDR:

Sparse autoencoders are currently a big deal in mech interp, but there's not a good, concise intro to what they are. I'm currently taking a stab at writing one! Here's the draft TLDR:
Percy Liang (@percyliang) 's Twitter Profile Photo

As expected, lots of new models in the last few weeks. We're tracking them (along with datasets and applications) in the ecosystem graphs: crfm.stanford.edu/ecosystem-grap…

As expected, lots of new models in the last few weeks. We're tracking them (along with datasets and applications) in the ecosystem graphs:
crfm.stanford.edu/ecosystem-grap…
Asad Naveed (@dr_asadnaveed) 's Twitter Profile Photo

This week, I tried ResearchPal (researchpal.co) and here's my review on it: It's a simple tool that quickly automates a lot of your research needs. Here're some specific use cases:

This week, I tried <a href="/ResearchPal_AI/">ResearchPal</a> (researchpal.co) and here's my review on it:

It's a simple tool that quickly automates a lot of your research needs.

Here're some specific use cases:
Richard Sutton (@richardssutton) 's Twitter Profile Photo

Last week and this I graduated my 11th and 12th PhD students, Kenny Young and Abhishek Naik. Kenny will go work for a startup, maybe Astrus.ai or Equilibretechnologies.com. Abhishek’s next step it TBD, but he would like something in AI and space exploration.

Daniel Mason (@dgmason) 's Twitter Profile Photo

1/ šŸ“£ Big news from Anon! We've raised $6.5M from USV and Abstract to be the Integration Platform for the AI internet. Also announcing: 🌐 Initial customer launches 🌐 Expanded list of 10+ integrations 🌐 Public developer docs 🌐 Our "Messenger API" product Read more šŸ‘‡

Ali (@itsalichaudhry) 's Twitter Profile Photo

We still don’t know why LLMs work so well or how to internally control their outputs! But a recent landmark paper from Anthropic on ā€˜Mapping the Mind of a Large Language Modelā€˜ attempts to make the inner workings of LLMs more transparent and interpretable. Why is this such a

We still don’t know why LLMs work so well or how to internally control their outputs!

But a recent landmark paper from Anthropic on ā€˜Mapping the Mind of a Large Language Modelā€˜ attempts to make the inner workings of LLMs more transparent and interpretable. 

Why is this such a
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

NVIDIA presents NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Achieves #1 on the MTEB leaderboard arxiv.org/abs/2405.17428

NVIDIA presents NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Achieves #1 on the MTEB leaderboard 

arxiv.org/abs/2405.17428
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨ The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X

# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨

The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X
AI at Meta (@aiatmeta) 's Twitter Profile Photo

šŸ“ New from FAIR: An Introduction to Vision-Language Modeling. Vision-language models (VLMs) are an area of research that holds a lot of potential to change our interactions with technology, however there are many challenges in building these types of models. Together with a set

šŸ“ New from FAIR: An Introduction to Vision-Language Modeling.

Vision-language models (VLMs) are an area of research that holds a lot of potential to change our interactions with technology, however there are many challenges in building these types of models. Together with a set