Generative AI & RL Community (@rlcommunity8) Twitter Tweets • TwiCopy

Sergey Levine

2 years ago

Can we get LLMs to "hedge" and express uncertainty rather than hallucinate? For this we first have to understand why hallucinations happen. In new work led by Katie Kang we propose a model of hallucination that leads to a few solutions, including conservative reward models 🧵👇

thumb_up_off_alt214

chat_bubble_outline2

repeat28

shareShare

Jeff Dean

@jeffdean

2 years ago

We're starting to roll out API support for Gemini 1.5 Pro for developers. We're excited to see what you build with the 1M token context window! We'll be onboarding people to the API slowly at first, and then we'll ramp it up. In the meantime, developers can try out Gemini 1.5

thumb_up_off_alt1,1K

chat_bubble_outline94

repeat384

shareShare

Nando de Freitas

@nandodf

2 years ago

An important take on intelligent machines today, arguing that they already have understanding and subjective experience , by Geoffrey Hinton | youtu.be/iHCeAotHZa4?si… via YouTube

thumb_up_off_alt53

chat_bubble_outline5

repeat10

shareShare

Christopher Manning

@chrmanning

2 years ago

LLMs like ChatGPT are an amazingly powerful breakthrough in AI and a transformative general purpose technology, like electricity or the internet. LLMs will reshape work and our lives this decade. They are not just a blurry photocopier or an extruder of meaningless word sequences.

thumb_up_off_alt550

chat_bubble_outline16

repeat88

shareShare

Fei-Fei Li

@drfeifei

2 years ago

One year ago, we first introduced BEHAVIOR-1K, which we hope will be an important step towards human-centered robotics. After our year-long beta, we’re thrilled to announce its full release, which our team just presented at NVIDIA #GTC2024. 1/n

thumb_up_off_alt696

chat_bubble_outline7

repeat143

shareShare

Ali

@itsalichaudhry

2 years ago

Our acceleration towards AGI is much faster than many anticipate. And by AGI I mean God-level AI that can literally do anything, not just profit from stocks. The biggest and foremost risk of this is that we might not even realise when it's here because we don't know how it

thumb_up_off_alt24

chat_bubble_outline1

repeat4

shareShare

Sergey Levine

@svlevine

2 years ago

A fun chat with Craig S. Smith from back in December at NeurIPS: youtu.be/Tk1pX_IMYzQ?si… Thanks Craig S. Smith for the chat, lots of fun questions. I hope I didn't ramble too much🙂

thumb_up_off_alt77

chat_bubble_outline4

repeat13

shareShare

Matei Zaharia

@matei_zaharia

2 years ago

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B. databricks.com/blog/introduci…

thumb_up_off_alt646

chat_bubble_outline13

repeat130

shareShare

Thomas Wolf

@thom_wolf

2 years ago

[75min talk] i finally recorded this lecture I gave two weeks ago because people kept asking me for a video so here it is, enjoy "The Little guide to building Large Language Models in 2024" tried to keep it short and comprehensive – focusing on concepts that are crucial for

thumb_up_off_alt1,1K

chat_bubble_outline14

repeat241

shareShare

Andrew Ng

@andrewyng

2 years ago

Last week, I described four design patterns for AI agentic workflows that I believe will drive significant progress this year: Reflection, Tool use, Planning and Multi-agent collaboration. Instead of having an LLM generate its final output directly, an agentic workflow prompts

thumb_up_off_alt2,2K

chat_bubble_outline101

repeat576

shareShare

Jared Friedman

@snowmaker

2 years ago

(0/25) Here's a list of 25 YC companies that have trained their own AI models. Reading through these will give you a good sense of what the near future will look like.

thumb_up_off_alt3,3K

chat_bubble_outline49

repeat591

shareShare

Neel Nanda

@neelnanda5

2 years ago

Sparse autoencoders are currently a big deal in mech interp, but there's not a good, concise intro to what they are. I'm currently taking a stab at writing one! Here's the draft TLDR:

thumb_up_off_alt347

chat_bubble_outline8

repeat30

shareShare

Percy Liang

@percyliang

2 years ago

As expected, lots of new models in the last few weeks. We're tracking them (along with datasets and applications) in the ecosystem graphs: crfm.stanford.edu/ecosystem-grap…

thumb_up_off_alt197

chat_bubble_outline4

repeat48

shareShare

Asad Naveed

@dr_asadnaveed

2 years ago

This week, I tried ResearchPal (researchpal.co) and here's my review on it: It's a simple tool that quickly automates a lot of your research needs. Here're some specific use cases:

This week, I tried <a href="/ResearchPal_AI/">ResearchPal</a> (researchpal.co) and here's my review on it:

It's a simple tool that quickly automates a lot of your research needs.

Here're some specific use cases:

thumb_up_off_alt317

chat_bubble_outline10

repeat103

shareShare

Richard Sutton

@richardssutton

2 years ago

Last week and this I graduated my 11th and 12th PhD students, Kenny Young and Abhishek Naik. Kenny will go work for a startup, maybe Astrus.ai or Equilibretechnologies.com. Abhishek’s next step it TBD, but he would like something in AI and space exploration.

thumb_up_off_alt125

chat_bubble_outline3

repeat14

shareShare

Daniel Mason

@dgmason

2 years ago

1/ 📣 Big news from Anon! We've raised $6.5M from USV and Abstract to be the Integration Platform for the AI internet. Also announcing: 🌐 Initial customer launches 🌐 Expanded list of 10+ integrations 🌐 Public developer docs 🌐 Our "Messenger API" product Read more 👇

thumb_up_off_alt510

chat_bubble_outline66

repeat44

shareShare

Ali

@itsalichaudhry

2 years ago

We still don’t know why LLMs work so well or how to internally control their outputs! But a recent landmark paper from Anthropic on ‘Mapping the Mind of a Large Language Model‘ attempts to make the inner workings of LLMs more transparent and interpretable. Why is this such a

thumb_up_off_alt35

chat_bubble_outline4

repeat11

shareShare

Aran Komatsuzaki

@arankomatsuzaki

2 years ago

NVIDIA presents NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Achieves #1 on the MTEB leaderboard arxiv.org/abs/2405.17428

thumb_up_off_alt462

chat_bubble_outline8

repeat96

shareShare

Andrej Karpathy

@karpathy

2 years ago

# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨ The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X

thumb_up_off_alt5,5K

chat_bubble_outline156

repeat664

shareShare

AI at Meta

@aiatmeta

2 years ago

📝 New from FAIR: An Introduction to Vision-Language Modeling. Vision-language models (VLMs) are an area of research that holds a lot of potential to change our interactions with technology, however there are many challenges in building these types of models. Together with a set

thumb_up_off_alt2,2K

chat_bubble_outline37

repeat487

shareShare