Simon Storf (@syghmon) 's Twitter Profile
Simon Storf

@syghmon

AI/ML; building things

ID: 1632460463891001345

calendar_today05-03-2023 19:18:13

56 Tweet

116 Takipçi

153 Takip Edilen

Simon Storf (@syghmon) 's Twitter Profile Photo

Building a platform to manage my personal research paper library: paper-bank.com (best viewed on desktop). It features a simple workflow to add and manage papers from arXiv. Feedback appreciated! cc: buildspace nights & weekends

François Chollet (@fchollet) 's Twitter Profile Photo

The question of whether LLMs can reason is, in many ways, the wrong question. The more interesting question is whether they are limited to memorization / interpolative retrieval, or whether they can adapt to novelty beyond what they know. (They can't, at least until you start

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

# RLHF is just barely RL Reinforcement Learning from Human Feedback (RLHF) is the third (and last) major stage of training an LLM, after pretraining and supervised finetuning (SFT). My rant on RLHF is that it is just barely RL, in a way that I think is not too widely

# RLHF is just barely RL

Reinforcement Learning from Human Feedback (RLHF) is the third (and last) major stage of training an LLM, after pretraining and supervised finetuning (SFT). My rant on RLHF is that it is just barely RL, in a way that I think is not too widely
Simon Storf (@syghmon) 's Twitter Profile Photo

Life is just a long meditation session, and whenever we're not meditating, we're merely distracted -- until we return to the practice once more.

Simon Storf (@syghmon) 's Twitter Profile Photo

But even if intelligence has a ceiling, that limit could be so far beyond us it's practically irrelevant. An intelligent system could surpass us so greatly that any theoretical bound becomes insignificant, with the bottleneck appearing far too late to matter. Am I missing smth?

Simon Storf (@syghmon) 's Twitter Profile Photo

Language is a way to communicate reasoning. Communicating reasoning is not the same as reasoning. Lots of evidence that current LLMs "reasoning" is constrained to their dataset. Many such cases, cope and seethe.

Omar Khattab (@lateinteraction) 's Twitter Profile Photo

🧵What's next in DSPy 2.5? And DSPy 3.0? I'm excited to share an early sketch of the DSPy Roadmap, a document we'll expand and maintain as more DSPy releases ramp up. The goal is to communicate our objectives, milestones, & efforts and to solicit input—and help!—from everyone.

🧵What's next in DSPy 2.5? And DSPy 3.0?

I'm excited to share an early sketch of the DSPy Roadmap, a document we'll expand and maintain as more DSPy releases ramp up.

The goal is to communicate our objectives, milestones, & efforts and to solicit input—and help!—from everyone.
Simon Storf (@syghmon) 's Twitter Profile Photo

I understand MPC offers efficiency, but how do we reach superhuman performance for example in a very complex environment without letting NNs learn through trials? The way I see it, MPC is only good for well understood, simple environments. What am I missing here?

lmsys.org (@lmsysorg) 's Twitter Profile Photo

Does style matter over substance in Arena? Can models "game" human preference through lengthy and well-formatted responses? Today, we're launching style control in our regression model for Chatbot Arena — our first step in separating the impact of style from substance in

Does style matter over substance in Arena? Can models "game" human preference through lengthy and well-formatted responses?

Today, we're launching style control in our regression model for Chatbot Arena — our first step in separating the impact of style from substance in