david rein(@idavidrein) 's Twitter Profileg
david rein

@idavidrein

Sentio ergo sum. AI alignment research at NYU, early employee @cohere

ID:1019763768631447552

calendar_today19-07-2018 02:00:03

3,8K Tweets

2,2K Followers

985 Following

david rein(@idavidrein) 's Twitter Profile Photo

I hate Fitbit for sleep tracking because I’ll feel great about getting to bed early and waking up 8.5 hours later, and then it’ll tell me that I actually got 6 hours with a sleep score of 67

account_circle
david rein(@idavidrein) 's Twitter Profile Photo

if anyone wants to flirt with me while I'm completely oblivious, feel free! I won't know either way, so there's no pressure

account_circle
Tsarathustra(@tsarnick) 's Twitter Profile Photo

Eric Schmidt: the point at which AI agents can talk to each other in a language we can't understand, we should unplug the computers

account_circle
Jacob Pfau(@jacob_pfau) 's Twitter Profile Photo

Do models need to reason in words to benefit from chain-of-thought tokens?

In our experiments, the answer is no! Models can perform on par with CoT using repeated '...' filler tokens.
This raises alignment concerns: Using filler, LMs can do hidden reasoning not visible in CoT🧵

Do models need to reason in words to benefit from chain-of-thought tokens? In our experiments, the answer is no! Models can perform on par with CoT using repeated '...' filler tokens. This raises alignment concerns: Using filler, LMs can do hidden reasoning not visible in CoT🧵
account_circle
david rein(@idavidrein) 's Twitter Profile Photo

If the future of software is that it'll be created on-the-fly to suit your needs by your AI, isn't that going to be devastating to the software industry?

account_circle
Ofir Press 🖋(@OfirPress) 's Twitter Profile Photo

Predictions:

>=2 orgs will get 35% on SWE-bench by Aug 1, 2024.

A fully open source system will reach 35% by Nov 1, 2024. Probably based on SWE-agent + ACI improvements: debugger, better code retrieval, lang. server protocol. The LM will be finetuned on ~500 good trajectories

account_circle
Hannah Rose Kirk(@hannahrosekirk) 's Twitter Profile Photo

Today we're launching PRISM, a new resource to diversify the voices contributing to alignment. We asked 1500 people around the world for their stated preferences over LLM behaviours, then we observed their contextual preferences in 8000 convos with 21 LLMs arxiv.org/abs/2404.16019

Today we're launching PRISM, a new resource to diversify the voices contributing to alignment. We asked 1500 people around the world for their stated preferences over LLM behaviours, then we observed their contextual preferences in 8000 convos with 21 LLMs arxiv.org/abs/2404.16019
account_circle
david rein(@idavidrein) 's Twitter Profile Photo

what are your favorite confidences? my favorite is definitely 85%. there are sooo many things that I think are 85% likely

account_circle
kamilė(@kamilelukosiute) 's Twitter Profile Photo

We need to be spending more money on evals to get better measurements of accuracy and confidence.

I wrote about how to compare model performance with the most basic statistical tools. Although these tools don't capture all the nuance, it's better than the status quo.

We need to be spending more money on evals to get better measurements of accuracy and confidence. I wrote about how to compare model performance with the most basic statistical tools. Although these tools don't capture all the nuance, it's better than the status quo.
account_circle
Josh You(@justjoshinyou13) 's Twitter Profile Photo

The best model you can run locally on your phone is now about as good as the best model period as of two years ago

account_circle