Philipp Schoenegger (@schoeneggerphil) 's Twitter Profile
Philipp Schoenegger

@schoeneggerphil

Post Doc at the London School of Economics (LLMs, Forecasting, Behavioural Science, etc)

Forecaster (Geopolitics, Economics, Technology, etc)

ID: 2451358519

linkhttp://philipp-schoenegger.weebly.com calendar_today18-04-2014 13:14:55

2,2K Tweet

2,2K Followers

1,1K Following

Lucius Caviola (@luciuscaviola) 's Twitter Profile Photo

New preprint and blog post! We explore the social disincentives of warning about unlikely risks. Many people are reluctant to warn about large but unlikely risks because they could look bad if the risk doesn’t occur 1/5

New preprint and blog post!

We explore the social disincentives of warning about unlikely risks.

Many people are reluctant to warn about large but unlikely risks because they could look bad if the risk doesn’t occur

1/5
Sean Trott (@sean_trott) 's Twitter Profile Photo

Philipp Schoenegger Spencer Greenberg 🔍 Alexander Grishin Lucius Caviola Thanks for making the code/data so clear! Replicated your main analysis here (github.com/seantrott/pers…), and I also tried implementing my "need to beat framework" I introduced here (direct.mit.edu/opmi/article/d…).

Koenfucius 🔍 (@koenfucius) 's Twitter Profile Photo

Cool research by @schoeneggerphil et al compares human lay people/experts with LLMs/specialized AI Systems predicting personality correlations. LLMs outperform lay folk but not experts; specialized AI outperforms individual experts, but not groups: buff.ly/3RymA3s

Cool research by @schoeneggerphil et al compares human lay people/experts with LLMs/specialized AI Systems predicting personality correlations. LLMs outperform lay folk but not experts; specialized AI outperforms individual experts, but not groups: buff.ly/3RymA3s
Philipp Schoenegger (@schoeneggerphil) 's Twitter Profile Photo

This AI Forecasting Benchmark Series from @Metaculus looks exciting! It will start with a $30k contest on July 8 and then go from there, providing a broad comparison of AI prediction performance (over time) to that of human forecasters. metaculus.com/notebooks/2552…

This AI Forecasting Benchmark Series from
@Metaculus looks exciting!

It will start with a $30k contest on July 8 and then go from there, providing a broad comparison of AI prediction performance (over time) to that of human forecasters.
 
metaculus.com/notebooks/2552…
Kobi Hackenburg (@kobihackenburg) 's Twitter Profile Photo

‼️New preprint: Scaling laws for political persuasion with LLMs‼️ In a large pre-registered experiment (n=25,982), we find evidence that scaling the size of language models yields sharply diminishing persuasive returns: arxiv.org/abs/2406.14508 1/n

‼️New preprint: Scaling laws for political persuasion with LLMs‼️

In a large pre-registered experiment (n=25,982), we find evidence that scaling the size of language models yields sharply diminishing persuasive returns:

arxiv.org/abs/2406.14508

1/n
Ben Tappin (@ben_tappin) 's Twitter Profile Photo

Check out our preprint 👇 Ft. a nice foreword by GPT 🤖 "This groundbreaking study delves into the persuasiveness of AI-generated political messages, uncovering a log scaling law that suggests limited gains from increasing model size. A crucial read for AI ethics and policy."

Philipp Schoenegger (@schoeneggerphil) 's Twitter Profile Photo

The large-scale collaborative AI Persuasion Project has a website now! Head over there for information about what we're up to, our team, and (eventually) our outputs as they come out! sites.google.com/view/ai-persua…

The large-scale collaborative AI Persuasion Project has a website now!  

Head over there for information about what we're up to, our team, and (eventually) our outputs as they come out!

sites.google.com/view/ai-persua…
Ramit Debnath (@ramitdebnath) 's Twitter Profile Photo

Excited to be collaborating on a very interesting global #responsibleAI project with 50+ leading researchers across behavioural sciences, CS, AI, ML, and more... Led by Philipp Schoenegger LSE et al. sites.google.com/view/ai-persua… ai@cam Responsible Ai UK Gates Cambridge

Excited to be collaborating on a very interesting global #responsibleAI project with 50+ leading researchers across behavioural sciences, CS, AI, ML, and more... 
Led by <a href="/SchoeneggerPhil/">Philipp Schoenegger</a> <a href="/LSEnews/">LSE</a> et al.

sites.google.com/view/ai-persua…

<a href="/ai_cam_mission/">ai@cam</a> <a href="/responsibleaiuk/">Responsible Ai UK</a> <a href="/Gates_Cambridge/">Gates Cambridge</a>
Michael Story ⚓ (@mwstory) 's Twitter Profile Photo

In September 2022 Swift Centre forecasters looked at Biden's re-election chances under 14 different conditions and estimated he was likely to lose in all of them. Some of the most potentially damaging issues identified were Hunter Biden's legal issues and Biden Snr's health

In September 2022 <a href="/swift_centre/">Swift Centre</a> forecasters looked at Biden's re-election chances under 14 different conditions and estimated he was likely to lose in all of them. Some of the most potentially damaging issues identified were Hunter Biden's legal issues and Biden Snr's health
Sean Trott (@sean_trott) 's Twitter Profile Photo

New paper in TACL with Cameron Jones and Ben Bergen comparing humans and LLMs on a range of tasks designed to assess theory of mind. Check out Cameron Jones’s thread for more details!

Koenfucius 🔍 (@koenfucius) 's Twitter Profile Photo

Large Language Models can produce responses to debate questions posed in a UK political TV-debate show that were judged as more authentic and relevant than the original responses from the people who were impersonated—research by Steffen Herbold et al: buff.ly/3LBlKzk

Large Language Models can produce responses to debate questions posed in a UK political TV-debate show that were judged as more authentic and relevant than the original responses from the people who were impersonated—research by <a href="/stherbold/">Steffen Herbold</a> et al: buff.ly/3LBlKzk
Forecasting Research Institute (@research_fri) 's Twitter Profile Photo

Today, we're releasing a report on our study using "conditional trees" to generate high-value forecasting questions about AI risk. Our method produced more informative questions than those on existing forecasting platforms. Here are the key findings: 🧵

Today, we're releasing a report on our study using "conditional trees" to generate high-value forecasting questions about AI risk. Our method produced more informative questions than those on existing forecasting platforms. Here are the key findings: 🧵
Philip E. Tetlock (@ptetlock) 's Twitter Profile Photo

Can we figure out how to incentivize not just accuracy of forecasts but also quality of questions? Here's a new FRI report that describes a conditional tree method of generating high-quality questions that beats business-as-usual on a major forecasting platform. I hope this will

Danny Halawi (@dannyhalawi15) 's Twitter Profile Photo

The results in "LLMs Are Superhuman Forecasters" don't hold when given another set of forecasting questions. I used their codebase (models, prompts, retrieval, etc.) to evaluate a new set of 324 questions—all opened after November 2023. Findings: Their Brier score: .195 Crowd