Philipp Schoenegger (@SchoeneggerPhil) Twitter Tweets • TwiCopy

Philipp Schoenegger

@SchoeneggerPhil

+ Follow

Decision Scientist at the London School of Economics and Political Science, studying Large Language Models and Forecasting; PhD from St Andrews '22

ID:2451358519

linkhttp://philipp-schoenegger.weebly.com calendar_today18-04-2014 13:14:55

2,8K Tweets

2,1K Followers

1,1K Following

Musa al-Gharbi

2 hours ago

MTurk is basically junk responses. People often lie about their background characteristics. And they often choose the same answer for most questions, regardless of content, such that you can ask people opposing questions and get completely incoherent results (even after screening

MTurk is basically junk responses. People often lie about their background characteristics. And they often choose the same answer for most questions, regardless of content, such that you can ask people opposing questions and get completely incoherent results (even after screening

thumb_up_off_alt66

chat_bubble_outline0

account_circle

Co-CREATE

11 hours ago

✨NEW PROJECT ✨

We are delighted to announce the launch of Co-CREATE - an EU funded project which will examine the conditions for responsible research on Solar Radiation Modification.

Find out more on our brand-new website: co-create-project.eu

thumb_up_off_alt12

chat_bubble_outline0

account_circle

Ruben C. Arslan

1 day ago

Farid just posted an update on our preprint about construct and measure proliferation. We changed the discussion a bit to reflect our current thinking. And I updated the treemap plots to better capture the fragmentation in measurement in psychology.

Farid just posted an update on our preprint about construct and measure proliferation. We changed the discussion a bit to reflect our current thinking. And I updated the treemap plots to better capture the fragmentation in measurement in psychology.

thumb_up_off_alt95

chat_bubble_outline0

account_circle

Colonel Tasty

1 day ago

GPT 4 Turbo TikZ unicorn vs gpt2-chatbot TikZ unicorn

GPT 4 Turbo TikZ unicorn vs gpt2-chatbot TikZ unicorn

thumb_up_off_alt285

chat_bubble_outline0

account_circle

Warren Hatch

3 days ago

Bearded Miguel Philip E. Tetlock The Good Judgment Open crowd has done better than the futures, but the Superforecasters (on their closed client platform) have done even better with less volatility. Here are their forecasts for the next 3 meetings compared to the futures (the GJO question is cumulative):

@beardedmiguel @PTetlock The Good Judgment Open crowd has done better than the futures, but the Superforecasters (on their closed client platform) have done even better with less volatility. Here are their forecasts for the next 3 meetings compared to the futures (the GJO question is cumulative):

thumb_up_off_alt16

chat_bubble_outline0

account_circle

hk

1 day ago

This is work from my doctoral advisor’s group at CMU! The lead author, Anthony Cheng, is a researcher to keep an eye on

thumb_up_off_alt49

chat_bubble_outline0

account_circle

Social Science Prediction Platform

3 days ago

New on the SSPP: Do financial incentives which had a positive effect on COVID-19 health outcomes generalize to other types of health behavior? Ray Duch invites your predictions! socialscienceprediction.org/predict/r/6e65…
📚Field: Econ
⏱️Duration: 10 min
📅Closes: May 17

thumb_up_off_alt3

chat_bubble_outline0

account_circle

François Fleuret

@francoisfleuret

3 days ago

Impressive Fig 2.

Pranab Sahoo

arxiv.org/abs/2402.07927

Impressive Fig 2. @pranabkumar_ arxiv.org/abs/2402.07927

thumb_up_off_alt115

chat_bubble_outline0

account_circle

Rafa Bastos

3 days ago

Finished writing and editing my first book! It will become and online resource freely available to all. I teach a lot of psychometrics I learned so far, and teach how to run the analysis in R. Stay tuned, I'll probably post it next week. PS: Very proud of this cover I made.

Finished writing and editing my first book! It will become and online resource freely available to all. I teach a lot of psychometrics I learned so far, and teach how to run the analysis in R. Stay tuned, I'll probably post it next week. PS: Very proud of this cover I made.

thumb_up_off_alt260

chat_bubble_outline0

account_circle

Zico Kolter

5 days ago

There's been a lot of discussion on LLMs 'memorizing' training data, but we argue for more nuance in the definition of 'memorize'. This work advocates for adversarial prompts (and whether they can be shorter than the output) as a metric for assessing memorization.

thumb_up_off_alt58

chat_bubble_outline0

account_circle

Robert de Neufville

6 days ago

Forecasters at Swift Centre are much less optimistic than most projections of global coal consumption (I didn't participate in this forecast)

thumb_up_off_alt3

chat_bubble_outline0

account_circle

Séb Krier

1 week ago

🔮 New Google DeepMind paper exploring what persuasion and manipulation in the context of language models. 👀

Existing safeguard approaches often focus on harmful outcomes of persuasion. This research argues for a deeper examination of the process of AI persuasion itself to

🔮 New Google DeepMind paper exploring what persuasion and manipulation in the context of language models. 👀 Existing safeguard approaches often focus on harmful outcomes of persuasion. This research argues for a deeper examination of the process of AI persuasion itself to

thumb_up_off_alt298

chat_bubble_outline0

account_circle

Philipp Schoenegger

@SchoeneggerPhil

1 week ago

Interesting preprint by David Rozado, showing that base models do not tend to have political skew, but that most conversational models skew left (and that this is straightforwardly steerable as seen with some fine-tuned models).

arxiv.org/pdf/2402.01789….

Interesting preprint by @DavidRozado, showing that base models do not tend to have political skew, but that most conversational models skew left (and that this is straightforwardly steerable as seen with some fine-tuned models). arxiv.org/pdf/2402.01789….

thumb_up_off_alt3

chat_bubble_outline0

account_circle

Ashutosh Mehra

1 week ago

Ilias Miraoui That's the flip flop effect documented in this paper arxiv.org/abs/2311.08596.

It shows that models flip their answers 46% of the time on average when asked 'Are you sure?'

thumb_up_off_alt14

chat_bubble_outline0

account_circle

Mike A. Merrill

@Mike_A_Merrill

1 week ago

The question below is pretty easy for humans. Why can't GPT-4 get it right? In our new preprint we introduce 'time series reasoning' and show that modern language models are surprisingly bad at interpreting these critical data. arxiv.org/abs/2404.11757

The question below is pretty easy for humans. Why can't GPT-4 get it right? In our new preprint we introduce 'time series reasoning' and show that modern language models are surprisingly bad at interpreting these critical data. arxiv.org/abs/2404.11757

thumb_up_off_alt67

chat_bubble_outline0

account_circle

Alexander Doria

1 week ago

As Llama 3 is working fine in French with a >95% English dataset, taking the opportunity to signal this great paper by Anton Schäfer et al.: counter-intuitively language imbalance in pre-training helps with cross-linguistic generation. arxiv.org/abs/2404.07982

thumb_up_off_alt128

chat_bubble_outline0

account_circle

Erik Løhre

1 week ago

We did a close replication but found instead that both experts and non-experts were more persuasive when they expressed certainty rather than uncertainty. This supports a confidence heuristic rather than the original incongruity hypothesis - people just seem to like certainty...

thumb_up_off_alt5

chat_bubble_outline0

account_circle

Gordon Hodson

@GordonHodsonPhD

1 week ago

Our longitudinal paper, now out, fails to find within-person change in attitudes following contact.

psycnet.apa.org/doiLanding?doi…

Our longitudinal paper, now out, fails to find within-person change in attitudes following contact. psycnet.apa.org/doiLanding?doi…

thumb_up_off_alt84

chat_bubble_outline0

account_circle

Philipp Schoenegger

@SchoeneggerPhil

1 week ago

Really cool preprint by Sean Trott on the wisdom of crowds and LLMs, introducing the framework of 'Number Needed To Beat' (NNTB), which captures the amount of human responses needed to achieve GPT-4 quality (studied here in a psycholinguistic context)!

osf.io/preprints/psya…

Really cool preprint by @Sean_Trott on the wisdom of crowds and LLMs, introducing the framework of 'Number Needed To Beat' (NNTB), which captures the amount of human responses needed to achieve GPT-4 quality (studied here in a psycholinguistic context)! osf.io/preprints/psya…

thumb_up_off_alt24

chat_bubble_outline0

account_circle

Philipp Schoenegger

@SchoeneggerPhil

1 week ago

Work is well underway in our 49-person strong LLM Persuasion team! It's been really great getting to work with so many incredibly talented people from all around!

(though I always feel slightly bad pinging the whole team across so many time zones for major updates)

Work is well underway in our 49-person strong LLM Persuasion team! It's been really great getting to work with so many incredibly talented people from all around! (though I always feel slightly bad pinging the whole team across so many time zones for major updates)

thumb_up_off_alt13

chat_bubble_outline0

account_circle

fpc ok :)