Peter Hase (@peterbhase) Twitter Tweets • TwiCopy

Peter Hase

@peterbhase

+ Follow

AI research @AnthropicAI. Interested in AI safety. PhD from UNC Chapel Hill (Google PhD Fellow). Previously: AI2, Google, Meta

ID: 1119252439050354688

linkhttps://peterbhase.github.io/ calendar_today19-04-2019 14:52:30

399 Tweet

2,2K Takipçi

813 Takip Edilen

Morena

@morenadevil4

9 years ago

Twitter Beğeni Hilesi

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

🚨 We have postdoc openings at UNC 🙂 Exciting+diverse NLP/CV/ML topics**, freedom to create research agenda, competitive funding, very strong students, many collabs w/ other faculty & universities+companies, superb quality of life/weather. Please apply + help spread the word

Mohit Bansal

@mohitban47

3 months ago

🚨 Kudos to Peter on leading this long 1-year AI+Philosophy effort on fundamental problems with model editing in LLMs (inspired by work on belief revision in philosophy)! 👏👏 ➡️ 12 core challenges across 3 areas: (1) defining the model editing problem (w.r.t. background

Anisha Gunjal

@anisha_gunjal

3 months ago

Challenges of LLM fact-checking with atomic facts: 👉 Too little info? Atomic facts lack context 👉 Too much info? Risk losing error localization benefits We study this and present 🧬molecular facts🧬, which balance two key criteria: decontextuality and minimality. w/Greg Durrett

Serena Booth

@serenalbooth

2 months ago

I'll be at ICML next week! I'm looking to hire a postdoc and PhD students in human-centered RL and AI governance starting Fall 2025, so get in touch if you're interested in working with me at Brown! I'm also happy to advise on applying to the AAAS fellowship or AI policy roles.

Dima Krasheninnikov

@dmkrash

2 months ago

1/ Excited to finally tweet about our paper “Implicit meta-learning may lead LLMs to trust more reliable sources”, to appear at ICML 2024. Our results suggest that during training, LLMs better internalize text that appears useful for predicting other text (e.g. seems reliable).

Peter Hase

@peterbhase

2 months ago

With System 1.x reasoning, we show how to steer a model toward more or less verbalized reasoning – the model learns to balance explicit/implicit reasoning. Using this method, we can encourage an LLM to verbalize reasoning/planning that would otherwise remain hidden.

Adam Gleave

@argleave

2 months ago

Anthropic comes out against current SB1047, proposes refocusing it on liability post-catastrophe (tort law++) and safety transparency (releasing RSP-style plan). Would cut: pre-catastrophe enforcement; new regulatory division. Analysis & 🔗 in 🧵👇

Peter Hase

@peterbhase

2 months ago

Life update: I am starting a residency at Anthropic! I will be working on research in AI safety. I have also relocated to SF! You will now find me there.

thumb_up_off_alt894

chat_bubble_outline38

repeat22

shareShare

Dan Hendrycks

@danhendrycks

2 months ago

New letter from Geoffrey Hinton, Yoshua Bengio, Lawrence @Lessig, and Stuart Russell urging Gov. Newsom to sign SB 1047. “We believe SB 1047 is an important and reasonable first step towards ensuring that frontier AI systems are developed responsibly, so that we can all better

New letter from <a href="/geoffreyhinton/">Geoffrey Hinton</a>, Yoshua Bengio, Lawrence @Lessig, and Stuart Russell urging Gov. Newsom to sign SB 1047.

“We believe SB 1047 is an important and reasonable first step towards ensuring that frontier AI systems are developed responsibly, so that we can all better

thumb_up_off_alt191

chat_bubble_outline12

repeat34

shareShare

Mohit Bansal

@mohitban47

a month ago

🚨 Check out an exciting batch of papers this week at #ACL2024! Say hi to some of our awesome students & collaborators who are attending in person, and feel free to ask about our postdoc openings too 🙂 Topics: -- multi-agent reasoning collaboration -- structured

Asma Ghandeharioun

@ghandeharioun

a month ago

🧵Responses to adversarial queries can still remain latent in a safety-tuned model. Why are they revealed sometimes, but not others? And what are the mechanics of this latent misalignment? Does it matter *who* the user is? (1/n)

Christopher Potts

@chrisgpotts

a month ago

The Linear Representation Hypothesis is now widely adopted despite its highly restrictive nature. Here, Csordás Róbert, Atticus Geiger, Christopher Manning & I present a counterexample to the LRH and argue for more expressive theories of interpretability: arxiv.org/abs/2408.10920

thumb_up_off_alt282

chat_bubble_outline10

repeat65

shareShare

Senator Scott Wiener

@scott_wiener

a month ago

.Anthropic sent a letter to the Governor sharing their analysis of SB 1047. Here are the main takeaways: 1️⃣On balance, the bill is good. 2️⃣Its compliance burdens for companies are reasonable. 3️⃣Catastrophic risks from AI are real. 4️⃣Federal action is uncertain at best.

.<a href="/AnthropicAI/">Anthropic</a> sent a letter to the Governor sharing their analysis of SB 1047. Here are the main takeaways:

1️⃣On balance, the bill is good.
2️⃣Its compliance burdens for companies are reasonable.
3️⃣Catastrophic risks from AI are real.
4️⃣Federal action is uncertain at best.

thumb_up_off_alt143

chat_bubble_outline35

repeat16

shareShare

Peter Hase

@peterbhase

a month ago

Really enjoyed going on this podcast with Daniel Filan! Tune in for 2 hrs of research discussion on AI safety and NLP

thumb_up_off_alt36

chat_bubble_outline0

repeat3

shareShare

Usman Anwar

@usmananwar391

14 days ago

Our agenda paper on alignment and safety of LLMs just got published at TMLR: openreview.net/forum?id=oVTkO… 🥳 The revised version is also now on arxiv arxiv.org/abs/2404.09932.