Jam Kraprayoon (@jkraprayoon) 's Twitter Profile
Jam Kraprayoon

@jkraprayoon

Researcher @iapsAI Fellow @scientistsorg Oxford/LSE. AI governance and policy. Fmr international civil servant. Also poet.

ID: 1014864640839307265

calendar_today05-07-2018 13:32:40

332 Tweet

223 Followers

1,1K Following

Palisade Research (@palisadeai) 's Twitter Profile Photo

⛳️ Our new LLM Agent achieved 95% success on InterCode-CTF, a high-school level hacking benchmark, using simple prompting techniques. 🚀 This surpasses prior work by a large margin:

⛳️ Our new LLM Agent achieved 95% success on InterCode-CTF, a high-school level hacking benchmark, using simple prompting techniques.

🚀 This surpasses prior work by a large margin:
OAISIS (@oaisis_official) 's Twitter Profile Photo

After 11 months of work, we proudly announce Third Opinion: A free of charge expert consultation service for frontier AI professionals. To help you clarify if what you're seeing is cause for concern. Anonymous, without sharing confidential information 🧵 tinyurl.com/2rbk2w59

Tamay Besiroglu (@tamaybes) 's Twitter Profile Photo

1/11 I’m genuinely impressed by OpenAI’s 25.2% Pass@1 performance on FrontierMath—this marks a major leap from prior results and arrives about a year ahead of my median expectations.

1/11 I’m genuinely impressed by OpenAI’s 25.2% Pass@1 performance on FrontierMath—this marks a major leap from prior results and arrives about a year ahead of my median expectations.
Elliot Glazer (@elliotglazer) 's Twitter Profile Photo

1/12 FrontierMath’s three-part rating—Background (1–5), Creativity (hours of insight), and Execution (solution time)—lets us precisely gauge problem difficulty. These ratings help provide context on o3’s benchmark results.

David Lawrence (@dc_lawrence) 's Twitter Profile Photo

3. Building AI assurance in the UK: Jam Kraprayoon and Bill Anderson-Samways explain how the UK can lead in AI safety and AI opportunities at the same time by becoming a global leader in AI testing and assurance. ukdayone.org/briefings/assu…

3. Building AI assurance in the UK: <a href="/JKraprayoon/">Jam Kraprayoon</a> and <a href="/BillSamways/">Bill Anderson-Samways</a> explain how the UK can lead in AI safety and AI opportunities at the same time by becoming a global leader in AI testing and assurance.

ukdayone.org/briefings/assu…
Alejandro Cuadron (@alex_cuadron) 's Twitter Profile Photo

Surprising find: OpenAI's O1 - reasoning-high only hit 30% on SWE-Bench Verified - far below their 48.9% claim. Even more interesting: Claude achieves 53% in the same framework. Something's off with O1's "enhanced reasoning"... 🧵1/8

Surprising find: OpenAI's O1 - reasoning-high only hit 30% on SWE-Bench Verified - far below their 48.9% claim. Even more interesting: Claude achieves 53% in the same framework. Something's off with O1's "enhanced reasoning"... 🧵1/8
Dr. Chinasa T. Okolo (@chinasatokolo) 's Twitter Profile Photo

For our next piece within the “AI Safety and the Global Majority” series, Shaun K.E. Ee and Jam Kraprayoon discuss the growing ecosystem of AI safety in Southeast Asia and opportunities to strengthen AI governance throughout the region. Brookings Governance brookings.edu/articles/ai-sa…

Institute for AI Policy and Strategy (IAPS) (@iapsai) 's Twitter Profile Photo

AI “agents”—systems that can autonomously pursue goals—are advancing fast. If current trends continue, we could soon see millions of agents deployed across society. Are we ready? Here’s what you need to know, from a report from Jam Kraprayoon, Zoe Wiliams, and Rida Fayyaz. 👇

AI “agents”—systems that can autonomously pursue goals—are advancing fast.

If current trends continue, we could soon see millions of agents deployed across society. Are we ready?

Here’s what you need to know, from a report from <a href="/JKraprayoon/">Jam Kraprayoon</a>, Zoe Wiliams, and Rida Fayyaz. 👇
Joe O'Brien (@__j0e___) 's Twitter Profile Photo

I explore this question with Jeremy Dolan, Jay Kim, Jonah, Jeba Sania, Sebastian Becker, Jam Kraprayoon, and R. Cara Labrador, in our new report, available here: iaps.ai/research/ai-re…

Joe O'Brien (@__j0e___) 's Twitter Profile Photo

Surprise finding: Every single multi-agent research area ranked in the top 30. Experts see multi-agent systems as a critical, underexplored frontier for AI risks.

Peter Wildeford 🇺🇸🚀 (@peterwildeford) 's Twitter Profile Photo

The US must invest in AI assurance + security tech to stay competitive. Institute for AI Policy and Strategy (IAPS) 's memo with FAS is now @scientistsorg outlines 3 critical gaps (emergent behaviors, infra security, autonomous agents) + 3 solutions (coordinated R&D strategy, public-private consortium, frontier fellowships)