Ollie Stephenson (@technolliegist) 's Twitter Profile
Ollie Stephenson

@technolliegist

Associate Director of AI and Emerging Technology Policy, @scientistsorg. Views are my own.

ID: 750327515835731968

calendar_today05-07-2016 13:56:35

240 Tweet

1,1K Takipçi

1,1K Takip Edilen

Yoshua Bengio (@yoshua_bengio) 's Twitter Profile Photo

This metric from METR shows the length of tasks AI agents can complete has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Increased agency in AI systems can lead to numerous risks, and truly effective guardrails

METR (@metr_evals) 's Twitter Profile Photo

METR tested pre-release versions of o3 + o4-mini on tasks involving autonomy and AI R&D. For each model, we examined how capable it is on our tasks & how often it tries to “hack” them. We detail our findings in a new report, a summary of which is included in OpenAI's system card.

METR tested pre-release versions of o3 + o4-mini on tasks involving autonomy and AI R&D. For each model, we examined how capable it is on our tasks & how often it tries to “hack” them. We detail our findings in a new report, a summary of which is included in OpenAI's system card.
Ollie Stephenson (@technolliegist) 's Twitter Profile Photo

🚨 Hiring: AI & Emerging Tech Manager Federation of American Scientists🔬 🚨 Shape U.S. #AI policy—drive AI equity work, build S&T talent pipelines, tackle AI-safety & energy projects. 💼 $70k–$87.5k | Hybrid DC (2-3 days in office). Apply soon, ideally by May 5! → fas.org/career/ai-and-…

Ollie Stephenson (@technolliegist) 's Twitter Profile Photo

“When experts get together to make a…recommendation, it’s hard to ignore them; when they divide themselves into duelling groups, it becomes easier for decision-makers to dismiss both sides and do nothing. Currently, nothing appears to be the plan.” newyorker.com/culture/open-q…

Joe O'Brien (@__j0e___) 's Twitter Profile Photo

Commerce Secretary Lutnick recently announced the Center for AI Standards and Innovation (CAISI), reforming US AISI. Despite this, U.S. govt still lacks capacity to evaluate & defend against risks from advanced AI. How do we build the infrastructure needed to meet this challenge?