mrinank 🍂 (@mrinanksharma) Twitter Tweets • TwiCopy

mrinank 🍂

@mrinanksharma

+ Follow

safeguards research lead @ anthropic ⭐️ poet, flautist, ecstatic dance DJ ⭐️ everything has to do with loving and not loving, rumi

ID: 1147066404824322049

linkhttp://www.mrinanksharma.net calendar_today05-07-2019 08:55:17

1,1K Tweet

1,1K Takipçi

551 Takip Edilen

Steven Adler

@sjgadler

7 months ago

Anthropic announced they've activated "Al Safety Level 3 Protections" for their latest model. What does this mean, and why does it matter? Let me share my perspective as OpenAl's former lead for dangerous capabilities testing. (Thread)

thumb_up_off_alt4,4K

chat_bubble_outline115

repeat453

shareShare

Vivid Void

@vividvoid_

6 months ago

The older I get, the more I realize "slow is smooth, smooth is fast" is incredible wisdom

thumb_up_off_alt15,15K

chat_bubble_outline127

repeat1,1K

shareShare

anita

@neats29

6 months ago

I’ve noticed a particular dynamic in recent months where when there is an unsaid “truth” in the field between me and someone else or with myself, I feel this subtle tension, like things are not as they seem and I know it but idk what it is. and then as soon as the truth is

thumb_up_off_alt34

chat_bubble_outline3

repeat2

shareShare

Archana Burra

@archanaburra

6 months ago

I’m hosting the Bay Area burbea sangha tomorrow at the alembic at 11:30! Please come by if you’d like to meditate with me

thumb_up_off_alt19

chat_bubble_outline0

repeat2

shareShare

hoagy

@hoagycunningham

6 months ago

New Anthropic blog: We benchmark approaches to making classifiers more cost-effective by reusing activations from the model being queried. We find that using linear probes or retraining just a single layer of the model can push the cost-effectiveness frontier. 🧵1/

thumb_up_off_alt125

chat_bubble_outline9

repeat15

shareShare

Anthropic

@anthropicai

4 months ago

New Anthropic research: filtering out dangerous information at pretraining. We’re experimenting with ways to remove information about chemical, biological, radiological and nuclear (CBRN) weapons from our models’ training data without affecting performance on harmless tasks.

thumb_up_off_alt1,1K

chat_bubble_outline223

repeat193

shareShare

mrinank 🍂

@mrinanksharma

3 months ago

Check out our recent work on defending LLM fine-tuning APIs!

thumb_up_off_alt18

chat_bubble_outline0

repeat0

shareShare

Joe Hudson

@fu_joehudson

3 months ago

Who you surround yourself with is incredibly important. My litmus test: Do I feel fortunate to be around them and do they feel fortunate to be around me? If that is not the case then an adjustment needs to be made. But the most important part is this: This seldom happens in

thumb_up_off_alt274

chat_bubble_outline7

repeat13

shareShare

Anthropic

@anthropicai

3 months ago

Our collaboration with the US Center for AI Standards and Innovation (CAISI) and UK AI Security Institute (AISI) shows the importance of public-private partnerships in developing secure AI models.

thumb_up_off_alt393

chat_bubble_outline39

repeat35

shareShare

mrinank 🍂

@mrinanksharma

3 months ago

It's been great to collaborate with Xander and other at UK AISI and CAISI to strength our jailbreak defences!

thumb_up_off_alt12

chat_bubble_outline1

repeat2

shareShare

Rosa Lewis

@rosaclewis

3 months ago

New ebook Unlocking the Depths of Being. Guiding you through seven parts of experience that, once connected to, unlock the subtler and more expansive layers of reality.

thumb_up_off_alt23

chat_bubble_outline1

repeat3

shareShare

mrinank 🍂

@mrinanksharma

2 months ago

a reminder about this! it’s tomorrow :)

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

Fazl Barez

@fazlbarez

2 months ago

🚨New AI Safety Course Autonomous Intelligent Machines & Systems @Oxford! I’m thrilled to launch a new called AI Safety & Alignment (AISAA) course on the foundations & frontier research of making advanced AI systems safe and aligned at University of Oxford what to expect 👇 robots.ox.ac.uk/~fazl/aisaa/

🚨New AI Safety Course <a href="/aims_oxford/">Autonomous Intelligent Machines & Systems @Oxford</a>!

I’m thrilled to launch a new called AI Safety & Alignment (AISAA) course on the foundations & frontier research of making advanced AI systems safe and aligned at <a href="/UniofOxford/">University of Oxford</a>
what to expect 👇
robots.ox.ac.uk/~fazl/aisaa/

thumb_up_off_alt107

chat_bubble_outline7

repeat21

shareShare

mrinank 🍂

@mrinanksharma

a month ago

check out some of our recent jailbreaking research!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare