mrinank 🍂 (@mrinanksharma) 's Twitter Profile
mrinank 🍂

@mrinanksharma

safeguards research lead @ anthropic ⭐️ poet, flautist, ecstatic dance DJ ⭐️ everything has to do with loving and not loving, rumi

ID: 1147066404824322049

linkhttp://www.mrinanksharma.net calendar_today05-07-2019 08:55:17

1,1K Tweet

1,1K Takipçi

551 Takip Edilen

Steven Adler (@sjgadler) 's Twitter Profile Photo

Anthropic announced they've activated "Al Safety Level 3 Protections" for their latest model. What does this mean, and why does it matter? Let me share my perspective as OpenAl's former lead for dangerous capabilities testing. (Thread)

Anthropic announced they've activated "Al Safety Level 3 Protections" for their latest model. What does this mean, and why does it matter?

Let me share my perspective as OpenAl's former lead for dangerous capabilities testing. (Thread)
anita (@neats29) 's Twitter Profile Photo

I’ve noticed a particular dynamic in recent months where when there is an unsaid “truth” in the field between me and someone else or with myself, I feel this subtle tension, like things are not as they seem and I know it but idk what it is. and then as soon as the truth is

Archana Burra (@archanaburra) 's Twitter Profile Photo

I’m hosting the Bay Area burbea sangha tomorrow at the alembic at 11:30! Please come by if you’d like to meditate with me

hoagy (@hoagycunningham) 's Twitter Profile Photo

New Anthropic blog: We benchmark approaches to making classifiers more cost-effective by reusing activations from the model being queried. We find that using linear probes or retraining just a single layer of the model can push the cost-effectiveness frontier. 🧵1/

New Anthropic blog: We benchmark approaches to making classifiers more cost-effective by reusing activations from the model being queried. We find that  using linear probes or retraining just a single layer of the model can push the cost-effectiveness frontier. 🧵1/
Anthropic (@anthropicai) 's Twitter Profile Photo

New Anthropic research: filtering out dangerous information at pretraining. We’re experimenting with ways to remove information about chemical, biological, radiological and nuclear (CBRN) weapons from our models’ training data without affecting performance on harmless tasks.

New Anthropic research: filtering out dangerous information at pretraining.

We’re experimenting with ways to remove information about chemical, biological, radiological and nuclear (CBRN) weapons from our models’ training data without affecting performance on harmless tasks.
Joe Hudson (@fu_joehudson) 's Twitter Profile Photo

Who you surround yourself with is incredibly important. My litmus test: Do I feel fortunate to be around them and do they feel fortunate to be around me? If that is not the case then an adjustment needs to be made. But the most important part is this: This seldom happens in

Anthropic (@anthropicai) 's Twitter Profile Photo

Our collaboration with the US Center for AI Standards and Innovation (CAISI) and UK AI Security Institute (AISI) shows the importance of public-private partnerships in developing secure AI models.

Rosa Lewis (@rosaclewis) 's Twitter Profile Photo

New ebook Unlocking the Depths of Being. Guiding you through seven parts of experience that, once connected to, unlock the subtler and more expansive layers of reality.

New ebook Unlocking the Depths of Being.

Guiding you through seven parts of experience that, once connected to, unlock the subtler and more expansive layers of reality.
Fazl Barez (@fazlbarez) 's Twitter Profile Photo

🚨New AI Safety Course Autonomous Intelligent Machines & Systems @Oxford! I’m thrilled to launch a new called AI Safety & Alignment (AISAA) course on the foundations & frontier research of making advanced AI systems safe and aligned at University of Oxford what to expect 👇 robots.ox.ac.uk/~fazl/aisaa/

🚨New AI Safety Course <a href="/aims_oxford/">Autonomous Intelligent Machines & Systems @Oxford</a>!

I’m thrilled to launch a new called AI Safety &amp; Alignment (AISAA) course on the foundations &amp; frontier research of making advanced AI systems safe and aligned at <a href="/UniofOxford/">University of Oxford</a> 
what to expect 👇
robots.ox.ac.uk/~fazl/aisaa/