Rakshit Trivedi (@rstriv) 's Twitter Profile
Rakshit Trivedi

@rstriv

Postdoctoral Associate at MIT

ID: 49674797

calendar_today22-06-2009 16:13:06

24 Tweet

47 Takipçi

166 Takip Edilen

South Park Commons (@southpkcommons) 's Twitter Profile Photo

@jc_y42 wowed with Simular, an AI agent on your personal device that perceives, reasons, and takes action, working seamlessly across applications at the operating system level to enhance your productivity. Waitlist at simular.ai youtu.be/mmU4vIldbbs

Natasha Jaques (@natashajaques) 's Twitter Profile Photo

Our recent PNAS paper shows that widely used interpretability methods, when used to ask simple counterfactual questions about models like “if I pay down this credit card will my credit score increase?”, are provably no better than random guessing. This is really problematic bc...

Dylan HadfieldMenell (@dhadfieldmenell) 's Twitter Profile Photo

New piece in Tech Policy, with Sina Fazelpour and Luca (also on the other platforms): we get into the details of red-teaming in the context of AI systems. We make the case that the inherent subjectivity of assessing AI means that the details around the red team matter. techpolicy.press/red-teaming-ai…

Joel Z Leibo (@jzl86) 's Twitter Profile Photo

Happy to announce our new paper! We applied framework-based qualitative analysis to study the fidelity of language model free responses to human lived experience. doi.org/10.1371/journa…

Matthias Gerstgrasser (@mgerstgrasser) 's Twitter Profile Photo

But wait, there might be hope... 🌟 Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data w/ Rylan Schaeffer Apratim Dey Rafael Rafailov Sanmi Koyejo Dan Roberts Andrey Gromov Diyi Yang David Donoho arxiv.org/abs/2404.01413 2/N

Matthias Gerstgrasser (@mgerstgrasser) 's Twitter Profile Photo

💬Have you wanted to build an LLM chat app but don’t know where to start? I’ve done all the tedious bits for you! 🔧Check out github.com/mgerstgrasser/… 1/4

Dylan HadfieldMenell (@dhadfieldmenell) 's Twitter Profile Photo

New paper out from Algorithmic Alignment Group! The key takeaway: training against an adversary that perturbs intermediate latent activations *with a well-defined target* is quite effective at robustly removing the behavior.

Cooperative AI Foundation (@coop_ai) 's Twitter Profile Photo

We're excited to be partnering again with Apart Research for a hackathon next weekend in the run-up to the Concordia Contest at NeurIPS Conference! The challenge: advancing the cooperative intelligence of language model agents. Sign up here: apartresearch.com/event/the-conc….

Joel Z Leibo (@jzl86) 's Twitter Profile Photo

The Concordia hackathon is starting tomorrow! running through the weekend (apartresearch.com/event/the-conc…) And immediately following the hackathon, we will be kicking off the main NeurIPS contest next week! It will run through the fall.

Cooperative AI Foundation (@coop_ai) 's Twitter Profile Photo

In collaboration with colleagues from Google DeepMind, Massachusetts Institute of Technology (MIT), UC Berkeley, and UCL, we are excited to announce that the NeurIPS 2024 Concordia Contest is now open! Deadline: October 31st. Prizes: $10,000 + more. Further details: cooperativeai.com/contests/conco…. youtube.com/watch?v=Xtb1WZ…

Dylan HadfieldMenell (@dhadfieldmenell) 's Twitter Profile Photo

New conference on safe and ethical AI! Our goal is to convene a broad group of experts from academia, civil society, industry, media, and governments to discuss the latest developments in AI safety and ethics. Please apply here by Nov 24: iaseai.org/conference/app…

Cooperative AI Foundation (@coop_ai) 's Twitter Profile Photo

We’re proud to announce our first cohort of PhD fellows for 2025. We’re delighted to welcome these exceptional early career researchers to our community, and we look forward to supporting their contributions to cooperative AI. Fellows' profiles via the link in next post.

We’re proud to announce our first cohort of PhD fellows for 2025. We’re delighted to welcome these exceptional early career researchers to our community, and we look forward to supporting their contributions to cooperative AI.

Fellows' profiles via the link in next post.
Gillian Hadfield (@ghadfield) 's Twitter Profile Photo

If you believe, as I do, that human intelligence is fundamentally cooperative—the capacity to integrate into and navigate human groups, to relate to other humans in complex ways, this seems very right: “high intelligence and creativity is more than just being a genius at maths”

Jakob Foerster (@j_foerst) 's Twitter Profile Photo

RL has always been the future and the future is now. Having an open-source version released _before_ major closed-source labs managed to rediscover this internally (as far as I know) is amazing.

Gillian Hadfield (@ghadfield) 's Twitter Profile Photo

Video from our tutorial NeurIPS Conference 2024 is up! Dylan HadfieldMenell Joel Z Leibo Rakshit Trivedi and I explore how frameworks from economics, institutional and political theory, and biological and cultural evolution can advance approaches to AI alignment neurips.cc/virtual/2024/t…

Atoosa Kasirzadeh (@dr_atoosa) 's Twitter Profile Photo

In this review paper, we advocate for the normalization of AI safety as an inherent component of AI development and deployment. AI safety should be a standard practice integrated into every stage of AI creation and deployment. Developing and deploying safe AI should be a

In this review paper, we advocate for the normalization of AI safety as an inherent component of AI development and deployment. AI safety should be a standard practice integrated into every stage of AI creation and deployment. Developing and deploying safe AI should be a
Cooperative AI Foundation (@coop_ai) 's Twitter Profile Photo

The development and widespread deployment of advanced AI agents will give rise to multi-agent systems of unprecedented complexity. A new report from staff at CAIF and a host of leading researchers explores the novel and under-appreciated risks these systems pose. Details below.

The development and widespread deployment of advanced AI agents will give rise to multi-agent systems of unprecedented complexity. A new report from staff at CAIF and a host of leading researchers explores the novel and under-appreciated risks these systems pose. Details below.
Daphne Cornelisse (@daphne_cor) 's Twitter Profile Photo

Sim agents are key for developing autonomous systems for safety-critical systems, like self-driving cars. We're open-sourcing sim agents that achieve a 99.8% success rate with < 0.8% failures on the Waymo Dataset. These agents are built through scaling self-play.

Cas (Stephen Casper) (@stephenlcasper) 's Twitter Profile Photo

🚨New paper led by Ariba Khan Lots of prior research has assumed that LLMs have stable preferences, align with coherent principles, or can be steered to represent specific worldviews. No ❌, no ❌, and definitely no ❌. We need to be careful not to anthropomorphize LLMs too much.

🚨New paper led by <a href="/aribak02/">Ariba Khan</a>

Lots of prior research has assumed that LLMs have stable preferences, align with coherent principles, or can be steered to represent specific worldviews. No ❌, no ❌, and definitely no ❌. We need to be careful not to anthropomorphize LLMs too much.