Chris Cundy (@chriscundy) 's Twitter Profile
Chris Cundy

@chriscundy

Research Scientist at FAR AI.
PhD from Stanford University.
Hopefully making AI benefit humanity.

Views are my own.

ID: 891751545594707968

linkhttp://cundy.me calendar_today30-07-2017 20:05:11

342 Tweet

1,1K Followers

215 Following

Chris Cundy (@chriscundy) 's Twitter Profile Photo

Interesting how in the Mistral-Large release they don't use the CoT@32/CoT@8 results for Gemini Pro (and no Gemini Ultra results), but presumably recompute them with no multiple CoT? I wonder if they tried to multiple CoT techniques with Mistral.

Interesting how in the Mistral-Large release they don't use the CoT@32/CoT@8 results for Gemini Pro (and no Gemini Ultra results), but presumably recompute them with no multiple CoT? I wonder if they tried to multiple CoT techniques with Mistral.
Chris Cundy (@chriscundy) 's Twitter Profile Photo

Explanation for the copilot issues recently: I think the decoding being used for copilot forces emojis at the end of a message Degeneration occurs since the decoding distribution is different from the 'correct' distribution. Implications: decoding strategies are exploitable

Chris Cundy (@chriscundy) 's Twitter Profile Photo

Lots of interest in diffusion on discrete spaces. I think Aaron's approach is the most principled I've seen. Plus, lots of applications in conditional generation!

Chris Cundy (@chriscundy) 's Twitter Profile Photo

Has anyone come up with good ideas on how to use LLM assistants, given adversarial inputs? I've already read simonwillison.net/2023/Apr/25/du…, any more discussion?

Chris Cundy (@chriscundy) 's Twitter Profile Photo

Reading the report about many-shot jailbreaking reminds me of the paper from Wolf et al last year: under weak assumptions then for sufficiently long context, any LLM is jailbreakable. arxiv.org/pdf/2304.11082…

Mark Russinovich (@markrussinovich) 's Twitter Profile Photo

As part of our ongoing work on AI safety and security, we've discovered a powerful, yet simple LLM jailbreak that exploits an intrinsic LLM behavior we call 'crescendo' and have demonstrated it on dozens of tasks across major LLM models and services: …ndo-the-multiturn-jailbreak.github.io

Chris Cundy (@chriscundy) 's Twitter Profile Photo

Happy to announce I was one of the winners of the OpenAI preparedness challenge! If anyone has a safety-related project that is bottlenecked on OpenAI credits, let me know--I'd be happy to help out openai.com/blog/frontier-…

kamilė (@kamilelukosiute) 's Twitter Profile Photo

We need to be spending more money on evals to get better measurements of accuracy and confidence. I wrote about how to compare model performance with the most basic statistical tools. Although these tools don't capture all the nuance, it's better than the status quo.

We need to be spending more money on evals to get better measurements of accuracy and confidence.

I wrote about how to compare model performance with the most basic statistical tools. Although these tools don't capture all the nuance, it's better than the status quo.
Rishi Desai (@rishi_desai2) 's Twitter Profile Photo

🚀 Excited to share our latest #AI research from Stanford on privacy-constrained reinforcement learning, developed with Chris Cundy and Stefano Ermon! Our framework minimizes sensitive information exposure using mutual information regularizers. 🤖💡#AISTATS2024 1/6

Chris Cundy (@chriscundy) 's Twitter Profile Photo

I'm at ICLR this week -- If you're there too, let's chat! Interested in hearing more about sequence models and LLMs, particularly about alignment, detecting deception, and scalable oversight

Kelsey Piper (@kelseytuoc) 's Twitter Profile Photo

When you leave OpenAI, you get an unpleasant surprise: a departure deal where if you don't sign a lifelong nondisparagement commitment, you lose all of your vested equity: vox.com/future-perfect…

Chris Cundy (@chriscundy) 's Twitter Profile Photo

Improving to 60% on GPQA from the previous SoTA of 54% is *really* impressive -- the GPQA questions are very difficult! (That is, assuming no test-set leakage...)

Chris Cundy (@chriscundy) 's Twitter Profile Photo

Life update: I'm excited to announce that I defended my PhD last month and have joined FAR.AI as a research scientist! At FAR, I'm investigating scalable approaches to reduce catastrophic risks from AI.