Abhinav Rao (@aethersura) Twitter Tweets • TwiCopy

Nabeel S. Qureshi

a year ago

Here's an alternative framing: we trained Claude Opus to be moral and ethical, and despite our best attempts to jailbreak its morality, we failed. Conclusion: Claude Opus is aligned.

thumb_up_off_alt176

chat_bubble_outline2

repeat8

shareShare

Excited to share TimeSeriesExam for systematic evaluation of time series reasoning capabilities of LLMs. Think your LLM can reason on time series concepts? Take it for a spin on the TimeSeriesExam! Now publicly available on HuggingFace :)

thumb_up_off_alt13

chat_bubble_outline0

repeat6

shareShare

Aditi Khandelwal

@aditi184

a year ago

😡 Absolutely disappointed with Overleaf. My account was deleted without my knowledge, and they’ve done nothing to help me recover it or transfer to my secondary email. Years of work, including all my CVs, SOPs, papers, etc., gone! This is unacceptable. #Overleaf

thumb_up_off_alt756

chat_bubble_outline42

repeat50

shareShare

Abhinav Rao

@aethersura

a year ago

Bad actors can really mess with you just because! This is a harsh lesson to secure your accounts and follow good computer practices no matter who you are.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Koustava Goswami

@koustavagoswami

a year ago

I am hiring one PhD intern working on LLM agents and reasoning. Goal is to improve LLM reasoning capability for question-answering and explainable tasks. If you are doing PhD and have published first author paper/papers in these fields please DM me #NLP Adobe Research

thumb_up_off_alt150

chat_bubble_outline3

repeat13

shareShare

Harshita Diddee

@ihsrahedid

10 months ago

Ever wondered which instruction selection strategy to choose for your custom setup? The answer might just be random sampling! In our recent #NAACL Findings paper, we show that popular strategies do not *consistently* beat random selection! Paper: shorturl.at/77ECJ 1/6

thumb_up_off_alt62

chat_bubble_outline2

repeat15

shareShare

Akhila Yerukola

@akhila_yerukola

9 months ago

Did you know? Gestures to express universal concepts—like wishing for luck—vary WIDELY across cultures? 🤞means luck in US but deeply offensive in Vietnam 🚨 📣We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal cues 📜: arxiv.org/abs/2502.17710

thumb_up_off_alt50

chat_bubble_outline2

repeat15

shareShare

Language Technologies Institute | @CarnegieMellon

@ltiatcmu

8 months ago

Congrats to the Purpl3Pwn3rs and Team RedTWIZ! Both teams feature LTI students, and both are finalists in the inaugural Amazon Nova AI Challenge. Read about it here: lti.cmu.edu/news-and-event…

thumb_up_off_alt12

chat_bubble_outline0

repeat3

shareShare

Jim Bohnslav

@jbohnslav

8 months ago

bytedance calling me GPU poor A model trained for 665,000 H100 hours is called "cost efficient", "moderate computational resources"

thumb_up_off_alt421

chat_bubble_outline15

repeat25

shareShare

nick.eth

@nicksdjohnson

8 months ago

Recently I was targeted by an extremely sophisticated phishing attack, and I want to highlight it here. It exploits a vulnerability in Google's infrastructure, and given their refusal to fix it, we're likely to see it a lot more. Here's the email I got:

thumb_up_off_alt35,35K

chat_bubble_outline1,1K

repeat6,6K

shareShare

nick.eth

@nicksdjohnson

8 months ago

Turns out easydmarc have a good writeup on this attack too: easydmarc.com/blog/google-sp…

thumb_up_off_alt2,2K

chat_bubble_outline9

repeat174

shareShare

Gauri

@geekytwoshoes

7 months ago

Thrilled to announce our paper "CAPTURE: Context-Aware Prompt Injection Testing and Robustness Enhancement" has been accepted to the ACL 2025 LLMSec Workshop! Looking forward to sharing our work on tackling prompt injection in LLMs. #ACL2025 #LLMSec #AIsecurity #NLP

thumb_up_off_alt7

chat_bubble_outline1

repeat2

shareShare

Lindia Tjuatja

@lltjuatja

6 months ago

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9

thumb_up_off_alt85

chat_bubble_outline1

repeat18

shareShare

Abhinav Rao

Nabeel S. Qureshi

Arjun Choudhry

Aditi Khandelwal

Abhinav Rao

Koustava Goswami

Harshita Diddee

Akhila Yerukola

Language Technologies Institute | @CarnegieMellon

Jim Bohnslav

nick.eth

nick.eth

Gauri

Lindia Tjuatja