Abhinav Rao (@aethersura) 's Twitter Profile
Abhinav Rao

@aethersura

I am going to UMD.
I am at CMU.
I was at MSR.

ID: 1445268646675111936

linkhttp://abhinavrao.netlify.app calendar_today05-10-2021 06:05:08

114 Tweet

231 Followers

519 Following

Nabeel S. Qureshi (@nabeelqu) 's Twitter Profile Photo

Here's an alternative framing: we trained Claude Opus to be moral and ethical, and despite our best attempts to jailbreak its morality, we failed. Conclusion: Claude Opus is aligned.

Arjun Choudhry (@arjun_7m) 's Twitter Profile Photo

Excited to share TimeSeriesExam for systematic evaluation of time series reasoning capabilities of LLMs. Think your LLM can reason on time series concepts? Take it for a spin on the TimeSeriesExam! Now publicly available on HuggingFace :)

Aditi Khandelwal (@aditi184) 's Twitter Profile Photo

😡 Absolutely disappointed with Overleaf. My account was deleted without my knowledge, and they’ve done nothing to help me recover it or transfer to my secondary email. Years of work, including all my CVs, SOPs, papers, etc., gone! This is unacceptable. #Overleaf

Abhinav Rao (@aethersura) 's Twitter Profile Photo

Bad actors can really mess with you just because! This is a harsh lesson to secure your accounts and follow good computer practices no matter who you are.

Koustava Goswami (@koustavagoswami) 's Twitter Profile Photo

I am hiring one PhD intern working on LLM agents and reasoning. Goal is to improve LLM reasoning capability for question-answering and explainable tasks. If you are doing PhD and have published first author paper/papers in these fields please DM me #NLP Adobe Research

Harshita Diddee (@ihsrahedid) 's Twitter Profile Photo

Ever wondered which instruction selection strategy to choose for your custom setup? The answer might just be random sampling! In our recent #NAACL Findings paper, we show that popular strategies do not *consistently* beat random selection! Paper: shorturl.at/77ECJ 1/6

Ever wondered which instruction selection strategy to choose for your custom setup? The answer might just be random sampling! In our recent #NAACL Findings paper, we show that popular strategies do not *consistently* beat random selection!
Paper: shorturl.at/77ECJ 1/6
Akhila Yerukola (@akhila_yerukola) 's Twitter Profile Photo

Did you know? Gestures to express universal concepts—like wishing for luck—vary WIDELY across cultures? 🤞means luck in US but deeply offensive in Vietnam 🚨 📣We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal cues 📜: arxiv.org/abs/2502.17710

Did you know? Gestures to express universal concepts—like wishing for luck—vary WIDELY across cultures?
🤞means luck in US but deeply offensive in Vietnam 🚨

📣We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal cues
📜: arxiv.org/abs/2502.17710
Language Technologies Institute | @CarnegieMellon (@ltiatcmu) 's Twitter Profile Photo

Congrats to the Purpl3Pwn3rs and Team RedTWIZ! Both teams feature LTI students, and both are finalists in the inaugural Amazon Nova AI Challenge. Read about it here: lti.cmu.edu/news-and-event…

Jim Bohnslav (@jbohnslav) 's Twitter Profile Photo

bytedance calling me GPU poor A model trained for 665,000 H100 hours is called "cost efficient", "moderate computational resources"

bytedance calling me GPU poor

A model trained for 665,000 H100 hours is called "cost efficient", "moderate computational resources"
nick.eth (@nicksdjohnson) 's Twitter Profile Photo

Recently I was targeted by an extremely sophisticated phishing attack, and I want to highlight it here. It exploits a vulnerability in Google's infrastructure, and given their refusal to fix it, we're likely to see it a lot more. Here's the email I got:

Recently I was targeted by an extremely sophisticated phishing attack, and I want to highlight it here. It exploits a vulnerability in Google's infrastructure, and given their refusal to fix it, we're likely to see it a lot more. Here's the email I got:
Gauri (@geekytwoshoes) 's Twitter Profile Photo

Thrilled to announce our paper "CAPTURE: Context-Aware Prompt Injection Testing and Robustness Enhancement" has been accepted to the ACL 2025 LLMSec Workshop! Looking forward to sharing our work on tackling prompt injection in LLMs. #ACL2025 #LLMSec #AIsecurity #NLP

Lindia Tjuatja (@lltjuatja) 's Twitter Profile Photo

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 🧵1/9

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs: 

🧵1/9