Robert Vacareanu (@robert_nlp) 's Twitter Profile
Robert Vacareanu

@robert_nlp

PhD from @UofArizona
Working on #nlproc
Past: 2022, 2023: Applied Scientist Intern (@AWS)

ID: 1511946939834667008

calendar_today07-04-2022 06:03:40

121 Tweet

236 Followers

1,1K Following

Robert Vacareanu (@robert_nlp) 's Twitter Profile Photo

Conference on Language Modeling Done! Thanks to everyone who stopped by! Lastly, the poster + a short description of it are available here drive.google.com/drive/folders/… #COLM #COLM2024 Conference on Language Modeling #NLP #NLProc

<a href="/COLM_conf/">Conference on Language Modeling</a> Done! Thanks to everyone who stopped by!
Lastly, the poster + a short description of it are available here drive.google.com/drive/folders/…

#COLM #COLM2024 <a href="/COLM_conf/">Conference on Language Modeling</a> #NLP #NLProc
Roberta Raileanu (@robertarail) 's Twitter Profile Photo

I’m looking for a PhD intern for next year to work at the intersection of LLM-based agents and open-ended learning, part of the Llama Research Team in London. If interested please send me an email with a short paragraph with some research ideas and apply at the link below.

Prateek Yadav (@prateeky2806) 's Twitter Profile Photo

I'm on the job market! Please reach out if you are looking to hire someone to work on - RLHF - Efficiency - MoE/Modular models - Synthetic Data - Test time compute - other phases of pre/post-training. If you are not hiring then I would appreciate a retweet! More details👇

Jacob Andreas (@jacobandreas) 's Twitter Profile Photo

Is your CS dept worried about what academic research should be in the age of LLMs? Hire one of my lab members! Leshem Choshen (Leshem (Legend) Choshen 🤖🤗), Pratyusha Sharma (Pratyusha Sharma) and Ekin Akyürek (Ekin Akyürek) are all on the job market with unique perspectives on the future of NLP: 🧵

Summer Yue (@summeryue0) 's Twitter Profile Photo

SEAL Visual-Understanding Leaderboard Launch 🏆 Today, we’re introducing VISTA—a new rubric-based visual task assessment benchmark that pushes beyond simple Q&A. The leading models achieve under 40% on this eval, compared to a human baseline of ~55.4%. This highlights that

SEAL Visual-Understanding Leaderboard Launch 🏆

Today, we’re introducing VISTA—a new rubric-based visual task assessment benchmark that pushes beyond simple Q&amp;A.

The leading models achieve under 40% on this eval, compared to a human baseline of ~55.4%. This highlights that
Summer Yue (@summeryue0) 's Twitter Profile Photo

🚀Big update: 4 new SEAL multilingual leaderboards are LIVE — Arabic, Chinese, Japanese, and Korean! 🌍 Arabic: Gemini 1.5 Pro (gemini-exp-1121) leads the pack 🏮 Chinese: Gemini 1.5 Pro (gemini-1.5-pro-exp-0827) holds the crown 💫 Japanese & Korean: o1-preview dominates 📊 See

Zifan (Sail) Wang (@_zifan_wang) 's Twitter Profile Photo

🧵 1/N) Excited to share our recent work at Scale AI, "Jailbreaking to Jailbreak (J2)".😈 We present a novel LLM-as-red-teamer approach in which a human jailbreaks a refusal-trained LLM to make it willing to jailbreak itself or other LLMs. We refer to this process as

🧵 1/N) Excited to share our recent work at <a href="/scale_AI/">Scale AI</a>, "Jailbreaking to Jailbreak (J2)".😈 We present a novel LLM-as-red-teamer approach in which a human jailbreaks a refusal-trained LLM to make it willing to jailbreak itself or other LLMs. We refer to this process as
Mihai Surdeanu (@msurd) 's Twitter Profile Photo

Our new paper in Findings of NAACL 2025, with Vlad Negru, Robert Vacareanu, Camelia Lemnaru, and Rodica Potolea, proposes a new, softer take on Natural Logic, where alignment is generated through text morphing. This yields robust performance cross domain. arxiv.org/abs/2502.09567

Tanmoy Chakraborty (@tanmoy_chak) 's Twitter Profile Photo

**Kindly consider sharing the post** We are seeking opinions about the current quality of reviewing in *CL conferences. We (EMNLP 2025 PCs along with ACLRollingReview EiCs) are committed to improving the review quality. We are bringing a series of changes in the review process.

Diyi Yang (@diyi_yang) 's Twitter Profile Photo

Check out 🔥 EgoNormia: a benchmark for physical social norm understanding egonormia.org Can we really trust VLMs to make decisions that align with human norms? 👩‍⚖️ With EgoNormia, a 1800 ego-centric video 🥽 QA benchmark, we show that this is surprisingly challenging

MohammadHossein Rezaei (@mhrezaeics) 's Twitter Profile Photo

🔥 Excited to share EgoNormia! A benchmark for physical social norm understanding. Can we really trust VLMs to make decisions that align with human norms? 🌐 Check out our website for the answer: egonormia.org Proud to be part of this amazing team! 🚀

Prateek Yadav (@prateeky2806) 's Twitter Profile Photo

Excited to share our work on RSQ — enhancing quantization by focusing on the most impactful tokens. - Rotate, Scale, Quantize: delivering strong performance - Dynamic, attention-based token importance drives better efficiency - Results across LLaMA3, Mistral, Qwen-2.5, and more

Amanda Bertsch (@abertsch72) 's Twitter Profile Photo

coming to a NAACL 2025 near you! 🌞 Looking forward to discussing with folks in Albuquerque :) The camera-ready is on arxiv now, with more models, more tasks, and more compared settings-- including results comparing ICL to full finetuning! arxiv.org/abs/2405.00200

Zifan (Sail) Wang (@_zifan_wang) 's Twitter Profile Photo

Exciting that Scale AI is sponsoring Agent Workshop at CMU in April. Students and researchers who work on agents feel free to visit CMU to present your work! I will also be traveling to Pittsburgh to share my recent focuses on agents, both capability and safety.

Exciting that <a href="/scale_AI/">Scale AI</a> is sponsoring Agent Workshop at CMU in April. Students and researchers who work on agents feel free to visit CMU to present your work! I will also be traveling to Pittsburgh to share my recent focuses on agents, both capability and safety.
Stanford NLP Group (@stanfordnlp) 's Twitter Profile Photo

Look who we found hanging out in her new Stanford Engineering Gates Computer Science office! We’re truly delighted to welcome Yejin Choi as a new Stanford NLP Group faculty member, starting full-time in September. ❤️ nlp.stanford.edu/people/

Look who we found hanging out in her new <a href="/StanfordEng/">Stanford Engineering</a> Gates Computer Science office! 

We’re truly delighted to welcome <a href="/YejinChoinka/">Yejin Choi</a> as a new <a href="/stanfordnlp/">Stanford NLP Group</a> faculty member, starting full-time in September. ❤️

nlp.stanford.edu/people/
Francesco Orabona (@bremen79) 's Twitter Profile Photo

This is a turning point: I just proved a complex math result useful for my research using an LLM. I am not sure if I should be happy or scared...

MohammadHossein Rezaei (@mhrezaeics) 's Twitter Profile Photo

If you’re at NAACL today, I’ll be presenting this poster in Hall 3 from 2:00 – 3:30 PM. Paper link: aclanthology.org/2025.naacl-lon…

Francesco Orabona (@bremen79) 's Twitter Profile Photo

As promised, we put on Arxiv the proof we did with Gemini. arxiv.org/pdf/2505.20219 This shows that the Polyak stepsize not only will not reach the optimum, but it can cycle, when used without the knowledge of f*. Gemini failed when prompted directly ("Find an example where the

As promised, we put on Arxiv the proof we did with Gemini. arxiv.org/pdf/2505.20219

This shows that the Polyak stepsize not only will not reach the optimum, but it can cycle, when used without the knowledge of f*.

Gemini failed when prompted directly ("Find an example where the