Robert Vacareanu (@robert_nlp) Twitter Tweets • TwiCopy

Robert Vacareanu

a year ago

Conference on Language Modeling Done! Thanks to everyone who stopped by! Lastly, the poster + a short description of it are available here drive.google.com/drive/folders/… #COLM #COLM2024 Conference on Language Modeling #NLP #NLProc

<a href="/COLM_conf/">Conference on Language Modeling</a> Done! Thanks to everyone who stopped by!
Lastly, the poster + a short description of it are available here drive.google.com/drive/folders/…

#COLM #COLM2024 <a href="/COLM_conf/">Conference on Language Modeling</a> #NLP #NLProc

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Sharon Levy

@sharonlevy21

a year ago

Come join our reading group!

thumb_up_off_alt46

chat_bubble_outline1

repeat5

shareShare

Roberta Raileanu

@robertarail

a year ago

I’m looking for a PhD intern for next year to work at the intersection of LLM-based agents and open-ended learning, part of the Llama Research Team in London. If interested please send me an email with a short paragraph with some research ideas and apply at the link below.

thumb_up_off_alt573

chat_bubble_outline11

repeat104

shareShare

Prateek Yadav

@prateeky2806

a year ago

I'm on the job market! Please reach out if you are looking to hire someone to work on - RLHF - Efficiency - MoE/Modular models - Synthetic Data - Test time compute - other phases of pre/post-training. If you are not hiring then I would appreciate a retweet! More details👇

thumb_up_off_alt213

chat_bubble_outline8

repeat60

shareShare

maharshi

@mrsiipa

a year ago

what an amazing read: converting json to regex then regex to finite state machines, and then optimising it is brilliant!

thumb_up_off_alt1,1K

chat_bubble_outline32

repeat161

shareShare

Jacob Andreas

@jacobandreas

a year ago

Is your CS dept worried about what academic research should be in the age of LLMs? Hire one of my lab members! Leshem Choshen (Leshem (Legend) Choshen 🤖🤗), Pratyusha Sharma (Pratyusha Sharma) and Ekin Akyürek (Ekin Akyürek) are all on the job market with unique perspectives on the future of NLP: 🧵

thumb_up_off_alt230

chat_bubble_outline5

repeat28

shareShare

Summer Yue

@summeryue0

a year ago

SEAL Visual-Understanding Leaderboard Launch 🏆 Today, we’re introducing VISTA—a new rubric-based visual task assessment benchmark that pushes beyond simple Q&A. The leading models achieve under 40% on this eval, compared to a human baseline of ~55.4%. This highlights that

thumb_up_off_alt78

chat_bubble_outline2

repeat23

shareShare

Summer Yue

@summeryue0

a year ago

🚀Big update: 4 new SEAL multilingual leaderboards are LIVE — Arabic, Chinese, Japanese, and Korean! 🌍 Arabic: Gemini 1.5 Pro (gemini-exp-1121) leads the pack 🏮 Chinese: Gemini 1.5 Pro (gemini-1.5-pro-exp-0827) holds the crown 💫 Japanese & Korean: o1-preview dominates 📊 See

thumb_up_off_alt38

chat_bubble_outline2

repeat11

shareShare

Zifan (Sail) Wang

@_zifan_wang

9 months ago

🧵 1/N) Excited to share our recent work at Scale AI, "Jailbreaking to Jailbreak (J2)".😈 We present a novel LLM-as-red-teamer approach in which a human jailbreaks a refusal-trained LLM to make it willing to jailbreak itself or other LLMs. We refer to this process as

🧵 1/N) Excited to share our recent work at <a href="/scale_AI/">Scale AI</a>, "Jailbreaking to Jailbreak (J2)".😈 We present a novel LLM-as-red-teamer approach in which a human jailbreaks a refusal-trained LLM to make it willing to jailbreak itself or other LLMs. We refer to this process as

thumb_up_off_alt69

chat_bubble_outline5

repeat21

shareShare

Mihai Surdeanu

@msurd

9 months ago

Our new paper in Findings of NAACL 2025, with Vlad Negru, Robert Vacareanu, Camelia Lemnaru, and Rodica Potolea, proposes a new, softer take on Natural Logic, where alignment is generated through text morphing. This yields robust performance cross domain. arxiv.org/abs/2502.09567

thumb_up_off_alt25

chat_bubble_outline0

repeat5

shareShare

Tanmoy Chakraborty

@tanmoy_chak

9 months ago

**Kindly consider sharing the post** We are seeking opinions about the current quality of reviewing in *CL conferences. We (EMNLP 2025 PCs along with ACLRollingReview EiCs) are committed to improving the review quality. We are bringing a series of changes in the review process.

thumb_up_off_alt13

chat_bubble_outline0

repeat5

shareShare

Diyi Yang

@diyi_yang

9 months ago

Check out 🔥 EgoNormia: a benchmark for physical social norm understanding egonormia.org Can we really trust VLMs to make decisions that align with human norms? 👩‍⚖️ With EgoNormia, a 1800 ego-centric video 🥽 QA benchmark, we show that this is surprisingly challenging

thumb_up_off_alt160

chat_bubble_outline1

repeat34

shareShare

MohammadHossein Rezaei

@mhrezaeics

9 months ago

🔥 Excited to share EgoNormia! A benchmark for physical social norm understanding. Can we really trust VLMs to make decisions that align with human norms? 🌐 Check out our website for the answer: egonormia.org Proud to be part of this amazing team! 🚀

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Prateek Yadav

@prateeky2806

9 months ago

Excited to share our work on RSQ — enhancing quantization by focusing on the most impactful tokens. - Rotate, Scale, Quantize: delivering strong performance - Dynamic, attention-based token importance drives better efficiency - Results across LLaMA3, Mistral, Qwen-2.5, and more

thumb_up_off_alt25

chat_bubble_outline0

repeat9

shareShare

Amanda Bertsch

@abertsch72

9 months ago

coming to a NAACL 2025 near you! 🌞 Looking forward to discussing with folks in Albuquerque :) The camera-ready is on arxiv now, with more models, more tasks, and more compared settings-- including results comparing ICL to full finetuning! arxiv.org/abs/2405.00200

thumb_up_off_alt63

chat_bubble_outline1

repeat12

shareShare

Zifan (Sail) Wang

@_zifan_wang

8 months ago

Exciting that Scale AI is sponsoring Agent Workshop at CMU in April. Students and researchers who work on agents feel free to visit CMU to present your work! I will also be traveling to Pittsburgh to share my recent focuses on agents, both capability and safety.

Exciting that <a href="/scale_AI/">Scale AI</a> is sponsoring Agent Workshop at CMU in April. Students and researchers who work on agents feel free to visit CMU to present your work! I will also be traveling to Pittsburgh to share my recent focuses on agents, both capability and safety.

thumb_up_off_alt19

chat_bubble_outline1

repeat5

shareShare

Stanford NLP Group

@stanfordnlp

8 months ago

Look who we found hanging out in her new Stanford Engineering Gates Computer Science office! We’re truly delighted to welcome Yejin Choi as a new Stanford NLP Group faculty member, starting full-time in September. ❤️ nlp.stanford.edu/people/

Look who we found hanging out in her new <a href="/StanfordEng/">Stanford Engineering</a> Gates Computer Science office!

We’re truly delighted to welcome <a href="/YejinChoinka/">Yejin Choi</a> as a new <a href="/stanfordnlp/">Stanford NLP Group</a> faculty member, starting full-time in September. ❤️

nlp.stanford.edu/people/

thumb_up_off_alt377

chat_bubble_outline6

repeat24

shareShare

Francesco Orabona

@bremen79

7 months ago

This is a turning point: I just proved a complex math result useful for my research using an LLM. I am not sure if I should be happy or scared...

thumb_up_off_alt591

chat_bubble_outline22

repeat59

shareShare

MohammadHossein Rezaei

@mhrezaeics

7 months ago

If you’re at NAACL today, I’ll be presenting this poster in Hall 3 from 2:00 – 3:30 PM. Paper link: aclanthology.org/2025.naacl-lon…

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Francesco Orabona

@bremen79

6 months ago

As promised, we put on Arxiv the proof we did with Gemini. arxiv.org/pdf/2505.20219 This shows that the Polyak stepsize not only will not reach the optimum, but it can cycle, when used without the knowledge of f*. Gemini failed when prompted directly ("Find an example where the

thumb_up_off_alt425

chat_bubble_outline5

repeat61

shareShare