Luxi (Lucy) He (@luxihelucy) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Attending Conference on Language Modeling from 10/6 to 10/9! If you want to chat about GenAI security, privacy, safety, or reasoning (I just started exploring it!), DM me :) & My team at Google AI is looking for interns. Email me ([email protected]) your resume if you are interested.

thumb_up_off_alt332

chat_bubble_outline4

repeat21

shareShare

Tianyu Gao

@gaotianyu1350

10 months ago

Very proud to introduce two of our recent long-context works: HELMET (best long-context benchmark imo): shorturl.at/JnBHD ProLong (a cont’d training & SFT recipe + a SoTA 512K 8B model): shorturl.at/XQV7a Here is a story of how we arrived there

thumb_up_off_alt199

chat_bubble_outline5

repeat43

shareShare

Luxi (Lucy) He

@luxihelucy

10 months ago

I'm attending Conference on Language Modeling next week! Excited to meet folks and chat about alignment, safety, reasoning, LM evaluations, and more! Please feel free to reach out anytime :) Mengzhou Xia and I will present our work on data selection + safety on Tuesday afternoon, come chat with us!

I'm attending <a href="/COLM_conf/">Conference on Language Modeling</a> next week! Excited to meet folks and chat about alignment, safety, reasoning, LM evaluations, and more! Please feel free to reach out anytime :)
<a href="/xiamengzhou/">Mengzhou Xia</a> and I will present our work on data selection + safety on Tuesday afternoon, come chat with us!

thumb_up_off_alt144

chat_bubble_outline4

repeat6

shareShare

Sadhika Malladi

@sadhikamalladi

10 months ago

Theory + exps in our new work show that preference tuning can move probability mass in unexpected ways, causing aligned models (across scales and settings) to unalign. For example, training a model to prefer "No" over "Never" makes prob of "Yes" increase. arxiv.org/abs/2410.08847

thumb_up_off_alt16

chat_bubble_outline0

repeat4

shareShare

Luxi (Lucy) He

@luxihelucy

10 months ago

Join us today at 3 pm ET for a discussion on AI safety and alignment with David Krueger 🤩 Submit your questions in advance at the link in the post!

thumb_up_off_alt21

chat_bubble_outline1

repeat4

shareShare

Yangsibo Huang

@yangsibohuang

10 months ago

Unlearning allows users request the removal of specific data from a trained model. Sounds great, right? 👿 BUT: we show how adversaries can exploit this to completely DESTROY model accuracy—plummeting to just 3.6% on CIFAR-10 and 0.4% on ImageNet after the attack! (1/n)

thumb_up_off_alt219

chat_bubble_outline2

repeat23

shareShare

Ryan Liu @ NeurIPS 2024

@theryanliu

9 months ago

Is encouraging LLMs to reason through a task always beneficial?🤔 NO🛑- inspired by when verbal thinking makes humans worse at tasks, we predict when CoT impairs LLMs & find 3 types of failure cases. In one OpenAI o1 preview accuracy drops 36.3% compared to GPT-4o zero-shot!😱

thumb_up_off_alt518

chat_bubble_outline15

repeat96

shareShare

Luxi (Lucy) He

@luxihelucy

9 months ago

Tune in to our PASS Seminar with Nathan Lambert next Monday! Submit your question via the link in thread :)

thumb_up_off_alt19

chat_bubble_outline0

repeat0

shareShare

Luxi (Lucy) He

@luxihelucy

9 months ago

Excited for the talk today at 2pm ET! YouTube link here youtube.com/@PrincetonPLI and submit your questions via forms.gle/7GQXAr9aonfvy1… 🤩

thumb_up_off_alt19

chat_bubble_outline1

repeat1

shareShare

Luxi (Lucy) He

@luxihelucy

9 months ago

Happening today! You can submit your questions for Gillian here: forms.gle/3aoVC7jrG5NYpS…

thumb_up_off_alt12

chat_bubble_outline0

repeat2

shareShare

Sadhika Malladi

@sadhikamalladi

9 months ago

Congratulations to Ai2 on the exciting Tulu 3 release! We had Nathan Lambert on PASS a few weeks ago to talk all about it. Check out the recording for an easy primer to the paper: youtube.com/watch?v=ltSzUI…

thumb_up_off_alt19

chat_bubble_outline0

repeat6

shareShare

Princeton PLI

@princetonpli

8 months ago

TOMORROW PASS SEMINAR, 12/3 at 1pm ET! Speaker: Percy Liang from Stanford University Live: youtube.com/@PrincetonPLI/… Recordings later at: youtube.com/@PrincetonPLI

TOMORROW PASS SEMINAR, 12/3 at 1pm ET!

Speaker: <a href="/percyliang/">Percy Liang</a> from <a href="/Stanford/">Stanford University</a>

Live: youtube.com/@PrincetonPLI/…
Recordings later at: youtube.com/@PrincetonPLI

thumb_up_off_alt24

chat_bubble_outline0

repeat6

shareShare

Tianyu Gao

@gaotianyu1350

7 months ago

Introducing MeCo (metadata conditioning then cooldown), a remarkably simple method that accelerates LM pre-training by simply prepending source URLs to training documents. arxiv.org/abs/2501.01956

thumb_up_off_alt197

chat_bubble_outline4

repeat43

shareShare

Simon Park

@parksimon0808

7 months ago

Does all LLM reasoning transfer to VLM? In context of Simple-to-Hard generalization we show: NO! We also give ways to reduce this modality imbalance. Paper arxiv.org/abs/2501.02669 Code github.com/princeton-pli/… Abhishek Panigrahi Yun (Catherine) Cheng Dingli Yu Anirudh Goyal Sanjeev Arora

thumb_up_off_alt69

chat_bubble_outline1

repeat18

shareShare

Yangsibo Huang

@yangsibohuang

6 months ago

LLM safety guardrails can be easily removed through fine-tuning. While defenses have been proposed, our #ICLR2025 paper shows flawed evaluations can create a false sense of security. Check out the thread by Boyi Wei for more details 🧵

thumb_up_off_alt69

chat_bubble_outline0

repeat7

shareShare

Luxi (Lucy) He

@luxihelucy

6 months ago

Excited to share that our paper is accepted at ICLR 2026 #ICLR2025! See you in Singapore!

thumb_up_off_alt114

chat_bubble_outline5

repeat12

shareShare

Alex Wettig

@_awettig

6 months ago

🤔 Ever wondered how prevalent some type of web content is during LM pre-training? In our new paper, we propose WebOrganizer which *constructs domains* based on the topic and format of CommonCrawl web pages 🌐 Key takeaway: domains help us curate better pre-training data! 🧵/N

thumb_up_off_alt195

chat_bubble_outline5

repeat48

shareShare

Peter Henderson

@peterhndrsn

5 months ago

Preserving alignment during customization & fine-tuning is a challenging problem! Here's another work showing how language models can be broadly misaligned by finetuning. If interested, can also check out work from our group by Luxi (Lucy) He Boyi Wei, Xiangyu Qi, & others!

thumb_up_off_alt66

chat_bubble_outline1

repeat17

shareShare

Princeton NLP Group

@princeton_nlp

4 months ago

Nothing like a sunny hike to welcome spring!

thumb_up_off_alt76

chat_bubble_outline0

repeat6

shareShare

Peter Henderson

@peterhndrsn

3 months ago

Very excited that our work, "Safety Alignment Should be Made More Than Just a Few Tokens Deep" was recognized for an Outstanding Paper Award at #ICLR2025! We hope this is a step forward in improving and understanding robustness of language model alignment. It was great working

thumb_up_off_alt105

chat_bubble_outline3

repeat11

shareShare