Haoran Xu (@fe1ixxu) Twitter Tweets • TwiCopy

repeat0

Haoran Xu

a year ago

Trilled to share that our paper has been accepted at the main conference of EMNLP 2023! 🎉 EMNLP 2024 JHU CLSP

thumb_up_off_alt21

chat_bubble_outline1

Is In-Context Learning (ICL) equivalent to Gradient Descent (GD)? There is a common belief that applying ICL in #LLM functions like GD-based fine-tuning. But does this hold in real-world LLMs? 🤔 Find out in our latest paper: arxiv.org/abs/2310.08540

JHU CLSP

@jhuclsp

a year ago

“Condensing Multilingual Knowledge with Lightweight Language-Specific Modules” Draft: arxiv.org/abs/2305.13993 By: Haoran Xu and authors TLDR: We propose lightweight but parameter-efficient language-specific modules and further fuse multilingual knowledge in a shared module.

thumb_up_off_alt4

chat_bubble_outline1

repeat2

Haoran Xu

10 months ago

How to generate translation within the target domain only with some monolingual data? Check Weiting (Steven) Tan's new paper!

thumb_up_off_alt4

repeat0

JHU Computer Science

@jhucompsci

10 months ago

Multi-language mastery: minimized hardware, maximized efficiency! Johns Hopkins computer scientists (feat. Haoran Xu & Kenton Murray) introduce a new method to reduce the size of multilingual language models. hub.jhu.edu/2023/12/07/mul…

Multi-language mastery: minimized hardware, maximized efficiency! Johns Hopkins computer scientists (feat. <a href="/fe1ixxu/">Haoran Xu</a> & <a href="/kentonmurray/">Kenton Murray</a>) introduce a new method to reduce the size of multilingual language models. hub.jhu.edu/2023/12/07/mul…

thumb_up_off_alt6

repeat3

Haoran Xu

8 months ago

Very proud that ALMA has been accepted at ICLR 2024! JHU CLSP Microsoft Research arxiv.org/abs/2309.11674

Lingfeng Shen

@lingfeng_nlp

8 months ago

So happy to share that our paper 'The Trickle-down Impact of Reward (In-)consistency on RLHF' (arxiv.org/abs/2309.16155…) has been accepted by ICLR this year. #ICLR #RLHF I believe that we should explore/enhance RLHF through a data-centric perspective! JHU CLSP

thumb_up_off_alt20

chat_bubble_outline1

Young

@yjkim362

8 months ago

Opening up a new generation of machine translation leveraging the power of LLMs! It's now in #ICLR2024 (w/ Haoran Xu , Amr Sharaf , Hany Awadalla ). Teaser: Another breakthrough is coming, soonish..

thumb_up_off_alt9

repeat3

Young

@yjkim362

8 months ago

We love DPO for its elegance and simplicity. So, we are making it even better! By eliminating the reference model, the loss function becomes contrastive and we call it CPO (Contrastive Preference Optimization). It's even more effective at our target task than DPO!

thumb_up_off_alt15

repeat2

Haoran Xu

8 months ago

Very happy to give this talk! Feel free to join!

thumb_up_off_alt9

repeat2

Weiting (Steven) Tan

@weiting_nlp

8 months ago

Is your model struggling with high latency and huge memory costs for real-time sequence processing? 🚀 Introducing STAR: A transformer-based model for streaming seq2seq transduction with compression. arxiv.org/abs/2402.01172 #NLProc #Streaming #Seq2seq #Compression #SpeechToText

Haoran Xu

5 months ago

Excited to share that CPO has been accepted at ICML 2024!

thumb_up_off_alt12

repeat3

Haoran Xu

5 months ago

Honored to give a talk at MASC-SLL!

thumb_up_off_alt5

repeat0

Lingfeng Shen

@lingfeng_nlp

4 months ago

📢 Happy to share that our paper on #LLM safety in multilingual contexts has been accepted at #ACL 2024! ✨ We show the difficulty of alleviating multilingual safety issues in LLMs through standard alignment methods. arxiv.org/abs/2401.13136 🧵1/7

Lingfeng Shen

@lingfeng_nlp

4 months ago

Super excited that our work got picked for an #Oral presentation at #ICML this year! Had an awesome time collaborating with Aayush Mishra and Daniel Khashabi 🕊️ at JHU CLSP. Pity I can't make it to Vienna because of visa issues😅

thumb_up_off_alt25

Haoran Xu

3 months ago

We recently had multiple rounds of discussions with the SimPO authors regarding the lack of comparison to CPO in their main paper. We both agree that it was an unintentional oversight, and they will update the paper to address it. We appreciate their positive and prompt response

thumb_up_off_alt21

Haoran Xu

3 months ago

Here’s some better news: Combining CPO and SimPO can likely improve the model! Check out more details in our GitHub code: github.com/fe1ixxu/CPO_SI…

Young

@yjkim362

a month ago

"What if Phi meets MoE?" I am super excited to share our new Phi-3.5-MoE. Phi-3.5-MoE is a 16 x 3.8B MoE model that only activates 6.6B params with 2 experts. MMLU score of 78.9! It outperforms Llama-3.1 8B, Gemma-2-9B, and Gemini-1.5-Flash. And, close to GPT-4o-mini. MIT lic

thumb_up_off_alt65

chat_bubble_outline3

repeat8