Lei Li (@lileics) 's Twitter Profile
Lei Li

@lileics

Generative AI for language and science. MT, LLM, GenAI Safety, Drug Discovery

ID: 128674863

linkhttps://www.cs.cmu.edu/~leili calendar_today01-04-2010 21:20:24

750 Tweet

5,5K Followers

433 Following

Lei Li (@lileics) 's Twitter Profile Photo

4/n 3. some weekly exercise could help. biking, running, skiing… I like my current city because we can have all different activities through four seasons.

Lei Li (@lileics) 's Twitter Profile Photo

5/n 4. there is an impossible triangle: fame, wealth, and work freedom. accepting this will lead to a better peaceful mind.

Lei Li (@lileics) 's Twitter Profile Photo

6/n 5. regardless of whether liking it or not, some social interactions and conversations with various people, colleagues, friends, relatives, neighbors could help.

Jian Ma (@jmuiuc) 's Twitter Profile Photo

Proper and meaningful benchmark datasets are crucial for advancing genomic LLMs/FMs, and ML methods for genomics in general. Fantastic collab w/ Lei Li's group. Amazing work led by Wenduo Cheng Zhenqiao Song Yang Zhang

Graham Neubig (@gneubig) 's Twitter Profile Photo

Are you interested in getting started in research related to LLMs, agents, speech, safety, fairness, or other aspects of language technology? At Language Technologies Institute | @CarnegieMellon we're hosting an internship program for pre-doctoral students interested in these areas! lti.cs.cmu.edu/news-and-event…

Lei Li (@lileics) 's Twitter Profile Photo

A new comprehensive multilingual (and multitask) evaluation suite for LLMs (covering 17 diverse languages), developed by Xu Huang and folks! Check out BenchMAX at github.com/CONE-MT/BenchM…

Lei Li (@lileics) 's Twitter Profile Photo

a newly baked Dr. Congratulations to Wenda Xu for successfully defending his phd thesis "On Evaluation and Efficient Post-training for LLMs". Highly recommend his slides: covering RL training, better KD, LLM/text gen evaluation, bias in LLM as a judge: docs.google.com/presentation/d…

Lei Li (@lileics) 's Twitter Profile Photo

The 2nd Generative AI and Biology workshop will collocate with ICML 2025 in Vancouver this year (July 18/19, 2025). CFP: genbio-workshop.github.io/2025/ We have a fantastic lineup of speakers. Mengdi Wang Eric Xing Marinka Zitnik Stefano Ermon Minkai Xu Zhenqiao Song

Lei Li (@lileics) 's Twitter Profile Photo

Excited to visit ABQ! We are presenting six papers at #NAACL2025 on simultaneous translation/speech translation, inference-time optimization, finding lottery tickets in LLMs, AI text detection, and language agents for task planning. I am here the full week. Feel free to DM.

Excited to visit ABQ! We are presenting six papers at #NAACL2025 on simultaneous translation/speech translation, inference-time optimization, finding lottery tickets in LLMs, AI text detection, and language agents for task planning. 
I am here the full week. Feel free to DM.
Lei Li (@lileics) 's Twitter Profile Photo

I will give a talk at 11:15am today in Ruidoso at #NAACL2025 about KS-Lottery— finding small number of token embeddings in an LLM that are effective for fine-tuning. Surprising finding: 18 tokens are enough for fine-tuning!

I will give a talk at 11:15am today in Ruidoso at #NAACL2025 about KS-Lottery— finding small number of token embeddings in an LLM that are effective for fine-tuning.
Surprising finding: 18 tokens are enough for fine-tuning!
Lei Li (@lileics) 's Twitter Profile Photo

Kexun is presenting OSCA - Optimal Sample Compute Allocation at #NAACL2025 in Hall 3 (#50). The paper presents an optimization algorithm to find optimal configurations for LLM inference. arxiv.org/abs/2410.22480

Kexun is presenting OSCA - Optimal Sample Compute Allocation at #NAACL2025 in Hall 3 (#50).
The paper presents an optimization algorithm to find optimal configurations for LLM inference.

arxiv.org/abs/2410.22480
Lei Li (@lileics) 's Twitter Profile Photo

Can AI text detectors identify LLm generated code, paper reviews, abstract, translation, summary? Brian is presenting a new study about existing AI text detectors on LLM generated content at #NAACL2025 TLDR; all existing detectors work poorly. arxiv.org/abs/2412.05139

Can AI text detectors identify LLm generated code, paper reviews, abstract, translation, summary? Brian is presenting a new study about existing AI text detectors on LLM generated content at #NAACL2025 

TLDR; all existing detectors work poorly.

arxiv.org/abs/2412.05139
Lei Li (@lileics) 's Twitter Profile Photo

Simultaneous translation always aims to reduce latency while retaining translation quality, but measuring latency turns non-trivial. Xi and Siqi’s new work proposes a highly accurate method, CA*, to measure latency in ST, by taking actual inference time into account. #NAACL25

Simultaneous translation always aims to reduce latency while retaining translation quality, but measuring latency turns non-trivial. Xi and Siqi’s new work proposes a highly accurate method, CA*, to measure latency in ST, by taking actual inference time into account.
#NAACL25
Lei Li (@lileics) 's Twitter Profile Photo

How to reduce latency for simultaneous (text) translation? Siqi proposes TAF method — the key idea is to forecast source side continuations of utterance before actual input, and then using majority voting to generate possible translations. arxiv.org/abs/2410.22499 #NAACL2025

How to reduce latency for simultaneous (text) translation? Siqi proposes TAF method — the key idea is to forecast source side continuations of utterance before actual input, and then using majority voting to generate possible translations.
arxiv.org/abs/2410.22499
#NAACL2025
Lei Li (@lileics) 's Twitter Profile Photo

Better than LoRA! You only need to train as few as 18 token embeddings of LLaMA to achieve superior translation performance on new languages. KS-Lottery provides a statistical sound method to find an extremely small number of LLM embedding parameters to fine-tune!

Better than LoRA! You only need to train as few as 18 token embeddings of LLaMA to achieve superior translation performance on new languages. KS-Lottery provides a statistical sound method to find an extremely small number of LLM embedding parameters to fine-tune!
Lei Li (@lileics) 's Twitter Profile Photo

We are organizing Generative AI for Biology workshop at #ICML2025. Welcome to submit any relevant work on AI for biomolecule, AI model for bio systems, AI and experiments, Agent for bio discovery, new datasets and tools, etc. The deadline is May 25th. genbio-workshop.github.io/2025/

Lei Li (@lileics) 's Twitter Profile Photo

Just delivered 4 lectures (50mins each, a total of 3hours 20mins) in a roll at Advanced course on Data Science and Machine Learning (acdl2025.icas.events). Wonderful to have conversations with the ACDL participants! thanks to the directors, Giuseppe Nicosia and Panos Pardalos

Just delivered 4 lectures (50mins each, a total of 3hours 20mins) in a roll at Advanced  course on Data Science and Machine Learning (acdl2025.icas.events). Wonderful to have conversations with the ACDL participants! thanks to the directors, Giuseppe Nicosia and Panos Pardalos