Wei Xu (@cocoweixu) 's Twitter Profile
Wei Xu

@cocoweixu

CS professor @GeorgiaTech @gtcomputing @ICatGT @mlatgt. Natural language processing, machine learning, LLMs, social media research.

ID: 237918251

linkhttps://cocoxu.github.io/ calendar_today13-01-2011 23:15:12

1,1K Tweet

10,10K Followers

1,1K Following

Wei Xu (@cocoweixu) 's Twitter Profile Photo

It was a pleasure to host Mike Lewis from Meta to speak at Georgia Tech's ML Seminar this week. His insights into training LLMs, based on his firsthand experience with successfully developing Llama-3 🦙, were incredibly engaging and informative. Thanks, Mike Lewis!

It was a pleasure to host Mike Lewis from Meta to speak at Georgia Tech's ML Seminar this week.

His insights into training LLMs, based on his firsthand experience with successfully developing Llama-3 🦙, were incredibly engaging and informative. Thanks, <a href="/ml_perception/">Mike Lewis</a>!
Yu Lu Liu 🦋@ liuyulu.bsky.social (@liu_yu_lu) 's Twitter Profile Photo

Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: “Mind the Context”! Come join us on this bridge between #HCI and #NLProc! Submission deadline: Feb 17 AoE More info at heal-workshop.github.io.

Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: “Mind the Context”! Come join us on this bridge between #HCI and #NLProc!

Submission deadline: Feb 17 AoE
More info at heal-workshop.github.io.
Chao Jiang (@chaojiang06) 's Twitter Profile Photo

I defended my doctoral thesis, "Studying Text Revision in Scientific Writing," yesterday!🎓 A big thanks to my advisor Wei Xu for training me to become a researcher, and thanks to my respected committee Alan Ritter kartik goyal Violet Peng & Dr. Cheng Li from DeepMind!

I defended my doctoral thesis, "Studying Text Revision in Scientific Writing," yesterday!🎓

A big thanks to my advisor <a href="/cocoweixu/">Wei Xu</a>  for training me to become a researcher, and thanks to my respected committee <a href="/alan_ritter/">Alan Ritter</a> <a href="/kartik_goyal_/">kartik goyal</a> <a href="/VioletNPeng/">Violet Peng</a>  &amp; Dr. Cheng Li from DeepMind!
Ziang Xiao (@ziangxiao) 's Twitter Profile Photo

Dr. Susu Zhang (sites.google.com/view/susuzhang/) and I are recruiting a postdoc to work on Gen AI evaluation design and assessment. If you want to join the team, please contact me and apply through the DSAI fellowship!

Jim Fan (@drjimfan) 's Twitter Profile Photo

This is the most gut-wrenching blog I've read, because it's so real and so close to heart. The author is no longer with us. I'm in tears. AI is not supposed to be 200B weights of stress and pain. It used to be a place of coffee-infused eureka moments, of exciting late-night arxiv

This is the most gut-wrenching blog I've read, because it's so real and so close to heart. The author is no longer with us. I'm in tears. AI is not supposed to be 200B weights of stress and pain. It used to be a place of coffee-infused eureka moments, of exciting late-night arxiv
Tarek Naous (@tareknaous) 's Twitter Profile Photo

What causes entity-related cultural biases in LMs? Is it just pre-training data? Our latest paper shows how varying linguistic phenomena exhibited by entities (such as word sense in Arabic) impact the cross-cultural performance of LMs. arxiv.org/abs/2501.04662

What causes entity-related cultural biases in LMs? Is it just pre-training data?

Our latest paper shows how varying linguistic phenomena exhibited by entities (such as word sense in Arabic) impact the cross-cultural performance of LMs.

arxiv.org/abs/2501.04662
Tu Vu (@tuvllms) 's Twitter Profile Photo

📢📢 If you're interested in a full-time Research Scientist/Engineer position at Google DeepMind (Mountain View, CA) working on connecting retrieval to Gemini, RAG, generative retrieval, open-book QA, etc., please email me at [email protected] with your CV and/or website.

Georgia Tech Computing (@gtcomputing) 's Twitter Profile Photo

As artificial intelligence (AI) continues to evolve, its impact on society becomes increasingly profound. To gain insights into the trends shaping the AI landscape in 2025, we spoke with Wei Xu (Wei Xu), an associate professor at Georgia Tech’s School of Interactive Computing

As artificial intelligence (AI) continues to evolve, its impact on society becomes increasingly profound. To gain insights into the trends shaping the AI landscape in 2025, we spoke with Wei Xu (<a href="/cocoweixu/">Wei Xu</a>), an associate professor at Georgia Tech’s School of Interactive Computing
Mohit (@mohit_r9a) 's Twitter Profile Photo

🚨Just out Targeted data curation for SFT and RLHF is a significant cost factor 💰for improving LLM performance during post-training. How should you allocate your data annotation budgets between SFT and Preference Data? We ran 1000+ experiments to find out! 1/7

🚨Just out

Targeted data curation for SFT and RLHF is a significant cost factor 💰for improving LLM performance during post-training.

How should you allocate your data annotation budgets between SFT and Preference Data?

We ran 1000+ experiments to find out!

1/7
Hanna Hajishirzi (@hannahajishirzi) 's Twitter Profile Photo

Excited to drive innovation and push the boundaries of open, scientific AI research & development! 🚀 Join us at Ai2 to shape the future of OLMo, Molmo, Tulu, and more. We’re hiring at all levels—apply now! 👇 #AI #Hiring Research Engineer job-boards.greenhouse.io/thealleninstit… Research

Jungsoo Park (@jungsoo___park) 's Twitter Profile Photo

🚨 Just Out Can LLMs extract experimental data about themselves from scientific literature to improve understanding of their behavior? We propose a semi-automated approach for large-scale, continuously updatable meta-analysis to uncover intriguing behaviors in frontier LLMs. 🧵

🚨 Just Out

Can LLMs extract experimental data about themselves from scientific literature to improve understanding of their behavior?

We propose a semi-automated approach for large-scale, continuously updatable meta-analysis to uncover intriguing behaviors in frontier LLMs. 🧵
Jonathan Zheng (@jonathanqzheng) 's Twitter Profile Photo

🚨o3-mini vastly outperforms DeepSeek-R1 on an unseen probabilistic reasoning task! Introducing k-anonymity estimation: a novel task to assess privacy risks in sensitive texts Unlike conventional math and logical reasoning, this is difficult for both humans and AI models. 1/7

🚨o3-mini vastly outperforms DeepSeek-R1 on an unseen probabilistic reasoning task!

Introducing k-anonymity estimation: a novel task to assess privacy risks in sensitive texts

Unlike conventional math and logical reasoning, this is difficult for both humans and AI models.

1/7
Alan Ritter (@alan_ritter) 's Twitter Profile Photo

Wondering what review scores you need to get accepted at ACL? Maybe this data from NAACL 2025 can help: gist.github.com/aritter/8b65a9…

Agam A. Shah (@shahagam4) 's Twitter Profile Photo

Thrilled to share our new preprint: "Beyond the Reported Cutoff: Where LLMs Fall Short on Financial Knowledge" We evaluated 197,011 revenue questions across 17,621 U.S. companies (1980–2022) using 6 top LLMs. Key insights 🧵

Thrilled to share our new preprint: 
 "Beyond the Reported Cutoff: Where LLMs Fall Short on Financial Knowledge" 

We evaluated 197,011 revenue questions across 17,621 U.S. companies (1980–2022) using 6 top LLMs.

Key insights 🧵
Wei Xu (@cocoweixu) 's Twitter Profile Photo

I am giving a keynote at PrivateNLP Workshop (sites.google.com/view/privatenl…) at #NAACL2025 (Sunday 9am CT). * GPT4-v is a performant geolocator, predicting exact GPS coordinates of image > any SOTA * LLMs can estimate privacy risk based on probabilistic reasoning > chain-of-thoughts

I am giving a keynote at PrivateNLP Workshop (sites.google.com/view/privatenl…) at #NAACL2025 (Sunday 9am CT). 

* GPT4-v is a performant geolocator, predicting exact GPS coordinates of image &gt; any SOTA
* LLMs can estimate privacy risk based on probabilistic reasoning &gt; chain-of-thoughts
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

An attempt to explain (current) ChatGPT versions. I still run into many, many people who don't know that: - o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3

An attempt to explain (current) ChatGPT versions.

I still run into many, many people who don't know that:
- o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3
Geyang Guo (@cherylolguo) 's Twitter Profile Photo

❤️🌎 Introducing CARE: Multilingual Multicultural Human Preference Learning 3490 culturally relevant prompts + 31.7k Human/AI-written responses rated by multilingual speakers 💡 Key insights: - Even a small amount of cultural data improves popular LLMs consistently. - Deepseek-v3

❤️🌎 Introducing CARE: Multilingual Multicultural Human Preference Learning
3490 culturally relevant prompts + 31.7k Human/AI-written responses rated by multilingual speakers
💡 Key insights:
- Even a small amount of cultural data improves popular LLMs consistently.
- Deepseek-v3