GX Xu (@gx_nlp) 's Twitter Profile
GX Xu

@gx_nlp

Research Engineer @ Redhat AI Innovation

ID: 1542500294340186112

calendar_today30-06-2022 13:28:40

26 Tweet

63 Followers

313 Following

Ruibo Liu (@ruiboliu) 's Twitter Profile Photo

🎲Life is a game. Play by your rules! 🎮 Stable Alignment enables LM to learn social norms from simulated everyday interactions in a social game! 👫 Check this out 👇: arxiv.org/abs/2305.16960

Matt Shumer (@mattshumer_) 's Twitter Profile Photo

Here is an incredible Claude 3 prompt for engineers. Use it to speed up any code by identifying inefficiencies and rectifying them: --- <prompt_explanation> You are a world expert in making code run faster. You use any resource you can to do so. Given some code, first, explain

Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

I'd given up using ChatGPT for all but the most basic tasks -- I just wasn't getting answers that were good enough to be of practical use to me. But Claude 3 Opus is being genuinely useful, and it's making me use LLM chat again. Thanks Anthropic!

Inflection AI (@inflectionai) 's Twitter Profile Photo

Evaluation is everything! While testing Inflection-2.5, we found that MT-Bench has a bunch of incorrect answers. Here we share the corrections for everyone to use, and we release a new Physics GRE benchmark for people to try out. inflection.ai/inflection-2-5

Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

Today, with Tim Dettmers, Hugging Face, & @mobius_labs, we're releasing FSDP/QLoRA, a new project that lets you efficiently train very large (70b) models on a home computer with consumer gaming GPUs. 1/🧵 answer.ai/posts/2024-03-…

Brendan Dolan-Gavitt (@moyix) 's Twitter Profile Photo

I gave Claude 3 the entire source of a small C GIF decoding library I found on GitHub, and asked it to write me a Python function to generate random GIFs that exercised the parser. Its GIF generator got 92% line coverage in the decoder and found 4 memory safety bugs and one hang.

swyx (@swyx) 's Twitter Profile Photo

I've now had multiple >20min phone calls with AI therapists and it feels completely natural. Every AI Engineer should be building their own therapist rn, and voice is the right medium. forget typing. go on a long walk and talk thru your day, your childhood, your dreams,

I've now had multiple &gt;20min phone calls with AI therapists and it feels completely natural. Every AI Engineer should be building their own therapist rn, and voice is the right medium.   

forget typing.   

go on a long walk and talk thru your day, your childhood, your dreams,
Elron Bandel (@elronbandel) 's Twitter Profile Photo

A personal note: Unitxt originated within the Leshem (Legend) Choshen 🤖🤗 fusing team, aiming to streamline the sharing of academic outputs, primarily through model weights but also data. In the process of training various models on numerous datasets, we encountered significant challenges related

GX Xu (@gx_nlp) 's Twitter Profile Photo

Even powerful LLM like Claude3 Opus breaks with the simplest attacks to start hallucinating about “non-existing” context about “steps”. The kind of mistake that a human 5 year old wouldnt make. 😉

Even powerful LLM like Claude3 Opus breaks with the simplest attacks to start hallucinating about “non-existing” context about “steps”.

The kind of mistake that a human 5 year old wouldnt make. 😉
GX Xu (@gx_nlp) 's Twitter Profile Photo

TLDR: Looking for a RLHF method that combines the best of PPO and DPO, stable training, and gives amazing result? BRAIN theoretically unites DPO and PPO, and empirical shown to out-perform! An earlier pre-print of the ICML paper is available now🔥

GX Xu (@gx_nlp) 's Twitter Profile Photo

A new RL alignment method, here’s Gaurav’s excellent blog that explains why BRAIn is more stable and gives better performance than PPO and DPO 🔥

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Congrats Google DeepMind on the new Gemma-2 27B & 9B release! Gemma-2 was tested in the Arena under the codename "*late-june-chatbots" and now out of stealth. Its early result matches the best open models (Llama-3-70B, Nemotron-340B) with only 27B parameters! Impressively,

Congrats <a href="/GoogleDeepMind/">Google DeepMind</a> on the new Gemma-2 27B &amp; 9B release!

Gemma-2 was tested in the Arena under the codename "*late-june-chatbots" and now out of stealth. Its early result matches the best open models (Llama-3-70B, Nemotron-340B) with only 27B parameters!

Impressively,
Red Hat AI (@redhat_ai) 's Twitter Profile Photo

.Red Hat AI Innovation team just dropped a new research paper on inference-time scaling! 🚨 All built on vLLM. Paper and code here: …abilistic-inference-scaling.github.io Cheers to paper authors Akash Srivastava, Kai Xu, GX Xu, Shivchander Sudalairaj, and Isha Puri!

.<a href="/RedHat/">Red Hat</a> AI Innovation team just dropped a new research paper on inference-time scaling! 🚨

All built on <a href="/vllm_project/">vLLM</a>.

Paper and code here: …abilistic-inference-scaling.github.io

Cheers to paper authors <a href="/variational_i/">Akash Srivastava</a>, <a href="/xukai92/">Kai Xu</a>, <a href="/GX_NLP/">GX Xu</a>, Shivchander Sudalairaj, and <a href="/ishapuri101/">Isha Puri</a>!
Isha Puri (@ishapuri101) 's Twitter Profile Photo

[1/x] can we scale small, open LMs to o1 level? Using classical probabilistic inference methods, YES! Joint MIT CSAIL / Red Hat AI Innovation Team work introduces a particle filtering approach to scaling inference w/o any training! check out …abilistic-inference-scaling.github.io

[1/x] can we scale small, open LMs to o1 level? Using classical probabilistic inference methods, YES! Joint <a href="/MIT_CSAIL/">MIT CSAIL</a> / <a href="/RedHat/">Red Hat</a> AI Innovation Team work introduces a particle filtering approach to scaling inference w/o any training! check out …abilistic-inference-scaling.github.io
Hao Wang (@hw_haowang) 's Twitter Profile Photo

[1/x] 🚀 We're excited to share our latest work on improving inference-time efficiency for LLMs through KV cache quantization---a key step toward making long-context reasoning more scalable and memory-efficient.

[1/x] 🚀 We're excited to share our latest work on improving inference-time efficiency for LLMs through KV cache quantization---a key step toward making long-context reasoning more scalable and memory-efficient.
Red Hat AI (@redhat_ai) 's Twitter Profile Photo

LLM inference is too slow, too expensive, and too hard to scale. 🚨 Introducing llm-d, a Kubernetes-native distributed inference framework, to change that—using vLLM (vLLM), smart scheduling, and disaggregated compute. Here’s how it works—and how you can use it today:

Red Hat AI (@redhat_ai) 's Twitter Profile Photo

Random Samples, our weekly seminar series that bridges the gap between cutting-edge AI research and real-world application, continue this Friday, July 18! Title: Grounding Feedback is All You Need: Aligning Small Vision-Language Models Abstract: While recent vision-language

Random Samples, our weekly seminar series that bridges the gap between cutting-edge AI research and real-world application, continue this Friday, July 18!

Title: 
Grounding Feedback is All You Need: Aligning Small Vision-Language Models

Abstract: 
While recent vision-language