Namgyu Ho (@itsnamgyu) 's Twitter Profile
Namgyu Ho

@itsnamgyu

PhD student at OSI LAB @kaist_ai. Prev @LG_AI_Research.
Solving problems for next-gen LLMs. Previously showed that LLMs are reasoning teachers.

ID: 2579393588

linkhttp://namgyu.com calendar_today20-06-2014 23:12:19

218 Tweet

1,1K Followers

358 Following

Alice Oh (@aliceoh) 's Twitter Profile Photo

My student Na Yeon Lee will be at #naacl2024 to present our paper on "Exploring Cross-Cultural Differences in English Hate Speech Annotations" arxiv.org/pdf/2308.16705 We start with the simple fact that English is widely spoken globally, but current NLP datasets are focused on how

My student <a href="/nlee0212/">Na Yeon Lee</a> will be at #naacl2024 to present our paper on "Exploring Cross-Cultural Differences in English Hate Speech Annotations" arxiv.org/pdf/2308.16705

We start with the simple fact that English is widely spoken globally, but current NLP datasets are focused on how
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Another 'WOW' paper - Upto 20x improvement in inference throughput with Block Transformer compared to vanilla transformers with equivalent perplexity.🤯 How ❓ by MASSIVELY reducing KV cache IO overhead from quadratic to linear with respect to context length, solving a key

Another 'WOW' paper - Upto 20x improvement in inference throughput with Block Transformer compared to vanilla transformers with equivalent perplexity.🤯

How ❓ by MASSIVELY reducing KV cache IO overhead from quadratic to linear with respect to context length, solving a key
Namgyu Ho (@itsnamgyu) 's Twitter Profile Photo

Has anyone tried few-shot prompting a base model with existing jailbreaking techniques? Could give us a few more papers.

Namgyu Ho (@itsnamgyu) 's Twitter Profile Photo

EXAONE 3.0 7.8B beats Llama 3 8B and Gemma 2 9B on MT-Bench and more! Amazing work my previous collaborators and mentors at LG AI Research 🚀

Namgyu Ho (@itsnamgyu) 's Twitter Profile Photo

Recent Apple Foundation models apply heavy quantization with LoRA to mitigate performance degradation, to fit 3B models on their devices. Check out the latest 🔬 details in QLoRA techniques from *the* efficiency expert 🧙‍♂️ from Korea.