Aran Komatsuzaki (@arankomatsuzaki) Twitter Tweets • TwiCopy

repeat3

account_circle

new research: Ultra Inertial Poser: Scalable full-body tracking in the wild using sparse sensing
#SIGGRAPH2024

No cameras—just 6 wearables (IMU+UWB) for our graph model to estimate poses

We are first to process raw IMU signals and need no proprietary sensors for 3D orientation

account_circle

Aran Komatsuzaki

1 week ago

Self-Play Preference Optimization for Language Model Alignment

SPPO serves as the RLHF counterpart of SPIN and outperforms iterative DPO, Snorkel AI, Self-Rewarding LM, GPT-4 0613 etc

arxiv.org/abs/2405.00675

account_circle

Nikita Drobyshev

@NikDrob23

1 week ago

I am thrilled to announce that my latest paper has been accepted at CVPR 2024:
EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
🔗 Project Page: neeek2303.github.io/EMOPortraits/

account_circle

Aran Komatsuzaki

1 week ago

Scale AI presents A Careful Examination of LLM Performance on Grade School Arithmetic

- Evaluate existing LLMs on a new test set of GSM8K
- Observe accuracy drops of up to 13%, with models like Phi and Mistral showing evidence of systematic overfitting

arxiv.org/abs/2405.00332

account_circle

Aran Komatsuzaki

1 week ago

Btw this multiple token training is not a panacea. The performance gain depends on the target task.

It leads to no perf gain or slight degradation on some multiple choice questions.

It leads to minior improvement on summarization and little to no improvement on arithmetic…

thumb_up_off_alt82

repeat5

account_circle

Aran Komatsuzaki

1 week ago

Delve

thumb_up_off_alt29

repeat1

account_circle

Chujie Zheng @ ICLR 2024

@ChujieZheng

1 week ago

✨New Paper Alert✨
Excited to introduce ExPO, an extremely simple method to boost LLMs' alignment with human preference, via weak-to-strong model extrapolation
👇
#LLMs #MachineLearning #NLProc #ArtificialIntelligence #AI

account_circle

Tianle Cai

@tianle_cai

1 week ago

Wow, Medusa can be used for pre-training and leads to a better and faster generation! 😍

account_circle

Ziming Liu

@ZimingLiu11

1 week ago

Aran Komatsuzaki Thanks for sharing our work! In case anyone's interested in digging more, here's my tweet: twitter.com/ZimingLiu11/st…

thumb_up_off_alt28

repeat5

account_circle

Aran Komatsuzaki

1 week ago

KAN: Kolmogorov–Arnold Networks

Proposes an alternative to MLP that outperforms in terms of accuracy and interpretability

arxiv.org/abs/2404.19756

account_circle

Aran Komatsuzaki

1 week ago

Meta presents Iterative Reasoning Preference Optimization

Increasing accuracy for Llama-2-70B-Chat:
- 55.6% -> 81.6% on GSM8K
- 12.5% -> 20.8% on MATH
- 77.8% -> 86.7% on ARC-Challenge

arxiv.org/abs/2404.19733

account_circle

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

1 week ago

gpt2-chatbot → Generative Pretrained Transformer 2 Chatbot

This is clearly a scaled up version of the Transformer 2 archtecture!! ↓

account_circle

Aran Komatsuzaki

1 week ago

Meta presents Better & Faster Large Language Models via Multi-token Prediction

- training language models to predict multiple future tokens at once results in higher sample efficiency
- up to 3x faster at inference

arxiv.org/abs/2404.19737

account_circle

Aran Komatsuzaki

1 week ago

MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

proj: dai-wenxun.github.io/MotionLCM-page/
abs: arxiv.org/abs/2404.19759

account_circle

Aran Komatsuzaki

1 week ago

Anyone wanna outsource LLM pretraining?

thumb_up_off_alt30

repeat3

account_circle

TeraflopAI

@TeraflopAI

2 weeks ago

Awesome to see Joseph Spisak, AI Product Director, Meta, mention our previous research, YaRN, on stage at the Weights & Biases Fully Connected conference. We have another very exciting long-context release coming soon.

thumb_up_off_alt8

repeat4

account_circle

Weiyan Shi

@shi_weiyan

2 weeks ago

🚨New Paper🚨
We propose
1⃣CultureBank🌎 dataset sourced from TikTok & Reddit
2⃣An extensible pipeline to build cultural knowledge bases
3⃣Evaluation of LLMs’ cultural awareness
4⃣Insights into culturally-aware LLMs

Project: culturebank.github.io
Data: shorturl.at/hrtwP

account_circle

Aran Komatsuzaki

2 weeks ago

Apple presents OpenELM

- An efficient LM family with open-source training and inference framework
- Performs on par with OLMo while requiring 2x fewer pre-training tokens

repo: github.com/apple/corenet
hf: huggingface.co/apple/OpenELM
abs: arxiv.org/abs/2404.14619

account_circle

Aran Komatsuzaki