Aran Komatsuzaki (@arankomatsuzaki) Twitter Tweets • TwiCopy

Aran Komatsuzaki

@arankomatsuzaki

+ Follow

@TeraflopAI

ID:794433401591693312

linkhttps://arankomatsuzaki.wordpress.com/about-me/ calendar_today04-11-2016 06:57:37

4,8K Tweets

94,6K Followers

78 Following

Follow People

AK

AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80Gx

+ Follow

Jim Fan

@NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.

+ Follow

AI at Meta

Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.

+ Follow

Lucas Beyer (bl16)

Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]

+ Follow

Soumith Chintala

Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.

+ Follow

Xiuyu Li

2 weeks ago

Handling long context in LLMs is expensive, but can we cut the cost by learning them offline for a specific set/genre of documents?

Introducing LLoCO, our new technique that learns documents offline through context compression and in-domain finetuning using LoRA, which archives…

Handling long context in LLMs is expensive, but can we cut the cost by learning them offline for a specific set/genre of documents? Introducing LLoCO, our new technique that learns documents offline through context compression and in-domain finetuning using LoRA, which archives…

thumb_up_off_alt17

chat_bubble_outline0

account_circle

Jiayi Pan

2 weeks ago

Thanks Aran for sharing!
AI feedbacks will enable autonomous evaluation and improvement of language agents at scale. We have a thread here if you wanna learn more :)

twitter.com/pan_jiayipan/s…

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

Autonomous Evaluation and Refinement of Digital Agents

Improves WebArena's GPT4 SotA agent by 30%+ and CogAgent in iOS by 75% without any extra supervision but only a VLM-based evaluator

repo: github.com/Berkeley-NLP/A…
abs: arxiv.org/abs/2404.06474

Autonomous Evaluation and Refinement of Digital Agents Improves WebArena's GPT4 SotA agent by 30%+ and CogAgent in iOS by 75% without any extra supervision but only a VLM-based evaluator repo: github.com/Berkeley-NLP/A… abs: arxiv.org/abs/2404.06474

thumb_up_off_alt27

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications

Presents a runtime for LLMs with an intuitive undo and damage confinement abstractions, enabling the safer deployment of LLM agents in practice

repo: github.com/ShishirPatil/g…
abs:…

GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications Presents a runtime for LLMs with an intuitive undo and damage confinement abstractions, enabling the safer deployment of LLM agents in practice repo: github.com/ShishirPatil/g… abs:…

thumb_up_off_alt11

chat_bubble_outline0

account_circle

Zhibin Gou

2 weeks ago

🔥 Not All Tokens Are What You Need!
🚀 Releasing the Rho-1 series, including the first 1B LLM to hit 40.6% on MATH.

Rho-1 introduces Selective Language Modeling ( #SLM ) for token-level pretraining data selection.

Thanks to AK and Aran Komatsuzaki for sharing our work!

🔥 Not All Tokens Are What You Need! 🚀 Releasing the Rho-1 series, including the first 1B LLM to hit 40.6% on MATH. Rho-1 introduces Selective Language Modeling (#SLM) for token-level pretraining data selection. Thanks to @_akhaliq and @arankomatsuzaki for sharing our work!

thumb_up_off_alt76

chat_bubble_outline0

account_circle

Iker García-Ferrero

2 weeks ago

We have released the training corpus, models and a lot of multilingual evaluation benchmarks in Hugging Face: huggingface.co/collections/Hi…

We hope this project begins a wave of multilingual models for the medical domain!

thumb_up_off_alt27

chat_bubble_outline0

account_circle

Ruibo Liu

2 weeks ago

Thanks Aran for sharing our work!

This is a survey paper I’ve been thinking about for a long time, as we have seen an increasing need for synthetic data. As we will probably run out of fresh tokens soon, the audience of this paper should be everyone who cares about AI progress.

thumb_up_off_alt91

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

AssemblyAI presents Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping

Presents an end-to-end ASR model trained on 570k hours of speech data

arxiv.org/abs/2404.07341

AssemblyAI presents Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping Presents an end-to-end ASR model trained on 570k hours of speech data arxiv.org/abs/2404.07341

thumb_up_off_alt14

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

Google presents Best Practices and Lessons Learned on Synthetic Data for Language Models

Provides an overview of synthetic data research, discussing its applications, challenges, and future directions

arxiv.org/abs/2404.07503

Google presents Best Practices and Lessons Learned on Synthetic Data for Language Models Provides an overview of synthetic data research, discussing its applications, challenges, and future directions arxiv.org/abs/2404.07503

thumb_up_off_alt538

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

Several LLMs (e.g., GPT-4) perform on par w/ supervised methods like Random Forest on regression

repo:github.com/robertvacarean…
abs: arxiv.org/abs/2404.07544

From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Several LLMs (e.g., GPT-4) perform on par w/ supervised methods like Random Forest on regression repo:github.com/robertvacarean… abs: arxiv.org/abs/2404.07544

thumb_up_off_alt66

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

Medical mT5: An Open-Source Multilingual Text-to-Text LLM for The Medical Domain

arxiv.org/abs/2404.07613

Medical mT5: An Open-Source Multilingual Text-to-Text LLM for The Medical Domain arxiv.org/abs/2404.07613

thumb_up_off_alt81

chat_bubble_outline0

account_circle

Tianbao Xie

2 weeks ago

Thanks Aran for sharing!! 🤗🤗

Let’s work together towards general purpose computer agent this time again, with multimodal language model agent.

(Will have an official post later.

thumb_up_off_alt42

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

ByteDance presents InfiCoder-Eval

InfiCoder-Eval comprises 270 carefully picked high-quality StackOverflow questions, covering 18 programming languages, for evaluating code LLMs

proj: infi-coder.github.io/inficoder-eval/
abs: arxiv.org/abs/2404.07940

ByteDance presents InfiCoder-Eval InfiCoder-Eval comprises 270 carefully picked high-quality StackOverflow questions, covering 18 programming languages, for evaluating code LLMs proj: infi-coder.github.io/inficoder-eval/ abs: arxiv.org/abs/2404.07940

thumb_up_off_alt77

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

Microsoft presents Rho-1: Not All Tokens Are What You Need

RHO-1-1B and 7B achieves SotA results of 40.6% and 51.8% on MATH dataset, respectively — matching DeepSeekMath with only 3% of the pretraining tokens.

repo: github.com/microsoft/rho
abs: arxiv.org/abs/2404.07965

Microsoft presents Rho-1: Not All Tokens Are What You Need RHO-1-1B and 7B achieves SotA results of 40.6% and 51.8% on MATH dataset, respectively — matching DeepSeekMath with only 3% of the pretraining tokens. repo: github.com/microsoft/rho abs: arxiv.org/abs/2404.07965

thumb_up_off_alt280

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

The first-of-its-kind scalable, real computer environment for multimodal agents, supporting task setup, execution-based evaluation, and interactive learning across various operating…

thumb_up_off_alt281

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

Apple presents Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Substantial improvements over SotA VLMs, thanks to its high-resolution scaling and fine-grained visual processing

arxiv.org/abs/2404.07973

Apple presents Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models Substantial improvements over SotA VLMs, thanks to its high-resolution scaling and fine-grained visual processing arxiv.org/abs/2404.07973

thumb_up_off_alt124

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

LLoCO: Learning Long Contexts Offline

- Significantly outperforms ICL while using 30x fewer tokens during inference
- Achieves up to 7.62x speed-up and substantially reduces the cost of long document QA

arxiv.org/abs/2404.07979

LLoCO: Learning Long Contexts Offline - Significantly outperforms ICL while using 30x fewer tokens during inference - Achieves up to 7.62x speed-up and substantially reduces the cost of long document QA arxiv.org/abs/2404.07979

thumb_up_off_alt103

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

ControlNet++: Improving Conditional Controls
with Efficient Consistency Feedback

Proposes an approach that improves controllable generation by explicitly optimizing pixel-level cycle consistency

proj: liming-ai.github.io/ControlNet_Plu…
abs: arxiv.org/abs/2404.07987

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Proposes an approach that improves controllable generation by explicitly optimizing pixel-level cycle consistency proj: liming-ai.github.io/ControlNet_Plu… abs: arxiv.org/abs/2404.07987

thumb_up_off_alt82

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

Only true long context language modeing is message-passing via gradient descent (sometimes w/ retrieval)

thumb_up_off_alt24

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

2 weeks ago

Attention international students in the U.S. exploring work options!

Did you know that unlike OPT or full-time CPT, part-time CPT (under 20 hours/week) is not subject to a cap?

This means that you can work for as many semesters as you want w/o affecting your future CPT / OPT!

thumb_up_off_alt19

chat_bubble_outline0

account_circle