Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile
Aran Komatsuzaki

@arankomatsuzaki

ID: 794433401591693312

linkhttps://arankomatsuzaki.wordpress.com/about-me/ calendar_today04-11-2016 06:57:37

5,5K Tweet

105,105K Followers

82 Following

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Automated Design of Agentic Systems Presents Meta Agent Search to demonstrate that we can use agents to invent novel and powerful agent designs by programming in code proj: shengranhu.com/ADAS/ abs: arxiv.org/abs/2408.08435 github: github.com/ShengranHu/ADAS

Automated Design of Agentic Systems

Presents Meta Agent Search to demonstrate that we can use agents to invent novel and powerful agent designs by programming in code

proj: shengranhu.com/ADAS/
abs: arxiv.org/abs/2408.08435
github: github.com/ShengranHu/ADAS
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations - Directly model images and videos via canonical codecs (e.g., JPEG, AVC/H.264) - More effective than pixel-based modeling and VQ baselines (yields a 31% reduction in FID) arxiv.org/abs/2408.08459

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

- Directly model images and videos via canonical codecs (e.g., JPEG, AVC/H.264)
- More effective than pixel-based modeling and VQ baselines (yields a 31% reduction in FID)

arxiv.org/abs/2408.08459
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Thanks a lot for joining our meetup! We had great turnout with so many interesting people showing up :) Some people even joined us just because they spotted a huge crowd we formed in Dolores Park they happened to walk across 😂

Thanks a lot for joining our meetup! 

We had great turnout with so many interesting people showing up :)

Some people even joined us just because they spotted a huge crowd we formed in Dolores Park they happened to walk across 😂
Weizhu Chen (@weizhuchen) 's Twitter Profile Photo

We released phi 3.5: mini+MoE+vision A better mini model with multilingual support: huggingface.co/microsoft/Phi-… A new MoE model:huggingface.co/microsoft/Phi-… A new vision model supporting multiple images: huggingface.co/microsoft/Phi-…

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Meta presents Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model - Can generate images and text on a par with similar scale diffusion models and language models - Compresses each image to just 16 patches arxiv.org/abs/2408.11039

Meta presents Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

- Can generate images and text on a par with similar scale diffusion models and language models
- Compresses each image to just 16 patches

arxiv.org/abs/2408.11039
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

WE ARE STARTING IN 6 MIN Hermes 3 - covered by emozilla from Nous Research A brief discussion on Phi 3.5 Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model JPEG-LM: LLMs as Image Generators with Canonical Codec Representations To Code, or

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Over the coming days, start creating and chatting with Gems: customizable versions of Gemini that act as topic experts. 🤝 We’re also launching premade Gems for different scenarios - including Learning coach to break down complex topics and Coding partner to level up your skills

James (@jamesliuid) 's Twitter Profile Photo

Your LLM may be sparser than you thought! Excited to announce TEAL, a simple training-free method that achieves up to 40-50% model-wide activation sparsity on Llama-2/3 and Mistral models. Combined with a custom kernel, we achieve end-to-end speedups of up to 1.53x-1.8x!

Your LLM may be sparser than you thought!

Excited to announce TEAL, a simple training-free method that achieves up to 40-50% model-wide activation sparsity on Llama-2/3 and Mistral models. Combined with a custom kernel, we achieve end-to-end speedups of up to 1.53x-1.8x!
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

AI2 presents OLMoE: Open Mixture-of-Experts Language Models - Opensources SotA LMs w/ MoE up to 7B active params. - Releases model weights, training data, code, and logs. repo: github.com/allenai/OLMoE hf: huggingface.co/allenai/OLMoE-… abs: arxiv.org/abs/2409.02060

AI2 presents OLMoE: Open Mixture-of-Experts Language Models

- Opensources SotA LMs w/ MoE up to 7B active params.
- Releases model weights, training data, code, and logs. 

repo: github.com/allenai/OLMoE
hf: huggingface.co/allenai/OLMoE-…
abs: arxiv.org/abs/2409.02060
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

STARTING IN 10 MIN!!! Papers we will cover: Building and better understanding vision-language models: insights and future directions - presented by Leo Tronchon OLMoE: Open Mixture-of-Experts Language Models - presented by Niklas Muennighoff Diffusion Models Are Real-Time Game

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark - Tests a cognitive skill of seamlessly integrating visual and textual information - Performance is substantially lower than on MMMU, ranging from 16.8% to 26.9% across models proj:

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

- Tests a cognitive skill of seamlessly integrating visual and textual information
- Performance is substantially lower than on MMMU, ranging from 16.8% to 26.9% across models

proj:
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

We’re presenting AlphaProteo: an AI system for designing novel proteins that bind more successfully to target molecules. 🧬 It could help scientists better understand how biological systems function, save time in research, advance drug design and more. 🧵 dpmd.ai/3XuMqbX